The real realtime preemption end game

Animats · 2023-11-16T19:26:24

QNX had this right decades ago. The microkernel has upper bounds on everything it does. There are only a few tens of thousands of lines of microkernel code. All the microkernel does is allocate memory, dispatch the CPU, and pass messages between processes. Everything else, including drivers and loggers, is in user space and can be preempted by higher priority threads.

The QNX kernel doesn't do anything with strings. No parsing, no formatting, no messages.

Linux suffers from being too bloated for real time. Millions of lines of kernel, all of which have to be made preemptable. It's the wrong architecture for real time. So it took two decades to try to fix this.

vacuity · 2023-11-16T19:37:31

For a modern example, there's seL4. I believe it does no dynamic memory allocation. It's also formally verified for various properties. (Arguably?) its biggest contribution to kernel design is the pervasive usage of capabilities to securely but flexibly export control to userspace.

_kb · 2023-11-17T00:03:45

And unfortunately had its funding dumped because it wasn’t shiny AI.

adastra22 · 2023-11-16T20:03:54

Capabilities are important, but I don’t think that was introduced by seL4. Mach (which underlies macOS) has the same capability-based system.

vacuity · 2023-11-16T21:30:28

I didn't say seL4 introduced capabilities. However, to my knowledge, seL4 was the first kernel to show that pervasive usage of capabilities is both feasible and beneficial.

adastra22 · 2023-11-17T00:54:17

There's quite a history of capabilities-based research OS's that culminated in, but did not start with L4 (of which seL4 is a later variant).

vacuity · 2023-11-17T01:12:19

Yes, but I believe seL4 took it to the max. I may be wrong on that count, but I think seL4 is unique in that it leverages capabilities for pretty much everything except the scheduler. (There was work in that area, but it's incomplete.)

adastra22 · 2023-11-17T01:57:23

L4 was developed in the 90's. Operating Systems like Amoeba, which were fundamentally capability-based to a degree that even exceeds L4, were a hot research topic in the 80's.

L4's contribution was speed. It was assumed that microkernels, and especially capability-based microkernels were fundamentally slower than monolithic kernels. This is why Linux (1991) is monolithic. Yet L4 (1994) was the fastest operating system in existence at the time, despite being a microkernel and capability based. It's too bad those dates aren't reversed, or we might have had a fast, capability-based, microkernel Linux :(

mananaysiempre · 2023-11-17T03:08:50

IIRC the KeyKOS/EROS/CapROS tradition used capabilities for everything including the scheduler. Of course, pervasive persistence makes those systems somewhat esoteric (barring fresh builds, they never shut down or boot up, only go to sleep and wake up in new bodies; compare Smalltalk, etc.).

adastra22 · 2023-11-17T06:46:12

Amoeba was my favorite, as it was a homogeneous, decentralized operating system. Different CPU architectures spread across different data centers, and it was all homogenized together into a single system image. You had a shell prompt where you typed commands and the OS could decide to spawn your process on your local device, in the server room rack, or in some connected datacenter in Amsterdam, it didn't make a difference. From the perspective of you, your program, or the shell, it's just a giant many-core machine with weird memory and peripheral access latencies that the OS manages.

Oh, and anytime as needed the OS could serialize out your process, pipe it across the network to another machine, and resume. Useful for load balancing, or relocating a program to be near the data it is accessing. Unless your program pays special attention to the clock, it wouldn't notice.

I still think about Amoeba from time to time, and imagine what could have been if we had gone down that route instead.

vacuity · 2023-11-17T07:29:03

Wouldn't there be issues following from distributed systems and CAP? Admittedly, I know nothing about Amoeba.

E.g. You spawn a process on another computer and then the connection drops.

vacuity · 2023-11-17T03:15:09

Guess I'm too ignorant. I need to read up on these. I did know about the persistence feature. I think it's not terrible but also not great, and systems should be designed for being shut down and apps being closed.

monocasa · 2023-11-17T00:16:33

The other L4s before it showed that caps are useful and can be implemented efficiently.

vacuity · 2023-11-17T01:24:13

https://dl.acm.org/doi/pdf/10.1145/2517349.2522720

" We took a substantially different approach with seL4; its model for managing kernel memory is seL4’s main contribution to OS design. Motivated by the desire to reason about resource usage and isolation, we subject all kernel memory to authority conveyed by capabili- ties (except for the ﬁxed amount used by the kernel to boot up, including its strictly bounded stack). "

I guess I should've said seL4 took capabilities to the extreme.

gigatexal · 2023-11-16T20:59:56

QNX is used in vehicle infotainment systems no? Where else?

I'm not bothered by the kernel bloat. There's a lot of dev time being invested in Linux and while the desktop is not as much of a priority as say the server space a performant kernel on handhelds and other such devices and the dev work to get it there will benefit the desktop users like myself.

notrom · 2023-11-16T23:22:56

I've worked with it in industrial automation systems in large scale manufacturing plants where it was pretty rock solid. And I'm aware of it's use in TV production and transmissions systems.

SubjectToChange · 2023-11-17T01:34:49

Railroads/Positive Train Control, emergency call centers, etc. QNX is used all over the place. If you want an even more impressive Microkernel RTOS, then Green Hills INTEGRITY is a great example. It's the RTOS behind the B-2, F-{16,22,35}, Boeing 787, Airbus A380, Sikorsky S-92, etc.

bkallus · 2023-11-16T21:32:54

I went to a conference at GE Research where I spoke to some QNX reps from Blackberry for a while. Seemed like they were hinting that some embedded computers in some of GE'S aerospace and energy stuff relies on QNX.

tyfon · 2023-11-16T22:00:46

It was used in my old toyota avensis from 2012. The infotainment was so slow you could measure performance in seconds pr frame instead of frames pr second :)

In the end, all I could practically use it for was as a bluetooth audio connector.

lmm · 2023-11-16T21:32:57

> QNX is used in vehicle infotainment systems no? Where else?

A bunch of similar embedded systems. And blackberry, if anyone's still using them.

dilyevsky · 2023-11-17T02:08:34

Routers, airplanes, satellites, nuclear power stations, lots of good stuff

Cyph0n · 2023-11-16T23:42:02

Cisco routers running IOS-XR, until relatively recently.

bregma · 2023-11-16T20:53:13

The current (SDP 8) kernel has 15331 lines of code, including comments and Makefiles.

gigatexal · 2023-11-16T21:01:52

> QNX had this right decades ago. The microkernel has upper bounds on everything it does. There are only a few tens of thousands of lines of microkernel code. All the microkernel does is allocate memory, dispatch the CPU, and pass messages between processes. Everything else, including drivers and loggers, is in user space and can be preempted by higher priority threads.

So much like a well structured main method in a C program or other C like language where main just orchestrates the calling of other functions and such. In this case main might initialize different things where the QNX kernel doesn't but the idea or general concept remains.

I'm no kernel dev but this sounds good to me. Keeps things simple.

vacuity · 2023-11-16T21:38:31

Recently, I've been thinking that we need a microkernel design in applications. You have the core and then services that can integrate amongst each other and the core that provide flexibility. Like the "browser as an OS" kind of things but applied more generally.

galdosdi · 2023-11-16T22:53:41

Yes! This reminds me strongly of the core/modules architecture of the apache httpd, as described by the excellent O'Reilly book on it.

The process of serving an HTTP request is broken into a large number of fine grained stages and plugin modules may hook into any or all of these to modify the input and output to each stage.

The same basic idea makes it easy to turn any application concept into a modules-and-core architecture. From the day I read (skimmed) that book a decade or two ago this pattern has been burned into my brain

blackth0rn · 2023-11-16T23:26:42

ECS systems for the gaming world are somewhat like this. There is the core ECS framework and then the systems and entity's integrate with each other

spookie · 2023-11-16T23:38:49

ECS is incredible. Other areas should take notice

whstl · 2023-11-17T00:07:25

Agreed. I find that we're going in this direction in many areas, games just got there much faster.

Pretty much everywhere there is some undercurrent of "use this ultra-small generic interface for everything and life will be easier". With games and ECS, microkernels and IPC-for-everything, with frontend frameworks and components that only communicate between themselves via props and events, with event sourcing and CQRS backends, Actors in Erlang, with microservices only communicating via the network to enforce encapsulation... Perhaps even Haskell's functional-core-imperative-shell could count as that?

I feel like OOP _tried_ to get to this point, with dependency injection and interface segregation, but didn't quite get there due to bad ergonomics, verbosity and because it was still too easy to break the rules. But it was definitely an attempt at improving things.

vbezhenar · 2023-11-17T05:23:37

COM, OSGI, Service architecture, microservice architecture and countless other approaches. This is correct way to build applications, because it gets reinvented over and over again.

js2 · 2023-11-16T23:52:02

VxWorks is what's used on Mars and it's a monolithic kernel, so there's more than one way to do it. :-)

dilyevsky · 2023-11-17T02:11:45

I think RT build also had to disable mmu

the8472 · 2023-11-16T19:53:38

For an example how far the kernel goes to get log messages out even on a dying system and how that's used in real deployments:

https://netflixtechblog.com/kubernetes-and-kernel-panics-ed6...

tyingq · 2023-11-16T15:44:22

I wonder if this being fixed will result in it displacing some notable amount of made-for-realtime hardware/software combos. Especially since there's now lots of cheap, relatively low power, and high clock rate ARM and x86 chips to choose from. With the clock rates so high, perfect real-time becomes less important as you would often have many cycles to spare for misses.

I understand it's less elegant, efficient, etc. But sometimes commodity wins over correctness.

tuetuopay · 2023-11-16T16:05:11

The thing is, stuff that require hard realtime cannot satisfy with "many cycles to spare for misses". And CPU cycles is not the whole story. A badly made task could lock down the kernel not doing anything useful. The point of hard realtime is "nothing cannot prevent this critical task from running".

For automotive and aerospace, you really want the control systems to be able to run no matter what.

tyingq · 2023-11-16T16:07:55

Yes, there are parts of the space that can't be displaced with this.

I'm unclear on why you put "many cycles to spare for misses" in quotes, as if it's unimportant. If a linux/arm (or x86) solution is displacing a much lower speed "real real time" solution, that's the situation...the extra cycles mean you can tolerate some misses while still being as granular as what you're replacing. Not for every use case, but for many.

tuetuopay · 2023-11-16T16:40:23

You won't be saved from two tasks deadlocking with cycles/second. this is what hard realtime systems are about. However, I do agree that not all systems have a real hard realtime requirements. But those usually can handle a non-rt kernel.

As for the quotes, it was a direct citation, not a way to dismiss what you said.

tremon · 2023-11-16T16:59:49

I don't think realtime anything has much to do with mutex deadlocks, those are pretty much orthogonal concepts. In fact, I would make a stronger claim: if your "realtime" system can deadlock, it's either not really realtime or it has a design flaw and should be sent back to the drawing board. It's not like you can say "oh, we have a realtime kernel now, so deadlocks are the kernel's problem".

Actual realtime systems are about workload scheduling that takes into account processing deadlines. Hard realtime systems can make guarantees about processing latencies, and can preemptively kill or skip tasks if the result would arrive too late. But this is not something that the Linux kernel can provide, because it is a system property rather than about just the kernel: you can't provide any hard guarantees if you have no time bounds for your data processing workload. So any discussion about -rt in the context of the Linux kernel will always be about soft realtime only.

tuetuopay · 2023-11-16T17:10:15

much agreed. I used deadlocks as an extreme example that's easy to reason about and straight to the point of "something independent of cpu cycles". something more realistic would be IO operations taking more time than expected. you would not want this to be blocking execution for hard rt tasks.

In the case of the kernel, it is indeed too large to be considered hard realtime. Best case we can make it into a firmer realtime than it currently is. But I would place it nowhere near avionics flight calculators (like fly-by-wire systems).

hamilyon2 · 2023-11-17T00:15:44

I had an introductory course on OS and learned about hard real-time systems. I had impression hard real-time is about memory, deadlocks, livelocks, starvation, and so on. And in general about how to design system that moves forward even in presence of serious bugs and unplanned-for circumstances.

syntheweave · 2023-11-17T01:57:20

Bugs related to concurrency - which is where you get race conditions and deadlocks - tend to pop up wherever there's an implied sequence of dependencies to complete the computation, and the sequence is determined dynamically by an algorithm.

For example, if I have a video game where there's collision against the walls, I can understand this as potentially colliding against "multiple things simultaneously", since I'm likely to describe the scene as a composite of bounding boxes, polygons, etc.

But to get an answer for what to do in response when I contact a wall, I have to come up with an algorithm that tests all the relevant shapes or volumes.

The concurrency bug that appears when doing this in a naive way is that I test one, give an answer to that, then modify the answer when testing the others. That can lead to losing information and "popping through" a wall. And the direction in which I pop through depends on which one is tested first.

The conventional gamedev solution to that is to define down the solution set so that it no longer matters which order I test the walls in: with axis aligned boxes, I can say "move only the X axis first, then move only the Y axis". Now there is a fixed order, and a built-in bias to favor one or the other axis. But this is enough for the gameplay of your average platforming game.

The generalization on that is to describe it as a constraint optimization problem: there are some number of potential solutions, and they can be ranked relative to the "unimpeded movement" heuristic, which is usually desirable when clipping around walls. That solution set is then filtered down through the collision tests, and the top ranked one becomes the answer for that timestep.

Problems of this nature come up with resource allocation, scheduling, etc. Some kind of coordinating mechanism is needed, and OS kernels tend to shoulder a lot of the burden for this.

It's different from real-time in that real-time is a specification of what kind of performance constraint you are solving for, vs allowing any kind of performance outcome that returns acceptable concurrent answers.

nine_k · 2023-11-16T19:18:38

How much more expensive and power-hungry an ARM core would be, if it displaces a lower-specced core?

I bet there are hard-realtime (commercial) OSes running on ARM, and the ability to use a lower-specced (cheaper, simpler, consuming less power) core may be seen as an advantage enough to pay for the OS license.

lmm · 2023-11-16T21:50:43

> How much more expensive and power-hungry an ARM core would be, if it displaces a lower-specced core?

The power issue is real, but it might well be the same price or cheaper - a standard ARM that gets stamped out by the million can cost less than a "simpler" microcontroller with a smaller production run.

bee_rider · 2023-11-16T16:28:07

It is sort of funny that language has changed to the point where quotes are assumed to be dismissive or sarcastic.

Maybe they used the quotes because they were quoting you, haha.

tuetuopay · 2023-11-16T16:37:34

it's precisely why I quoted the text, to quote :)

archgoon · 2023-11-16T16:30:44

I'm pretty sure they were just putting it in quotes because it was the expression you used, and they thus were referencing it.

zmgsabst · 2023-11-16T16:38:27

What’s an example of a system that requires hard real time and couldn’t cope with soft real time on a 3GHz system having 1000 cycle misses costing 0.3us?

LeifCarrotson · 2023-11-16T18:05:54

We've successfully used a Delta Tau real-time Linux motion controller to run a 24 kHz laser galvo system. It's ostensibly good for 25 microsecond loop rates, and pretty intolerant of jitter (you could delay a measurement by a full loop period if you're early). And the processor is a fixed frequency Arm industrial deal that only runs at 1.2 GHz.

Perhaps even that's not an example of such a system, 0.3 microseconds is close to the allowable real-time budget, and QC would probably not scrap a $20k part if you were off by that much once.

But in practice, every time I've heard "soft real time" suggested, the failure mode is not a sub-microsecond miss but a 100 millisecond plus deadlock, where a hardware watchdog would be needed to drop the whole system offline and probably crash the tool (hopefully fusing at the tool instead of destroying spindle bearings, axis ball screws, or motors and gearboxes) and scrap the part.

zmgsabst · 2023-11-17T06:52:49

Thanks for the detailed reply!

I’m trying to understand where the roadblock on a rPi + small FPGA hybrid board for $50 fails at the task… and it sounds like the OS/firmware doesn’t suffice. (Or a SoC, like a Zynq.)

Eg, if we could guarantee that the 1.5GHz core won’t “be off” by more than 1us on responding and the FPGA can manage IO directly to buffer out (some of) the jitter, then the cost of many hobby systems with “(still not quite) hard” real time systems would come down to reasonable.

krylon · 2023-11-16T23:57:24

I suspect a fair amount of hard real time applications are not running on 3GHz CPUs. A 100MHz CPU (or lower) without an MMU or FPU is probably more representative.

But it's not really so much about being fast, it's about being able to guarantee that your system can respond to an event within a given amount of time every time. (At least that is how a friend who works in embedded/real time explained it to me.)

lelanthran · 2023-11-16T16:44:12

> What’s an example of a system that requires hard real time and couldn’t cope with soft real time on a 3GHz system having 1000 cycle misses costing 0.3us?

Any system that deadlocks.

binary132 · 2023-11-16T16:06:09

I get the sense that applications with true realtime requirements generally have hard enough requirements that they cannot allow even the remote possibility of failure. Think avionics, medical devices, automotive, military applications.

If you really need realtime, then you really need it and "close enough" doesn't really exist.

This is just my perception as an outsider though.

abe_m · 2023-11-16T16:39:35

Having worked on a number of "real time" machine control applications:

1) There is always a possibility that something fails to run by its due date. Planes crash sometimes. Cars won't start some times. Factory machinery makes scrap parts sometimes. In a great many applications, missing a real time deadline results in degraded quality, not end of life, or regional catastrophy. The care that must be taken to lower the probability of failure needs to be in proportion to the consequence of the failure. Airplanes have redundant systems to reduce (but not eliminate) possibility of failure, while cars and trucks generally don't.

2) Even in properly working real time systems, there is a tolerance window on execution time. As machines change modes of operation, the amount of calculation effort to complete a cycle changes. If the machine is in a warm up phase, it may be doing minimal calculations, and the scan cycle is fast. Later it may be doing a quality control function that needs to do calculations on inputs from numerous sensors, and the scan cycle slows down. So long as the scan cycle doesn't exceed the limit for the process, the variation doesn't cause problems.

mlsu · 2023-11-16T16:44:29

That is true, but generally not acceptable to a regulating body for these critical applications. You would need to design and implement a validation test to prove timing in your system.

Much easier to just use an RTOS and save the expensive testing.

vlovich123 · 2023-11-16T20:36:28

But you still need to implement the validation test to prove that the RTOS has these requirements…

blt · 2023-11-16T20:34:07

How is your point 2) a response to any of the earlier points? Hard realtime systems don't care about variation, only the worst case. If your code does a single multiply-add most of the time but calls `log` every now and then, hard realtime requirement is perfectly satisfied if the bound on the worst-case runtime of `log` is small enough.

abe_m · 2023-11-16T21:28:30

I suppose it isn't, but I bristle when I see someone tossing around statements like "close enough doesn't really exist". In my experience when statements like that start up, there are people involved that don't understand variation is a part of every real process. My point is that if you're going to get into safety critical systems, there is always going to be some amount of variation, and there is always a "close enough", as there is never an "exact" in real systems.

jancsika · 2023-11-17T02:16:12

The point is to care about the worst case within that variation.

Most software cares about the average case, or, in the case of the Windows 10/11 start menu animation, the average across all supported machines apparently going 20 years into the future.

dripton · 2023-11-16T16:29:18

You can divide realtime applications into safety-critical and non-safety-critical ones. For safety-critical apps, you're totally right. For non-critical apps, if it's late and therefore buggy once in a while, that sucks but nobody dies.

Examples of the latter include audio and video playback and video games. Nobody wants pauses or glitches, but if you get one once in a while, nobody dies. So people deliver these on non-RT operating systems for cost reasons.

lll-o-lll · 2023-11-16T21:04:44

> You can divide realtime applications into safety-critical and non-safety-critical ones.

No. This is a common misconception. The distinction between a hard realtime system and a soft realtime system is simply whether missing a timing deadline leads to a) failure of the system or b) degradation of the system (but the system continues to operate). Safety is not part of it.

Interacting with the real physical world often imposes “hard realtime” constraints (think signal processing). Whether this has safety implications simply depends on the application.

jancsika · 2023-11-16T21:04:47

Your division puts audio performance applications in a grey area.

On the one hand they aren't safety critical.

On the other, I can imagine someone getting chewed out or even fired for a pause or a glitch in a professional performance.

Probably the same with live commercial video compositing.

eschneider · 2023-11-16T23:32:43

Audio is definitely hard realtime. The slightest delays are VERY noticeable.

jancsika · 2023-11-17T02:21:15

I mean, it should be.

But there are plenty of performers who apparently rely on Linux boxes and gumption.

binary132 · 2023-11-16T16:43:02

This kind of makes the same point I made though -- apps without hard realtime requirements aren't "really realtime" applications

tremon · 2023-11-16T17:31:27

No -- soft realtime applications are things like video conferencing, where you care mostly about low latency in the audio/video stream but it's ok to drop the occasional frame. These are still realtime requirements, different from what your typical browser does (for example): who cares if a webpage is rendered in 100ms or 2s? Hard realtime is more like professional audio/video recording where you want hard guarantees that each captured frame is stored and processed within the alotted time.

atq2119 · 2023-11-17T02:01:30

> who cares if a webpage is rendered in 100ms or 2s?

Do you really stand by the statement of this rhetorical question? Because if yes: this attitude is a big reason for why web apps are so unpleasant to work with compared to locally running applications. Depending on the application, even 16ms vs 32ms can make a big difference.

duped · 2023-11-16T16:46:59

The traditional language is "hard" vs "soft" realtime

binary132 · 2023-11-16T20:03:02

RTOS means hard realtime.

pluto_modadic · 2023-11-16T16:48:00

I sense that people will insist on their requirements being hard unnecessarily... and that the bug is the fault of it being on a near-realtime system instead of it being faulty even on a realtime one.

lmm · 2023-11-16T22:05:37

Like many binary distinctions, when you zoom in on the details hard-versus-soft realtime is really more of a spectrum. There's "people will die if it's late". "The line will have to stop for a day if it's late". "If it's late, it'll wreck the part currently being made". Etc.

Even hard-realtime systems have a failure rate, in practice if not in theory - even a formally verified system might encounter a hardware bug. So it's always a case of tradeoffs between failure rate and other factors (like cost). If commodity operating systems can push their failure rate down a few orders of magnitude, that moves the needle, at least for some applications.

wongarsu · 2023-11-16T16:36:53

There is some amount of realtime in factory control where infrequent misses will just increase your reject rate in QA.

ajross · 2023-11-16T17:50:22

> Think avionics, medical devices, automotive, military applications.

FWIW by-device/by-transistor-count, the bulk of "hard realtime systems" with millisecond-scale latency requirements are just audio.

The sexy stuff are all real applications too. But mostly we need this just so we don't hear pops and echos in our video calls.

binary132 · 2023-11-16T20:04:09

Nobody thinks Teams is a realtime application

ajross · 2023-11-16T21:20:25

No[1], but the people writing the audio drivers and DSP firmware absolutely do. Kernel preemption isn't a feature for top-level apps.

[1] Actually even that's wrong: for sure there are teams of people within MS (and Apple, and anyone else in this space) measuring latency behavior at the top-level app layer and doing tuning all the way through the stack. App latency excursions can impact streams too, though ideally you have some layer of insulation there.

cptaj · 2023-11-16T16:28:11

Unless its just music

duped · 2023-11-16T16:48:17

Unless that music is being played through a multi kW amplifier into a stadium and an xrun causes damage to the drivers and/or audience (although, they should have hearing protection anyway).

beiller · 2023-11-16T17:04:20

Your talk of xrun is giving me anxiety. When I was younger I dreamed of having a linux audio effects stack with cheap hardware on stage and xruns brought my dreams crashing down.

robocat · 2023-11-16T21:42:54

xrun definition: https://unix.stackexchange.com/questions/199498/what-are-xru...

(I didn't know the term, trying to be helpful if others don't)

tinix · 2023-11-17T03:02:05

just a buffer under/overrun

itishappy · 2023-11-16T16:38:20

It may not be safety critical, but remember that people can and will purchase $14k power chords to (ostensibly) improve the experience of listening to "just music".

https://www.audioadvice.com/audioquest-nrg-dragon-high-curre...

cwillu · 2023-11-16T19:24:11

FWIW, a power chord is a _very_ different thing than a power cord.

itishappy · 2023-11-16T20:05:54

LOL, what a typo! Good catch!

binary132 · 2023-11-16T16:46:52

what if your analog sampler ruins the only good take you can get? What if it's recording a historically important speech? Starting to get philosophical here...

calvinmorrison · 2023-11-16T16:09:11

If you really need realtime, and you really actually need it, should you be using a system like Linux at all?

synergy20 · 2023-11-16T16:23:47

no you don't, you use a true RTOS instead.

linux RTOS is at microseconds granularity but it still can not 100% guarantee it, anything in cache nature (L2 cache, TLB miss) are hard for hard real time.

a dual kernel with xenomai could improve it, but it is not widely used somehow, only used in industrial controls I think.

linux RT is great for audio, multimedia etc as well, where real-time is crucial, but not a MUST.

froh · 2023-11-16T20:41:59

> anything in cache nature (L2 cache, TLB miss) are hard for hard real time

yup that's why you'd pin the memory and the core for the critical task. which, alas, will affect performance of the other cores and all other tasks. and whoosh there goes the BOM...

which again as we both probably are familiar with leads to the SoC designs with a real time core microcontroller and a HPC microprocessor on the same package. which leads to the question how to architect the combined system of real-time microcontroller and compute power but soft real time microprocessor such that the overall system remains sufficiently reliable...

oh joy and fun!

eschneider · 2023-11-16T23:35:35

The beauty of multicore/multi-cpu systems is that you can dedicate cores to running realtime OSs and leave the non-hard realtime stuff to an embedded linux on it's own core.

tyingq · 2023-11-16T16:16:11

I'm guessing it's not that technical experts will be choosing this path, but rather companies. Once it's "good enough", and much easier to hire for, etc...you hire non-experts because it works most of the time. I'm not saying it's good, just that it's a somewhat likely outcome. And not for everything, just the places where they can get away with it.

froh · 2023-11-16T20:43:48

nah. when functional safety enters the room (as it does for hard real time) then engineers go to jail if they sign off something unsafe and people die because of that. since the challenger disaster there is an awareness that not listening to engineers can be expensive and cost lifes.

rcxdude · 2023-11-16T19:27:23

really depends on your paranoia level and the consequences for failure. soft to hard realtime is a bit of a spectrum in terms of how hard of a failure missing a deadline actually is, and therefore how much you try to verify that you will actually meet that deadline.

snickerbockers · 2023-11-16T17:48:14

Pretty sure most people who think they need a real-time thread actually don't tbh.

refulgentis · 2023-11-16T16:15:01

...yes, after realtime support lands

lumb63 · 2023-11-16T16:20:38

A lot of realtime systems don’t have sufficient resources to run Linux. Their hardware is much less powerful than Linux requires.

Even if a system can run (RT-)Linux, it doesn’t mean it’s suitable for real-time. Hardware for real-time projects needs much lower interrupt latency than a lot of hardware provides. Preemption isn’t the only thing necessary to support real-time requirements.

rcxdude · 2023-11-16T19:29:44

realtime just means execution time is bounded. It doesn't necessarily mean the latency is low. Though, in this sense RT-linux should probably be mostly thought of as low-latency linux, and the improvement in realtime guarantees is mostly in reducing the amount of things that can cause you to miss a deadline as opposed to allowing you to guarantee any particular deadline, even a long one.

refulgentis · 2023-11-16T18:15:55

Sure but that was already mentioned before the comment I was replying to. Standard hardware not being great for realtime has nothing to do with hypothetical realtime Linux.

skyde · 2023-11-16T17:19:17

what kind of Hardware is considered to have "lower interrupt latency"? Is there some kind of Arduino board I could get that fit those lower interrupt latency required for real-time but still support things like Bluetooth?

lumb63 · 2023-11-16T20:51:51

Take a look at the Cortex R series. The Cortex M series still has lower interrupt latency than the A series, but lower processing power. I imagine for something like Bluetooth that an M is more than sufficient.

moffkalast · 2023-11-16T17:34:22

I feel like at this point we have enough cores (or will soon, anyway) that you could dedicate one entirely to one process and have it run realtime.

KWxIUElW8Xt0tD9 · 2023-11-16T17:39:07

That's one way to run DPDK processes under LINUX -- you get the whole processor for doing whatever network processing you want to do -- no interruptions from anything.

JohnFen · 2023-11-16T16:56:28

When I'm doing realtime applications using cheap, low-power, high-clockrate ARM chips (I don't consider x86 chips for those sorts of applications), I'm not using an operating system at all. An OS interferes too much, even an RTOS. I don't see how this changes anything.

But it all depends on what your application is. There are a lot of applications that are "almost real-time" in need. For those, this might be useful.

PaulDavisThe1st · 2023-11-17T02:57:29

CPU speed and clock rate has absolutely nothing to do with realtime anything.

foobarian · 2023-11-16T16:00:30

Ethernet nods in vigorous agreement

imtringued · 2023-11-16T16:05:52

Sure, but this won't magically remove the need for dedicated cores. What will probably happen is that people will tell the scheduler to exclusively put non-premptible real time tasks on one of the LITTLE cores.

pardoned_turkey · 2023-11-16T23:44:31

The conversation here focuses on a distinction between "hard" real-time applications, where you probably don't want a general-purpose OS like Linux no matter what; and "soft" real-time applications like videoconferencing or audio playback, where you nothing terrible happens if you get a bit of stuttering or drop a couple of frames every now and then. The argument is that RT Linux would be a killer solution for that.

But you can do all these proposed "soft" use cases with embedded Linux today. It's not like low-latency software video or audio playback is not possible, or wasn't possible twenty years ago. You only run into problems on busy systems where non-preemptible I/O could regularly get in the way. That's seldom a concern in embedded environments.

I think there are compelling reasons for making the kernel fully-preemptible, giving people more control over scheduling, and so forth. But these reasons have relatively little to do with wanting Linux to supersede minimalistic realtime OSes or bare-metal code. It's just good hygiene that will result in an OS that, even in non-RT applications, behaves better under load.

eisbaw · 2023-11-16T16:09:40

Great to hear. However even if Linux the kernel is real-time, likely the hardware won't be due to caches and internal magic CPU trickery.

Big complex hardware is a no-no for true real-time.

That's why AbsInt and WCET tools mainly has simple CPU architectures. 8051 will truly live forever.

btw, Zephyr RTOS.

wholesomepotato · 2023-11-16T16:18:49

Features of modern CPUs don't really prevent them from real time usage, afaik. As long as something is bounded and can be reasoned about it can be used to build a real time system. You can always assume no cache hits and alikes, maximum load etc and as long as you can put a bound on the time it will take, you're good to go.

bloak · 2023-11-16T16:48:02

So the things that might prevent you are:

1. Suppliers have not given you sufficient information for you to be able to prove an upper bound on the time taken. (That must happen a lot.)

2. The system is so complicated that you are not totally confident of the correctness of your proof of the upper bound.

3. The only upper bound that can prove with reasonable confidence is so amazingly bad that you'd be better off with cheaper, simpler hardware.

4. There really isn't a worst case. There might, for example, be a situation equivalent to "roll the dice until you don't get snake eyes". In networking, for example, sometimes after a collision both parties try again after a random delay so the situation is resolved eventually with probability one but there's no actual upper bound. A complex CPU and memory system might have something like that? Perhaps you'd be happy with "the probability of this operation taking more than 2000 clock cycles is less than 10^-13" but perhaps not.

SAI_Peregrinus · 2023-11-16T16:37:05

Exactly. "Real-time" is a misnomer, it should be called "bounded-time". As long as the bound is deterministic, known in advance, and guaranteed, it's "real-time". For it to be useful it also must be under some application-specific duration.

The bounds are usually in CPU cycles, so a faster CPU can sometimes be used even if it takes more cycles. CPUs capable of running Linux usually have higher latency (in cycles) than microcontrollers, but as long as that can be kept under the (wall clock) duration limits with bounded-time it's fine. There will still be cases where the worst-case latency to fetch from DRAM in an RT-Linux system will be higher than a slower MCU fetching from internal SRAM, so RT-Linux won't take over all these systems.

dooglius · 2023-11-16T18:15:48

System management mode is one example of a feature on modern CPUs that prevents real-time usage https://wiki.linuxfoundation.org/realtime/documentation/howt...

synergy20 · 2023-11-16T16:25:43

mlock your memory, test with cache miss and cache invalidation scenarios will help, using no heap for memory allocation, but it's a bit hard

jeffreygoesto · 2023-11-16T20:49:14

Right. But still possible.

https://www.etas.com/en/applications/etas-middleware-solutio...

eschneider · 2023-11-16T23:38:07

Does anyone use paged memory in hard realtime systems?

nraynaud · 2023-11-16T16:24:33

I think it's really useful on 'big' MCU, like the raspberry pi. There exists an entire real time spirit there, where you don't really use the CPU to do any bit banging but everything is on time as seen from the outside. You have timers that receive the quadrature encoders inputs, and they just send interrupt when they wrap, the GPIO system can be plugged to the DMA, so you can stream the memory to the output pins without involving the CPU (again, interrupts at mid-buffer and empty buffer). You can stream to a DAC, stream from a ADC to memory with the DMA. A lot of that stuff bypasses the caches to get a predictable latency.

stefan_ · 2023-11-16T16:46:47

Nice idea but big chip design strikes again: on the latest Raspberry Pi, GPIO pins are handled by the separate IO chip connected over PCI Express. So now all your GPIO stuff needs to traverse a shared serial bus (that is also doing bulk stuff like say raw camera images).

And already on many bigger MCUs, GPIOs are just separate blocks on a shared internal bus like AHB/APB that connects together all the chip IP, causing unpredictable latencies.

SubjectToChange · 2023-11-16T18:40:30

Big complex hardware is a no-no for true real-time.

There are advanced real time cores like the Arm Coretex-R82. In fact many real time systems are becoming quite powerful due to the need to process and aggregate ever increasing amounts of sensor data.

0xDEF · 2023-11-16T16:47:26

>Big complex hardware is a no-no for true real-time.

SpaceX uses x86 processors for their rockets. That small drone copter NASA put on Mars uses "big-ish" ARM cores that can probably run older versions of Android.

ska · 2023-11-16T17:49:43

Does everything runs on those CPUs though? Hard realtime control is often done on much simpler MCU at the lowest level, with oversight/planning for a high level system....

zokier · 2023-11-16T18:45:09

In short, no. For Ingenuity (the Mars2020 helicopter) the flight computer runs on pair of hard-realtime Cortex R5 MCUs paired with a FPGA. The non-realtime Snapdragon SoC handles navigation/image processing duties.

https://news.ycombinator.com/item?id=26907669

ska · 2023-11-16T19:57:55

That's basically what I expected, thanks.

loeg · 2023-11-16T15:43:24

Synchronous logging strikes again! We ran into this some at work with GLOG (Google's logging library), which can, e.g., block on disk IO if stdout is a file or whatever. GLOG was like, 90-99% of culprits when our service stalled for over 100ms.

cduzz · 2023-11-16T16:05:38

I have discussions with cow-orkers around logging;

"We have Best-Effort and Guaranteed-Delivery APIs"

"I want Guaranteed Delivery!!!"

"If the GD logging interface is offline or slow, you'll take downtime; is that okay?"

"NO NO Must not take downtime!"

"If you need it logged, and can't log it, what do you do?"

These days I just point to the CAP theorem and suggest that logging is the same as any other distributed system. Because there's a wikipedia article with a triangle and the word "theorem" people seem to accept that.

[edit: added "GD" to clarify that I was referring to the guaranteed delivery logging api, not the best effort logging API]

msm_ · 2023-11-16T16:31:48

Interesting, I'd think logging is one of the clearest situations when you want best effort. Logging is, almost by definition, not the "core" of your application, so failure to log properly should not prevent the core of the program from working. Killing the whole program because logging server is clearly throwing the baby out with the bathwater.

What people probably mean is "logging is important, let's avoid losing log messages if possible", which is what "best" in "best effort" stands for. For example it's often a good idea to have a local log queue, to avoid data loss in case of a temporary log server downtime.

fnordpiglet · 2023-11-16T17:39:49

It depends.

Some systems the logs are journaled records for the business or are discoverable artifacts for compliance. In highly secure environments logs are not only durable but measures are taken to fingerprint them and their ordering (like ratchet hashing) to ensure integrity is invariant.

I would note that using disk based logging is generally harmful in these situations IMO. Network based logging is less likely to cause blocking at some OS level or other sorts of jitter that’s harder to mask. Typically I develop logging as an in memory thing that offloads to a remote service over the network. The durability of the memory store can be an issue in highly sensitive workloads, and you’ll want to do synchronous disk IO for that case to ensure durability and consistent time budgets, but for almost all application disk less logging is preferable.

shawnz · 2023-11-16T17:51:24

If you're not waiting for the remote log server to write the messages to its disk before proceeding, then it seems like that's not guaranteed to me? And if you are, then you suffer all the problems of local disk logging but also all the extra failure modes introduced by the network, too

lmm · 2023-11-16T22:07:45

> If you're not waiting for the remote log server to write the messages to its disk before proceeding, then it seems like that's not guaranteed to me?

Depends on your failure model. I'd consider e.g. "received in memory by at least 3/5 remote servers in separate datacenters" to be safer than "committed to local disk".

cduzz · 2023-11-17T00:36:51

You're still on one side or another of the CAP triangle.

In a network partition, you are either offline or your data is not consistent.

If you're writing local to your system, you're losing data if there's a single device failure.

https://en.wikipedia.org/wiki/CAP_theorem

fnordpiglet · 2023-11-17T00:41:44

For logs, which are immutable time series journals, any copy is entirely sufficient. The first write is a quorum. Also from a systems POV reads are not a feature of logs.

lmm · 2023-11-17T03:17:31

CAP is irrelevant, consistency does not matter for logs.

fnordpiglet · 2023-11-16T19:33:33

The difference is that network IO can be more easily masked by the operating system than block device IO. When you offload your logging to another thread the story isn’t over because your disk logging can interfere at a system level. Network IO isn’t as noisy. If durability is important you might still need to wait for an ACK before freeing the buffer for the message which might lead to more overall memory use, all the operations play nicely in a preemptable scheduling system.

Also, the failure modes of systems are very tied to durable storage devices attached to the system and very rarely to network devices. By reducing the number of things that need a disk (ideally to zero) you can remove disks from the system and its availability story. Once you get to fully disk less systems the system failure modes are actually almost nothing. But even with disks attached reducing the times you interact with the disk (especially for chatty things like logs!) reduces the likelihood the entire system fails due to a disk issue.

ReactiveJelly · 2023-11-16T18:09:22

If it's a journaled record for the business then I think I'd write it to SQLite or something with good transactions and not mix it in the debug logs

fnordpiglet · 2023-11-16T19:34:45

There are more logs than debug logs, and using SQLite as the encoding store for your logs doesn’t make it not logging.

cduzz · 2023-11-16T16:38:36

People use logging (appropriately or inappropriately; not my bucket of monkeys) for a variety of things including audit and billing records, which are likely a good case for a guaranteed delivery API.

People often don't think precisely about what they say or want, and also often don't think through corner cases such as "what if XYZ breaks or gets slow?"

And don't get me started on "log" messages that are 300mb events. Per log. Sigh.

linuxdude314 · 2023-11-16T17:22:22

It’s not the core of the application, but it can be the core of the business.

For companies that sell API access logs in one form or another are how bills are reconciled and usage metered.

insanitybit · 2023-11-16T17:06:48

If you lose logs when your service crashes you're losing logs at the time they are most important.

tremon · 2023-11-16T17:22:53

But if your service has downtime because the logs could not be written, that seems strictly inferior. As someone else wrote upthread, you only want guaranteed delivery for logs if they're required under a strict audit regime and the cost of noncompliance is higher than the cost of a service outage.

insanitybit · 2023-11-16T20:53:26

FWIW I agree, I'm just trying to be clear that you are choosing one or the other, as the grandparent was stating.

tux1968 · 2023-11-16T17:19:22

That's unavoidable if the logging service is down when your server crashes.

Having a local queue doesn't mean logging to the service is delayed, it can be sent immediately. All the local queue does is give you some resiliency, by being able to retry if the first logging attempt fails.

insanitybit · 2023-11-16T19:59:56

If your logging service is down all bets are off. But by buffering logs you're now accepting that problems not related to the logging service will also cause you to drop logs - as I mentioned, your service crashing, or being OOM'd, would be one example.

tux1968 · 2023-11-16T23:13:48

What's more likely? An intermittent network issue, the logging service being momentarily down, or a local crash that only affects your buffering queue?

If an OOM happens, all bets are off anyway, since it has as much likelihood of taking out your application as it does your buffering code. The local buffering code might very well be part of the application in the first place, so the fate of the buffering code is the same as the application anyway.

It seems you're trying very hard to contrive a situation where doing nothing is better than taking reasonable steps to counter occasional network hiccups.

insanitybit · 2023-11-17T01:15:19

> It seems you're trying very hard to contrive a situation where doing nothing is better than taking reasonable steps to counter occasional network hiccups.

I think you've completely misunderstood me then. I haven't taken a stance at all on what should be done. I'm only trying to agree with the grandparent poster about logging ultimately reflecting CAP Theorem.

andreasmetsala · 2023-11-16T17:20:28

No, you’re losing client logs when your logging service crashes. Your logging service should probably not be logging through calls to itself.

wolverine876 · 2023-11-16T17:36:33

Logging can be essential to security (to auditing). It's your record of what happened. If an attacker can cause logging to fail, they can cover their tracks more easily.

deathanatos · 2023-11-16T17:56:56

To me audit logs aren't "logs" (in the normal sense), despite the name. They tend to have different requirements; e.g., in my industry, they must be retained, by law, and for far longer than our normal logs.

To me, those different requirements imply that they should be treated differently by the code, probably even under distinct flows: synchronously, and ideally to somewhere that I can later compress like hell and store in some very cheap long term storage.

Whereas the debug logs that I use for debugging? Rotate out after 30 to 90d, … and yeah, best effort is fine.

(The audit logs might also end up in one's normal logs too, for convenience.)

wolverine876 · 2023-11-16T18:04:48

While I generally agree, I'll add that the debug logs can be useful in security incidents.

rezonant · 2023-11-16T17:14:18

> "If the GD logging interface is offline or slow, you'll take downtime; is that okay?"

> [edit: added "GD" to clarify that I was referring to the guaranteed delivery logging api, not the best effort logging API]

i read GD as god-damned :-)

salamanderman · 2023-11-16T17:26:44

me too [EDIT: and I totally empathized]

Zondartul · 2023-11-16T19:13:15

I have some wishfull thinking ideas on this, but it should be possible to have both at least in an imaginary, theoretical scenario.

You can have both guaranteed delivery and no downtime if your whole system is so deterministic that anything that normally would result in blocking just will not, cannot happen. In other words it should be a hard real-time system that is formally verified top to bottom, down to the last transistor. Does anyone actually do that? Verify the program and the hardware to prove that it will never run out of memory for logs and such?

Continuing this thought, logs are probably generated endlessly, so either whoever wants them has to also guarantee that that they are processedand disposed of right after being logged... or there is a finite ammount of log messages that can be stored (arbitrary number like 10 000) but the user (of logs) has to guarantee that they will take the "mail" out of the box sooner than it overfills (at some predictable, deterministic rate). So really that means even if OUR system is mathematically perfect, we're just making the downtime someone elses problem - namely, the consumer of the infinite logs.

That, or we guarantee that the final resources of our self-contained, verified system will last longer than the finite shelf life of the system as a whole (like maybe 5 years for another arbitrary number)

morelisp · 2023-11-16T19:17:37

PACELC says you get blocking or unavailability or inconsistency.

supriyo-biswas · 2023-11-16T16:57:27

The better way to do this is to write the logs to a file or an in-memory ring buffer and have a separate thread/process push logs from the file/ring-buffer to the logging service, allowing for retries if the logging service is down or slow (for moderately short values of down/slow).

Promtail[1] can do this if you're using Loki for logging.

[1] https://grafana.com/docs/loki/latest/send-data/promtail/

insanitybit · 2023-11-16T17:06:05

But that's still not guaranteed delivery. You're doing what the OP presented - choosing to drop logs under some circumstances when the system is down.

a) If your service crashes and it's in-memory, you lose logs

b) If your service can't push logs off (upstream service is down or slow) you either drop logs, run out of memory, or block

kbenson · 2023-11-16T17:33:36

Yeah, what the "best effort" actually means in practice is usually a result of how much resources you want to throw at the problem. Those give you runway on how much of a problem you can withstand and perhaps recover from without any loss of data (logs), but in the end you're usually still just buying time. That's usually enough though.

hgfghui7 · 2023-11-16T17:28:40

You are thinking too much in terms of the stated requirements instead of what people actually want: good uptime and good debugability. Falling back to local logging means a blip in logging availability doesn't turn into all hands on deck everything is on fire. And it means that logs will very likely be available for any failures.

In other words it's good enough.

mort96 · 2023-11-16T17:43:18

"Good uptime and good reliability but no guarantees" is just a good best effort system.

insanitybit · 2023-11-16T20:00:43

Good enough is literally "best effort delivery", you're just agreeing with them that this is ultimately a distributed systems problem and you either choose CP or AP.

o11c · 2023-11-16T19:31:25

Logging to `mmap`ed files is resilient to service crashes, just not hardware crashes.

sroussey · 2023-11-16T17:09:31

We did something like this at Weebly for stats. The app sent the stats to a local service via UDP, so shoot and forget. That service aggregated for 1s and then sent off server.

laurencerowe · 2023-11-16T17:35:55

Why UDP for a local service rather than a unix socket?

loeg · 2023-11-16T16:20:26

I read GD as “god damn,” which also seems to fit.

rezonant · 2023-11-16T17:16:21

aw you beat me to it

tuetuopay · 2023-11-16T16:06:28

We had prod halt once when the syslog server hanged. Logs were pushed through TCP which propagated the blocking to the whole of prod. We switched to UDP transport since: better to lose some logs than the whole of prod.

deathanatos · 2023-11-16T18:02:13

TCP vs. UDP and async best-effort vs. synchronous are completely orthogonal…

E.g., a service I wrote wrote logs to an ELK setup; we logged over TCP. But the logging was async: we didn't wait for logs to make it to ELK, and if the logging services went down, we just queued up logs locally. (To a point; at some point, the buffer fills up, and logs were discarded. The process would make a note of this if it happened, locally.)

tuetuopay · 2023-11-16T18:32:39

> TCP vs. UDP and async best-effort vs. synchronous are completely orthogonal…

I agree, when stuff is properly written. I don't remember the exact details, but at least with UDP the asyncness is built-in: there is no backpressure whatsoever. So poorly written software can just send udp to heart's end.

tetha · 2023-11-16T16:12:09

Especially if some system is unhappy enough to log enough volume to blow up the local log disk... you'll usually have enough messages and clues in the bazillion other messages that have been logged.

RobertVDB · 2023-11-16T18:13:31

Ah, the classic GLOG-induced service stall - brings back memories! I've seen similar scenarios where logging, meant to be a safety net, turns into a trap. Your 90-99% figure resonates with my experience. It's like opening a small window for fresh air and having a storm barrel in. We eventually had to balance between logging verbosity and system performance, kind of like a tightrope walk over a sea of unpredictable IO delays. Makes one appreciate the delicate art of designing logging systems that don't end up hogging the spotlight (and resources) themselves, doesn't it?

oneepic · 2023-11-16T18:11:02

Oh, we had this type of issue ("logging lib breaks everything") with a $MSFT logging library. Imagine having 100 threads each with their own logging buffer of 300MB. Needless to say it annihilated our memory and our server crashed, even on the most expensive sku of Azure App Service.

pests · 2023-11-16T19:51:13

Brilliant strategy.

Reminds me a litte of the oldtimers trick of adding a sleep(1000) somewhere so they could later come back and have some resources later, or if they needed a quick win with the client.

Now cloud companies are using malloc(300000000) it to fake resource usage. /s

lopkeny12ko · 2023-11-16T17:17:03

I would posit that if your product's availability hinges on +/- 100ms, you are doing something deeply wrong, and it's not your logging library's fault. Users are not going to care if a button press takes 100 more ms to complete.

fnordpiglet · 2023-11-16T17:33:41

100ms for something like say API authorization on a high volume data plane service would be unacceptable. Exceeding latencies like that can degrade bandwidth and cause workers to exhaust connection counts. Likewise, even in humans response space, 100ms is an enormous part of a budget for responsiveness. Taking again authorization, if you spend 100ms, you’re exhausting the perceptible threshold for a humans sense of responsiveness to do something that’s of no practical value but is entirely necessary. Your UI developers will be literally camped outside your zoom room with virtual pitch forks night and day.

loeg · 2023-11-16T19:17:17

Yes, and in fact the service I am talking about is a high volume data plane service.

saagarjha · 2023-11-17T05:18:06

Core libraries at, say, Google, are supposed to be reliable to several nines. If they go down for long enough for a human to notice, they’re failing SLA.

hamandcheese · 2023-11-16T17:35:58

Not every API is a simple CRUD app with a user at the other end.

kccqzy · 2023-11-16T19:17:00

Add some fan out and 100ms could suddenly become 1s, 10s…

deepsquirrelnet · 2023-11-16T16:45:42

What a blast from the past. I compiled a kernel for Debian with RT_PREEMPT about 17-18 years ago to use with scientific equipment that needed tighter timings. I was very impressed at the latencies and jitter.

I haven’t really thought about it since then, but I can imagine lots of used cases for something like an embedded application with raspberry pi where you don’t quite want to make the leap into a microcontroller running an RTOS.

HankB99 · 2023-11-17T01:07:30

Interesting to mention the Raspberry Pi. I saw an article just a day or two ago that claimed that the RpiOS was stated by and ran on top of RTOS. That's particularly interesting because at one time years ago, I saw suggestions that Linux could run as a task on an RTOS. Things that required hard real time deadlines could run on the RTOS and not be subject to the delays that a virtual memory system could entail.

I don't recall if this was just an idea or was actually implemented. I also have seen only the one mention of RpiOS on an RTOS so I'm curious about that.

rsaxvc · 2023-11-17T02:49:43

>That's particularly interesting because at one time years ago, I saw suggestions that Linux could run as a task on an RTOS.

I've worked with systems that ran Linux as a task of uITRON as well as threadX, both on somewhat obscure ARM hardware. Linux managed the MMU but had a large carveout for the RTOS code. They had some strange interrupt management so that Linux could 'disable interrupts' but while Linux IRQs were disabled, an RTOS IRQ could still fire and context switch back to an RTOS task. I haven't seen anything like this on RPi though, but it's totally doable.

HankB99 · 2023-11-17T05:21:03

Interesting to know that it was more than just an idea - thanks!

salamanderman · 2023-11-16T17:35:05

I had a frustrating number of job interviews in my early career where the interviewers didn't know what realtime actually was. That "and predictable delay" concept from the article frequently seemed to be lost on many folks, who seemed to think realtime just meant fast, whatever that means.

mort96 · 2023-11-16T17:38:47

I would even remove the "minimum" part altogether; the point of realtime is that operations have predictable upper bounds. That might even mean slower average cases than in non-realtime systems. If you're controlling a car's braking system, "the average delay is 50ms but might take up to 80ms" might be acceptable, whereas "the average delay is 1ms but it might take arbitrarily long, possibly multiple seconds" isn't.

ska · 2023-11-16T17:51:28

The old saying "real time" /= "real fast". Hard vs "soft" realtime muddies things a bit, but I think it's probably the majority of software developers don't really understand what realtime actually is either.

Aaargh20318 · 2023-11-16T16:21:46

What does this mean for the common user? Is this something you would only enable in very specific circumstances or can it also bring a more responsive system to the general public?

fbdab103 · 2023-11-16T16:53:06

My understanding is that real-time makes a system slower. To be real-time, you have to put a time allocation on everything. Each operation is allowed X budget, and will not deviate. This means if the best-case operation is fast, but the worst case is slow, the system has to always assume worst case.

ravingraven · 2023-11-16T16:28:16

If by "common" user you mean the desktop user, not much. But this is a huge deal for embedded devices like industrial control and communication equipment, as their devs will be able to use the latest mainline kernel if they need real-time scheduling.

andrewaylett · 2023-11-16T17:18:44

RT doesn't necessarily improve latency, it gives it a fixed upper bound for some operations. But the work needed to allow RT can definitely improve latency in the general case -- the example of avoiding synchronous printk() calls is a case in point. It should improve latency under load even when RT isn't even enabled.

I think I'm right in asserting that a fully-upstreamed RT kernel won't actually do anything different from a normal one unless you're actually running RT processes on it. The reason it's taken so long to upstream has been the trade-offs that have been needed to enable RT, and (per the article) there aren't many of those left.

stavros · 2023-11-16T16:24:50

As far as I can understand, this is for Linux becoming an option when you need an RTOS, so for critical things like aviation, medical devices, and other such systems. It doesn't do anything for the common user.

ska · 2023-11-16T17:59:24

For the parts of such systems that you would need an RTOS for this isn't really a likely replacement because the OS is way too complex.

The sort of thing it could help with is servicing hardware that does run hard realtime. For example, you have an RTOS doing direct control of a robot or medical device or whatever, and you have a UI pendant or the like that a user is interacting with. If linux on that pendant can make some realtime latency guarantees, you may be able to simplify communication between the two without risking dropping bits on the floor.

Conversely, for the common user it could improve things like audio/video streaming, in theory but I haven't looked into details or how much trouble there is currently.

SubjectToChange · 2023-11-16T16:38:19

The Linux kernel, real-time or not, is simply too large and too complex to realistically certify for anything safety critical.

rcxdude · 2023-11-16T19:32:18

the most common desktop end-user that might benefit from this is those doing audio work: latency and especially jitter can be quite a pain there.

dist-epoch · 2023-11-16T17:16:58

It could allow very low latency audio (1-2 ms). Not a huge thing, but nice for some audio people.

dataflow · 2023-11-17T00:27:58

I feel like focusing on the kernel side misses CPU level issues.

Is there any known upper bound on, say, how long a memory access instruction takes on x86?

rsaxvc · 2023-11-17T03:08:59

I don't know for x86.

But for things that really matter, I've tested by configuring the MMU to disable caching for the memory that the realtime code lives in and uses to emulate 0% hitrate. And there's usually still a fair amount of variance on top of that depending on if the memory controller has a small cache, and where the memory controller is in its refresh cycle.

dataflow · 2023-11-17T03:16:37

Yeah. And I'm not sure that even that would give you the worst case as far as the cache is concerned. Of course I don't know how these implementations work, but it seems plausible that code that directly uses memory could run faster than code that encounters a cache miss beforehand (or contention, if you're using multiple cores). Moreover there's also the instruction cache, and I'm not sure if you can disable caching for that in a meaningful way?

For soft real time, I don't see a problem. But for hard real time, it seems a bit scary.

saagarjha · 2023-11-17T05:19:13

You can continually take page faults in a Turing complete way without executing any code, so I would guess this is unbounded?

dataflow · 2023-11-17T05:21:17

I almost mentioned page faults, but that's something the kernel has control over. It could just make sure everything is in memory so there aren't any faults. So it's not really an issue I think.

rwmj · 2023-11-16T20:09:47

About printk, the backported RT implementation of printk added to the RHEL 9.3 kernel has deadlocks ... https://issues.redhat.com/browse/RHEL-15897 & https://issues.redhat.com/browse/RHEL-9380

alangibson · 2023-11-16T17:54:49

Very exiting news for those of us building CNC machines with LinuxCNC. The end of kernel patches is nigh!

NalNezumi · 2023-11-16T17:48:35

Slightly tangential, but does anyone know good learning material to understand real-time (Linux) kernel more? For someone with rudimentary Linux knowledge.

I've had to compile&install real-time kernel as a requirement for a robot arm (franka) control computer. It would be nice to know a bit more than just how to install the kernel.

ActorNightly · 2023-11-16T17:54:56

https://www.freertos.org/implementation/a00002.html

Generally, having experience with Greenhills in a previous job, for personal projects like robotics or control systems I would recommend programming a microcontroller directly rather than dealing with SoC with RTOS. Modern STM32s with Cortex chips have enough processing power to run pretty much anything.

sesm · 2023-11-16T23:30:58

IMO if you really care about certain process being responsive, you should allocate dedicated CPU cores and a contiguous region of memory to it, that shouldn’t be touched by the rest of OS. Oh, and also give a it direct access to a separate network card. I’m not sure if Linux supports this.

andy_ppp · 2023-11-16T15:39:48

What do other realtime OS kernels do when printing from various places? It almost seems like this should be done in hardware because it's such a difficult problem to not lose messages but also have them on a different OS thread in most cases.

xenadu02 · 2023-11-16T22:15:22

In many problem spaces you can optimize for the common success and failure paths if you accept certain losses on long-tail failure scenarios.

A common logging strategy is to use a ring buffer with a separate isolated process reading from the ring. The vast majority of the time the ring buffer handles temporary disruptions (eg slow disk I/O to write messages to disk) but in the rare failure scenarios you simply overwrite events in the buffer and increment an atomic overwritten event counter. Events do not get silently dropped but you prioritize forward progress at the cost of data loss in rare scenarios.

Microkernels and pushing everything to userspace just moves the tradeoffs around. If your driver is in userspace and blocks writing a log message because the log daemon is blocked or the I/O device it is writing the log to is overloaded it does the same thing. Your realtime thread won't get what it needs from the driver within your time limit.

It all comes down to CAP theorem stuff. If you always want the kernel (or any other software) to be able to make forward progress within specific time limits then you must be willing to tolerate some data loss in failure scenarios. How much and how often it happens depends on specific design factors, memory usage, etc.

（评论） (comments)

（评论）
(comments)