![]() |
|
![]() |
| Microsoft made the reasonable point that locking 3rd parties out of the kernel might have resulted in legal challenges in the EU [0]. It is an interesting case where everyone is certain in hindsight that they would have been ok with MS blocking access, but it is less obvious that they would have taken that view if MS had pressured a bunch of security products out of the kernel with no obvious prompting.
[0] https://www.theregister.com/2024/07/22/windows_crowdstrike_k... |
![]() |
| The 2009 agreement with the EU mentioned in the article seems to be the one about the integration of the Internet Explorer (IE) into MS Windows.[1] But it only applied to IE and the commitment was limited to 5 years.[2]
Or is the article referring to something else? I see no reason why the EU should object to Microsoft's adoption of eBPF as long as MS Defender simply uses the same API that is available to all competitors. [1] Here is the original text: https://ec.europa.eu/competition/antitrust/cases/dec_docs/39... [2] See section 4, paragraph 20. |
![]() |
| They've done a technology transition once already from legacy file system filter drivers to the minifilter model. If they see enough benefit to another change, it wouldn't be unprecedented.
Mind you, it looks like after 20-ish years Windows still supports loading legacy filter drivers. Given the considerable work that goes into getting even a simple filesystem minifilter driver working reliably, it's safe to assume that we'd be looking at a similarly protracted transition period. As to the performance, I don't think the raw infrastructure to support minifilters is the major performance hit. The work the drivers themselves end up doing tends to be the bigger hit in my experience. Some background for the curious: https://www.osr.com/nt-insider/2019-issue1/the-state-of-wind... |
![]() |
| It's just easier for everyone involved (outside Windows GUI clicker admins) if it runs on Linux. Containerization is easier, configuration is easier and operating system is much more robust. |
![]() |
| AFAIK, NTFS is a perfectly ok design. But the Windows file system never performed well. This is probably not for architectural reasons.
And no, it didn't perform better at NT4, XP, or Win7 times. |
![]() |
| > In the future, computers will not crash due to bad software updates, even those updates that involve kernel code. In the future, these updates will push eBPF code.
eBPF is fantastic, and it can be used for many purposes and improve a lot of things, but this is IMO overselling it. Assuming that BPF itself it free of bugs, it’s still a rather large sprawl of kernel hooks, and those hooks invoke eBPF code, which can call right back into the kernel. Here’s a list: https://www.man7.org/linux/man-pages/man7/bpf-helpers.7.html bpf_probe_read_kernel() is particularly heavily used, and it is not safe. It tries fairly hard not to OOPS or crash, but it is definitely not perfect. The rest of that list contains plenty of this that will easily take down a system, even if it doesn’t actually oops or panic in the process. And, of course, any tool that detects userspace “malicious behavior” and stops it can start calling everything malicious, and the computer becomes unusable. Meanwhile, eBPF has no real security model on the userspace side. Actual attachment of an eBPF program goes through the bpf() syscall, not through sensibly permissioned operations on the underlying kernel objects being attached to, and there is nothing whatsoever that confines eBPF to, say, a container that uses it. (See bpf_probe_read_kernel() -- it's fundamentally able to read all kernel memory.) So, IMO, most of the benefit of eBPF over ordinary kernel C code is that eBPF is kind of like writing code in a safe language with a limited unsafe API surface. It's a huge improvement for this sort of work, but it is not perfect by any means. > The verifier is rigorous -- the Linux implementation has over 20,000 lines of code The verifier is absurdly complex. I'd rather see something based on formal methods than 20kLOC of hand-written logic. |
![]() |
| eBPF isn't "watching the watchers" it's just a tool that lets other tools access low-level things in the kernel via a very picky sandbox. Think of it like this:
Old way: Load kernel driver, hook into bazillions of system calls (doing whatever it is you want to do), pray you don't screw anything up (otherwise you can get a panic though not necessarily--Linux is quite robust). eBPF way: Just ask eBPF to tell you what you want by giving it some eBPF-specific instructions. There's a rundown on how it works here: https://ebpf.io/what-is-ebpf/ |
![]() |
| eBPF programs cannot crash the kernel, assuming there are no bugs in the eBPF verifier. There have been such bugs in the past but they seem to be getting more and more rare. |
![]() |
| I don't see how this contradicts what I said. Indeed, there are helpers, but the verifier is supposed to check that the eBPF program isn't calling them with invalid arguments. |
![]() |
| On rhel8 variants, you can use the Oracle UEK to get eBPF.
https://blogs.oracle.com/linux/post/oracle-linux-and-bpf
|
![]() |
| Considering the number of systems running very obsolete OSes these days: WinNT (4x or 3x), Windows, DOS, or various proprietary Unixen, stale Linux flavours, etc., etc., ... yes, quite. |
![]() |
| Rhel8 is based on 4.18
RHEL9 is based on 5.14 , i think it still has the same restriction ( kernel.unprivileged_bpf_disabled ).
I reckon Red Hat may duplicate upstreams behavior by RHEL10. |
![]() |
| Hardly. For starters, wasm doesn’t guarantee that a piece of code terminates in bound time. There are further security guarantees in ebpf such as any lock acquired must be released. |
![]() |
| Does anyone know how far along the eBPF implementation for Windows actually is? In the sense that it could start feasibly replacing existing kernel drivers. |
![]() |
| I don't buy it... didn't a bug from RedHat + Crowdstrike have a similar panic issue? I understand in that case it was because of RedHat, but still. I don't think this, by itself will change much. |
![]() |
| > it'll almost always be possible to cause a crash, if you try hard enough.
If you think you know a way to crash the Linux kernel by loading and running an eBPF program, you should report a bug. |
![]() |
| At Ring 3 it would crash an app, not the entire OS.
Yes, the kernel is fine and is not to blame. But running basically a rootkit controlled by a third party indeed is to blame. |
![]() |
| If you consider kernel programming to be inherently unsafe, then you would consider this to be inevitable, meaning it's not really the specific company's fault. They were just the unlucky ones. |
![]() |
| Let's walk this through: Canary deployment to Windows machines. If those Windows machines got hit with BSOD, they will go offline. How do you determine if they go offline because of Canary or because of regular maintenance by the customer's IT cycle?
You can guess, but you cannot be 100% sure. What if the targeted canary deployments are Employees desktops that are OFFLINE during the time of rollout? >I’m out of the loop if this crowdstrike update was such a scenario where best practices for software deployment were worth bypassing. I did post a question: what about other Cybersecurity vendors? Do you think they do canary deployment on their AV definitions? Here's more context to understand Cybersecurity: https://radixweb.com/blog/what-is-mean-time-to-detect Cybersecurity companies participate in Sec evaluation annually that evaluates (measure) and grade their performance. That grade is an input for Organizations to select vendors outside their own metrics/measurements. I don't know if MTTD is included in the contract/SLA. If it does, you got some answer as to why certain decision is made. It's definitely interesting to see Software developers of HN giving out their 2c for a niche Cybersecurity industry. |
![]() |
| Why even do that? We have virtualization, they could emulate real clients and networks of clients. This particular bug would have been prevented for sure |
![]() |
| Agree, Crowdstrike was an unlucky one, but it is more about the issue in general. If I remember correctly, also others like sysdig user their own kernel modules for collection. |
![]() |
| CrowdStrike is mentioned, but the goal of the article is to promote eBPF. CrowdStrike is tangentially related because it draws attention to a platform that Gregg has put a lot into. |
![]() |
| I wonder if microkernels ever had this kind of bullshit. Had it been a microkernel, would we all be sitting twiddling our thumbs on friday? Hot take: No. |
![]() |
| From the article:
> If the verifier finds any unsafe code, the program is rejected and not executed. The verifier is rigorous -- the Linux implementation has over 20,000 lines of code [0] -- with contributions from industry (e.g., Meta, Isovalent, Google) and academia (e.g., Rutgers University, University of Washington). [0] links to https://github.com/torvalds/linux/blob/master/kernel/bpf/ver... which has this interesting comment at the top:
I haven't inspected the code, but I thought that checking for infinite loops would imply solving the halting problem. Where's the catch? |
![]() |
| I'm not able to comment on what this code is doing, but as for the theory:
The halting problem is only unsolvable in the general case. You cannot prove that any arbitrary piece of code will stop, but you can prove that specific types of code will stop and reject anything that you're unable to prove. The trivial case is "no jumps"—if your code executes strictly linearly and is itself finite then you know it will terminate. More advanced cases can also be proven, like a loop over a very specific bound, as long as you can place constraints on how the code can be structured. As an example, take a look at Dafny, which places a lot of restrictions on loops [0], only allowing the subset that it can effectively analyze. [0] https://ece.uwaterloo.ca/~agurfink/stqam/rise4fun-Dafny/#h25 |
![]() |
| Adding on (and it's not terribly relevant to eBPF), it's also worth noting that there are trivial programs you can prove DON'T halt.
A trivial example[1]:
This program trivially runs forever[2], and indeed many static code analyzers will point out that everything after the `while (true) {}` line is unreachable.I feel like the halting problem is incredibly widely misunderstood to be similar to be about "ANY program" when it really talks about "ALL programs". [1]: In C++, this is undefined behavior technically, but C and most other programming languages define the behavior of this (or equivalent) function. [2]: Fun relevant xkcd: https://xkcd.com/1266/ |
![]() |
| EDIT: I am incorrect, please ignore. (Original text below, for posterity).
Nit: In many languages, doesn't this depend on what foo() does? e.g:
|
![]() |
| The halting problem cannot be solved in the general case, but in many cases you can prove that a program halts. eBPF only allows verifiably-halting programs to run. |
![]() |
| This, others have said it less concisely, but a program without loops and arbitrary jumps is guaranteed to halt if we assume the external functions it calls into will halt. |
![]() |
| I'm glad to hear that Meta and Google code is "rigorous". I'd prefer INRIA, universities that fund theorem provers, industries where correctness matters like aerospace or semiconductors. |
![]() |
| > I think they mean that the code base is small enough to be audited thoroughly.
They wouldn't say it was "over 20,000 lines" in that case. And 20,000 lines of C is far too big to audit. |
![]() |
| The halting problem is exhaustive, there isn't an algorithm that is valid for all programs. You can still check for some kinds of infinite loops though! |
![]() |
| I should clarify that individual eBPF programs have to terminate, but more complex problems can be solved with multiple eBPF programs, and can be "scheduled" indefinitely using BPF timers |
![]() |
| Well it is useful in practice, there are some pretty useful products based on eBPF on Linux, most notably Cilium (and, shameless plug for the one I’m working on: Parca, an eBPF-based CPU profiler). |
This doesn’t seem grounded in reality. If you follow the link to the “hooks” that Windows eBPF makes available [1], it’s just for incoming packets and socket operations. IOW, MS is expecting you to use the Berkeley Packet Filter for packet filtering. Not for filtering I/O, or object creation/use, or any of the other million places a driver like Crowdstrike’s hooks into the NT kernel.
In addition, they need to be in the kernel in order to monitor all the other 3rd party garbage running in kernel-space. ELAM (early-launch anti-malware) loads anti-malware drivers first so they can monitor everything that other drivers do. I highly doubt this is available to eBPF.
If Microsoft intends eBPF to be used to replace kernel-space anti-malware drivers, they have a long, long way to go.
[1]: https://microsoft.github.io/ebpf-for-windows/ebpf__structs_8...