![]() |
|
![]() |
| Calling C from Rust can be quite simple. You just declare the external function and call it. For example, straight out of the Rust book https://doc.rust-lang.org/book/ch19-01-unsafe-rust.html#usin... :
Now, if you have a complex library and don't want to write all of the declarations by hand, you can use a tool like bindgen to automatically generate those extern declarations from a C header file: https://github.com/rust-lang/rust-bindgenThere's an argument to be made that something like bindgen could be included in Rust, not requiring a third party dependency and setting up build.rs to invoke it, but that's not really the issue at hand in this article. The issue is not the low-level bindings, but higher level wrappers that are more idiomatic in Rust. There's no way you're going to be able to have a general tool that can automatically do that from arbitrary C code. |
![]() |
| > `extern "C"` has nothing to do with linkage, all it does is disable namemangling, so you get the same symbol name as with a C compiler.
extern "C" also ensures that the C calling convention is used, which is relevant for callbacks. It's not just name mangling. This is the reason that extern "C" static functions exist. You can actually overload a C++ function by extern "C" vs extern "C++", and it will dispatch it appropriately based on whether the passed in function is declared with C or C++ linkage. And I'm not sure the terms are confused, because that's how most documentation refers to it: https://learn.microsoft.com/en-us/cpp/cpp/extern-cpp?view=ms... > In C++, when used with a string, extern specifies that the linkage conventions of another language are being used for the declarator(s). C functions and data can be accessed only if they're previously declared as having C linkage. However, they must be defined in a separately compiled translation unit. And https://en.cppreference.com/w/cpp/language/language_linkage The post you're replying to had it completely right. extern "C" is entirely about linkage, which includes calling convention and name mangling. > As you noted correctly, the calling conventions must match, but in practice this only matters on x86 Windows. Or if you want your program to actually be correct, instead of just incidentally working for most common cases, including on future systems. If you're passing a callback to a C function from C++, it's wrong unless the callback is declared extern "C". |
![]() |
| > extern "C" also ensures that the C calling convention is used, which is relevant for callbacks. It's not just name mangling.
I stand corrected. I didn't know that `extern "C"` enforces the C calling convention. However, on modern platforms this doesn't really matter because, as I said, there is only a single calling convention (per platform). And I'm pretty sure that future platforms will keep it that way. Fortunately, if you try to pass a C++ callback of the wrong calling convention, you get a compiler error. > If you're passing a callback to a C function from C++, it's wrong unless the callback is declared extern "C". That's certainly not true because `extern "C"` is not the only way to specify the calling convention. In fact, you might need a different calling convention! As I mentioned, on x86 the Windows API uses stdcall for all API functions and callbacks, so `extern "C"` would be wrong. If you look at the Microsoft examples, you will see that they declare the callbacks as WINAPI (without `extern "C"`): https://learn.microsoft.com/en-us/windows/win32/procthread/c... So I stand by my point that in practice you don't need `extern "C"` for passing C++ callbacks to C functions. You can pass a lambda function just fine, and when it doesn't work the compiler will tell you. |
![]() |
| > * Specifying calling convention at all is a compiler specific extension.
Yes, because the calling conventions themselves are platform/compiler specific. > There is no standard way of specifying a C calling convention without `extern`. Well, on modern platforms you don't need to because there is only a single calling convention that is shared between C and C++. For legacy platforms with multiple calling conventions, you need compiler specific extensions by definition. > The only portable way to specify C linkage in a C++ program is extern "C". You will always get the right ABI for your platform and it will work on every compiler. Again, on platforms with several calling conventions `extern "C"` absolutely won't give you the appropriate calling convention all the time. See again my Win32 API example. > The compiler will very often not tell you > An incorrect ABI will usually be accepted and will just do the wrong thing or crash at runtime. That's absolutely not my experience! Functions with different calling conventions have different types, so a C++ compiler must reject such code. See https://godbolt.org/z/6EnncE5v5. (Note that for the lambda case MSVC is smart enough to automatically add __stdcall whereas MinGW refuses to compile. The free function is rejected by both compilers.) Can you show me an actual example where a C++ compiler silently accepts a function with the wrong calling convention? > Your position works sometimes for some compilers and some platforms. It has always worked for me so far and I write software for many different platforms. |
![]() |
| > Even if they're the same calling convention, that should fail, but it doesn't.
It's an interesting question. According to the standard, functions with different language linkage are indeed considered different types. As a consequence, "The only modern compiler that differentiates function types with "C" and "C++" language linkages is Oracle Studio, others do not permit overloads that are only different in language linkage, including the overload sets required by the C++ standard" https://en.cppreference.com/w/cpp/language/language_linkage In practice, extern "C" does two things (as you correctly pointed out): 1. disable name mangling - This only affects the symbol name and is not relevant for callback functions 2. enforce the (default) C calling convention - On all (modern) platforms I know, C and C++ have the same default calling convention for free functions. This means that from the view of a C++ compiler, pointers to `foo()` and `extern "C" foo()` have the exact same type. Anyway, no need to be nervous. Even if the compiler treated these as different types, you would get a compiler error because C++ disallows implicit casts between different pointer types. |
![]() |
| > Is it guaranteed that an incorrect calling convention will always cause a compiler error?
A standard-conforming C++ compiler must not allow implicit pointer casts, so yes! > I wasn't aware the calling convention was considered part of the pointer type. Some well-designed C APIs define a macro for the calling convention that they add to all API functions and function pointer declarations. The user can then use the same macro when supplying their callbacks, which guarantees that the calling conventions match. (On modern platforms, the macro would be typically empty.) Here's an example: https://github.com/Celemony/ARA_API/blob/1f68fba7a374b14df19.... As you can see, it is part of the function pointer type: https://github.com/Celemony/ARA_API/blob/1f68fba7a374b14df19... Another famous example is, of course, the WINAPI macro in the Win32 API. That's also what I tend to do with my own C APIs. > I guess I had some assumptions about calling conventions that needed to be straightened out I also learned a few things in this discussion, so thanks for that! |
![]() |
| Now imagine a hundred or two functions, structures and callbacks, some of them exposed only as CPP macros over internal implementation. PJSIP low level API is one example. |
![]() |
| They quickly become unwieldy on non-trivial APIs, with hundreds of definitions across dozens of files and with macros to boot. Naturally people would still get the job done but it's beyond simple. |
![]() |
| Not types, functions. Where the macro is essentially a forward declaration but the implementation is deep inside the code and is not exposed via headers. |
![]() |
| > Almeida put up a slide with the equivalent of iget_locked() in Rust, which was called get_or_create_inode().
Seems like the answer is that it's reimplementing and doesn't use the same names. |
![]() |
| Moving using a 70's technology breaks things. Rust is tested already on other OSes like Windows, Mac (or iOS) and Android and solves several pitfalls of C and C++. Some quotes from the Android team [1]:
"To date, there have been zero memory safety vulnerabilities discovered in Android’s Rust code." "Safety measures make memory-unsafe languages slow" Not saying Rust is the perfect solution to every problem, but it is definitely not an outlandish proposition to use it where it makes sense. [1] https://security.googleblog.com/2022/12/memory-safe-language... |
![]() |
| Some of the comments below the lwn.net page are rather disrespectful.
Imagine getting this comment about the open source project you contribute to: "Science advances one funeral at a time" |
![]() |
| C is, in some important cases, less convenient than assembly in ways which have to be worked round either fooling the compiler or adding intrinsics. A recent example: https://justine.lol/endian.html
Is the huge macro more convenient than the "bswap" instruction? No, but it's portable. > I don't really see what other ones made sense historically. Pascal chose differently in a couple of places. In particular, carrying the length with strings. C refused to define semantics for arithmetic. This gave you programs which were "portable" so long as you didn't mind different behavior on different platforms. Good for adoption, bad for sanity. It was only relatively recently they defined subtraction to be twos-complement. 16-bit Windows even used C with the Pascal calling convention. http://www.c-jump.com/CIS77/ASM/Procedures/P77_0070_pascal_s... |
![]() |
| Obviously, there are some cases where it makes sense to use an unsafe block. However, I think there might be fewer cases then people might think.
As an example, both the popular generic self-referencing crates ouroboros and self_cell have had memory safety bugs in them in the past. (links at the end) Both of them were carefully reviewed by experienced rust developers before their first public release, and yet they still ended up with such bugs. Admittedly, part of the issue is that both crates are trying to be more generic, so they have to be correct over a larger range of circumstances. But still, these crates have one job, are both less than 1500 LOC, and they were carefully reviewed to ensure they did that one job before their public releases, and they still ended up having issues that were not caught. They might still have issues. Thus, while it might be fine to use unsafe to state that your array of zeros is a valid utf-8 string without a runtime check, it's probably a good idea to twist yourself into a pretzel if the invariants are not trivial to prove and the overhead to maintenance/runtime isn't too high. [0]: https://rustsec.org/advisories/RUSTSEC-2023-0042.html [1]: https://rustsec.org/advisories/RUSTSEC-2023-0070.html |
![]() |
| To my knowledge most if not all of these people driving Rust adoption in Linux are seasoned Linux contributors and/or maintainers. They are not outsiders "coming to other people's projects". |
![]() |
| Kent has been really vocal about writing bcachefs in Rust in the IRC channel. They even started some work already, but decided to wait until the common abstractions are ready and merged. |
![]() |
| Man, an omni-hater, not just hating people trying to bring Rust to Linux but also hating on Linux. Must be a very rewarding sense of superiority.
> ECC memory This is entirely Intel's fault. |
![]() |
| Rust is a thing in the real world.
Both Windows and Android are shipping, today, with meaningful components written in Rust. Amazon S3 and Lambda are built on top of Rust. Apple is hiring Rust developers and they post about it on this platform [https://news.ycombinator.com/item?id=40849188]. Dropbox and Discord backend services are written in Rust. Cloudflare uses Rust very extensively in their infrastructure, which means that a large fraction of global internet traffic passes through routers and servers written in Rust. The UEFI firmware implementation of the next Surface products by Microsoft is written in Rust. You are simply incorrect. Instead of arguing I will suggest that you do a slight modicum of research into who is using Rust and for what. While it won't be comparable in omnipresence with C and C++ for a long time, it is widely-enough used that there is a near-zero chance that you are not already using some tool or service that directly or indirectly uses Rust for some significant purpose. It is not a "forum and hobby project language". The list I just provided is also by no means complete - Shopify, Disney, Facebook, Firefox... and many others... also use Rust. Your claim of credibility via working on kernels falls completely flat in the face of Microsoft directly contradicting you: https://www.thurrott.com/windows/282471/microsoft-is-rewriti... "According to Weston, Microsoft has already rewritten 36,000 lines of code in the Windows kernel in Rust, in addition to another 152,000 lines of code it wrote for a proof of concept DirectWrite Core library, and the performance is excellent with no regressions compared to the old C++ code. He also called out that “there is now a syscall, in the Windows kernel, written in Rust.” Whatever experience you have is out of date with the current reality. Not only is there interest in using Rust in these core areas, but it has already started happening. |
![]() |
| Not advocating for this, but you could also imagine a C superset where all of the new features only apply in 'safe' blocks, which would be backwards-compatible, and likely safer-in-practice. |
![]() |
| > additional complexity tax
Yes, but that should be offsetted by easier driver development. See the blog about Rust GPU driver for asahii linux, done in one month. EDIT: Google "tales of the m1 gpu" (author has a very negative opinions about hacker news, read if you like by clicking the link https://asahilinux.org/2022/11/tales-of-the-m1-gpu/) Is it universal? We'll see in coming years. |
![]() |
| > The criminal liability here is "intentional inflection of emotional distress",
Intentional infliction of emotional distress is a tort, not criminal. |
![]() |
| There are obvious differences between criticism and harassment. Let's not act like we don't know why the asahi linux team is getting death threats, and maybe try to improve the situation. |
![]() |
| HN has actually drifted far, far to the left of where it started.
It was called Startup News at the beginning. Of course, it's gonna be interested in money and money making. Duh. |
![]() |
| There is no "safe subset" of C. MISRA is fairly close, but all sorts of things that you might need, like integer arithmetic, have potential UB in C.
(The best current effort is https://sel4.systems/ , which is written in C but has a large proof of safety attached. The language design question is basically: should the proof be part of the language?) |
![]() |
| The disconnect section of the article is a good example of exactly on how not to do the things, and how things can turn out sour if the existing community isn't taken for the ride. |
If the lifecycle of inodes is filesystem-specific, it should be managed via filesystem-specific functions.