（评论）

（评论）
(comments)

原始链接: https://news.ycombinator.com/item?id=41146860

当谈论未定义行为（UB）时，您正在讨论编程语言标准未规定的行为，为编译器的解释留下了空间。这为编译器编写者提供创造性解决方案和定制优化的可能性，从而提高整体性能。然而，由于缺乏可预测性，它会带来风险，并可能导致难以捕获和重现的细微错误。未定义行为的一个例子是 C 语言，特别是有符号整数溢出。在 C 中，执行超出整数类型最大可表示范围的整数算术被视为未定义行为，允许编译器在假设不会发生溢出的情况下应用优化，从而可能会引入意外结果。其他实例包括空指针取消引用、越界访问和别名违规等。一些人认为，包含未定义的行为为编译器提供了灵活性，并允许在特定场景中更快地生成代码。其他人则反驳说，它会造成不必要的复杂性，并鼓励草率的编程实践，从而导致模糊边缘情况的扩散，从而导致具有挑战性的调试体验。无论如何，了解代码库中未定义的行为并努力尽量减少其存在至关重要。应谨慎开发代码，考虑可维护性、安全性和正确性，同时还要记住，忽视未定义的行为可能会造成隐藏的陷阱，并导致更大的维护挑战。以下是深入探讨此概念的文章的链接：。这篇博文对整数溢出、空指针取消引用以及未定义行为对性能和可靠性的影响等主题进行了深入的讨论。

> compiler writers refuse to take responsibility for the bugs they introduced, even though the compiled code worked fine before the "optimizations". The excuse for not taking responsibility is that there are "language standards" saying that these bugs should be blamed on millions of programmers writing code that bumps into "undefined behavior"

But that's not an excuse for having a bug; it's the exact evidence that it's not a bug at all. Calling the compiler buggy for not doing what you want when you commit Undefined Behavior is like calling dd buggy for destroying your data when you call it with the wrong arguments.

I think this is actually a mistake by the author since the rant is mostly focused on implementation defined behavior, not undefined.

The examples they give are all perfectly valid code. The specific bugs they're talking about seem to be compiler optimizations that replace bit twiddling arithmetic into branches, which isn't a safe optimization if the bit twiddling happens in a cryptographic context because it opens the door for timing attacks.

I don't think it's correct to call either the source code or compiler buggy, it's the C standard that is under specified to the author's liking and it creates security bugs on some targets.

Ultimately though I can agree with the C standard authors that they cannot define the behavior of hardware, they can only define the semantics for the language itself. Crypto guys will have to suffer because the blame is on the hardware for these bugs, not the software.

The blog post does, at the very end, mention the thing you should actually do.

You need a language where you can express what you actually meant, which in this case is "Perform this constant time operation". Having expressed what you meant, now everybody between you and the hardware can co-operate to potentially deliver that.

So long as you write C (or C++) you're ruined, you cannot express what you meant, you are second guessing the compiler authors instead.

I think a language related to WUFFS would be good for this in the crypto world. Crypto people know maths already, so the scariest bits of such a language (e.g. writing out why you believe it's obvious that 0 <= offset + k + n < array_length so that the machine can see that you're correct or explain why you're wrong) wouldn't be intimidating for them. WUFFS doesn't care about constant time, but a similar language could focus on that.

> You need a language where you can express what you actually meant, which in this case is "Perform this constant time operation". Having expressed what you meant, now everybody between you and the hardware can co-operate to potentially deliver that.

Yeah, even with assembly, your timing guarantees are limited on modern architectures. But if you REALLY need something specific from the machine, that's where you go.

Most architectures have a subset of their instructions that can be called in constant time (constant time meaning data-independent time). Things like non-constant pointer loads and branches are obviously out, and so are div/mod in almost all chips, but other arithmetic and conditional moves are in that set.

CPUs are actually much better at making those guarantees than compilers are.

And besides, Assembly should not be a scary thing.

Even most managed languages provide a way to get down into Assembly dumps from their AOT and JIT compiler toolchains.

Maybe we need some TikTok videos showing how to do such workflows.

There's nothing terribly difficult to writing specific routines in asm. It's kinda fun, actually. Assembly _in the large_ is hard, just because you need to be next-level organized to keep things from being a mess.

I find it rather easy as long as you don’t have to interact with the OS. That’s where it becomes messy, with inflexible data structures and convoluted arguments in ABIs designed to be used from higher level languages.

If you are doing really low level stuff, sometimes it’s worth rolling out your own little RTOS with just the bits and pieces you need.

Not that long ago, I realised most 8-bit (and MS-DOS) computer games had their own RTOS woven into the game code, with multi-tasking, hardware management, IO, and so on.

> it's the C standard that is under specified to the author's liking

Isn't this unreasonable? Here we are, 52, years down the road with C et al. and suddenly it's expected that compiler developers must consider any change in the light of timing attacks? At what point do such new expectations grind compiler development to a halt? What standard would a compiler developer refer to to stay between the lines? My instincts tell me that this would be a forever narrowing and forever moving target.

Does timing sensitivity ever end? Asked differently: is there any code that deals in sensitive information that can't, somehow, be compromised by timing analysis? Never mind cryptographic algorithms. Just measuring the time needed to compute the length of strings will leak something of use, given enough stimuli and data collection.

Is there some reason a cryptographic algorithm developer must track the latest release of a compiler? Separate compilation and linking is still a thing, as far as I know. Such work could be isolated to "validated" compilers, leaving the insensitive code (if that concept is even real...) to whatever compiler prevails.

Also, it's not just compilers that can "optimize" code. Processing elements do this as well. Logically, therefore, must we not also expect CPU designers to also forego changes that could alter timing behavior? Forever?

I've written a lot of question marks above. That's because this isn't my field. However, invoking my instincts again: what, short of regressing to in-order, non-speculative cores and freezing all compiler development, could possibly satisfy the expectation that no changes are permitted to reveal timing differences where it previously hadn't?

This all looks like an engineering purity spiral.

> Isn't this unreasonable? Here we are, 52, years down the road with C et al. and suddenly it's expected that compiler developers must consider any change in the light of timing attacks?

We already went through a similar case to this: when the C++11 multithreaded memory model was introduced, compiler authors were henceforth forced to consider all future optimizations in light of multithreading. Which actually forced them to go back and suppress some optimizations that otherwise appeared reasonable.

This isn't to say the idea is good (or bad), but just that "compiler development will grind to a halt" is not a convincing against it.

It is completely unreasonable though to assume that a compiler should now preserve some assumed (!) timing of source operations.

It would be reasonable to implement (and later standardize) a pragma or something that specifies timings constraint for a subset of the language. But somebody would need to do the work.

An attribute for functions that says "no optimisations may be applied to the body that would change timings" seems like a reasonable level of granularity here, and if you were conservative about which optimisations it allowed in version zero it'd probably not be a vast amount of work.

I'm sort of reminded of the software people vs. hardware people stuff in embedded work, where ideally you'd have people around who'd crossed over from one to another but (comparing to crypto authors and compiler authors) there's enough of a cultural disconnect between the two groups that it doesn't happen as often as one might prefer.

Instruction duration isn't constant even within the same arch. You cannot have branches in constant-time code.

I do wonder though how often cpu instructions have data-dependent execution times....

The author is somewhat known for his over-the-top rants. When you read his writing in the context of a 90's flamewar, you'll find them to be quite moderate and reasonable. But it comes from a good place; he's a perfectionist, and he merely expects perfection from the rest of us as well.

In all things, moderation. Security must be evaluated as a collection of tradeoffs -- privacy, usability, efficiency, etc. must be considered.

For example, you might suspect that the NSA has a better sieve than the public, and conclude that your RSA key needs to be a full terabyte*. We know that this isn't perfect, of course, but going much beyond that key length will prevent your recipient from decrypting the message in their lifetime.

* runtime estimates were not performed to arrive at this large and largely irrelevant number

We had ~30 years of "undefined behaviour" practically meaning "do whatever the CPU does". It is not new that people want predictable behaviour, it simply wasn't a talking point as we already had it.

You pretty much answered your own question. ~20 years ago and back. But I think it is also worth pointing out that it has gotten worse, those 20 years has been a steady trickle of new foot guns.

It's not even that. Yes, in case of signed intger overflow, usually yoj get whatever the CPU gives as answer for the sum. But you also have the famous case of an if branch checking for a null pointer being optimized away. And even in the case of integer overflow, the way to correctly check for it isn't intuitive at first, because you need to check for integer overflow without the check itsef falling under UB.

EDIT: just to make my point clear: the problem with UB isn't just that it exists, it is also that compiler optimizations make it hard to check for it.

We have ill defined behaviour, implementation defined behaviour, erroneous behaviour, unspecified behaviour, undefined behaviour.

Undefined behaviour isn't exactly what most people think it is.

> Is there some reason a cryptographic algorithm developer must track the latest release of a compiler?

Tracking the latest release is important because:

1. Distributions build (most? all?) libraries from source, using compilers and flags the algorithm authors can't control

2. Today's latest release is the base of tomorrow's LTS.

If the people who know most about these algorithms aren't tracking the latest compiler releases, then who else would be qualified to detect these issues before a compiler version bearing a problematic optimization is used for the next release of Debian or RHEL?

> Logically, therefore, must we not also expect CPU designers to also forego changes that could alter timing behavior?

Maybe? [1]

> freezing all compiler development

There are many, many interesting areas of compiler development beyond incremental application of increasingly niche optimizations.

For instance, greater ability to demarcate code that is intended to be constant time. Or test suites that can detect when optimizations pose a threat to certain algorithms or implementations. Or optimizing the performance of the compiler itself.

Overall I agree with you somewhat. All engineers must constantly rail against entropy, and we are doomed to fail. But DJB is probably correct that a well-reasoned rant aimed at the community that both most desires and most produces the problematic optimizations has a better chance at changing the tide of opinion and shifting the rate at which all must diminish than yelling at chipmakers or the laws of thermodynamics.

[1]https://en.m.wikipedia.org/wiki/Spectre_(security_vulnerabil...

> This all looks like an engineering purity spiral.

To get philosophical for a second, all of engineering is analyzing problems and synthesizing solutions. When faced with impossible problems or infinite solution space, we must constrain the problem domain and search space to find solutions that are physically and economically realizable.

That's why the answer to every one of your questions is, "it depends."

But at the top, yes, it's unreasonable. The C standard specifies the observable behavior of software in C. It does not (and cannot) specify the observable behavior of the hardware that evaluates that software. Since these behavior are architecture and application specific, it falls to other tools for the engineer to find solutions.

Simply put, it isn't the job of the C standard to solve these problems. C is not a specification of how a digital circuit evaluates object code. It is a specification of how a higher level language translates into that object code.

> short of regressing to in-order, non-speculative cores

I guess you are referring to a GPU cores here.

It is a joke but can hint that in-order non-speculative cores are powerful computers nonetheless.

You make a fine point. If you follow this regression to its limit you're going to end up doing your cryptography on a MCU core with a separate, controlled tool chain. TPM hardware has been a thing for a while as well. Also, HSMs.

This seems a lot more sane than trying to retrofit these concerns onto the entire stack of general purpose hardware and software.

But then you're not writing C, except maybe as some wrappers. Wanting to use C isn't laziness. Making it nearly unfeasible to use C is the most suffering a C compiler can inflict.

As was pointed out elsewhere, fiddling bits with constant time guarantees isn't part of the C specification. You need a dedicated implementation that offers those guarantees, which isn't clang (or C, to be pedantic).

Fortunately you don't have to go 100% one way or the other. Write your code in C, compile and check it's correct and constant time, then commit that assembly output to the repo. You can also clean it up yourself or add extra changes on top.

You don't need to rely on C to guarantee some behaviour forever.

I agree that an extension (e.g. a pragma or some high-level operations similar to ckd_{add,sub,mul}) that allows writing code with fixed timing would be very useful.

But we have generally the problem that there are far more people complaining that actually contributing usefully to the ecosystem. For example, I have not seen anybody propose or work on such an extension for GCC.

The problem doesn't stop there, if you want to ensure constant time behaviour you must also be able to precisely control memory loads/stores, otherwise cache timings can subvert even linear assembly code. If you have to verify the assembly, might as well write it in assembly.

The cryptographic constant time requirement only concerns operations that are influenced by secret data. You can't learn the contents of say a secret key by how long it took to load from memory. But say we use some secret data to determine what part of the key to load, then the timing might reveal some of that data.

The problem is that c and c++ have a ridiculous amount of undefined behavior, and it is extremely difficult to avoid all of it.

One of the advantages of rust is it confines any potential UB to unsafe blocks. But even in rust, which has defined behavior in a lot of places that are UB in c, if you venture into unsafe code, it is remarkable easy to accidentally run into subtle UB issues.

It’s true that UB is not intuitive at first, but “ridiculous amount” and “difficult to avoid” is overstating it. You have to have a proof-writing mindset when coding, but you do get sensitized to the pitfalls once you read up on what the language constructs actually guarantee (and don’t guarantee), and it’s not that much more difficult than, say, avoiding panics in Rust.

In my experience it is very easy to accidentally introduce iterator invalidation: it starts with calling a callback while iterating, add some layers of indirection, and eventually somebody will add some innocent looking code deep down the call stack which ends up mutating the collection while it's being iterated.

I can tell you that this happens in Java as well, which doesn’t have undefined behavior. That’s just the nature of mutable state in combination with algorithms that only work while the state remains unmodified.

Depending on your collection iterator invalidation _is_ UB. Pushing to a vector while iterating with an iterator will eventually lead to dereferencing freed memory as any push may cause the vector to grow and move the allocation. The standard iterator for std::vector is a pointer to somewhere in the vector's allocation when the iterator is created, which will be left dangling after the vector reallocates.

If it is UB, the compiler is allowed to optimize based on the assumption that it can't happen. For example, if you have an if in which one branch leads to UB and the other doesn't, the compiler can assume that the branch that led to UB will never happen and remove it from the program, and even remove other branches based on "knowing" that the branch condition didn't happen.

If it's simply erroneous, then it behaves like in every other language outside C and C++: it leaves memory in a bad state if it happens at runtime, but it doesn't have any effect at compile time.

A state in which memory is not expected to be based on the theoretical semantics of the program.

For example, if you do an out of bounds write in C, you can set some part of an object to a value it never should have according to the text of the program, simply because that object happened to be placed in memory next to the array that you wrote past the end of.

According to the semantics of the C abstract machine, the value of a non-volatile object can only change when it is written to. But in a real program, writes to arbitrary memory (which are not valid programs in C's formal semantics) can also modify C objects, which would be called a "bad state".

For example, take this program, and assume it is compiled exactly as written, with no optimizations at all:

  void foo() {
    int x = 10;
    char y[3];
    y[4] = 1;
    printf("x = %d", x);
  }

In principle, this program should print "10". But, the OOB write to y[4] might overwrite one byte of x with the value 1, leading to the program possibly printing 1, or printing 266 (0x010A), or 16777226 (0x0100000A), depending on how many bytes an int has and how they are laid out in memory. Even worse, the OOB write may replace a byte from the return address instead, causing the program to jump to a random address in memory when hitting the end of the function. Either way, your program's memory is in a bad state after that instruction runs.

Yes, this is an example of UB leaving memory in a bad state.

If you want an example of something that is not UB leaving memory in a bad state, here is some Go code:

  global := 7;
  func main () {
    go func() {
      global = 1000000;
    }()
    go func() {
      global = 10
    }()
    fmt.Printf("gloabl is now %d")
  }

The two concurrent writes may partially overlap, and global may have a value that is neither 7 nor 10 nor 1000000. The program's memory is in a bad state, but none of this is UB in the C or C++ sense. In particular, the Go compiler is not free to compile this program into something entirely different.

Edit: I should also note that a similar C program using pthreads or win32 threading for concurrent access is also an example of a program which will go into a bad state, but that is not UB per the C standard (since the C standard has no notion of multithreading).

I'm familiar with that very sort of bug, but I don't see how it's a failure of the language. To be convinced of that I think I'd at least need to be shown what a good solution to that problem would look like (at the level of the language and/or standard library).

Rust statically enforces that you have exclusive access to a collection to mutate it. This prevents also having an active iterator.

You also have languages using immutable or persistent data structures in their std lib to side-step the problem.

So surely you know by hear the circa 200 use cases documented in ISO C, and the even greater list documented in ISO C++ standard documents.

Because, me despite knowing both since the 1990's, I rather leave that to static analysis tools.

I've spent hours debugging a memory alignment issues. Its not fun. The problem is that you don't know (at first) the full space of UB. So you spend the first 10 years of programming suffering through all kinds of weird UBs and then at the end of the pipeline claims "pftt, just git gud at it. C is perfect!".

Maybe I got lucky, because on my first C job I got told to make sure to stick to ISO C (by which they probably mostly meant not to use compiler-specific extensions), so I got down the rabbit hole of reading up on the ISO specification and on what it does and doesn’t guarantee.

Making sure you have no UB certainly slows you down considerably, and I strongly prefer languages that can catch all non-defined behavior statically for sure, but I don’t find C to be unmanageable.

Memory alignment issues only happen when you cast pointers from the middle of raw memory to/from other types, which, yes, is dangerous territory, and you have to know what you are doing there.

It isn't so much that it is unintuitive, for the most part[1], but rather that there are a lot of things to keep track of, and a seemingly innocous change in one part of the program can potentially result in UB in somewhere far away. And usually such bugs are not code that is blatantly undefined behavior, but rather code that is well defined most of the time, but in some edge case can trigger undefined behavior.

It would help if there was better tooling for finding places that could result in UB.

[1]: although some of them can be a little surprising, like the fact that overflow is defined for unsigned types but not signed types

I agree. I do not find UB very problematic in practice. It is still certainly a challenge when writing security sensitive code to fully make sure there is no issue left. (also of course, model checker, or run-time verified code such as eBPF etc. exist).

But the amount of actual problems I have with UB in typical projects is very low by just some common sense and good use of tools: continuous integration with sanitizers, using pointers to arrays instead of raw pointers (where a sanitizer then does bounds checks), avoiding open coded string and buffer operations, also abstracting away other complicated data structures behind safe interfaces, and following a clear policy about memory ownership.

Would you mind sharing how you became sensitized to UB code? Did you just read the C spec, carefully digest it, and then read/write lots of C? Or do you have other recommendations for someone else interested in intuiting UB as well?

I hung out in comp.std.c, read the C FAQ (https://c-faq.com/), and yes, read the actual language spec.

For every C operation you type, ask yourself what is its "contract", that is, what are the preconditions that the language or the library function expects the programmer to ensure in order for their behavior to be well-defined, and do you ensure them at the particular usage point? Also, what are the failure modes within the defined behavior (which result in values or states that may lead to precondition violations in subsequent operations)? This contractual thinking is key to correctness in programs in general, not just in C. The consequences of incorrect code are just less predictable in C.

What helped me was to instrument older game engine version build with Clang's UB sanitizer and attempt to run it for few weeks. Granted I had to approve the research with management to have that much time but I have learned some things I have never seen in twentyish years of using C++.

I'm sorry but OP seems to be vastly overestimating their abilities. Every study about bugs related to UB show that even the best programmers will make mistakes, and often mistakes that are nearly impossible to have prevented without static tools because of the action-at-a-distance nature of the harder ones (unless you had the whole code base in your head, and you paid enormous attention to the consequences of every single instruction you wrote, you just couldn't have prevented UB).

> Every study about bugs related to UB

Are about C++. There's an order of magnitude difference in the cognitive level to visually spot UB in C code vs visually spotting UB in C++ code.

You mean studies from Google, which explicitly has a culture of dumbing down software development, and heavily focuses on theoretical algorithmic skills rather than technical ones?

Google hires the best developers in the world. They pay well beyond anyone else except the other big SV tech giants, who compete for the best. I don't work for them but if money was my main motivator and they had jobs not too far from me I would totally want to. My point is: don't pretend you're superior to them. You're very likely not, and even if you are really good, they're still about the same level as you. If you think they're doing "dumb" development, I can only think you're suffering from a very bad case of https://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect , without meaning disrespect.

Ignoring the fact this isn't even true, you're completely misunderstanding my point.

As I said, Google does not prioritize for technical expertise, primarily because that's quite individual-centric. They're a large organization and their goal is to make sure people are formatted and replaceable.

They hire generally smart people and mold them to solve problems with the frameworks that they've previously built, with guidelines that are set such that anyone can come and pick it up and contribute to it, in order to facilitate maintenance in case of turnover or changing teams.

They also hire a lot of people straight out of university, with many people spending large portions of their career there without seeing much of the outside world.

As a result, their workforce is not particularly adept about using third-party tools in a variety of situations; they're optimized to know how things work at Google, which is its own sub- (or arguably mono-)culture.

Being an expert requires using a tool in many diverse codebases for widely different applications and domains. A large organization like this is not a particularly good data point to assess whether people can become good experts knowledgeable about the gotchas of a programming language.

You say this, and yet every single major project written in C has undefined behavior issues. The Linux kernel even demanded and uses a special flag in GCC to define some of this UB (especially the most brain dead one, signed integer overflow).

The linux kernel nowadays uses the fact that signed overflow is UB to detect problems using sanitizers. It turns out the defined unsigned wraparound is now the hard problem.

> but “ridiculous amount” and “difficult to avoid” is overstating it

Maybe you can argue that C doesn't have a “ridiculous amount” of UB (even though the number is large), but C++ is so much worse I don't think saying it's “ridiculous” is off the mark.

And not only the amount is already ridiculous but every new feature introduced in modern versions of C++ adds its own brand new UB!

If you count the number of UB in the standard, then yes, 200 cases is high. There is some ongoing effort to eliminate many of them. But it should also be noted, that almost all of those cases are not really problematic in practice. The problematic ones are signed overflow, out-of-bounds, use-after-free, and aliasing issues. Signed overflow is IMHO not a problem anymore because of sanitizers. In fact, I believe that unsigned wraparound is much more problematic. Out-of-bounds and use-after-free can be dealt with by having good coding strategies and for out-of-bounds issues I expect that we will have full bounds safety options in compilers soon. Aliasing issues can also mostly be avoided by not playing games with types. User-after-free is more problematic (and where the main innovation of Rust is). But having a good ownership model and good abstractions also avoids most problems here in my experience. I rarely have actual problems in my projects related to this.

> Signed overflow is IMHO not a problem anymore because of sanitizers

IIRC the overflow in SHA3's reference implementation was hard to catch also for ststic analisys tools, and had the practical impact of making it easy to generate collisions.

> Out-of-bounds and use-after-free can be dealt with by having good coding strategies

You're basically saying that every project in the wild has bad “coding strategy”…

> I expect that we will have full bounds safety options in compilers soon

Which will be disabled in most places because of the overhead it incurs.

> But having a good ownership model and good abstractions also avoids most problems here in my experience. I rarely have actual problems in my projects related to this.

It's easier said than done when you have no way of enforcing the ownership, and practically intractable when not working alone on a codebase.

No, I am not saying that every project in the wild has a bad "coding strategy". Some of the most reliable software I use everyday is written in C. Some of this I use for decades without every encountering a crash or similar bug. So the meme that "all C code crashes all the time because of UB" is clearly wrong. It is not intractable, in my experience you just have to document some rules and occasionally make sure they are followed. But I agree that a formal system to enforce ownership is desirable.

> So the meme that "all C code crashes all the time because of UB" is clearly wrong.

It's not about crash at all, but “all software has security vulnerabilities because of UB” is unfortunately true.

> It is not intractable, in my experience you just have to document some rules and occasionally make sure they are followed.

If even DJB couldn't get that part perfectly I'm pretty certain you cannot either.

Not all software is security relevant. I run a lot of large simulations. I do not care at all if that software would crash on specially prepared inputs. I do care that it's as fast as possible.

I think the biggest problem is people conflating "undefined" with "unknowable". They act like because C doesn't define the behavior you can't expect certain compilers to behave a certain way. GCC handles signed overflows consistently, even though the concept is undefined at a language level; as goes many other UBs. And the big compilers are all pretty consistent with each other.

Is it annoying if you want to make sure your code compiles the same in different compiler sets? Sure, but that's part of the the issue with the standards body and the compiler developers existing independent of each other. Especially considering plenty of times C/C++ have tried to enforce certain niche behaviors and GCC/ICC/Clang/etc have decided to go their own ways.

This is dead wrong, and a very dangerous mindset.

All modern C and C++ compilers use the potential of UB as a signal in their optimization options. It is 100% unpredictable how a given piece of code where UB happens will actually be compiled, unless you are intimately familiar with every detail of the optimizer and the signals it uses. And even if you are, seemingly unrelated changes can change the logic of the optimizer just enough to entirely change the compilation of your UB segment (e.g. because a function is now too long to be inlined, so a certain piece of code can no longer be guaranteed to have some property, so [...]).

Your example of signed integer overflow is particularly glaring, as this has actually triggered real bugs in the Linux kernel (before they started using a compilation flag to force signed integer overflow to be considered defined behavior). Sure, the compiler compiles all signed integer operations to processor instructions that result in two's complement operations, and thus overflow on addition. But, since signed integer overflow is UB, the compiler also assumes it never happens, and optimizes your program accordingly.

For example, the following program will never print "overflow" regardless of what value x has:

  int foo(int num) {
    int x = num + 100;
    if (x < num) {
      printf("overflow occured");
    }
    return x;
  }

In fact, you won't even find the string "overflow" in the compiled binary, as the whole `if` is optimized away [0], since per the standard signed integer overflow can't occur, so x + 100 is always greater than x for any (valid) value of x.

[0] https://godbolt.org/z/zzdr4q1Gx

> This is dead wrong, and a very dangerous mindset.

It's "dead wrong" that compilers independently choose to define undefined behavior?

Oh, ok; I guess I must just be a stellar programmer to never have received the dreaded "this is undefined" error (or it's equivalent) that would inevitably be emitted in these cases then.

I've explained and showed an example of how compilers behave in relation to UB. They don't typically "choose to define it", they choose to assume that it never happens.

They don't throw errors when UB happens, they compile your program under the assumption that any path that would definitely lead to UB can't happen at runtime.

I believe you think that because signed integer addition gets compiled to the `add` instruction that overflows gracefully at runtime, this means that compilers have "defined signed int overflow". I showed you exactly why this is not true. You can't write a C or C++ program that relies on this behavior, it will have optimizer-induced bugs sooner or later.

Isn't this a terrible failure of the compiler though? Why is it not just telling you that the `if` is a noop?? Damn, using IntelliJ and getting feedback on really difficult logic when a branch becomes unreachable and can be removed makes this sort of thing look like amateur hour.

    if(DEBUG) {
       log("xyz")
    }

Should the compiler emit a warning for such code? Compilers don't behave like a human brain, maybe a specific diagnostic could be added by pattern matching the AST but it will never catch every case.

There's a world of difference between code that's dead because of a static define, and code that's dead because of an inference the compiler made.

A dead code report would be a useful thing, though, especially if it could give the reason for removal. (Something like the list of removed registers in the Quartus analysis report when building for FPGAs.)

> There's a world of difference between code that's dead because of a static define, and code that's dead because of an inference the compiler made.

Not really, that’s the problem. After many passes of transforming the code through optimization it is hard for the compiler to tell why a given piece of code is dead. Compiler writers aren’t just malicious as a lot of people seem to think when discussions like this come up.

Yeah, I know the compiler writers aren't being deliberately malicious. But I can understand why people perceive the end result - the compiler itself - as having become "lawful evil" - an adversary rather than an ally.

Fair point, however your example is a runtime check, so shouldn't result in dead code.

(And if DEBUG is a static define then it still won't result in dead code since the preprocessor will remove it, and the compiler will never actually see it.)

EDIT: and now I realise I misread the first example all along - I read "#if (DEBUG)" rather than "if (DEBUG)".

I am guessing there would be a LOT of false negatives of compilers removing dead code for good reason. For example, if you only use a portion of a library's enum then it seems reasonable to me that the compilers optimizes away all the if-else that uses those enums that will never manifest.

I don't think it is unreasonable to have an option for "warn me about places that might be UB" that would tell you if it removes something it thinks is dead because it assumed UB doesn't happen?

The focus was certainly much more on optimization instead of having good warnings (although any commercial products focus on that). I would not blame compiler vendors exclusively, certainly paying customer also prioritized this.

This is shifting though, e.g. GCC now has -fanalyzer. I does not detect this specific coding error though, but for example issues such as dereferencing a pointer before checking for null.

There are only two models of UB that are useful to compiler users:

1) This is a bad idea and refuse to compile.

2) Do something sensible and stable.

Silently fail and generate impossible to predict code is a third model that is only of use to compiler writers. Hiding behind the spec benefits no actual user.

I think this is a point of view that seems sensible, but probably hasn't really thought through how this works. For example

  some_array[i]

What should the compiler emit here? Should it emit a bounds check? In the event the bounds check fails, what should it do? It is only through the practice of undefined behavior that the compiler can consistently generate code that avoids the bounds check. (We don't need it, because if `i` is out-of-bounds then it's undefined behavior and illegal).

If you think this is bad, then you're arguing against memory unsafe languages in general. A sane position is the one the Rust takes, which is by default, yes indeed you should always generate the bounds check (unless you can prove it always succeeds). But there will likely always be hot inner loops where we need to discharge the bounds checks statically. Ideally that would be done with some kind of formal reasoning support, but the industry is far that atm.

For a more in depth read: https://blog.regehr.org/archives/213

> What should the compiler emit here?

It should emit an instruction to access memory location some_array + i.

That's all most people that complain about optimizations on undefined behavior want. Sometimes there are questions that are hard to answer, but in a situation like this, the answer is "Try it and hope it doesn't corrupt memory." The behavior that's not wanted is for the compiler to wildly change behavior on purpose when something is undefined. For example, the compiler could optimize

  if(foo) {
      misbehaving_code();
      return puppies;
  } else {
      delete_data();
  }

into

  delete_data();

I think the "do the normal" thing is very easy to say and very hard to do in general. Should every case of `a / b` inject a `(b != 0) && ((a != INT_MAX && b != -1))`? If that evaluates to `true` then what should the program do? Or: should the compiler assume this can't happen. Languages with rich runtimes get around this by having an agreed upon way to signal errors, at the expense of runtime checking. An example directly stolen from the linked blog post:

  int stupid (int a) {
    return (a+1) > a;
  }

What should the compiler emit for this? Should it check for overflow, or should it emit the asm equivalent of `return 1`? If your answer is check for overflow: then should the compiler be forced to check for overflow every time it increments an integer in a for loop? If your answer is don't check: then how do you explain this function behaving completely weird in the overflow case? The point I'm trying to get at is that "do the obvious thing" is completely dependent on context.

The compiler should emit the code to add one to a, and then code to check if the result is greater than a. This is completely evident, and is what all C and C++ compilers did for the first few decades of their existence. Maybe a particularly smart compiler could issue a `jo` instead of a `cmp ax, bx; jz `.

The for loop example is silly. There is no reason whatsoever to add an overflow check in a for loop. The code of a standard for loop, `for (int i = 0; i < n; i++)` doesn't say to do any overflow check, so why would the compiler insert one? Not inserting overflow checks is completely different than omitting overflow checks explicitly added in the code. Not to mention, for this type of loop, the compiler doesn't need any UB-based logic to prove that the loop terminates - for any possible value of n, including INT_MAX, this loop will terminate, assuming `i` is not modified elsewhere.

I'd also note that the "most correct" type to use for the iteration variable in a loop used to access an array, per the standard, would be `size_t`, which is an unsigned type, which does allow overflow to happen. The standard for loop should be `for (size_t i = 0; i < n; ++i)`, which doesn't allow the compiler to omit any overflow checks, even if any were present.

The interesting case is what should the code do if inlined on a code path where a is deduced to be INT_MAX.

A compiler will just avoid inlining any code here, since it's not valid, and thus by definition that branch cannot be taken, removing cruft that would impact the instruction cache.

The original code is not invalid, even by the standard. It's not even undefined behavior. It is perfectly well defined as equivalent to `return true` according to the standard, or it can be implemented in the more straightforward way (add one to a, compare the result with a, return the result of the comparison). Both are perfectly valid compilations of this code according to the standard. Both allow inlining the function as well.

Note that also `return 1 < 0` is also perfectly valid code.

The problem related to UB appears if the function is inlined in a situation where a is INT_MAX. That causes the whole branch of code to be UB, and the compiler is allowed to compile the whole context with the assumption that this didn't happen.

For example, the following function can well be compiled to print "not zero":

  int foo(int x) {
    if (x == 0) {
      return stupid(INT_MAX);
    } else {
      printf("not zero");
      return -1;
    } 
  }

  foo(0); //prints "not zero"

This is a valid compilation, because stupid(INT_MAX) would be UB, so it can't happen in a valid program. The only way for the program to be valid is for x to never be 0, so the `if` is superfluous and `foo` can be compiled to only have the code where UB can't happen.

Eidt: Now, neither clang nor gcc seem to do this optimization. But if we replace stupid(INT_MAX) with a "worse" kind of UB, say `*(int*)NULL = 1`, then they do indeed compile the function to simply call printf [0].

[0] https://godbolt.org/z/McWddjevc

I don't know what you're ranting on about.

Functions have parameters. In the case of the previous function, it is not defined if its parameter is INT_MAX, but is defined for all other values of int.

Having functions that are only valid on a subset of the domain defined by the types of their parameters is a commonplace thing, even outside of C.

Yes, a compiler can deduce that a particular code path can be completely elided because the resulting behaviour wasn't defined. There is nothing surprising about this.

The point is that a compiler can notice that one branch of your code leads to UB and elide the whole branch, even eliding code before the UB appears. The way this cascades is very hard to track and understand - in this case, the fact that stupid() is UB when called with INT_MAX makes foo() be UB when called with 0, which can cascade even more.

And no, this doesn't happen in any other commonly-used language. No other commonly-used language has this notion of UB, and certainly not this type of optimization based on deductions made from UB. A Java function that is not well defined over its entire input set will trigger an exception, not cause code calling it with the parameters it doesn't accept to be elided from the executable.

Finally, I should mention that the compiler is not even consistent in its application of this. The signed int overflow UB is not actually used to ellide this code path. But other types of UB, such as null pointer dereference, are.

It is perfectly possible to write a function in pure Java that would never terminate when called with parameter values outside of the domain for which it is defined. It is also possible for it to yield an incorrect value.

Your statement that such a function would throw an exception is false.

Ensuring a function is only called for the domain it is defined on is entirely at the programmer's discretion regardless of language. Some choose to ensure all functions are defined for all possible values, but that's obviously impractical due to combinatorial explosions. Types that encapsulate invariants are typically seen as the solution for this.

I didn't claim that all functions are either correct or throw an exception in Java. I said that UB doesn't exist in Java, in the sense of a Java program that compiles, but for which no semantics are assigned and the programmer is not allowed to write it. All situations that are UB in C or C++ are either well-defined in Java (signed integer overflow, non-terminating loops that don't do IO/touch volatile variables), many others throw exceptions (out of bounds access, divide by 0), and a few are simply not possible (use after free). Another few are what the C++ standard would call "unspecified behavior", such as unsynchronized concurrent access.

And yes, it's the programmer's job to make sure functions are called in their domain of apllication. But it doesn't help at all when the compiler prunes your code-as-written to remove flows that would have reached an error situation, making debugging much harder when you accidentally do call them with illegal values.

If you want the compiler to output exactly the code as written (or as close as possible to it for the target architecture), then most compilers support that. It's called turning off optimizations. You can do that if that's what you want.

Optimizing compilers on the other hand are all about outputting something that is equivalent to your code UNDER THE RULES OF THE LANGUAGE while hopefully being faster. This condition isn't there to fuck you over its there because it is required for the compiler to do more than very very basic optimizations.

> Optimizing compilers on the other hand are all about outputting something that is equivalent to your code UNDER THE RULES OF THE LANGUAGE while hopefully being faster.

The problem here is how far you stretch this "equivalent under the rules of the language" concept. I think many agree that C and C++ compilers have chosen to play language lawyer games to little performance in real world code, but introducing very real bugs.

As it stands today, C and C++ are the only mainstream languages that have non-timing-related bugs in optimized builds that aren't there in debug builds - putting a massive burden on programmers to find and fix these bugs. The performance gain from this is extremely debatable. But what is clear is that you can create very performant code without relying on this type of UB logic.

Ah, but what if it writes so far off the array that it messes with the contents of another variable on the stack that is currently cached in a register? Should the compiler reload that register because the out of bounds write might have updated it? Probably not, let's just assume they didn't mean to do that and use the in-register version. That's taking advantage of undefined behavior to optimize a program.

> Ah, but what if it writes so far off the array that it messes with the contents of another variable on the stack that is currently cached in a register? Should the compiler reload that register because the out of bounds write might have updated it? Probably not, let's just assume they didn't mean to do that and use the in-register version.

Yes, go ahead and assume it won't alias outside the rules of C and hope it works out.

> That's taking advantage of undefined behavior to optimize a program.

I don't know if I really agree with that, but even taking that as true, that's fine. The objection isn't to doing any optimizations. Assuming memory didn't get stomped is fine. Optimizations that significantly change program flow in the face of misbehavior and greatly amplify it are painful. And lots of things are in the middle.

> That's all most people that complain about optimizations on undefined behavior want

If this was true most of them could just adopt Rust where of course this isn't a problem.

But in fact they're often vehemently against Rust. They like C and C++ where they can write total nonsense which has no meaning but it compiles and then they can blame the compiler for not reading their mind and doing whatever it is they thought it "obviously" should do.

I could be wrong here since I don't develop compilers, but from my understanding many of the undefined behaviours in C are the product of not knowing what the outcome will be for edge cases or due to variations in processor architecture. In these cases, undefined behaviour was intended as a red flag for application developers. Many application developers ended up treating the undefined behaviours as deterministic provided that certain conditions were met. On the other hand, compiler developers took undefined behaviour to mean they could do what they wanted, generating different results in different circumstance, thus violating the expectations of application developers.

I think the problem is that some behaviours are undefined where developers expect them to be implementation-defined (especially in C's largest remaining stronghold, the embedded world) - i.e. do what makes sense on this particular CPU.

Signed overflow is the classic example - making that undefined rather than implementation-defined is a decision that makes less to those of us living in today's exclusively two's-complement world than it would have done when it was taken.

It's become more of an issue in recent years as compilers started doing more advanced optimisations, which some people perceived as the compiler being "lawful evil".

What it reminds me of is that episode of Red Dwarf with Kryten (with his behavioural chip disabled) explaining why he thought it was OK to serve roast human to the crew: "If you eat chicken then obviously you'd eat your own species too, otherwise you'd just be picking on the chickens"!

Unfortunately it's not necessarily specified what counts as "an optimisation". For example, the (DSP-related) compiler I worked on back in the day had an instruction selection pass, and much of the performance of optimised code came from it being a very good instruction selector. "Turning off optimisations" meant not running compiler passes that weren't required in order to generate code, we didn't have a second sub-optimal instruction selector.

And undefined behaviour is still undefined behaviour without all the optimisation passes turned on.

> It should emit an instruction to access memory location some_array + i.

That's definitely what compilers emit. The UB comes from the fact that the compiler cannot guarantee how the actual memory will respond to that. Will the OS kill you? Will your bare metal MCU silently return garbage? Will you corrupt your program state and jump into branches that should never be reached? Who knows. You're advocating for wild behavior but you don't even realize it.

As for your example. No, the compiler couldn't optimize like that. You seem to have some misconceptions about UB. If foo is false in your code, then the behavior is completely defined.

> If foo is false in your code, then the behavior is completely defined.

That's the point. If foo is false, both versions do the same thing. If foo is true, then it's undefined and it doesn't matter. Therefore, assume foo is false. Remove the branch.

Yes! This is exactly the point. It is undefined, so given that, it could do what the other branch does, so you can safely remove that branch.

you get it, but a lot of other people don't understand just how undefined, undefined code is.

We do. We just wish undefined was defined to be a bit less undefined, and are willing to sacrifice a bit of performance for higher debuggability an. ability to reason.

It could do what the other branch does, in theory.

But let me put it this way. If you only had the misbehaving_code(); line by itself, the compiler would rightly be called crazy and malicious if it compiled that to delete_data();

So maybe it's not reasonable to treat both branches as having the same behavior, even if you can.

The result of a binary search is undefined if the input is not sorted.

How do you expect the compiler to statically guarantee that this property holds in all the cases you want to do a binary search?

> Silently fail and generate impossible to predict code is a third model that is only of use to compiler writers. Hiding behind the spec benefits no actual user.

A significant issue is that compiler "optimizations" aren't gaining a lot of general benefit anymore, and yet they are imposing a very significant cost on many people.

Lots of people still are working on C/C++ compiler optimizations, but nobody is asking if that is worthwhile to end users anymore.

Data suggests that it is not.

TFA? Quoting:

    Compiler writers measure an "optimization" as successful if they can find any example where the "optimization" saves time. Does this matter for the overall user experience? The typical debate runs as follows:

    In 2000, Todd A. Proebsting introduced "Proebsting's Law: Compiler Advances Double Computing Power Every 18 Years" (emphasis in original) and concluded that "compiler optimization work makes only marginal contributions". Proebsting commented later that "The law probably would have gone unnoticed had it not been for the protests by those receiving funds to do compiler optimization research."

    Arseny Kapoulkine ran various benchmarks in 2022 and concluded that the gains were even smaller: "LLVM 11 tends to take 2x longer to compile code with optimizations, and as a result produces code that runs 10-20% faster (with occasional outliers in either direction), compared to LLVM 2.7 which is more than 10 years old."

    Compiler writers typically respond with arguments like this: "10-20% is gazillions of dollars of computer time saved! What a triumph from a decade of work!"

We are spinning the compilers much harder and imposing changes on end programmers for roughly 10-20% over a decade. That's not a lot of gain in return for the pain being caused.

I suspect most programmers would happily give up 10% performance on their final program if they could halve their compile times.

> We are spinning the compilers much harder and imposing changes on end programmers for roughly 10-20% over a decade. That's not a lot of gain in return for the pain being caused.

> I suspect most programmers would happily give up 10% performance on their final program if they could halve their compile times.

10% at FAANG scale is around a billion dollars per year. There's a reason why FAANG continues to be the largest contributor by far to LLVM and GCC, and it's not because they're full of compiler engineers implementing optimizations for the fun of it.

> There's a reason why FAANG continues to be the largest contributor by far to LLVM and GCC, and it's not because they're full of compiler engineers implementing optimizations for the fun of it.

And, yet, Google uses Go which is glop for performance (Google even withdrew a bunch of people from the C/C++ working groups). Apple funded Clang so they could get around the GPL with GCC and mostly care about LLVM rather than Clang. Amazon doesn't care much as their customers pay for CPU.

So, yeah, Facebook cares about performance and ... that's about it. Dunno about Netflix who are probably more concerned about bandwidth.

Half of what? I'm not overly concerned about how long a prod build & deploy takes if it's automated. 10 minute build instead of 5 for 10% perf gain is probably worth it. Probably more and more worth it as you scale up because you only need to build it once then you can copy the binary to many machines where they all benefit.

Fun fact you and GP both right. Goals of 'local' build a programmer does to check what he wrote are at odds with goals of 'build farm' build meant for end user. Former should be optimized to reduce build time and latter optimized to reduce run-time. In gamedev we separate them as different build configurations.

Right and if anything, compilers are conservative when it comes the optimizations parameters they enable for release builds (i.e. with -O2/-O3). For most kinds of software even a 10x further increase in compile times could make sense if it meant a couple of percent faster software.

If something is good for compiler developers, it is good for compiler users, in the sense that it makes it easier for the compiler developers to make the compilers we need.

I think you're replying to a strawman. Here's the full quote:

> The excuse for not taking responsibility is that there are "language standards" saying that these bugs should be blamed on millions of programmers writing code that bumps into "undefined behavior", rather than being blamed on the much smaller group of compiler writers subsequently changing how this code behaves. These "language standards" are written by the compiler writers.

> Evidently the compiler writers find it more important to continue developing "optimizations" than to have computer systems functioning as expected. Developing "optimizations" seems to be a very large part of what compiler writers are paid to do.

The argument is that the compiler writers are themselves the ones deciding what is and isn't undefined, and they are defining those standards in such a way as to allow themselves latitude for further optimizations. Those optimizations then break previously working code.

The compiler writers could instead choose to prioritize backwards compatibility, but they don't. Further, these optimizations don't meaningfully improve the performance of real world code, so the trade-off of breaking code isn't even worth it.

That's the argument you need to rebut.

Perhaps the solution is also to reign in the language standard to support stricter use cases. For example, what if there was a constant-time { ... }; block in the same way you have extern "C" { ... }; . Not only would it allow you to have optimizations outside of the block, it would also force the compiler to ensure that a given block of code is always constant-time (as a security check done by the compiler).

I would say that allowing undefined behavior is a bug in itself. It was an understandable mistake for 1970, especially for such a hacker language as C. But now if a compiler can detect UD, it should warn you about it (and mostly it does by default), and you should treat that warning as an error.

So, well, yes, if the bug is due to triggering UD, some blame should fall on the developer, too.

We can debate whether it's reasonable or not to optimize code based on undefined behavior. But we should at least have the compiler emit a warning when this happens. Just like we have the notorious "x makes an integer from a pointer without a cast", we could have warnings for when the compiler decides to not emit the code for an if branch checking for a null pointer or an instruction zeroing some memory right before deallocation (I think this is not UB, but still a source of security issues due to extreme optimizations).

Calling the compiler buggy for not doing what you want when you commit Undefined Behavior is like calling dd buggy for destroying your data when you call it with the wrong arguments.

No, it's like calling dd buggy for deliberately zeroing all your drives when you call it with no arguments.

How did we let pedantic brainless "but muh holy standards!!!1" religious brigading triumph over common sense?

The standards left things undefined in the hopes that the language would be more widely applicable and implementers would give those areas thought themselves and decide the right thing. Not so that compiler writers can become adversarial smartasses. It even suggests that "behaving in a manner characteristic of the environment" is a possible outcome of UB, which is what "the spirit of C" is all about.

In my observations this gross exploitation of UB started with the FOSS compilers, GCC and Clang being the notable examples. MSVC or ICC didn't need to be so crazy, and yet they were very competitive, so I don't believe claims that UB is necessary for optimisation.

The good thing about FOSS is that those in power can easily be changed. Perhaps it's time to fork, fix, and fight back.

> The standards left things undefined in the hopes that the language would be more widely applicable and implementers would give those areas thought themselves and decide the right thing.

That sounds like implementation-defined behavior, not undefined behavior.

I do not think there is a reason to fork. Just contribute. I found GCC community very welcoming. But maybe not come in with an "I need to take back the compiler from evil compiler writers" attitude.

I don’t really think there is either, but I figured it was a funny way to present the “there never was anything to prevent you from forking in the first place” argument.

From personal experience, they couldn't care less if they can argue it's "undefined". All they do is worship The Holy Standard. They follow the rules blindly without ever thinking whether it makes sense.

But maybe not come in with an "I need to take back the compiler from evil compiler writers" attitude.

They're the ones who started this hostility in the first place.

If even someone like Linus Torvalds can't get them to change their ways, what chances are there for anyone else?

Okay, so you're not up to making a boringcc compiler from the nine-years old proposal of TFA's author, and you don't believe that it's possible to persuade the C implementers to adopt different semantics, so... what do you propose then? It can be only three things, really: either "stop writing crypto algorithms altogether", or "write them in some different, sane language", or "just business as usual: keep writing them in C while complaining about those horrible implementers of C compilers". But perhaps you have a fourth option?

P.S. "All they do is worship The Holy Standard. They follow the rules blindly without ever thinking whether it makes sense" — well, no. Who do you think writes The Holy Standard? Those compiler folks actually comprise quite a number of the members of JTC1/SC22/WG14, and they are also the ones who actually get to implement that standard. So to quote JeanHeyd Meneide of thephd.dev, "As much as I would not like this to be the case, users – me, you and every other person not hashing out the bits ‘n’ bytes of your Frequently Used Compiler — get exactly one label in this situation: bottom bitch".

> They're the ones who started this hostility in the first place.

"How dare these free software developers not do exactly what I want."

Talk about being entitled. If you can't manage to communicate you ideas in a way that will convince others to do the work you want to see done then you need to either pay (and find someone willing to do the work for payment) or do the work yourself.

Plenty of undefined behavior is actually perfectly good code the compiler has no business screwing up in any way whatsoever. This is C, we do evil things like cast pointers to other types and overlay structures onto byte buffers. We don't really want to hear about "undefined" nonsense, we want the compiler to accept the input and generate the code we expect it to. If it's undefined, then define it.

This attitude turns bugs into security vulnerabilities. There's a reason the Linux kernel is compiled with -fwrapv -fno-strict-aliasing -fno-delete-null-pointer-checks and probably many more sanity restoring flags. Those flags should actually be the default for every C project.

Disabling all optimizations isn't even enough- fundamentally what you need is a much narrower specification for how the source language maps to its output. Even -O0 doesn't give you that, and in fact will often be counterproductive (e.g. you'll get branches in places that the optimizer would have removed them).

The problem with this is that no general purpose compiler wants to tie its own hands behind its back in this way, for the benefit of one narrow use case. It's not just that it would cost performance for everyone else, but also that it requires a totally different approach to specification and backwards compatibility, not to mention deep changes to compiler architecture.

You almost may as well just design a new language, at that point.

> You almost may as well just design a new language, at that point.

Forget “almost”.

Go compile this C code:

    void foo(int *ptr)
    {
        free(ptr);
        *ptr = 42;
    }

This is UB. And it has nothing whatsoever to do with optimizations — any sensible translation to machine code is a use-after-free, and an attacker can probably find a way to exploit that machine code to run arbitrary code and format your disk.

If you don’t like this, use a language without UB.

But djb wants something different, I think: a way to tell the compiler not to introduce timing dependencies on certain values. This is a nice idea, but it needs hardware support! Your CPU may well implement ALU instructions with data-dependent timing. Intel, for example, reserves the right to do this unless you set an MSR to tell it not to. And you cannot set that MSR from user code, so what exactly is a compiler supposed to do?

https://www.intel.com/content/www/us/en/developer/articles/t...

It isn't just UB to dereference `ptr` after `free(ptr)` – it is UB to do anything with its value whatsoever. For example, this is UB:

    void foo(int *ptr)
    {
        assert(ptr != NULL);
        free(ptr);
        assert(ptr != NULL);
    }

Why is that? Well, I think because the C standard authors wanted to support the language being used on platforms with "fat pointers", in which a pointer is not just a memory address, but some kind of complex structure incorporating flags and capabilities (e.g. IBM System/38 and AS/400; Burroughs Large Systems; Intel iAPX 432, BiiN and i960 extended architecture; CHERI and ARM Morello). And, on such a system, they wanted to permit implementors to make `free()` a "pass-by-reference" function, so it would actually modify the value of its argument. (C natively doesn't have pass-by-reference, unlike C++, but there is nothing stopping a compiler adding it as an extension, then using it to implement `free()`.)

See this discussion of the topic from 8 years back: https://news.ycombinator.com/item?id=11235385

> And you cannot set that MSR from user code, so what exactly is a compiler supposed to do?

Set a flag in the executable which requires that MSR to be enabled. Then the OS will set the MSR when it loads the executable, or refuse to load it if it won't.

Another option would be for the OS to expose a user space API to read that MSR. And then the compiler emits a check at the start of security-sensitive code to call that API and abort if the MSR doesn't have the required value. Or maybe even, the OS could let you turn the MSR on/off on a per-thread basis, and just set it during security-sensitive processing.

Obviously, all these approaches require cooperation with the OS vendor, but often the OS vendor and compiler vendor is the same vendor (e.g. Microsoft)–and even when that isn't true, compiler and kernel teams often work closely together.

> Set a flag in the executable which requires that MSR to be enabled. Then the OS will set the MSR when it loads the executable, or refuse to load it if it won't.

gcc did approximately this for decades with -ffast-math. It was an unmitigated disaster. No thanks. (For flavor, consider what -lssl would do. Or dlopen.)

> Another option would be for the OS to expose a user space API to read that MSR. And then the compiler emits a check at the start of security-sensitive code to call that API and abort if the MSR doesn't have the required value.

How does the compiler know where the sensitive code starts and ends? Maybe it knows that certain basic blocks are sensitive, but it’s a whole extra control flow analysis to find beginning and ends.

And making this OS dependent means that compilers need to be more OS dependent for a feature that’s part of the ISA, not the OS. Ick.

Or maybe even, the OS could let you turn the MSR on/off on a per-thread basis, and just set it during security-sensitive processing.

> How does the compiler know where the sensitive code starts and ends?

Put an attribute on the function. In C23, something like `[[no_data_dependent_timing]]` (or `__attribute__((no_data_dependent_timing))` using pre-C23 GNU extension)

> And making this OS dependent means that compilers need to be more OS dependent for a feature that’s part of the ISA, not the OS. Ick.

There are lots of unused bits in RFLAGS, I don't know why Intel didn't use one of those, instead of an MSR. (The whole upper 32 bits of RFLAGS is unused – if Intel and AMD split it evenly between them, that would be 16 bits each.) Assuming the OS saves/restores the whole of RFLAGS on context switch, it wouldn't even need any change to the OS. CPUID could tell you whether this additional RFLAGS bit was supported or not. Maybe have an MSR which controls whether the feature is enabled or not, so the OS can turn it off if necessary. Maybe even default to having it off, so it isn't visible in CPUID until it is enabled by the OS via MSR – to cover the risk that maybe the OS context switching code can't handle a previously undefined bit in RFLAGS being non-zero.

Execution time is not considered Observable Behavior in the C standard. It's entirely outside the semantics of the language. It is Undefined Behavior, though not UB that necessarily invalidates the program's other semantics the way a use-after-free would.

This is pretty persnickety and I imagine you're aware of this, but free is a weak symbol on Linux, so user code can replace it at whim. Your foo cannot be statically determined to be UB.

Hmm, not sure, I think it would be possible to mark a function with a pragma as "constant time", and the compiler could make sure that it indeed is that. I think it wouldn't be impossible to actually teach it to convert branched code into unbranched code automatically for many cases as well. Essentially, the compiler pass must try to eliminate all branches, and the code generation must make sure to only use data-constant-time ops. It could warn/fail when it cannot guarantee it.

It’s not well-defined what counts as an optimization. For example, should every single source-level read access of a memory location go through all cache levels down to main memory, instead of, for example, caching values in registers? That would be awfully slow. But that question is one reason for UB.

Or writing code that relies on inlining and/or tail call optimization to successfully run at all without running out of stack... We've got some code that doesn't run if compiled O0 due to that.

There is a fundamental difference of priorities between the two worlds. For most general application code any optimization is fine as long as the output is correct. In security critical code information leakage from execution time and resource usage on the chip matters but that essentially means you need to get away from data-dependent memory access patterns and flow control.

Then such code needs to be written in a language that actually makes the relevant timing guarantees. That language may be C with appropriate extensions but it certainly is not C with whining that compilers don't apply my special requirements to all code.

The problem is that preventing timing attacks often means you have to implement something in constant time. And most language specifications and implementations don't give you any guarantees that any operations hapen in constant time and can't be optimized.

So the only possible way to ensure things like string comparison don't have data-dependent timing is often to implement it in assembly, which is not great.

What we really need is intrinsics that are guaranteed to have the desired timing properties , and/or a way to disable optimization, or at least certain kinds of optimization for an area of code.

Intrinsics which do the right thing seems like so obviously the correct answer to me that I've always been confused about why the discussion is always about disabling optimizations. Even in the absence of compiler optimizations (which is not even an entirely meaningful concept), writing C code which you hope the compiler will decide to translate into the exact assembly you had in mind is just a very brittle way to write software. If you need the program to have very specific behavior which the language doesn't give you the tools to express, you should be asking for those tools to be added to the language, not complaining about how your attempts at tricking the compiler into the thing you want keep breaking.

The article explains why this is not as simple as that, especially in the case of timing attacks. Here it's not just the end-result that matters, but how it's done that matters. If any code can be change to anything else that gives the same results, then this becomes quite hard.

Absolutist statements such as this may give you a glowing sense of superiority and cleverness, but they contribute nothing and are not as clever as you think.

The article describes why you can’t write code which is resistant to timing attacks in portable C, but then concludes that actually the code he wrote is correct and it’s the compiler’s fault it didn’t work. It’s inconvenient that anything which cares about timing attacks cannot be securely written in C, but that doesn’t make the code not fundamentally incorrect and broken.

It is, in fact, pretty hard as evidenced by how often programmers fail at it. The macho attitude of "it's not hard, just write good code" is divorced from observable reality.

It's more complex than that for the example of car speed limits. Depending on where you live, the law also says that driving too slow is illegal because it creates an unsafe environment by forcing other drivers on i.e. the freeway to pass you.

But yeah, seeing how virtually everyone on every road is constantly speeding, that doesn't give me a lot of faith in my fellow programmers' ability to avoid UB...

And to be specific, some kinds of UB are painfully easy to avoid. A good example of that is strict aliasing. Simply don't do any type punning. Yet people still complain about it being the compiler's fault when their wanton casting leads to problems.

Some jurisdictions also set the speed limit at, e.g., the 85th percentile of drivers' speed (https://en.wikipedia.org/wiki/Speed_limit#Method) so some drivers are always going to be speeding.

(I'm one of those speeders, too; I drive with a mentality of safety > following the strict letter of the law; I'll prefer speed of traffic if that's safer than strict adherence to the limit. That said, I know not all of my peers have the same priorities on the road, too.)

People write buffer overflows because and memory leaks they are not coreful. The rest of ub are things I have never seen despite running sanitizers and a large codebase.

Only if developers act as grown ups and use all static analysers they can get hold of, instead of acting as they know better.

The tone of my answer is a reflection of what most surveys state, related to the actual use of such tooling.

I like Bernstein but sometimes he flies off the handle in the wrong direction. This is a good example, which he even half-heartedly acknowledges at the end!

A big chunk of the essay is about a side point — how good the gains of optimization might be, which, even with data, would be a use-case dependent decision.

But the bulk of his complaint is that C compilers fail to take into account semantics that cannot be expressed in the language. Wow, shocker!

At the very end he says “use a language which can express the needed semantics”. The entire essay could have been replaced with that sentence.

There's an important point to be made here: those who define the semantics of C and C++ shovel an unreasonable amount of behavior into the bucket of "undefined behavior". Much of this has dubious justifications, while making it more difficult to write correct programs.

To be pedantic, I think you're speaking about unspecified behavior and implementation defined behavior. Undefined behavior specifically refers to things that have no meaningful semantics, so the compiler assumes it never happens.

Unspecified behavior is anything outside the scope of observable behavior for which there are two or more ways the implementation can choose.

Since the timing of instructions on machines with speculative execution is not observable behavior in C, anything that impacts it is unspecified.

There's really no way around this, and I disagree that there's an "unreasonable" amount of it. Truly the problem is up to the judgement of the compiler developers what choice to make and for users to pick implementations based on those choices, or work around them as needed.

I am referring to undefined behavior.

For example, consider the case integer overflow when adding two signed numbers. C considers this undefined behavior, making the program's behavior is undefined. All bets are off, even if the program never makes use of the resulting value. C compilers are allowed to assume the overflow can never happen, which in some cases allows them to infer that numbers must fit within certain bounds, which allows them to do things like optimize away bounds checks written by the programmer.

A more reasonable language design choice would be to treat this as an operation that produces and unspecified integer result, or an implementation-defined result.

Edit: The following article helps clear up some common confusion about undefined behavior:

https://blog.regehr.org/archives/213

Unfortunately this article, like most on the subject, perpetuates the notion that there are significant performance benefits to treating simple things like integer overflow as UB. E.g.: "I've heard that certain tight loops speed up by 30%-50% ..." Where that is true, the compiler could still emit the optimized form of the loop without UB-based inference, but it would simply have to be guarded by a run-time check (outside of the loop) that would fall back to the slower code in the rare occasions when the assumptions do not hold.

Signed integer overflow being undefined has these two consequences for me: 1. It makes my code slightly faster. 2. It makes my code slightly smaller. 3. It makes my code easier to check for correctness, and thus makes it easier to write correct code.

Win, win, win.

Signed integer overflow would be a bug in my code.

As I do not write my own implementations to correctly handle the case of signed integer overflow, the code I am writing will behave in nonsensical ways in the presence of signed integer overflow, regardless of whether or not it is defined. Unless I'm debugging my code or running CI, in which case ubsan is enabled, and the signed overflow instantly traps to point to the problem.

Switching to UB-on-overflow in one of my Julia packages (via `llvmcall`) removed like 5% of branches. I do not want those branches to come back, and I definitely don't want code duplication where I have two copies of that code, one with and one without. The binary code bloat of that package is excessive enough as is.

Agreed. If anything, I'd like to have an unsigned type with undefined overflow so that I can get these benefits while also guaranteeing that the numbers are never negative where that doesn't make any sense.

It would also be nice if hardware would trap on signed integer overflow. Of course since the most popular architectures do not, new architectures also do not either.

The point is much of what the C standard currently calls undefined behavior should instead be either unspecified or implementation-defined. This includes the controversial ones like strict aliasing and signed overflow.

Additionally, part of the problem is compiler devs insisting on code transforms that are unsound in the presence of undecidable UB, without giving the programmer sufficiently fine control over such transforms (at best we have a few command line flags for some of them, worst case you'd need to disable all optimizations including the non-problematic ones.)

For example the recent realloc change in C23. I was surprised the previously used behaviour, even if inconsistent across implementations, was declared UB. Why not impdef?

> A big chunk of the essay is about a side point — how good the gains of optimization might be, which, even with data, would be a use-case dependent decision.

I think this was useful context, and it was eye-opening to me.

If you were not aware of this then you might reflect on the part of my comment that he doesn’t bring up: how good/bad are use-case dependent. Every program optimizes for a use case, sometimes pessimizing for others (e.g. an n^2 algo that’s worthwhile because it is believed to only be called on tiny vectors).

IMHO he was overgenerous on the optimization improvement of compilers. Often an optimization will make a difference in a tiny fraction of a percent. The value comes from how often that optimization can be applied, and how lots of optimizations can in aggregate make a bigger improvement just as a sand dune is made of tiny grains of sand.

C and C++ are unsuitable for writing algorithms with constant-time guarantees. The standards have little to no notion of real time, and compilers don't offer additional guarantees as extensions.

But blaming the compiler devs for this is just misguided.

That was my thought reading this article. If you want to produce machine code that performs operations in constant time regardless of the branch taken, you need to use a language that supports expressing that, which C does not.

If you want to get very paranoid most instructions probably use slightly different amounts of power for different operands which will change thermal output which will affect CPU throttling. I'm not sure there are any true constant time instructions on modern high-performance CPUs. I think we have just agreed that some instructions are as close as we can reasonably get to constant time.

It is not a problem that different CPUs have different execution time, the problem is if the same CPU, running the same instruction has a timing difference depending on the data it operates on. In this regard CPUs have actually gotten better, specifically because it is a feature that AMD and Intel has pursued.

Not always. At least for RISC-V there is the Zkt extension which guarantees data independent execution time for some instructions. I assume there's something similar for ARM and x86.

It does pretty much require you to write assembly though. I think it would definitely make sense to have some kind of `[constant_time]` attribute for C++ that instructed the compiler to ensure the code is constant time.

> If you want to produce machine code that performs operations in constant time regardless of the branch taken

Nobody is asking for that. That's the whole point. Crypto code that needs to be constant time in regards to secret data is needs to avoid branching based on secret data, but the optimizer is converting non-branching code into branching code.

According to some comments under this submission, even x86 assembly isn't suitable, or only under specific circumstances that are generally not available in userspace.

At this time, the idea of a constant-time operation embedded into a language’s semantics is not a thing. Similar for CPU architectures. Our computing base is about being fast and faster.

> because correct code does not exist in user mode.

User mode code can run in the correct mode. What it cannot do is toggle the mode on/off. Once toggled on, it works perfectly fine for userspace; this could become e.g. a per-process flag enabled by a prctl syscall, with the MSR adjusted during scheduler task switching.

So your compiler is supposed to emit a pair of syscalls each function that does integer math? Never mind that a pair of syscalls that do WRMSR may well take longer than whatever crypto operation is between them.

I have absolutely nothing good to say about Intel’s design here.

An instruction prefix that makes instructions constant time. A code segment bit (ugly but would work). Different instructions. Making constant time the default. A control register that’s a user register.

since we already have some reasons to sign in an enclave, why not just design a cryptographic processor which is highly unoptimized and highly predictable. since the majority of codes benefit immensely from the optimizations, it doesn't seem reasonable to cripple them.

So instead of just doing the rather fast elliptic curve math when getting a TLS connection request by using a standard crypto library, I’m supposed to call out to a cryptographic coprocessor that may or may not even support the operation I need? Have you seen what an unbelievable mess your average coprocessor is to use, Intel or otherwise.

CPUs have done just fine doing constant time math for decades. It’s at best a minor optimization to add data dependence, and Intel already knows (a) how to turn it off and (b) that it’s sometimes necessary to let it be turned off. Why can’t they add a reasonable mechanism to turn them off?

The version of this that I want to see is a CPU that gives you a core that doesn't have caches or branch prediction on which you can write custom code without having to worry about timing attacks.

> [..] whenever possible, compiler writers refuse to take responsibility for the bugs they introduced

I have seldomly seen someone discredit their expertise that fast in a blog post. (Especially if you follow the link and realized it's just basic fundamental C stuff of UB not meaning it produces an "arbitrary" value.)

No, I think you're just speaking past each other here. You're using "bug" in reference to the source code. They're using "bug" in reference to the generated program. With UB it's often the case that the source code is buggy but the generated program is still correct. Later the compiler authors introduce a new optimization that generates a buggy program based on UB in the source code, and the finger-pointing starts.

Edit: What nobody likes to admit is that all sides share responsibility to the users here, and that is hard to deal with. People just want a single entity to offload the responsibility to, but reality doesn't care. To give an extreme analogy to get the point across: if your battery caught fire just because your CRUD app dereferenced NULL, nobody (well, nobody sane) would point the finger at the app author for forgetting to check for NULL. The compiler, OS, and hardware vendors would be held accountable for their irresponsibly-designed products, "undefined behavior" in the standard be damned. Everyone in the supply chain shares a responsibility to anticipate how their products can be misused and handle them in a reasonable manner. The apportionment of the responsibility depends on the situation and isn't something you can just determine by just asking "was this UB in the ISO standard?"

> just speaking past each other here

no I'm not

if your program has UB it's broken and it doesn't matter if it currently happen to work correct under a specific compiler version, it's also fully your fault

sure there is shared responsibility through the stack, but _one of the most important aspects when you have something like a supply chain is to know who supplies what under which guarantees taking which responsibilities_

and for C/C++ its clearly communicated that it's soly your responsibility to avoid UB (in the same way that for batteries it's the batteries vendors responsibility to produce batteries which can't randomly cough on fire and the firmware vendors responsibility for using the battery driver/chagrin circuit correctly and your OS responsibility so that a randoms program faulting can't affect the firmware etc.)

> be misused and handle them in a reasonable manner

For things provided B2B its in general only the case in context of it involving end user, likely accidents and similar.

Instead it's the responsibility of the supplier to be clear about what can be done with the product and what not and if you do something outside of the spec it's your responsibility to continuously make sure it's safe (or in general ask the supply for clarifying guarantees wrt. your usage).

E.g. if you buy capacitors rate for up to 50C environmental temperature but happen to work for up to 80C then you still can't use them for 80C because there is 0% guarantee that even other capacitors from the same batch will also work for 80C. In the same way compilers are only "rate"(1) to behave as expected for programs without UB.

If you find it unacceptable because it's to easy to end up with accidental UB, then you should do what anyone in a supply chain with a too risky to use component would do:

Replace it with something less risky to use.

There is a reason the ONCD urged developers to stop using C/C++ and similar where viable, because that is pretty much just following standard supply chain management best-practice.

(1: just for the sake of wording. Through there are certified, i.e. ~rated, compilers revisions)

> your program has UB it's broken and it doesn't matter if it currently happen to work correct under a specific compiler version, it's also fully your fault

Except that compiler writers essentially decide what's UB. Which is a conflict of interest.

And they add UB, making previously non-UB code fall under UB. Would you call such code buggy?

> Except that compiler writers essentially decide what's UB.

No, the C/C++ standards specify what is UB. So, as long as you don't switch targeted standard versions, the brokenness of your code never changes.

Compilers may happen to previously have never made optimizations around some specific UB, but, unless you read in the compiler's documentation that it won't, code relying on it was always broken. It's a bog standard "buggy thing working once doesn't mean it'll work always".

> No, the C/C++ standards specify what is UB.

And the compiler writers have a stranglehold on the standards bodies. They hold more than 50% of the voting power last time I checked.

So yeah, compiler writers decide what's UB.

The vast majority of UB usually considered problematic has been in the standards for decades, long before compilers took as much advantage of it as they do now (and the reasons for including said UB back then were actual hardware differences, not appeasing compiler developers).

Are there even that many UB additions? The only thing I can remember is realloc with size zero going from implementation-defined to undefined in C23.

Yes, but that does not change the fact that compilers writers have control of the standard, have had that control since probably C99, and have introduced new UB along with pushing the 00UB worldview.

What introduced UB are you thinking of? I'll admit I don't know how much has changed, but the usually-complained-about things (signed overflow, null pointer dereferencing, strict aliasing) are clearly listed as UB in some C89 draft I found.

C23's introduced stdc_trailing_zeros & co don't even UB on 0, even though baseline x86-64's equivalent instructions are literally specified to leave their destination undefined on such!

00UB is something one can argue about, but I can't think of a meaningful way to define UB that doesn't impose significant restrictions on even basic compilers, without precisely defining how UB-result values are allowed to propagate.

e.g. one might expect that 'someFloat == (float)(int8_t)someFloat' give false on an input of 1000, but guaranteeing that takes intentional effort - namely, on hardware whose int↔float conversions only operate on ≥32-bit integers (i.e. everything - x86, ARM, RISC-V), there'd need to be an explicit 8-to-32-bit sign-extend, and the most basic compiler just emitting the two f32→i32 & i32→f32 instructions would fail (but is imo pretty clearly within "ignoring the situation completely with unpredictable results" that the C89 draft contains). Sure it doesn't summon cthulhu, but it'll quite likely break things very badly anyway. (whether it'd be useful to not have UB here in the first place is a separate question)

Even for 'x+100 < x' one can imagine a similar case where the native addition & comparison instructions operate on inputs wider than int; using such for assuming-no-signed-wrap addition always works, but would mean that the comparison wouldn't detect overflow. Though here x86-64, aarch64, and RISC-V all do provide instructions for 32-bit arith, matching their int. This would be a bigger thing if it were possible to have sub-int-sized arith.

All of it. But especially anything added after C89 that was not already there implicitly.

Edit: okay, not all of it. I was hyperbolic. Race conditions and data races should be UB. But anything that can be implementation-defined should be.

So your issue is not at all any specific thing or action anyone took, but just in general having UB in places not strictly necessary. And "Especially anything [different from The Golden Days]", besides being extremely cliche, is a completely arbitrary cutoff point.

A given compiler is free to define specific behavior for UB (and indeed you can add compiler flags to do that for many things); the standard explicitly acknowledges that with "Possible undefined behavior ranges from […], to behaving during translation or program execution in a documented manner characteristic of the environment".

Sigh...yes, I don't want any UB where it's not necessary.

But if you must have a concrete example, how about realloc?

In C89 [1] (page 155), realloc with a 0 size and a non-NULL pointer was defined as free:

> If size is zero and ptr is not a null pointer, the object it points to is freed.

In C99 [2] (page 314), that sentence was removed, making it undefined behavior when it wasn't before. This is a pure example of behavior becoming undefined when it was not before.

In C11 [3] (page 349), that sentence remains gone.

In C17 [4] (page 254), we get an interesting addition:

> If size is zero and memory for the new object is not allocated, it is implementation-defined whether the old object is deallocated. If the old object is not deallocated, its value shall be unchanged.

So the behavior switches from undefined to implementation-defined.

In C23 [5] (page 357), the wording completely changes to:

> ...or if the size is zero, the behavior is undefined.

So WG14 made it UB again after making implementation-defined.

SQLite targets C89, but people compile it with modern compilers all the time, and those modern compilers generally default to at least C99, where the behavior is UB. I don't know if SQLite uses realloc that way, but if it does, are you going to call it buggy just because the authors stick to C89 and their users use later standards?

[1]: https://web.archive.org/web/20200909074736if_/https://www.pd...

[2]: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf

[3]: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf

[4]: https://web.archive.org/web/20181230041359if_/http://www.ope...

[5]: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3047.pdf

If SQLite wants exactly C89, it can just require -std=c89, and then people compiling it with a different standard target are to blame. This is just standard backwards incompatibility, nothing about UB (in other languages requiring specific compiler/language versions is routine). Problems would arise even if it was changed from being a defined 'free(x)' to being a defined 'printf("here's the thing you realloc(x,0)'d: %p",x)'. (whether the C standard should always be backwards compatible is a more interesting question, but is orthogonal to UB)

I do remember reading somewhere that a real platform in fact not handling size 0 properly (or having explicitly-defined behavior going against what the standard allowed?) being an argument for changing the standard requirement. It's certainly not because compiler developers had big plans for optimizing around it, given that both gcc and clang don't: https://godbolt.org/z/jjcGYsE7W. and I'm pretty sure there's no way this could amount to any optimization on non-extremely-contrived examples anyway.

I had edited one of my parent comments to mention realloc, so if we both landed on the same example, there's probably not that many significant other cases.

> If SQLite wants exactly C89, it can just require -std=c89, and then people compiling it with a different standard target are to blame.

Backwards compatibility? I thought that was a target for WG14.

> This is just standard backwards incompatibility, nothing about UB

But UB is insidious and can bite you with implicit compiler settings, like the default to C99 or C11.

> whether the C standard should always be backwards compatible is more interesting, but is a question orthogonal to UB

If it's a target, then it should be.

And on the contrary, UB is not orthogonal to backwards compatibility.

Any UB could have been made implementation-defined and still be backwards compatible. But it's backwards-incompatible to make anything UB that wasn't UB. These count as examples of WG14 screwing over its users.

> I do remember some mention somewhere of a real platform in fact not handling size 0 properly being an argument for reducing the standard requirement.

So WG14 just decides to screw over users from other platforms? Just keep it implementation-defined! It already was! And that's still a concession from the pure defined behavior of C89!

> I had edited one of my parent comments to mention realloc, so if we both landed on the same example, there's probably not that many significant other cases.

I beg to differ. Any case where UB was implicit just because it wasn't defined in the standard could have easily been made implementation-defined instead.

Anytime WG14 adds UB that doesn't need to be UB, it is screwing over users.

> Backwards compatibility? I thought that was a target for WG14.

C23 removed K&R function declarations. Indeed backwards-compatibility is important for them, but it's not the be-all end-all.

Having a standard state exact possible behavior is meaningless if in practice it isn't followed. And it wasn't just implementation-defined, it had a specific set of options for what it could do.

> Any case where UB was implicit just because it wasn't defined in the standard could have easily been made implementation-defined instead. Any UB could have been made implementation-defined and still be backwards compatible. But anything that wasn't UB that now is counts as an example of WG14 screwing over its users.

If this is such a big issue for you, you could just name another example. It'd take, like, 5 words to say another feature in question unnecessarily changed. I'll happily do the research on how it changed over time.

It's clear that you don't like UB, but I don't think you've said anything more than that. I quite like that my compiler will optimize out dead null comparisons or some check that collapses to a 'a + C1 < a' after inlining/constant propagation. I think it's quite neat that not being able to assume signed wrapping means that one can run sanitizers that warn on such, without heaps of false-positives from people doing wrapping arith with it. If anything, I'd want some unsigned types with no unsigned wrapping (though I'd of course still want some way to do wrapping arith where needed)

> Having a standard state exact possible behavior is meaningless if in practice it isn't followed.

No, it means that the bug is documented to be in the platform, not the program.

> If this is such a big issue for you, you could just name another example. It'd take, like, 5 words to say another feature in question unnecessarily changed.

Okay, how about `signal()` being called in a multi-threaded program? Why couldn't they define it in C11 such that it could be called? Obviously, such a thing didn't really exist in C99, but it did in POSIX, and in POSIX, it wasn't, and still isn't, undefined. Why couldn't WG14 have simply made it implementation-defined?

> I quite like that my compiler will optimize out dead null comparisons or some check that collapses to a 'a + C1 < a' after inlining/constant propagation.

I'd rather not be forced to be a superhuman programmer.

> No, it means that the bug is documented to be in the platform, not the program.

Yes, it means that the platform is buggy, but that doesn't help anyone wanting to write portable-in-practice code. The standard specifying specific behavior is just giving a false sense of security.

> Okay, how about `signal()` being called in a multi-threaded program? Why couldn't they define it in C11 such that it could be called?

This is even more definitely not a case of compiler developer conflict of interest. And it's not a case of previously-defined behavior changing, so that set remains still at just realloc. (I wouldn't be surprised if there are more, but if it's not a thing easily listed off I find it hard to believe it's a real significant worry)

But POSIX defines it anyway; and as signals are rather pointless without platform-specific assumptions, it's not like it matters for portability. Honestly, having signals as-is in the C standard feels rather useless to me in general. And 'man 2 signal' warns to not use 'signal()', recommending the non-standard sigaction instead.

And, as far as I can tell, implementation-defined vs undefined barely matters, given that a platform may choose to define the implementation-defined thing as doing arbitrary things anyway, or, conversely, indeed document specific behavior for undefined things. The most significant thing I can tell from the wording is that implementation-defined requires the behavior to be documented, but I am fairly sure there are many C compilers that don't document everything implementation-defined.

> I'd rather not be forced to be a superhuman programmer.

All you have to do is not use signed integers for doing modular/bitwise arithmetic just as much as you don't use integers for doing floating-point arithmetic. It's not much to ask. And the null pointer thing isn't even an issue for userspace code (i.e. what 99.99% of programmers write).

I do think think configuring behavior of various things should be more prevalent & nicer to do; even in cases where a language/platform does define specific behavior, it may nevertheless be undesired (e.g. a+1

> if your battery caught fire just because your CRUD app dereferenced NULL, nobody (well, nobody sane) would point the finger at the app author for forgetting to check for NULL.

I think pretty much anyone sane would and would be right to do so. Incorrect code is, well, incorrect and safety critical code shouldn’t use UB. Plus, it’s your duty as a software producer to use an appropriate toolchain and validate the application produced. You can’t offload the responsibility of your failure to do so to a third party (doesn’t stop people for trying all the time with either their toolchains or a library they use but that shouldn’t be tolerated and be pointed as the failure to properly test and validate it is).

I would be ashamed if fingers were pointed towards a compiler provider there unless said provider certified that its compiler wouldn’t do that and somehow lied (but even then, still a testing failure on the software producer part).

> I think pretty much anyone sane would and would be right to do so. Incorrect code is, well, incorrect and safety critical code shouldn’t use UB

You missed the whole point of the example. I gave CRUD app as an example for a reason. We weren't talking safety-critical code like battery firmware here.

Because your exemple isn’t credible. But even then I don’t think I missed the point, no. You are responsible for what your application does (be it a CRUD app or any others). If it causes damage because you fail to test properly, it is your responsibility. The fact that so many programmers fail to grasp this - which is taken as evidence in pretty much any other domain - is why the current quality of the average piece of software is so low.

Anyway, I would like to know by which magic you think a CRUD app could burn a battery? There is a whole stack of systems to prevent that from ever happening.

> There is a whole stack of systems to prevent that from ever happening.

You've almost got the point your parent is trying to make. That the supply chain shares this responsibility, as they said.

> I would like to know by which magic you think a CRUD app could burn a battery?

I don't know about batteries, but there was a time when Dell refused to honour their warranty on their Inspiron series laptops if they found VLC to be installed. Their (utterly stupid) reasoning? That VLC allows the user to raise the (software) volume higher than 100%. It was their own damn fault for using poor quality speakers and not limiting allowable current through them in their (software or hardware) drivers.

> You've almost got the point your parent is trying to make. That the supply chain shares this responsibility, as they said.

Deeply disagree. Failsafe doesn’t magically remove your responsibility.

I’m so glad I started my career in a safety critical environment with other engineers working on the non software part. The amount of software people who think they can somehow absolve themselves of all responsibility for shipping garbage still shock me after 15 years in the field.

> It was their own damn fault for using poor quality speakers

Yes, exactly, I’m glad to see we actually agree. It’s Dell’s fault - not the speaker manufacturer’s fault, not the subcontractor who designed the sound part’s fault - Dell’s fault because they are the one who actually shipped the final product.

I think the author knows very well what UB is and means. But he’s thinking critically about the whole system.

UB is meant to add value. It’s possible to write a language without it, so why do we have any UB at all? We do because of portability and because it gives flexibility to compilers writers.

The post is all about whether this flexibility is worth it when compared with the difficulty of writing programs without UB.

The author makes the case that (1) there seem to be more money lost on bugs than money saved on faster bytecode and (2) there’s an unwillingness to do something about it because compiler writers have a lot of weight when it comes to what goes into language standards.

Even stipulating that part of the argument, the author then goes on a tear about optimizations breaking constant-time evaluation, which doesn’t have anything to do with UB.

The real argument seems to be that C compilers had it right when they really did embody C as portable assembly, and everything that’s made that mapping less predictable has been a regression.

But C never had been portable assembly.

Which I think is somewhat the core of the problem. People treating things in C in ways they just are not. Weather that is C is portable assembly or C the "it's just bit's in memory" view of things (which often is double wrong ignoring stuff like hardware caching). Or stuff like writing const time code based on assuming that the compiler probably, hopefully can't figure out that it can optimize something.

> The real argument seems to be that C compilers had it right when they really did embody C as portable assembly

But why would you use such a C. Such a C would be slow compared to it's competition while still prone to problematic bugs. At the same time often people seem to forgot that part of UB is rooted in different hardware doing different things including having behavior in some cases which isn't just a register/mem address having an "arbitrary value" but more similar to C UB (like e.g. when it involves CPU caches).

The full quote is:

> Although it strove to give programmers the opportunity to write truly portable programs, the C89 Committee did not want to force programmers into writing portably, to preclude the use of C as a “high-level assembler:” the ability to write machine-specific code is one of the strengths of C. It is this principle which largely motivates drawing the distinction between strictly conforming program and conforming program (§4).

This doesn't say that C is a high-level assembly.

It just says that the committee doesn't (at that point in time) wants to force the usage of "portable" C as a mean to prevent the usage of C as high-level assembler. But just because some people use something as high level assembler doesn't mean it is high level assembly (like I did use a spoon as a fork once, it's still a spoon).

Furthermore the fact that they explicitly mention forcing portable C with the terms "to preclude" and not "to break compatibility" or similar I think says a lot about weather or not the committee thought of C as high level assembly.

Most importantly the quote is about the process of making the first C standard which had to make sure to ease the transition from various non standardized C dialects to "standard C" and I'm pretty sure that through the history there had been C dialects/compiler implementations which approached C as high level assembly, but C as in "standard C" is not that.

It specifically says that the use of C as a "portable assembler" is a use that the standards committee does not want to preclude.

Not sure how much clearer this can be.

That statement means the comittee does not want to stop it from being developed. The question is, has it? They mean a specific implementation could work as portable assembler, mirroring djb's request for an 'unsurprising' C compiler. Another interpretation would be in the context of CompCert, which has been developed to achieve semantic preservation between assembly and its source. Interestingly this of course hints at verifying an assembled snippet coming from some other source as well. Then that alternate source for the critical functions frees the rest of compiler internals from the problems of preserving constant-timeness and leakfreedom through their passes.

No.

C already existed prior to the ANSI standardization process, so there was nothing "to be developed", though a few changes were made to the language, in particular function prototypes.

C was being used in this fashion, and the ANSI standards committee made it clear that it wanted the standard to maintain that use-case.

These are aspiration statements, not a factual judgment of what that standard or its existing implementations actually are. At least they do not cover all implementations nor define precisely what they cover. Note the immediate next statement: "C code can be non-portable."

In my opinion, C has tried to serve two masters and they made a screw-hammer in the process.

The rest of the field has moved on significantly. We want portable behavior, not implementation-defined vomit that will leave you doubting whether porting introduces new UB paths that you haven't already fully checked against (by, e.g. varying the size of integers in such a way some promotion is changed to something leading to signed overflow; or bounds checking is ineffective).

The paragraph further down about explicitly and swiftly rejecting a validation test suite should also read as a warning. Not only would the proposal of modern software development without a test suite get you swiftly fired today, but they're explicitly acknowledging the insurmountable difficulties in producing any code with consistent cross-implementation behavior. But in the time since then, other languages have demonstrated you can reap many of the advantages of close-to-the-metal without compromising on behavior consistency in cross-target behavior, at least for many relevant real-word cases.

They really knew what they were building, a compromise. But that gets cherry-picked into absurdity such as stating C is portable in present-tense or that any inherent properties make it assembly-like. It's neither.

These are statements of intent. And the intent is both stated explicitly and also very clear in the standard document that the use as a "portable assembler" is one of the use cases that is intended and that the language should not prohibit.

That does not mean that C is a portable assembly language to the exclusion of everything and anything else, but it also means the claim that it is definitely in no way a portable assembly language at all is also clearly false. Being a portable assembly (and "high level" for the time) is one of the intended use-cases.

> In my opinion, C has tried to serve two masters and they made a screw-hammer in the process.

Yes. The original intent for which it was designed and in which role it works well.

> The rest of the field has moved on significantly. We want portable behavior, not implementation-defined vomit that will leave you doubting whether porting introduces new UB paths that you haven't already fully checked against

Yes, that's the "other" direction that deviates from the original intent. In this role, it does not work well, because, as you rightly point out, all that UB/IB becomes a bug, not a feature.

For that role: pick another language. Because trying to retrofit C to not be the language it is just doesn't work. People have tried. And failed.

Of course what we have now is the worst of both worlds: instead of either (a) UB serving its original purpose of letting C be a fairly thin and mostly portable shell above the machine, or (b) eliminating UB in order to have stable semantics, compiler writers have chosen (c): exploiting UB for optimization.

Now these optimizations alter program behavior, sometimes drastically and even impacting safety (for example by eliminating bounds checks that the programmer explicitly put in!), despite the fact that the one cardinal rule of program optimization is that it must not alter program behavior (except for execution speed).

The completely schizophrenic "reasoning" for this altering of program behavior being somehow OK is that, at the same time that we are using UB to optimize all over the place, we are also free to assume that UB cannot and never does happen. This despite the fact that it is demonstrably untrue. After all UB is all over the C standard, and all over real world code. And used for optimization purposes, while not existing.

> They really knew what they were building, a compromise.

Exactly. And for the last 3 decades or so people have been trying unsuccessfully to unpick that compromise. And the result is awful.

The interests driving this are also pretty clear. On the one hand a few mega-corps for whom the tradeoff of making code inscrutable and unmanageable for The Rest of Us™ is completely worth it as long as it shaves off 0.02% running time in the code they run on tens or hundreds of data centers and I don't know how many machines. On the other hand, compiler researchers and/or open-source compiler engineers who are mostly financed by those few megacorps (the joy of open-source!) and for whom there is little else in terms of PhD-worthy or paid work to do outside of that constellation.

I used to pay for my C compiler, thus there was a vendor and I was their customer and they had a strong interest in not pissing me off, because they depended on me and my ilk for their livelihood. This even pre-dated the first ANSI-C standard, so all the compiler's behavior was UB. They still didn't pull any of the shenanigans that current C compilers do.

Back in 1989, when C abstract machine semantics were closer to being a portable macro processor, and stuff like the register keyword was actually something compilers cared about.

And even then there was no notion of constant-time being observable behavior to the compiler. You cannot write reliably constant-time code in C because execution time is not a property the C language includes in its model of computation.

But having a straightforward/predictable mapping to the underlying machine and its semantics is included in the C model of computation.

And that is actually not just compatible with the C "model of computation" being otherwise quite incomplete, these two properties are really just two sides of the same coin.

The whole idea of an "abstract C machine" that unambiguously and completely specifies behavior is a fiction.

> But having a straightforward/predictable mapping to the underlying machine and its semantics is included in the C model of computation.

While you can often guess what the assembly will be from looking at C code given that you're familiar with the compiler, exactly how C is to be translated into assembly isn't well-specified.

For example, you can't expect that all uses of the multiplication operator "*" results in an actual x86 mul instruction. Many users expect constant propagation, so you can write something like "2 * SOME_CONSTANT" without computing that value at runtime; there is no guarantee of this behavior, though. Also, for unsigned integers, when optimizations are turned on, many expect compilers to emit left shift instructions when multiplying by a constant power of two, but again, there's no guarantee of this. That's not to say this behavior couldn't be part of a specification, but it's just an informal expectation right now.

What I think people might want is some readable, well-defined set of attribute grammars[0] for translation of C into assembly for varying optimization levels - then, you really would be able to know exactly how some piece of C code under some context would be translated into assembly. They've already been used for writing code generator generators in compilers, but what I'm thinking is something more abstract, not as concrete as a code generation tool.

[0]: https://en.wikipedia.org/wiki/Attribute_grammar

> exactly how C is to be translated into assembly isn't well-specified.

Exactly! It's not well-specified so the implementation is not prevented from doing a straightforward mapping to the machine by some part of the spec that doesn't map well to the actual machine.

> But having a straightforward/predictable mapping to the underlying machine and its semantics is included in the C model of computation.

not rally, or at least not in a way which would count as "high level assembler". If it would the majority of optimizations compilers do today would not be standard conform.

Like there is a mapping to behavior but not a mapping to assembly.

Which is where the abstract C machine as a hypothetical machine formed from the rules of the standard comes in. Kinda as a mind model which runs the behavior mappings instead of running any specific assembly. But then it not being ambiguous and complete doesn't change anything about C not being high level assembly, actually it makes C even less high level assembly.

So you can easily tell, just by looking to the C source code, if plain Assembly instructions are being used from four books of ISA manual, if the compiler is able to automatically vectorize a code region including which flavour of vector instructions, or completely replace specific math code patterns for a single opcode.

Nobody says that implementation-defined behavior must be sane or safe. The crux of the issue is that a compiler can assume that UB never happens, while IB is allowed to. Does anyone have an example where the assumption that UB never happens actually makes the program faster and better, compared to UB==IB?

The issue is that you’d have to come up with and agree on an alternative language specification without (or with less) UB. Having the compiler implementation be the specification is not a solution. And such a newly agreed specification would invariably either turn some previously conforming programs nonconforming, or reduce performance in relevant scenarios, or both.

That’s not to say that it wouldn’t be worth it, but given the multitude of compiler implementations and vendors, and the huge amount of existing code, it’s a difficult proposition.

What traditionally has been done, is either to define some “safe” subset of C verified by linters, or since you probably want to break some compatibility anyway, design a separate new language.

> UB is meant to add value. It’s possible to write a language without it, so why do we have any UB at all? We do because of portability and because it gives flexibility to compilers writers.

Implementation-defined behavior is here for portability for valid code. Undefined behavior is here so that compilers have leeway with handling invalid conditions (like null pointer dereference, out-of-bounds access, integer overflows, division by zero ...).

What does it mean that a language does not have UBs? There are several cases how to handle invalid conditions:

1) eliminate them at compile time - this is optimal, but currently practical just for some classes of errors.

2) have consistent, well-defined behavior for them - platforms may have vastly different way how to handle invalid conditions

3) have consistent, implementation-defined behavior for them - usable for some classes of errors (integer overflow, division by zero), but for others it would add extensive runtime overhead.

4) have inconsistent behavior (UB) - C way

（评论） (comments)

（评论）
(comments)