I work at Red Hat on GCC, the GNU Compiler Collection. I spent most of the past year working on how GCC emits diagnostics (errors and warnings) in the hope of making it easier to use. Let's take a look at 6 improvements to look forward to in the upcoming GCC 15.
1. Prettier execution paths
I added a static analyzer to GCC in GCC 10 that prints visualizations of predicted paths of execution through the user's code, demonstrating each problem it predicts.
Here's an example that shows some of the improvements I've made to this in GCC 15:
infinite-loop-linked-list.c: In function ‘while_loop_missing_next’:
infinite-loop-linked-list.c:30:10: warning: infinite loop [CWE-835] [-Wanalyzer-infinite-loop]
30 | while (n)
| ^
‘while_loop_missing_next’: events 1-3
30 | while (n)
| ^
| |
| (1) ⚠️ infinite loop here
| (2) when ‘n’ is non-NULL: always following ‘true’ branch... ─>─┐
| │
| │
|┌────────────────────────────────────────────────────────────────────────┘
31 |│ {
32 |│ sum += n->val;
|│ ~~~~~~
|│ |
|└─────────────>(3) ...to here
‘while_loop_missing_next’: event 4
32 | sum += n->val;
| ~~~~^~~~~~~~~
| |
| (4) looping back... ─>─┐
| │
‘while_loop_missing_next’: event 5
| │
|┌─────────────────────────────────┘
30 |│ while (n)
|│ ^
|│ |
|└────────>(5) ...to here
I've added a warning emoji (⚠️) to the event in the path where the problem occurs (event 1 in the above), and I've added "ASCII art" to show control flow, such as the lines connecting events 2 and 3, and those connecting events 4 and 5 (compare with the GCC 14 output seen here).
Another example of an execution path can be seen in this new -fanalyzer
warning -Wanalyzer-undefined-behavior-ptrdiff
, which warns about pointer subtractions involving different chunks of memory:
demo.c: In function ‘test_invalid_calc_of_array_size’:
demo.c:9:20: warning: undefined behavior when subtracting pointers [CWE-469] [-Wanalyzer-undefined-behavior-ptrdiff]
9 | return &sentinel - arr;
| ^
events 1-2
│
│ 3 | int arr[42];
│ | ~~~
│ | |
│ | (2) underlying object for right-hand side of subtraction created here
│ 4 | int sentinel;
│ | ^~~~~~~~
│ | |
│ | (1) underlying object for left-hand side of subtraction created here
│
└──> ‘test_invalid_calc_of_array_size’: event 3
│
│ 9 | return &sentinel - arr;
│ | ^
│ | |
│ | (3) ⚠️ subtraction of pointers has undefined behavior if they do not point into the same array object
│
There's a line on the left-hand side that visualizes the stack depth to highlight calls and returns. As of GCC 15 this now uses unicode box-drawing characters (where the locale supports this), and for purely intraprocedual cases like the -Wanalyzer-infinite-loop
example above, we now omit it altogether, which saves some visual "noise."
2. A new look for C++ template errors
Compiler errors involving C++ templates are notoriously difficult to read.
Consider this invalid C++ code:
struct widget {};
void draw (widget &);
struct diagram {};
void draw (diagram &);
template <class W>
concept drawable = requires(W w) { w.draw (); };
template <drawable T>
void draw(T);
int main ()
{
draw (widget ());
}
Attempting to compile it with GCC 14 with -fconcepts -fconcepts-diagnostics-depth=2
gives 34 lines of output, which is relatively simple as far as these go, and even this can be hard to decipher. I'll post it here for reference, but I confess that my eyes tend to glaze over when I try to read it:
<source>: In function 'int main()':
<source>:15:8: error: no matching function for call to 'draw(widget)'
15 | draw (widget ());
| ~~~~~^~~~~~~~~~~
<source>:2:6: note: candidate: 'void draw(widget&)' (near match)
2 | void draw (widget &);
| ^~~~
<source>:2:6: note: conversion of argument 1 would be ill-formed:
<source>:15:9: error: cannot bind non-const lvalue reference of type 'widget&' to an rvalue of type 'widget'
15 | draw (widget ());
| ^~~~~~~~~
<source>:5:6: note: candidate: 'void draw(diagram&)'
5 | void draw (diagram &);
| ^~~~
<source>:5:12: note: no known conversion for argument 1 from 'widget' to 'diagram&'
5 | void draw (diagram &);
| ^~~~~~~~~
<source>:11:6: note: candidate: 'template<class T> requires drawable<T> void draw(T)'
11 | void draw(T);
| ^~~~
<source>:11:6: note: template argument deduction/substitution failed:
<source>:11:6: note: constraints not satisfied
<source>: In substitution of 'template<class T> requires drawable<T> void draw(T) [with T = widget]':
<source>:15:8: required from here
15 | draw (widget ());
| ~~~~~^~~~~~~~~~~
<source>:8:9: required for the satisfaction of 'drawable<T>' [with T = widget]
<source>:8:20: in requirements with 'W w' [with W = widget]
<source>:8:43: note: the required expression 'w.draw()' is invalid, because
8 | concept drawable = requires(W w) { w.draw (); };
| ~~~~~~~^~
<source>:8:38: error: 'struct widget' has no member named 'draw'
8 | concept drawable = requires(W w) { w.draw (); };
| ~~^~~~
One of the issues is that there is a hierarchical structure to these messages, but we're printing them as a "flat" list, which obscures the meaning.
I've been experimenting with a new way of presenting this information, taking inspiration from Sy Brand's excellent "Concepts Error Messages for Humans" paper. It's not ready to turn on by default in GCC for all C++ users, but it's available in GCC 15 through a command-line option for people who want to try it out: -fdiagnostics-set-output=text:experimental-nesting=yes
Here's the example we just saw, adding the -fdiagnostics-set-output=text:experimental-nesting=yes
"cheat code":
demo.cc: In function ‘int main()’:
demo.cc:19:8: error: no matching function for call to ‘draw(widget)’
19 | draw (widget ());
| ~~~~~^~~~~~~~~~~
• there are 3 candidates
• candidate 1: ‘void draw(widget&)’ (near match)
demo.cc:6:6:
6 | void draw (widget &);
| ^~~~
• conversion of argument 1 would be ill-formed:
• error: cannot bind non-const lvalue reference of type ‘widget&’ to an rvalue of type ‘widget’
demo.cc:19:9:
19 | draw (widget ());
| ^~~~~~~~~
• candidate 2: ‘void draw(diagram&)’
demo.cc:9:6:
9 | void draw (diagram &);
| ^~~~
• no known conversion for argument 1 from ‘widget’ to ‘diagram&’
demo.cc:9:12:
9 | void draw (diagram &);
| ^~~~~~~~~
• candidate 3: ‘template<class T> requires drawable<T> void draw(T)’
demo.cc:15:6:
15 | void draw(T);
| ^~~~
• template argument deduction/substitution failed:
• constraints not satisfied
• demo.cc: In substitution of ‘template<class T> requires drawable<T> void draw(T) [with T = widget]’:
• required from here
demo.cc:19:8:
19 | draw (widget ());
| ~~~~~^~~~~~~~~~~
• required for the satisfaction of ‘drawable<T>’ [with T = widget]
demo.cc:12:9:
12 | concept drawable = requires(W w) { w.draw (); };
| ^~~~~~~~
• in requirements with ‘W w’ [with W = widget]
demo.cc:12:20:
12 | concept drawable = requires(W w) { w.draw (); };
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
• the required expression ‘w.draw()’ is invalid, because
demo.cc:12:43:
12 | concept drawable = requires(W w) { w.draw (); };
| ~~~~~~~^~
• error: ‘struct widget’ has no member named ‘draw’
demo.cc:12:38:
12 | concept drawable = requires(W w) { w.draw (); };
| ~~^~~~
This presentation of the errors uses indentation and nested bullet points to show the logical structure of what the compiler is doing, eliminates redundant "visual noise" where it can, and clarifies the wording to state clearly that the compiler tried 3 different candidates for the function call, and spells out why each candidate was unsuitable.
I've tried turning this on for my day-to-day C++ work and it feels like a huge improvement. I hope we'll be able to turn it on by default in GCC 16; you can try it yourself here.
3. Machine-readable diagnostics
SARIF is a file format intended for storing the results of static analysis tools in a machine-readable, interchangeable format; as such, it's a great fit for compiler diagnostics. I added support in GCC 13 for writing out GCC's diagnostics in SARIF form, but it was an all-or-nothing deal: you could choose either GCC's classic text output on stderr, or SARIF, but not both.
For GCC 15, I've reworked the insides of how we handle diagnostics so that there can be multiple "output sinks," and added a new command-line option -fdiagnostics-add-output=ARGS
for adding new sinks. For example, using -fdiagnostics-add-output=sarif
will get you diagnostics emitted both as text on stderr and in SARIF form to a file.
Various sub-options are available; for example, -fdiagnostics-add-output=sarif:version=2.2-prerelease
will select SARIF 2.2 output for that sink (though given that we're still working on the SARIF 2.2 specification, it uses an unofficial draft of the specification, and is subject to change).
I've also improved the SARIF that GCC emits. The output now captures all locations and labeled source ranges associated with a diagnostic. For example, for:
PATH/missing-semicolon.c: In function 'missing_semicolon':
PATH/missing-semicolon.c:9:12: error: expected ';' before '}' token
9 | return 42
| ^
| ;
10 | }
| ~
the GCC SARIF output now captures the secondary location (that of the trailing close brace), as well as that of the missing semicolon. Similarly, for:
bad-binary-ops.c: In function ‘bad_plus’:
bad-binary-ops.c:64:23: error: invalid operands to binary + (have ‘S’ {aka ‘struct s’} and ‘T’ {aka ‘struct t’})
64 | return callee_4a () + callee_4b ();
| ~~~~~~~~~~~~ ^ ~~~~~~~~~~~~
| | |
| | T {aka struct t}
| S {aka struct s}
the SARIF output captures those underlined ranges and their labels.
GCC's SARIF output now captures the command-line arguments (§3.20.2), timestamps for the start and end of compilation (§§3.20.7-8), and the working directory (§3.20.19). It also now sets the roles
property for SARIF artifact
objects (§3.24.6), captures any embedded URLs in the text of messages (§3.11.6). For diagnostics relating to header files, the SARIF output now captures the chain of #include
directives that led to the diagnostic's location (using SARIF locationRelationship
objects, §3.34).
As well as improving the SARIF that GCC produces, I've added a tool to GCC 15 for consuming SARIF: sarif-replay
. This is a simple command-line tool for viewing .sarif
files, showing ("replaying") any diagnostics found in the .sarif
files in text form as if they were GCC diagnostics, with support for details such as quoting source code, underlined ranges, fix-it hints, and diagnostic paths.
For example, here's a replay of a .sarif
file emitted by GCC's -fanalyzer
static analysis option:
$ sarif-replay signal-warning.sarif
In function 'custom_logger':
signal.c:13:3: warning: call to ‘fprintf’ from within signal handler [-Wanalyzer-unsafe-call-within-signal-handler]
13 | fprintf(stderr, "LOG: %s", msg);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
'main': events 1-2
│
│ 21 | int main(int argc, const char *argv)
│ | ^~~~
│ | |
│ | (1) entry to ‘main’
│......
│ 25 | signal(SIGINT, handler);
│ | ~~~~~~~~~~~~~~~~~~~~~~~
│ | |
│ | (2) registering ‘handler’ as signal handler
│
event 3
│
│GNU C17:
│ (3): later on, when the signal is delivered to the process
│
└──> 'handler': events 4-5
│
│ 16 | static void handler(int signum)
│ | ^~~~~~~
│ | |
│ | (4) entry to ‘handler’
│ 17 | {
│ 18 | custom_logger("got signal");
│ | ~~~~~~~~~~~~~~~~~~~~~~~~~~~
│ | |
│ | (5) calling ‘custom_logger’ from ‘handler’
│
└──> 'custom_logger': events 6-7
│
│ 11 | void custom_logger(const char *msg)
│ | ^~~~~~~~~~~~~
│ | |
│ | (6) entry to ‘custom_logger’
│ 12 | {
│ 13 | fprintf(stderr, "LOG: %s", msg);
│ | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
│ | |
│ | (7) call to ‘fprintf’ from within signal handler
│
4. An easier transition to C23
When compiling C code, GCC 14 and earlier defaulted to -std=gnu17
(i.e., the "C17" version of the C standard, plus extensions). GCC 15 now defaults to -std=gnu23
(based on C23), so if your build system isn't specifying which C version to use, you might run into C17 versus C23 incompatibilities.
I attempted to rebuild a large subset of Fedora with GCC 15 and thus defaulting to C23 to try to shake out problems that would arise, and in doing so found various diagnostics that needed improving. For example, bool
, true
, and false
are keywords in C23, so I've tweaked the error message that occurs on old code such as:
typedef int bool;
so that you immediately know it's a C23 compatibility problem:
<source>:1:13: error: 'bool' cannot be defined via 'typedef'
1 | typedef int bool;
| ^~~~
<source>:1:13: note: 'bool' is a keyword with '-std=c23' onwards
Similarly, C17 and C23 treat function declarations without parameters such as int foo();
differently. In C23, it's equivalent to int foo(void);
whereas, in earlier versions of C, such a declaration plays fast and loose with the type system—essentially meaning, "we don't know how many parameters this function takes or their types; let's hope your code is correct!". In my tests, this led to lots of errors on old code, such as in this example:
#include <signal.h>
void test()
{
void (*handler)();
handler = signal(SIGQUIT, SIG_IGN);
}
So I've clarified these error messages so that they tell you the types of the functions (or function pointers) and show you where the pertinent typedef
is:
<source>: In function 'test':
<source>:6:11: error: assignment to 'void (*)(void)' from incompatible pointer type '__sighandler_t' {aka 'void (*)(int)'} [-Wincompatible-pointer-types]
6 | handler = signal(SIGQUIT, SIG_IGN);
| ^
In file included from <source>:1:
/usr/include/signal.h:72:16: note: '__sighandler_t' declared here
72 | typedef void (*__sighandler_t) (int);
| ^~~~~~~~~~~~~~
Similarly, I've clarified the C front end's error messages for bad call sites:
struct p { int (*bar)(); };
void baz() {
struct p q;
q.bar(1);
}
t.c: In function 'baz':
t.c:7:5: error: too many arguments to function 'q.bar'; expected 0, have 1
7 | q.bar(1);
| ^ ~
t.c:2:15: note: declared here
2 | int (*bar)();
| ^~~
showing the expected parameter count versus the actual argument count, underlining the first extraneous argument at the call site, and showing the pertinent field declaration of the callback.
5. A revamped color scheme
GCC will use color when emitting its text messages on stderr at a suitably modern terminal, using a few colors that seem to work well in a number of different terminal themes—but the exact rules for choosing which color to use for each aspect of the output have been rather arbitrary.
For GCC 15, I've gone through C and C++'s errors, looking for places where two different things in the source are being contrasted, such as type mismatches. These diagnostics now use color to visually highlight and distinguish the differences.
For example, this error (Figure 1) shows a bogus attempt to use a binary +
operator due to the two operands being structs (via typedef
s S
and T
), rather than numeric types.
The two significant things here are the types, so GCC 15 uses two colors to consistently highlight the different types: in the message itself, in the quoted source, and the labels. Here, the left-hand type (typedef struct s S;
) is shown throughout in green and the right-hand type (typedef struct t T;
) in dark blue. I hope this approach better ties together the message's text with the source code and makes such errors a little easier to figure out.
6. libgdiagnostics
Figure 1 above shows off some of the features that GCC's diagnostics subsystem has: code for colorization, quoting source code, labelling ranges of source, fix-it hints, execution paths, SARIF output, and so on. Previously this code was hidden inside GCC and only usable by GCC itself.
For GCC 15, I've made this functionality available as a shared library for other projects to use: libgdiagnostics. There is a C API, along with C++ and Python bindings. For example, I was able to use the Python bindings to write a "linting" script for our testsuite, and "for free" got source quoting, colorization, and fix-it hints. Adding the ability to this script to output as SARIF would be a one-liner, rather than having to write lots of JSON-handling code.
Try GCC 15
We're still fixing bugs, but we hope that GCC 15 will be ready to officially release (as 15.1) sometime later this month. With my "downstream" hat on, we're already using the prerelease (GCC 15.0) within Fedora 42 Beta.
Finally, you can use the excellent Compiler Explorer site to play with the new compiler. Have fun!