GCC 15 的可用性改进

GCC 15 的可用性改进
Usability Improvements in GCC 15

原始链接: https://developers.redhat.com/articles/2025/04/10/6-usability-improvements-gcc-15

GCC 15 在诊断方面带来了一些改进，旨在提升用户体验。 **1. 更清晰的执行路径:** 静态分析器现在使用警告表情符号和ASCII艺术来更清晰地显示控制流，突出问题区域并减少视觉干扰。 **2. 改进的C++模板错误:** 一个新的实验性输出格式（-fdiagnostics-set-output=text:experimental-nesting=yes）使用缩进和项目符号以更逻辑的方式呈现分层错误信息，使复杂的模板错误更容易理解。 **3. 机器可读的诊断信息:** 增强后的SARIF输出包含所有位置、源范围、命令行参数和包含路径。新的sarif-replay工具允许将SARIF诊断信息显示为格式化的GCC输出，包括颜色显示和修复提示。 **4. 更轻松地迁移到C23:** C17/C23不兼容性的错误消息已得到澄清，提供了更多上下文和指导。 **5. 改进的配色方案:** 诊断信息现在使用对比色来突出显示源代码中的差异，例如类型不匹配，从而提高可读性。 **6. libgdiagnostics:** GCC的诊断功能（颜色显示、源代码引用、SARIF输出）现在作为一个共享库提供，供其他项目使用。

Hacker News 最新 | 往期 | 评论 | 提问 | 展示 | 招聘 | 提交登录 GCC 15 的可用性改进 (redhat.com) 9 分，来自 dmalcolm，1 小时前 | 隐藏 | 往期 | 收藏 | 讨论加入我们，参加 6 月 16-17 日在旧金山举办的 AI 初创公司学校！指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请 YC | 联系我们搜索：

（评论） 2024-04-05

（评论） 2025-03-24

（评论） 2024-07-22

（评论） 2024-07-01

原文

I work at Red Hat on GCC, the GNU Compiler Collection. I spent most of the past year working on how GCC emits diagnostics (errors and warnings) in the hope of making it easier to use. Let's take a look at 6 improvements to look forward to in the upcoming GCC 15.

1. Prettier execution paths

I added a static analyzer to GCC in GCC 10 that prints visualizations of predicted paths of execution through the user's code, demonstrating each problem it predicts.

Here's an example that shows some of the improvements I've made to this in GCC 15:

infinite-loop-linked-list.c: In function ‘while_loop_missing_next’:
infinite-loop-linked-list.c:30:10: warning: infinite loop [CWE-835] [-Wanalyzer-infinite-loop]
   30 |   while (n)
      |          ^
  ‘while_loop_missing_next’: events 1-3
   30 |   while (n)
      |          ^
      |          |
      |          (1) ⚠️  infinite loop here
      |          (2) when ‘n’ is non-NULL: always following ‘true’ branch... ─>─┐
      |                                                                         │
      |                                                                         │
      |┌────────────────────────────────────────────────────────────────────────┘
   31 |│    {
   32 |│      sum += n->val;
      |│             ~~~~~~
      |│              |
      |└─────────────>(3) ...to here
  ‘while_loop_missing_next’: event 4
   32 |       sum += n->val;
      |       ~~~~^~~~~~~~~
      |           |
      |           (4) looping back... ─>─┐
      |                                  │
  ‘while_loop_missing_next’: event 5
      |                                  │
      |┌─────────────────────────────────┘
   30 |│  while (n)
      |│         ^
      |│         |
      |└────────>(5) ...to here

I've added a warning emoji (⚠️) to the event in the path where the problem occurs (event 1 in the above), and I've added "ASCII art" to show control flow, such as the lines connecting events 2 and 3, and those connecting events 4 and 5 (compare with the GCC 14 output seen here).

Another example of an execution path can be seen in this new -fanalyzer warning -Wanalyzer-undefined-behavior-ptrdiff, which warns about pointer subtractions involving different chunks of memory:

demo.c: In function ‘test_invalid_calc_of_array_size’:
demo.c:9:20: warning: undefined behavior when subtracting pointers [CWE-469] [-Wanalyzer-undefined-behavior-ptrdiff]
    9 |   return &sentinel - arr;
      |                    ^
  events 1-2
    │
    │    3 | int arr[42];
    │      |     ~~~
    │      |     |
    │      |     (2) underlying object for right-hand side of subtraction created here
    │    4 | int sentinel;
    │      |     ^~~~~~~~
    │      |     |
    │      |     (1) underlying object for left-hand side of subtraction created here
    │
    └──> ‘test_invalid_calc_of_array_size’: event 3
           │
           │    9 |   return &sentinel - arr;
           │      |                    ^
           │      |                    |
           │      |                    (3) ⚠️  subtraction of pointers has undefined behavior if they do not point into the same array object
           │

There's a line on the left-hand side that visualizes the stack depth to highlight calls and returns. As of GCC 15 this now uses unicode box-drawing characters (where the locale supports this), and for purely intraprocedual cases like the -Wanalyzer-infinite-loop example above, we now omit it altogether, which saves some visual "noise."

2. A new look for C++ template errors

Compiler errors involving C++ templates are notoriously difficult to read.

Consider this invalid C++ code:

struct widget {};
void draw (widget &);

struct diagram {};
void draw (diagram &);

template <class W>
concept drawable = requires(W w) { w.draw (); };

template <drawable T>
void draw(T);

int main ()
{
  draw (widget ());
}

Attempting to compile it with GCC 14 with -fconcepts -fconcepts-diagnostics-depth=2 gives 34 lines of output, which is relatively simple as far as these go, and even this can be hard to decipher. I'll post it here for reference, but I confess that my eyes tend to glaze over when I try to read it:

<source>: In function 'int main()':
<source>:15:8: error: no matching function for call to 'draw(widget)'
   15 |   draw (widget ());
      |   ~~~~~^~~~~~~~~~~
<source>:2:6: note: candidate: 'void draw(widget&)' (near match)
    2 | void draw (widget &);
      |      ^~~~
<source>:2:6: note:   conversion of argument 1 would be ill-formed:
<source>:15:9: error: cannot bind non-const lvalue reference of type 'widget&' to an rvalue of type 'widget'
   15 |   draw (widget ());
      |         ^~~~~~~~~
<source>:5:6: note: candidate: 'void draw(diagram&)'
    5 | void draw (diagram &);
      |      ^~~~
<source>:5:12: note:   no known conversion for argument 1 from 'widget' to 'diagram&'
    5 | void draw (diagram &);
      |            ^~~~~~~~~
<source>:11:6: note: candidate: 'template<class T>  requires  drawable<T> void draw(T)'
   11 | void draw(T);
      |      ^~~~
<source>:11:6: note:   template argument deduction/substitution failed:
<source>:11:6: note: constraints not satisfied
<source>: In substitution of 'template<class T>  requires  drawable<T> void draw(T) [with T = widget]':
<source>:15:8:   required from here
   15 |   draw (widget ());
      |   ~~~~~^~~~~~~~~~~
<source>:8:9:   required for the satisfaction of 'drawable<T>' [with T = widget]
<source>:8:20:   in requirements with 'W w' [with W = widget]
<source>:8:43: note: the required expression 'w.draw()' is invalid, because
    8 | concept drawable = requires(W w) { w.draw (); };
      |                                    ~~~~~~~^~
<source>:8:38: error: 'struct widget' has no member named 'draw'
    8 | concept drawable = requires(W w) { w.draw (); };
      |                                    ~~^~~~

One of the issues is that there is a hierarchical structure to these messages, but we're printing them as a "flat" list, which obscures the meaning.

I've been experimenting with a new way of presenting this information, taking inspiration from Sy Brand's excellent "Concepts Error Messages for Humans" paper. It's not ready to turn on by default in GCC for all C++ users, but it's available in GCC 15 through a command-line option for people who want to try it out: -fdiagnostics-set-output=text:experimental-nesting=yes

Here's the example we just saw, adding the -fdiagnostics-set-output=text:experimental-nesting=yes "cheat code":

demo.cc: In function ‘int main()’:
demo.cc:19:8: error: no matching function for call to ‘draw(widget)’
   19 |   draw (widget ());
      |   ~~~~~^~~~~~~~~~~
  • there are 3 candidates
    • candidate 1: ‘void draw(widget&)’ (near match)
      demo.cc:6:6:
          6 | void draw (widget &);
            |      ^~~~
      • conversion of argument 1 would be ill-formed:
      • error: cannot bind non-const lvalue reference of type ‘widget&’ to an rvalue of type ‘widget’
        demo.cc:19:9:
           19 |   draw (widget ());
              |         ^~~~~~~~~
    • candidate 2: ‘void draw(diagram&)’
      demo.cc:9:6:
          9 | void draw (diagram &);
            |      ^~~~
      • no known conversion for argument 1 from ‘widget’ to ‘diagram&’
        demo.cc:9:12:
            9 | void draw (diagram &);
              |            ^~~~~~~~~
    • candidate 3: ‘template<class T>  requires  drawable<T> void draw(T)’
      demo.cc:15:6:
         15 | void draw(T);
            |      ^~~~
      • template argument deduction/substitution failed:
        • constraints not satisfied
          • demo.cc: In substitution of ‘template<class T>  requires  drawable<T> void draw(T) [with T = widget]’:
          • required from here
            demo.cc:19:8:   
               19 |   draw (widget ());
                  |   ~~~~~^~~~~~~~~~~
          • required for the satisfaction of ‘drawable<T>’ [with T = widget]
            demo.cc:12:9:   
               12 | concept drawable = requires(W w) { w.draw (); };
                  |         ^~~~~~~~
          • in requirements with ‘W w’ [with W = widget]
            demo.cc:12:20:   
               12 | concept drawable = requires(W w) { w.draw (); };
                  |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
          • the required expression ‘w.draw()’ is invalid, because
            demo.cc:12:43:
               12 | concept drawable = requires(W w) { w.draw (); };
                  |                                    ~~~~~~~^~
            • error: ‘struct widget’ has no member named ‘draw’
              demo.cc:12:38:
                 12 | concept drawable = requires(W w) { w.draw (); };
                    |                                    ~~^~~~

This presentation of the errors uses indentation and nested bullet points to show the logical structure of what the compiler is doing, eliminates redundant "visual noise" where it can, and clarifies the wording to state clearly that the compiler tried 3 different candidates for the function call, and spells out why each candidate was unsuitable.

I've tried turning this on for my day-to-day C++ work and it feels like a huge improvement. I hope we'll be able to turn it on by default in GCC 16; you can try it yourself here.

3. Machine-readable diagnostics

SARIF is a file format intended for storing the results of static analysis tools in a machine-readable, interchangeable format; as such, it's a great fit for compiler diagnostics. I added support in GCC 13 for writing out GCC's diagnostics in SARIF form, but it was an all-or-nothing deal: you could choose either GCC's classic text output on stderr, or SARIF, but not both.

For GCC 15, I've reworked the insides of how we handle diagnostics so that there can be multiple "output sinks," and added a new command-line option -fdiagnostics-add-output=ARGS for adding new sinks. For example, using -fdiagnostics-add-output=sarif will get you diagnostics emitted both as text on stderr and in SARIF form to a file.

Various sub-options are available; for example, -fdiagnostics-add-output=sarif:version=2.2-prerelease will select SARIF 2.2 output for that sink (though given that we're still working on the SARIF 2.2 specification, it uses an unofficial draft of the specification, and is subject to change).

I've also improved the SARIF that GCC emits. The output now captures all locations and labeled source ranges associated with a diagnostic. For example, for:

PATH/missing-semicolon.c: In function 'missing_semicolon':
PATH/missing-semicolon.c:9:12: error: expected ';' before '}' token
    9 |   return 42
      |            ^
      |            ;
   10 | }
      | ~

the GCC SARIF output now captures the secondary location (that of the trailing close brace), as well as that of the missing semicolon. Similarly, for:

bad-binary-ops.c: In function ‘bad_plus’:
bad-binary-ops.c:64:23: error: invalid operands to binary + (have ‘S’ {aka ‘struct s’} and ‘T’ {aka ‘struct t’})
  64 |   return callee_4a () + callee_4b ();
     |          ~~~~~~~~~~~~ ^ ~~~~~~~~~~~~
     |          |              |
     |          |              T {aka struct t}
     |          S {aka struct s}

the SARIF output captures those underlined ranges and their labels.

GCC's SARIF output now captures the command-line arguments (§3.20.2), timestamps for the start and end of compilation (§§3.20.7-8), and the working directory (§3.20.19). It also now sets the roles property for SARIF artifact objects (§3.24.6), captures any embedded URLs in the text of messages (§3.11.6). For diagnostics relating to header files, the SARIF output now captures the chain of #include directives that led to the diagnostic's location (using SARIF locationRelationship objects, §3.34).

As well as improving the SARIF that GCC produces, I've added a tool to GCC 15 for consuming SARIF: sarif-replay. This is a simple command-line tool for viewing .sarif files, showing ("replaying") any diagnostics found in the .sarif files in text form as if they were GCC diagnostics, with support for details such as quoting source code, underlined ranges, fix-it hints, and diagnostic paths.

For example, here's a replay of a .sarif file emitted by GCC's -fanalyzer static analysis option:

$ sarif-replay signal-warning.sarif
In function 'custom_logger':
signal.c:13:3: warning: call to ‘fprintf’ from within signal handler [-Wanalyzer-unsafe-call-within-signal-handler]
   13 |   fprintf(stderr, "LOG: %s", msg);
      |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  'main': events 1-2
    │
    │   21 | int main(int argc, const char *argv)
    │      |     ^~~~
    │      |     |
    │      |     (1) entry to ‘main’
    │......
    │   25 |   signal(SIGINT, handler);
    │      |   ~~~~~~~~~~~~~~~~~~~~~~~
    │      |   |
    │      |   (2) registering ‘handler’ as signal handler
    │
  event 3
    │
    │GNU C17:
    │ (3): later on, when the signal is delivered to the process
    │
    └──> 'handler': events 4-5
           │
           │   16 | static void handler(int signum)
           │      |             ^~~~~~~
           │      |             |
           │      |             (4) entry to ‘handler’
           │   17 | {
           │   18 |   custom_logger("got signal");
           │      |   ~~~~~~~~~~~~~~~~~~~~~~~~~~~
           │      |   |
           │      |   (5) calling ‘custom_logger’ from ‘handler’
           │
           └──> 'custom_logger': events 6-7
                  │
                  │   11 | void custom_logger(const char *msg)
                  │      |      ^~~~~~~~~~~~~
                  │      |      |
                  │      |      (6) entry to ‘custom_logger’
                  │   12 | {
                  │   13 |   fprintf(stderr, "LOG: %s", msg);
                  │      |   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                  │      |   |
                  │      |   (7) call to ‘fprintf’ from within signal handler
                  │

4. An easier transition to C23

When compiling C code, GCC 14 and earlier defaulted to -std=gnu17 (i.e., the "C17" version of the C standard, plus extensions). GCC 15 now defaults to -std=gnu23 (based on C23), so if your build system isn't specifying which C version to use, you might run into C17 versus C23 incompatibilities.

I attempted to rebuild a large subset of Fedora with GCC 15 and thus defaulting to C23 to try to shake out problems that would arise, and in doing so found various diagnostics that needed improving. For example, bool, true, and false are keywords in C23, so I've tweaked the error message that occurs on old code such as:

typedef int bool;

so that you immediately know it's a C23 compatibility problem:

<source>:1:13: error: 'bool' cannot be defined via 'typedef'
    1 | typedef int bool;
      |             ^~~~
<source>:1:13: note: 'bool' is a keyword with '-std=c23' onwards

Similarly, C17 and C23 treat function declarations without parameters such as int foo(); differently. In C23, it's equivalent to int foo(void); whereas, in earlier versions of C, such a declaration plays fast and loose with the type system—essentially meaning, "we don't know how many parameters this function takes or their types; let's hope your code is correct!". In my tests, this led to lots of errors on old code, such as in this example:

#include <signal.h>

void test()
{
  void (*handler)();
  handler = signal(SIGQUIT, SIG_IGN);
}

So I've clarified these error messages so that they tell you the types of the functions (or function pointers) and show you where the pertinent typedef is:

<source>: In function 'test':
<source>:6:11: error: assignment to 'void (*)(void)' from incompatible pointer type '__sighandler_t' {aka 'void (*)(int)'} [-Wincompatible-pointer-types]
    6 |   handler = signal(SIGQUIT, SIG_IGN);
      |           ^
In file included from <source>:1:
/usr/include/signal.h:72:16: note: '__sighandler_t' declared here
   72 | typedef void (*__sighandler_t) (int);
      |                ^~~~~~~~~~~~~~

Similarly, I've clarified the C front end's error messages for bad call sites:

struct p { int (*bar)(); };
    
void baz() {
    struct p q;
    q.bar(1);
}

t.c: In function 'baz':
t.c:7:5: error: too many arguments to function 'q.bar'; expected 0, have 1
    7 |     q.bar(1);
      |     ^     ~
t.c:2:15: note: declared here
    2 |         int (*bar)();
      |               ^~~

showing the expected parameter count versus the actual argument count, underlining the first extraneous argument at the call site, and showing the pertinent field declaration of the callback.

5. A revamped color scheme

GCC will use color when emitting its text messages on stderr at a suitably modern terminal, using a few colors that seem to work well in a number of different terminal themes—but the exact rules for choosing which color to use for each aspect of the output have been rather arbitrary.

For GCC 15, I've gone through C and C++'s errors, looking for places where two different things in the source are being contrasted, such as type mismatches. These diagnostics now use color to visually highlight and distinguish the differences.

For example, this error (Figure 1) shows a bogus attempt to use a binary + operator due to the two operands being structs (via typedefs S and T), rather than numeric types.

Figure 1: A new color scheme for errors in GCC 15.

The two significant things here are the types, so GCC 15 uses two colors to consistently highlight the different types: in the message itself, in the quoted source, and the labels. Here, the left-hand type (typedef struct s S;) is shown throughout in green and the right-hand type (typedef struct t T;) in dark blue. I hope this approach better ties together the message's text with the source code and makes such errors a little easier to figure out.

6. libgdiagnostics

Figure 1 above shows off some of the features that GCC's diagnostics subsystem has: code for colorization, quoting source code, labelling ranges of source, fix-it hints, execution paths, SARIF output, and so on. Previously this code was hidden inside GCC and only usable by GCC itself.

For GCC 15, I've made this functionality available as a shared library for other projects to use: libgdiagnostics. There is a C API, along with C++ and Python bindings. For example, I was able to use the Python bindings to write a "linting" script for our testsuite, and "for free" got source quoting, colorization, and fix-it hints. Adding the ability to this script to output as SARIF would be a one-liner, rather than having to write lots of JSON-handling code.

Try GCC 15

We're still fixing bugs, but we hope that GCC 15 will be ready to officially release (as 15.1) sometime later this month. With my "downstream" hat on, we're already using the prerelease (GCC 15.0) within Fedora 42 Beta.

Finally, you can use the excellent Compiler Explorer site to play with the new compiler. Have fun!