数学很难。

原文

When developing software to run in a Unix environment, you will often be able to use the same system features and benefit from good developer tools, regardless of the particular platform you're working on, as most processors will provide a rich instruction set and virtual memory, among other things.

When you're on the other side of the fence, and working in the kernel, all the gory details which will heavily differ across platforms can no longer be ignored, and sometimes, the shortcomings of a given processor architecture can become a real pain in the arse.

For example, if you have read the m88k saga, you might remember that the need, for the operating system exception handler, to perform all the pending load and stores before returning from exception processing, had been a source of problems for years.

The 88100 processor is not the only processor which sometimes makes the kernel developer's life harder than it could have been.

Let me tell you about a processor design choice which turned out to have a significant cost in the kernel (but in a rare situation.)

The VAX architecture, introduced at the end of 1977, is one of the oldest 32-bit architectures. The architecture has a large instruction set, plenty of addressing modes, but nothing fancy: no out-of-order execution, no branch delay slots, no register renaming, no hyper threading, and even no cache memory on the earliest designs, which did not need any as they wouldn't run faster than the memory refresh cycle (back then, processors speeds were expressed as cycle times in micro- or nano-seconds, rather than megahertz: a 5MHz processor would be described as having a 200ns cycle time; in comparison, the memory refresh cycles would be around 120ns, and as progress were made, were slowly decreasing, with 100ns memory being common at the end of the 1980s, 80ns in the first half of the 1990s, 70ns and 60ns later on.)

The exception model of the VAX was also quite simple, with the ``Exceptions and Interrupts'' chapter of the VAX Architecture Reference Manual being only 36 pages long in the first edition (and 43 in the second edition, mostly because of a slightly larger font rather than extra text.)

Quoting from it:

A trap is an exception that occurs at the end of the instruction that caused the exception. Therefore the PC saved on the stack is the address of the next instruction that would normally have been executed.
[...]
A fault is an exception that occurs during an instruction and that leaves the registers and memory in a consistent state such that elimination of the fault condition and restarting the instruction will give correct results. After an instruction faults, the PC saved on the stack points to the instruction that faulted.

So far, this is textbook processor design. If the processor encounters a situation which is not recoverable (and will cause your process to be killed), it's a trap.

If, however, there is a chance that some recovery action can be done and the offending instruction given another chance, then it's a fault.

For example, accessing a memory page which is not mapped will cause a fault. If the address is legitimate, the appropriate page and its contents will be fetched from swap (or from the binary file you are running), and the operation can be restarted. If the address is not legitimate, then your process will be sent a SIGSEGV signal and die.

Dividing by zero, on the other hand, is a trap. No matter what one may try to bend the laws of mathematics, there is no way for such a computation to ever deliver a meaningful result. Your process will be sent a SIGFPE (Floating-Point Exception) signal - even if this was an integer divide. (The siginfo_t extra information will let an hypothetical signal handler tell integer divide by zero (FPE_INTDIV) and floating-point divide by zero (FPE_FLTDIV) apart.)

So far, so good - the VAX exception handler (trap() in sys/arch/vax/vax/trap.c) would let the VM system recover the missing page faults, and would send a SIGFPE signal down the throat of your process, for arithmetic traps. This code has been almost unchanged since 3BSD.

Did you know?

Back in 1980, the illegal instruction signal nowadays known as SIGILL was called SIGINS, SIGSEGV was called SIGSEG, SIGKILL was called SIGKIL, SIGFPE was called SIGFPT, SIGTERM was called SIGTRM, and roads were uphill both ways...

Excerpt from 3BSD sys/h/param.h, dated january 5th, 1980:

/*
 * signals
 * dont change
 */

#define NSIG    17
/*
 * No more than 16 signals (1-16) because they are
 * stored in bits in a word.
 */
#define SIGHUP  1       /* hangup */
#define SIGINT  2       /* interrupt (rubout) */
#define SIGQUIT 3       /* quit (FS) */
#define SIGINS  4       /* illegal instruction */
#define SIGTRC  5       /* trace or breakpoint */
#define SIGIOT  6       /* iot */
#define SIGEMT  7       /* emt */
#define SIGFPT  8       /* floating exception */
#define SIGKIL  9       /* kill, uncatchable termination */
#define SIGBUS  10      /* bus error */
#define SIGSEG  11      /* segmentation violation */
#define SIGSYS  12      /* bad system call */
#define SIGPIPE 13      /* end of pipe */
#define SIGCLK  14      /* alarm clock */
#define SIGTRM  15      /* Catchable termination */

In late april 2002, Todd Miller, who was - among other things - taking care of Perl in the OpenBSD basesystem, tried the latest Perl snapshot which would eventually become Perl 5.8, and noticed it would fail to build on the i386 and vax platforms, because miniperl (a subset of Perl itself used during the build to produce various files needed by the full-blown Perl) would sometimes spin, apparently stuck but keeping the processor busy.

Investigating, he managed to produce a standalone reproducer.

Date: Tue, 30 Apr 2002 16:24:50 -0600
From: Todd C. Miller
To: private OpenBSD mailinglist
Subject: i386 divide by zero bug

The following program hangs forever with:
 29142 a.out    PSIG  SIGFPE caught handler=0x1 mask=0x0 addr=0x17ba trapno=8

Vax has similar behavior when you overflow a double.

 - todd

#include <stdio.h>
#include <stdlib.h>
#include <signal.h>

int
main(int argc, char **argv)
{
        int i;

        signal(SIGFPE, SIG_IGN);
        i = 1 / 0;

        exit(0);
}

The i386 situation got taken care of quite quickly, but we were left with the Vax situation.

On may 7th, there was this very terse, but to the point, status report on the OpenBSD developers chatroom.

<deraadt> Todd, what about that SIGFPE stuff?
<millert> What about it?
<millert> It's still fucked as far as I know
<millert> And that means that when perl gets updated, it won't work on vax...

One week later, this was still pending...

<deraadt> ok, so Todd, the new perl just wants a vax FPE fix eh?
<millert> Yes.
<deraadt> the correct behaviour should be?
<millert> The problem is that when you try to ignore SIGFPE and an overflow
          occurs the kernel keeps delivering the signal and doesn't stop.  It
          should not deliver the signal at all since it is ignored.
<deraadt> and it should... do what?
<deraadt> advance over the instruction I suppose.
<millert> I guess.  There are ways to tell the vax to ignore FPU exceptions but
          I didn't find any real info on it.

The next day, I chimed in:

<miod> I was thinking about the SIGFPE-in-a-loop problem
<miod> and found this note:
<miod> When we get an arithmetic fault of types 8,9,10. The PC is backed up to
       point at the instruction causing the fault. If we just send a SIGFPE and
       return, and there is no SIGFPE hander, the program goes into an infinite
       loop
<hugh> heh
<miod> that might be what we are experiencing here
<miod> I'll check with the VARM this evening

(VARM here being the VAX Architecture Reference Manual.)

This note was actually an excerpt from the Linux-vax project, as it was not dead yet at that time. This todolist is no longer online, but has been saved by the Wayback Machine. The complete text from which I quoted was:

When we get an arithmetic fault of types 8,9,10. The PC is backed up to point at the instruction causing the fault. If we just send a SIGFPE and return, and there is no SIGFPE hander, the program goes into an infinite loop with the arith_fault handler, and the faulting instr. Should we a) try and advance PC, or b) send it a signal that kills it?

After some tinkering, I had a crude diff which had a chance to solve the problem.

Date: Wed, 15 May 2002 19:44:14 +0000
From: Miod Vallat
To: Hugh Graham, Todd C. Miller
Subject: the vax SIGFPE problem, WIP

As told on ICB, I think I've found the reason behind the SIGFPE loop.
Arithmetic fault can either be "traps", or restartable "faults". In the
fault case, the frame pc points to the instruction that faulted, and not
the following instruction, in case we could save the world and make it
not fault again.

Since we only deliver a signal in this case, it loops. The workaround is
to skip to the next instruction.

I cooked the following diff, but it's not finished compiling, so be
careful, it might not be a bright idea, but I think you might have
comments on the way I'm doing it...

Oh, and ddb needs fixes to properly recognize two-byte opcodes, but this
will be a later diff.

Miod
[...]

The problem was indeed simple: if the arithmetic trap was a fault, as opposed to a trap, and the SIGFPE signal was ignored, then we had to resume process execution after the faulting instruction. But the VAX exception model does not give us the ability to return from exception and skip that instruction.

So the kernel had to skip the instruction by itself. VAX instructions are of variable length, depending on the actual operands and addressing modes used. This meant that, in order to compute the correct instruction length, the kernel had to disassemble the instruction to skip. Which is no simple task since, when using some of the most insane addressing modes, a VAX instruction can span more than 16 bytes!

The high-level logic was simple and easy to document:

Index: vax/trap.c
===================================================================
RCS file: /cvs/src/sys/arch/vax/vax/trap.c,v
retrieving revision 1.22
diff -u -r1.22 trap.c
--- vax/trap.c  2002/03/14 03:16:02     1.22
+++ vax/trap.c  2002/05/15 19:38:24
@@ -313,8 +313,25 @@
        }

        if (trapsig) {
                sv.sival_ptr = (caddr_t)frame->pc;
                trapsignal(p, sig, frame->code, typ, sv);
+
+               /*
+                * Arithmetic exceptions can be of two kinds:
+                * - traps (codes 1..7), where pc points to the
+                *   next instruction to execute.
+                * - faults (codes 8..10), where pc points to the
+                *   faulting instruction.
+                * In the latter case, we need to advance pc by ourselves
+                * to prevent a signal loop.
+                *
+                * XXX this is gross -- miod
+                */
+               if (code == (T_ARITHFLT | T_USER) && frame->code >= 8) {
+                       extern void *skip_opcode(void *);
+
+                       frame->pc = skip_opcode(frame->pc);
+               }
        }

        if (umode == 0)

And all the gory details had to be put in that new skip_opcode function.

About 6 hours later, I had an ugly workaround: I was reusing part of the disassembler code from the kernel debugger to parse the faulting instruction and compute its length.

Date: Wed, 15 May 2002 21:28:53 +0000
From: Miod Vallat
To: Hugh Graham, Todd C. Miller, Theo de Raadt
Subject: working vax sigfpe diff

As Hugh and Todd already know, the SIGFPE problem is very simple:
Arithmetic fault can either be "traps", or restartable "faults". In the
fault case, the frame pc points to the instruction that faulted, and not
the following instruction, in case we could save the world and make it
not fault again.

Since we only deliver a signal in this case, it loops. The workaround is
to skip to the next instruction.

To do so, I'm borrowing some MD ddb code, hence a lot of ugly #ifdef to
ensure that non-DDB kernel can have this fix and not bring too much
stuff.

Miod
[...]

The feedback I received was mostly negative - while everyone acknowledged that this diff was solving a real problem and that there was no easy way to skip an instruction but parse it to compute its length, everyone was also not wanting to involve the kernel debugger code for that, as we wanted to be able to build kernels without it, and also that particular task of computing an instruction length was not really part of the tasks of a debugger.

So I reworked my changes to make the skip_opcode completely independent from the debugger code, but duplicating a few lines of code.

Date: Thu, 16 May 2002 00:49:16 +0000
From: Miod Vallat
To: Theo de Raadt, Hugh Graham, Todd C. Miller
Subject: improved vax sigfpe diff with goodies

This new diff:

- does not interfere with ddb anymore, at the expense of a few lines in
  machdep.c
- features my improved db_disasm that correctly recognizes two-byte
  opcodes.

Builds with or without option DDB, passes the fpe regress test, no
issues so far here.

Comments?

Miod
[...]

There were no objections to that new version of the diff, and it went in shortly after.

Fix a long standing problem on vax: on "arithmetic fault" exceptions,
we schedule a SIGFPE signal delivery to the faulting process.

However, arithmetic faults come in two flavors: "traps" that are "regular"
exceptions, and "faults" that are restartable exceptions.
In the "fault" case, the frame pc points to the faulting instruction, instead
of the next instruction, in case we could save the world by tweaking memory
and make the instruction not fault again when restarted.

In practice, this led to processes blocked in a SIGFPE loop madness.

To avoid this, add a skip_opcode() routine to compute the address of the
next opcode, effectively skipping the offending instruction ; this routine
is a very stripped-down db_disasm().

While there, enhance the ddb disassembler to correctly recognize and
disassemble two-byte opcodes.

ok hugh@, deraadt@

This fix made its way into NetBSD 7 years later.

However, two days later, Michael Hitch noticed a bug in this change and fixed it.

On the vax, the trapsignal() call will change frame->sp to point to a
callg on the user's stack that calls the user's signal handler, so do
the skip_opcode() before calling trapsignal().  A floating point
overflow no longer causes a signal loop.  This should stop the native
compile hangs trying to compile src/lib/libm/complex/catan.ln.

This time, it was my turn to let this slip past my radar. I only carried the fix over to OpenBSD three years later, as the import of SQLite in the base system caused that bug to be triggered when building on vax.

When handling SIGFPE, do the `advance pc if exception is a fault (as opposed
to a trap)' dance before invoking trapsignal(), which will mess with the pc
too. My bug initially, can't believe I never noticed; fixed first in NetBSD.
This makes libsqlite3 build.

So, all is well that ends well.

But there remains an unanswered question: with BSD having been runinng on VAX hardware since 1979, how come this problem was not fixed until 2002?

One possible reason is that few programs, if any, did ignore SIGFPE (or attempt to handle it), so when SIGFPE got delivered, these programs would be terminated immediately, without looping on the offending instruction.

But I think the real reason is different.

Mind you, in the early years of the VAX, these arithmetic faults did not exist - there were only arithmetic traps, where the exception already points to the next instruction. Which is something one can only know if either:

one had been a Digital employee at that time.
one had been a Digital customer who had a VAX system reworked by Digital technicians at that time.
one has a 2nd edition VAX Architecture Reference Manual (first published in 1991) and paid careful attention to the note on page 257.

I suppose very few of my readers will satisfy any of these three conditions, so I will explain.

In the first edition of the VAX Architecture Reference Manual, on page 231, table 5.1 lists the Arithmetic Exception Type Codes:

Table 5.1
Arithmetic Exception Type Codes

Exception Type	Mnemonic	Decimal	Hex
Traps
integer overflow	SS$_INTOVF	1	1
integer divide-by-zero	SS$_INTDIV	2	2
floating overflow	SS$_FLTOVF	3	3
floating or decimal divide-by-zero	SS$_FLTDIV	4	4
floating underflow	SS$_FLTUND	5	5
decimal overflow	SS$_DECOVF	6	6
subscript range	SS$_SUBRNG	7	7
Faults
floating overflow	SS$_FLTOVF_F	8	8
floating divide-by-zero	SS$_FLTDIV_F	9	9
floating underflow	SS$_FLTUND_F	10	A

("decimal" in the exception types above refers to computations involving data in "packed decimal" format, with each half-byte (nibble) would store one decimal digit.)

Note that the three fault conditions also exist as trap conditions.

In fact, their descriptions are quite similar. For example:

Floating Overflow Trap -- A floating overflow trap is an exception that indicates that the last instruction executed resulted in an exponent greater than the largest representable exponent for the data type after normalization and rounding.
[...]
Floating Overflow Fault -- A floating overflow fault is an exception that indicates that the last instruction executed resulted in an exponent greater than the largest representable exponent for the data type after normalization and rounding.
[...]

This confusion gets cleared in the second edition. The same table (now table 5.2, on page 255) only lists:

Table 5.2
Arithmetic-Exception Type Codes

Exception Type	Mnemonic	Decimal	Hex
Traps
integer overflow	SS$_INTOVF	1	1
integer divide-by-zero	SS$_INTDIV	2	2
decimal divide-by-zero	SS$_FLTDIV	4	4
decimal overflow	SS$_DECOVF	6	6
subscript range	SS$_SUBRNG	7	7
Faults
floating overflow	SS$_FLTOVF_F	8	8
floating divide-by-zero	SS$_FLTDIV_F	9	9
floating underflow	SS$_FLTUND_F	10	A

Note how types 3 and 5 have disappeared, and description of type 4 no longer mentions floating-point.

After the various traps and faults descriptions, the note finally gives us the clue:

Note
Floating overflow, floating underflow, and floating divide by zero were originally implemented as traps on the VAX-11/780, and had type codes 3. 4, and 5, respectively. The architecture was later modified to include only floating-point faults, and all VAX-11/780s were upgraded.

Therefore, in the very beginning, when the only VAX systems were 11/780s, all arithmetic exceptions were traps, and could not get restarted. There was simply no need for the BSD kernel to skip the instruction, as the hardware had already done the work.

The only other exceptions which were faults, not traps, were memory management exceptions, which would always behave as "either the operating system can fix the problem and restart the instruction, or this is a non-recoverable error and your program can't continue" (and if you ignore SIGSEGV, it is considered perfectly acceptable that your program spins in a SIGSEGV loop until you kill it.)

When the architecture was changed to turn these into faults, I guess nobody paid enough attention to the consequences of that change to realize the need for the operating system to sometimes skip the faulting instruction, as one could not imagine software would mask SIGFPE or try to mess with the register values and restart the computation.

It is also very likely that the consequences of this change were only considered from a VMS point of view, with Unix (well, BSD) being considered irrelevant by Digital at the time.

But the net result is that the hardware does not provide any facility to get the address of the next instruction, in case a fault needs to be handled as a trap. After all, since the instruction has been executed, the address of the next one is known somewhere in the processor.

I wish there had been a way to get that address easily (either from the trap frame or from a special processor register), as this would have made fixing the problem simpler. But this situation is rare enough that the cost of having the kernel do the work turned out to be acceptable.

If you are a bit more curious about this, there are a few interesting documents which you can find on Bitsavers.

The VAX-11 System Reference Manual, revision 5, dated february 1979, contains edit history information. The exceptions are described in chapter 6, and it seems that the change from traps to faults has been documented in the 6th revision of the chapter, dated 31-Jan-79.
This implies the actual processor rework took place sometime earlier. Since the second VAX model, the VAX-11/750, was only announced in 1980, this is also consistent with all the mentions that only models 780 had the "every arithmetic exception is a trap" behaviour.

In VAX/VMS Internals and Data Structures, for VMS 2.2, published in april 1981, page 2-12 (page 71 of the pdf file) also mentions that
On the VAX-11/750, these three floating point exceptions are faults. On the VAX-11/780, they are traps.
...which hints that modification of the VAX-11/780 systems had not started, at the time of writing.
Figuring out when VAX-11/780 installations started to be modified by Digital field engineers (and in which order) would be an interesting detective work, but I doubt the paperwork trail of these reworks is still existing somewhere, especially with Digital having been bought by Compaq and then later by HP.
After all, we're talking about events having taken place about 45 years ago, which is an eternity, in computing times...

The first edition of the published VAX Architecture Reference Manual, from 1987. This is the book on the left, on the picture above (from which I learned almost all of my VAX knowledge.)

The DEC STD 032 Vax Architecture Standard document contains the same text as the second edition of the VAX Architecture Reference Manual, but typewritten and crude drawings, while the second edition book uses a nice, much more readable, font. The note about the 11/780 systems being modified can be found on page 5-12 (page 361 of the pdf file.)

Also, in the few VMS release notes which can be found there on Bitsavers, there does not seem to be any mention of a 11/780 rework required (or advised).

The VMS 1.5 release notes, dated february 1979, in section 4.3, refer to a "CVTTP FCO" (CVTTP being a VAX instruction processing decimal data, FCO being a Field Change Order, when a Field Engineer is required to apply hardware changes to the system) required for proper Cobol-74 operation; but this is not related to floating-point aritmetic exceptions, thus not the change discussed here.

数学很难。 Math Is Hard – OpenBSD Stories

数学很难。
Math Is Hard – OpenBSD Stories