修复 Unix v4 中的缓冲区溢出，就像在 1973 年一样

修复 Unix v4 中的缓冲区溢出，就像在 1973 年一样
Fixing a Buffer Overflow in Unix v4 Like It's 1973

原始链接: https://sigma-star.at/blog/2025/12/unix-v4-buffer-overflow/

2025年，磁带上发现唯一已知的UNIX v4副本，这是一个关键版本，因为它首次用C语言重写了UNIX。在成功地在PDP-11模拟器上运行它之后，作者研究了核心实用程序，并在`su(1)`程序中发现了一个缓冲区溢出漏洞——一个setuid-root可执行文件，用于权限提升。这个50年前的程序由不到50行代码组成，它会检索root密码，禁用终端回显，并将哈希输入与存储的哈希进行比较。该漏洞在于读取用户输入到100字节缓冲区时缺乏边界检查，允许过长的输入导致崩溃。利用UNIX的传统做法，即包含源代码，作者使用`ed`行编辑器修补了该程序，在输入循环中添加了一个计数器和一个大小检查。然后编译并部署了修补后的代码，需要设置setuid位才能正常工作。这次经历突出了UNIX原始设计理念的力量，并展示了如何使用现成的工具快速解决安全问题。它也强调了20世纪70年代不同的安全优先级，当时这类漏洞并不被认为是关键问题。

## 黑客新闻讨论：修复 Unix v4 中的缓冲区溢出一个黑客新闻帖子讨论了原始 Unix v4 系统中的缓冲区溢出漏洞。核心问题在于可能通过超出相邻 `password` 数组的边界来覆盖 `pwbuf`（密码缓冲区）。用户推测，重复两次的 100 字节密码可能可以利用此漏洞，从而获得 root 权限。讨论深入到早期 Unix 环境的细节：有限的代码大小（每个程序 50-100 行），使用电报机终端导致命令名称简短，以及 C 语言中结构体的使用。几位评论者试图创建漏洞利用程序，但这些程序大多因不完整、格式不佳或依赖于对密码哈希和系统行为的不准确假设而受到批评。该帖子还链接到对恢复的 Unix v4 磁带的相关分析以及现场终端演示。最终，以实际方式远程利用此漏洞的可行性仍然不清楚，建议倾向于时序攻击，如果无法直接溢出。

原文

Introduction

In 2025, the only known copy of UNIX v4 surfaced on a magnetic tape^{.
This version marks a pivotal moment in computer history: the rewriting of UNIX into C.
Enthusiasts quickly recovered the data and successfully ran the system on a PDP-11 simulator^.}

Fascinated by this artifact, I set up an instance to explore it. Because the distribution includes the source code, I examined the implementation of several core utilities. While auditing the su(1) program, I identified a bug. Let’s fix it.

The UNIX v4 su(1) program

Although more than 50 years old, the su program functions similarly to its modern variant. As a setuid-root executable, it validates the root password. If the user provides the correct credentials, the program spawns a root shell, allowing an unprivileged user to escalate privileges.

The source file, su.c, contains fewer than 50 lines of code.

/* su -- become super-user */

char    password[100];
char    pwbuf[100];
int     ttybuf[3];
main()
{
        register char *p, *q;
        extern fin;

        if(getpw(0, pwbuf))
                goto badpw;
        (&fin)[1] = 0;
        p = pwbuf;
        while(*p != ':')
                if(*p++ == '\0')
                        goto badpw;
        if(*++p == ':')
                goto ok;
        gtty(0, ttybuf);
        ttybuf[2] =& ~010;
        stty(0, ttybuf);
        printf("password: ");
        q = password;
        while((*q = getchar()) != '\n')
                if(*q++ == '\0')
                        return;
        *q = '\0';
        ttybuf[2] =| 010;
        stty(0, ttybuf);
        printf("\n");
        q = crypt(password);
        while(*q++ == *p++);
        if(*--q == '\0' && *--p == ':')
                goto ok;
        goto error;

badpw:
        printf("bad password file\n");
ok:
        setuid(0);
        execl("/bin/sh", "-", 0);
        printf("cannot execute shell\n");
error:
        printf("sorry\n");
}

In short, the program executes the following steps:

It calls getpw() to retrieve the passwd entry for the root user (UID 0) from /etc/passwd. Surprisingly, if the read fails or the line format is incorrect, su continues execution rather than aborting. While unusual, this likely acts as a safeguard to ensure su remains usable on a partially corrupted system. This is a security issue on its own because an unprivileged user could consume enough resources to make the getpw() call fail. Ron Natalie pointed^{out that this attack vector was known at the time.}
It disables the TTY echo mode and prompts the user for a password.
It reads byte-by-byte from the TTY into a buffer until it encounters a newline or NUL character, NUL causes the program to exit immediately.
Once reading is complete, it re-enables echo mode, hashes the input using the crypt() library function, and compares the result with the stored hash.
If the hashes match, it spawns a shell, otherwise, it terminates.

The logic is standard, except for one critical flaw: the password buffer has a fixed size of 100 bytes, yet the input loop lacks a bounds check. If a user enters more than 100 characters, a buffer overflow occurs.

I confirmed this behavior by testing with a long input string, which successfully crashed the program. Not all long strings trigger a core dump. The outcome depends on which area of adjacent memory is overwritten, sometimes, su simply exits.

# su
password:<long input>Memory fault -- Core dumped

Note: Because su disables TTY echo mode, a crash prevents the terminal from displaying subsequent input. To restore visibility, type stty echo blindly and press Enter.

Fixing su(1)

UNIX traditionally includes the source code necessary for self-recompilation, and v4 is no exception. This allows us to patch and compile su directly on the system. In 1973, editor options were sparse. Neither vi nor emacs had been invented yet. However, the system provides ed, a line-oriented text editor designed for teletype terminals where output was printed on paper rather than displayed on a screen. ed allows us to list, delete, and append lines, which is sufficient for our needs.

We will edit su.c to prevent the overflow by maintaining a counter, i, and verifying it against the buffer size during the read loop. I initially attempted a fix using pointer arithmetic, but the 1973 C compiler didn’t like it, while it didn’t refuse the syntax, the code had no effect. I settled on a simpler index-based check instead.

--- a/s2/su.c
+++ b/s2/su.c
@@ -7,6 +7,7 @@ main()
 {
        register char *p, *q;
        extern fin;
+       register int i;
 
        if(getpw(0, pwbuf))
                goto badpw;
@@ -22,9 +23,13 @@ main()
        stty(0, ttybuf);
        printf("password: ");
        q = password;
-       while((*q = getchar()) != '\n')
+       i = 0;
+       while((*q = getchar()) != '\n') {
+               if (++i >= sizeof(password))
+                       goto error;
                if(*q++ == '\0')
                        return;
+       }
        *q = '\0';
        ttybuf[2] =| 010;
        stty(0, ttybuf);

# chdir /usr/source/s2
# ed su.c

Upon launch, ed outputs the file size in bytes and awaits input. The command i inserts text before the current line, d deletes the line, and p prints it. Entering a number moves the focus to that specific line, while pressing Return prints the current line’s content.

Below is a screen recording of the editing session:

741
8
        register char *p, *q;

        extern fin;
i
        register int i;
.
24
        printf("password: ");

        q = password;
i
        i = 0;
.
p
        i = 0;

        while((*q = getchar()) != '\n')
d
i
        while((*q = getchar()) != '\n') {
.

                if(*q++ == '\0')
i
                if (++i >= sizeof(password))
                        goto error;
.

                if(*q++ == '\0')

                        return;

        *q = '\0';
i
        }
.
w
811
q

First, we jump to line 8 and press Return several times to locate a suitable spot for the variable declaration. We use i to enter insert mode, add the variable, and then type a single period (.) on a new line to exit insert mode. The critical change occurs around the while loop: we initialize i and add a boundary check to the loop condition. Finally, w writes the modified buffer to disk, confirming the file has grown by a few bytes, and q terminates the editor.

Building and Deploying

With the source code patched, we must rebuild the binary. Since su consists of a single C file, the compilation process is trivial:

# cc su.c

The compiler outputs a binary named a.out. To deploy it, we move the file to /bin/su:

# mv a.out /bin/su

However, the installation is incomplete. Because su requires root privileges to function, we must set the setuid bit and adjust the file permissions:

# ls -l /bin/su
-rwxrwxrwx 1 root     2740 Jun 12 19:58 /bin/su
# chmod 4755 /bin/su
# ls -l /bin/su
-rwsr-xr-x 1 root     2740 Jun 12 19:58 /bin/su

Summary

UNIX v4 is a fascinating gem of computer history. It feels surprisingly similar to our current systems. While it lacks modern conveniences, the fundamental logic remains recognizable to anyone with modern UNIX experience.

The ability to fix su so quickly highlights the power of the early UNIX philosophy: shipping the operating system with its full source code and a C compiler. We patched, compiled, and deployed the fix directly on the system, no external toolchains required.

Finally, this bug reminds us of the era’s different priorities. In the trusted, isolated environments of 1973, security was not the critical concern it is today. Furthermore, the knowledge that a buffer overflow could be exploited for arbitrary code execution had not yet come of age.

As an exercise for the reader to improve their ed skills, try adding the code to restore TTY echo mode to the overflow detection logic. This ensures the terminal functions correctly even after the program catches the error.