结构体 sockaddr 的后续
A Struct Sockaddr Sequel

原始链接: https://lwn.net/Articles/1045453/

LWN.net,一个面向开发者的Linux新闻网站,恳请订阅以支持其工作——他们更倾向于撰写信息丰富的文章,而不是投资于营销。最近的一篇深度报道聚焦于Linux内核自我保护项目(KSPP)为期十年的努力,旨在提高内存安全性,特别是解决有问题的`struct sockaddr`网络结构。 `sockaddr`最初设计于1980年代,其固定大小的数据字段现在常常不足,导致开发者将其视为一个灵活的数组——这种做法阻碍了边界检查。试图正式将其定义为灵活数组导致了大量的编译器警告,从而阻碍了进展。 目前的解决方案是引入`sockaddr_unsized`,一个新的变体,允许灵活的大小,同时在特定的内部接口中保留长度信息。这使得能够将`sockaddr`恢复到其原始定义,从而启用边界检查并消除警告。然而,`sockaddr_unsized`仍然存在潜在的溢出风险,促使未来增加长度字段以进一步提高安全性。这说明了即使Rust集成提供了长期的好处,确保内核庞大的C代码库安全仍然是一个持续的挑战。

黑客新闻 新 | 过去 | 评论 | 提问 | 展示 | 招聘 | 提交 登录 一个 Struct Sockaddr 后续 (lwn.net) 28 分,来自 g0xA52A2A 1 天前 | 隐藏 | 过去 | 收藏 | 讨论 指南 | 常见问题 | 列表 | API | 安全 | 法律 | 申请YC | 联系 搜索:
相关文章

原文
We're bad at marketing

We can admit it, marketing is not our strong suit. Our strength is writing the kind of articles that developers, administrators, and free-software supporters depend on to know what is going on in the Linux world. Please subscribe today to help us keep doing that, and so we don’t have to get good at marketing.

By Jonathan Corbet
November 14, 2025

One of the many objectives of the Linux Kernel Self-Protection Project (KSPP), which just completed ten years of work, is to ensure that all array references can be bounds-checked, even in the case of flexible array members, the size of which is not known at compile time. One of the most challenging flexible array members in the kernel is not even declared as such. Almost exactly one year ago, LWN looked at the effort to increase safety around the networking subsystem's heavily used sockaddr structure. One year later, Kees Cook is still looking for a way to bring this work to a close.

In short, the problem is that struct sockaddr is traditionally defined as:

    struct sockaddr {
        short sa_family;
	char sa_data[14];
    };

The sa_data field was more than large enough to hold a network address in the early 1980s when this structure was first defined for BSD Unix, but it is not large enough now. As a result, a great deal of code, in both the kernel and user space, passes around struct sockaddr pointers that, in truth, point to different structures with more space for the addresses they need to hold. In other words, sa_data is being treated as a flexible array member, even though it is not declared as one. The prevalence of struct sockaddr has thrown a spanner into the works of many attempts to better check the uses of array members in structures.

At the end of last year's episode, much of the kernel had been changed to use struct sockaddr_storage (actually implemented as struct __kernel_sockaddr_storage), which has a data array large enough to hold any known network address. An attempt was made to change the definition struct sockaddr to make its sa_data field into an explicit flexible array member, but that work ran into a minor snag. There are many places in the kernel where struct sockaddr is embedded within another structure. In most of these cases, sa_data is not treated as a flexible array member, so developers have freely embedded struct sockaddr anywhere within the containing structure, often not at the end.

If sa_data is redefined as a flexible array member, the compiler no longer knows how large the structure will actually be. That, in turn, means that the compiler does not know how to lay out a structure containing struct sockaddr, so it guesses and emits a warning. Or, in the case of a kernel build, tens of thousands of warnings. Kernel developers, as it turns out, would rather face the prospect of an array overflow than a warning flood of that magnitude, so this work came to a halt.

One possible solution would be to replace embedded struct sockaddr fields with struct sockaddr_storage, eliminating the flexible array member. But that would bloat the containing structures with memory that is not needed, so that approach is not popular either.

Instead, Cook is working on a patch series that introduces yet another struct sockaddr variant:

    struct sockaddr_unsized {
	__kernel_sa_family_t	sa_family;	/* address family, AF_xxx */
	char			sa_data[];	/* flexible address data */
    };

Its purpose is to be used in internal network-subsystem interfaces where the size of sa_data needs to be flexible, but where its actual size is also known. For example, the bind() method in struct proto_ops is defined as:

    int	(*bind) (struct socket *sock,
		 struct sockaddr *myaddr,
		 int sockaddr_len);

The type of myaddr can be changed to struct sockaddr_unsized * since sockaddr_len gives the real size of the sa_data array. Cook's patch series does many such replacements, eliminating the use of variably sized sockaddr structures in the networking subsystem. With that done, there are no more uses of struct sockaddr that read beyond the 14-byte sa_data array. As a result, struct sockaddr can be reverted to its classic, non-flexible definition, and array bounds checking can be applied to code using that structure.

That change is enough to make all of those warnings go away, so many would likely see it as a good stopping point. There is still, though, the matter of all those sockaddr_unsized structures, any of which might be the source of a catastrophic overflow at some point. So, once the dust settles from this work, we are likely to see some attention paid to implementing bounds checking for those structures. One possible approach mentioned in the patch set is to eventually add an sa_data_len field, so that the structure would contain the length of its sa_data array. That would make it easy to document the relationship between the fields with the counted_by() annotation, enabling the compiler to insert bounds checks.

While the ability to write new code in Rust holds promise for reducing the number of memory-safety bugs introduced into the kernel, the simple fact is that the kernel contains a huge amount of C code that will not be going away anytime soon. Anything that can be done to make that code safer is thus welcome. The many variations of struct sockaddr that have made the rounds may seem silly to some, but they are a part of the process of bringing a bit of safety to an API that was defined over 40 years ago. Ten years of KSPP have made the kernel safer, but the job is far from done.



联系我们 contact @ memedata.com