程序员对 TCP 的误解 Falsehoods programmers believe about TCP

原始链接: https://lwn.net/Articles/990281/

原帖讨论了 TCP(传输控制协议)及其可靠性问题。 作者解释了他们使用“NetworkManager”(用于管理网络连接的实用程序)和“wpa\_supplicant”(用于配置 WiFi 连接的工具)的个人经验。 由于信号质量差和不稳定的无线环境,他们遇到了频繁的断开连接,导致应用程序行为不稳定。 针对这个问题,作者反驳了人们对 TCP 的普遍看法,并强调了与其性能相关的各种谎言。 例如,TCP 可能被认为是可靠的,但作者指出,这并不一定意味着所有传输的数据都会到达目的地,也不一定意味着发送方和接收方始终会就正确发送和接收的字节达成一致。 此外,通过更高级别的应用程序协议创建类似于 TCP 提供的保证并不简单,因为解决复杂的同步问题需要两个以上的节点(例如 Paxos 或 Raft 算法)。 此外,网络可能并不总是按照标准协议运行,因此在设计和实现系统时考虑潜在的非标准行为非常重要。 最后,这篇文章涉及拥塞控制等主题,指出如果没有正确处理网络内的拥塞,增加活动 TCP 连接的数量可能不会提高速度。 对话还提到了网络阻止互联网控制消息协议 (ICMP) 数据包或丢弃无法识别的流量等特性。 总之,TCP 有其局限性,不应被认为是完美无缺的。 网络管理员和开发人员在使用 TCP 和设计网络应用程序时应考虑网络条件、不一致和潜在的异常情况。

The original post discusses issues with TCP (Transmission Control Protocol) and its reliability. The author explains their personal experience with using 'NetworkManager', a utility for managing networking connections, and 'wpa\_supplicant', a tool for configuring WiFi connections. They encountered frequent disconnections due to poor signal quality and flaky wireless environments, causing applications to behave erratically. In response to this issue, the author argues against common beliefs about TCP, highlighting various falsehoods related to its performance. For example, TCP might be considered reliable, but the author points out that this doesn't necessarily mean that all transmitted data will reach the destination, nor that sender and receiver will always agree on which bytes were sent and received correctly. Additionally, creating guarantees similar to those provided by TCP through higher level application-protocols isn't straightforward, as solving complex synchronization problems requires more than two nodes (such as Paxos or Raft algorithms). Furthermore, networks may not always act according to standard protocols, making it important to consider potential non-standard behaviors when designing and implementing systems. Finally, the post touches upon topics like congestion control, noting that increasing the number of active TCP connections might not lead to improved speeds without proper handling of congestion within the network. The conversation also mentions peculiarities like networks blocking Internet Control Message Protocol (ICMP) packets or dropping unrecognized traffic. In conclusion, TCP has limitations and shouldn't be taken for granted as flawless. Network managers and developers should take into account network conditions, inconsistencies, and potential anomalies when working with TCP and designing network applications.


Posted Sep 13, 2024 22:42 UTC (Fri) by NYKevin (subscriber, #129325)
In reply to: NetworkManager or networkd by mathstuf
Parent article: Debating ifupdown replacements for Debian trixie
> FWIW, I dropped NetworkManager years ago for `wpa_supplicant`-based management because I had flaky wireless situations (thick concrete walls in the dorms, roaming across campus, etc.) and any whiff of packet loss would announce to the whole machine "no network" and apps would start to freak out and react. However, it was likely to be back Real Soon™ and normal TCP recovery would make it "transparent" (if with a spike in latency).

Somebody ought to write one of those "falsehoods programmers believe" articles for TCP, because this is just reflective of a broader trend of software that thinks it knows better than TCP, and usually does not. Here, I'll even get the ball rolling (remember, all of the following statements are *false* at least some of the time, but for some of these, perhaps not very often):

1. TCP is reliable, so everything I send will be received by the other end.
2. OK, mostly reliable.
3. OK, fine, it's not reliable (in the above sense of the word), but the sender and recipient will always eventually agree on exactly which bytes made it over the transport.
4. It is possible to create a guarantee analogous to (3) by building some message-oriented application-level protocol on top of TCP, such as HTTP or SMTP.
5. There is a such thing as a TCP packet.
6. There is no such thing as a TCP packet.
7. If we fail to connect to a well-known remote host, then we must be offline.
8. Nagle's algorithm is good.
9. Nagle's algorithm is bad.
10. I don't have to care about Nagle's algorithm.
11. This is all low-level pedantry. I can think of TCP like a two-way Unix pipe that goes over the network, and completely ignore how it is implemented.
12. If the network is transparent to TCP, then it must be transparent to IP.
13. If the network is transparent to HTTP/1.1, then it must be transparent to TCP.
14. Weird networks that are not transparent to standard protocols are an aberration. I can safely ignore them.
15. TCP is implemented in terms of IP.

Explainer for 1-4: https://en.wikipedia.org/wiki/Two_Generals%27_Problem. TL;DR: If the connection breaks while an ACK is outstanding, the sender will have no way of knowing whether the segment was received, and this turns out to be an insoluble problem no matter how much complexity you pile on top of it. You need something resembling Paxos or Raft to get a guarantee like that, and that always requires a minimum of three nodes, so it can't be built on top of a single two-party TCP stream. See RFC 1047 for an SMTP-specific discussion of this problem (which still applies to modern SMTP, since RFC 2821 says that implementations MUST follow 1047's core advice), but note that some variation of this problem applies to literally every two-party TCP service (and for that matter, every UDP or IP service as well), regardless of how it works or what abstractions it introduces. SMTP is only special in that both sides are explicitly required to care about whether the message was received or not, which is marginally unusual for TCP services (compare and contrast: FTP file uploads, HTTP POST and PUT, etc., most of which omit significant discussion of client retry logic in favor of leaving it up to the application or end user).

15 is left as an exercise for the reader (hint: it is primarily of historical interest, but I'm not sure it's possible to entirely rule out modern counterexamples, since we don't know what weird stuff is going on in [any large organization]'s private network).


NetworkManager or networkd

NetworkManager or networkd

My point is not that there is no set of bytes the parties agree on. My point is that it is not possible for either party to know exactly which bytes are in the consensus set.

NetworkManager or networkd

(To clarify: It is possible for a party to know the consensus set contains *at least* the first N bytes. It is not possible for either party to know that the consensus set contains *exactly* the first N bytes.)

NetworkManager or networkd

NetworkManager or networkd

NetworkManager or networkd

Or networks that block ICMP, or networks that drop anything they don't understand...

NetworkManager or networkd

16. I don't need to know anything about congestion control (a sub-category of this one is “If I don't get the speed I want, I should open multiple TCP connections”)

相关文章
联系我们 contact @ memedata.com