谷歌向生态系统开放了可靠的低延迟硬件传输Falcon。
Google opens Falcon, a reliable low-latency hardware transport, to the ecosystem

原始链接: https://cloud.google.com/blog/topics/systems/introducing-falcon-a-reliable-low-latency-hardware-transport

在科技领域,谷歌最近推出了Falcon-一个可靠且快速的低延迟硬件传输系统。Falcon通过Carousel、Snap、Swift、PLB和CSIG五个关键概念组合提供低延迟,这些概念适用于高带宽但不稳定的以太网数据中心数据传输。Falcon在较低层基于精细粒度的硬件辅助往返时间(RTT)测量,硬件控制的流量整形机制,以及快速准确的包恢复方法,与PSP安全的多路径连接选项相结合,提供了卓越的可靠性和最小的延迟。它兼容各种上层协议,如InfiniBand RDMA或NVMe ULPs,具有可变顺序语义和平滑故障管理的附加功能,以满足巨大的仓库级应用程序需求。Falcon由硬件和软件专家共同开发,旨在满足人工智能和机器学习等领域未来先进技术系统中对高带宽、高消息率和低延迟的迅速增长需求。行业领导者,如英特尔、AMD、博通、思科、爱达思、惠普企业、微软和Oracle,对Falcon在未来以太网数据中心基础设施中可能产生的潜在影响表示热情。通过查看谷歌在11:45 AM的OCP峰会演讲,了解更多关于Falcon的信息。该公司计划在2024年Q1根据Open Compute项目的规定提供Falcon规格。

这篇文章讨论了由谷歌开发的Falcon,这是一种硬件传输协议,旨在解决高性能计算工作负载在传统以太网网络中存在的问题。与RoCEv2、SRv2和SRD一起,Falcon旨在改善数据中心和AI工作负载环境中的大规模延迟敏感型网络应用的延迟敏感性。然而,一些批评者认为这些努力是对行业内部特定工作负载而不是促进开放标准的培养。此外,与TCP/IP、gRPC和HTTP/2等现有标准相比,Falcon似乎代表了一种倒退。然而,它为寻求超越传统方法的开发者提供了更多选择,这对于新兴工作负载来说越来越不相关。总之,Falcon代表了当前事实标准之外的重要区别,强调了技术基础设施、平台和过程之间的继续分裂和异质性的潜力。
相关文章

原文

The lower layers of Falcon use three key insights to achieve low latency in high-bandwidth, yet lossy, Ethernet data center networks. Fine-grained hardware-assisted round-trip time (RTT) measurements with flexible, per-flow hardware-enforced traffic shaping, and fast and accurate packet retransmissions, are combined with multipath-capable and PSP-encrypted Falcon connections. On top of this foundation, Falcon has been designed from the ground up as a multi-protocol transport capable of supporting ULPs with widely varying performance requirements and application semantics. The ULP mapping layer not only provides out-of-the-box compatibility with Infiniband Verbs RDMA and NVMe ULPs, but also includes additional innovations critical for warehouse-scale applications such as flexible ordering semantics and graceful error handling. Last but not least, the hardware and software are co-designed to work together to help achieve the desired attributes of high message rate, low latency, and high bandwidth, while maintaining flexibility for programmability and continued innovation.

Falcon reflects the central role that Ethernet continues to play in our industry. Falcon is designed for predictable high performance at warehouse scale, as well as flexibility and extensibility. We look forward to working with the community and industry partners to modernize Ethernet to serve the networking requirements of our AI-driven future. We believe that Falcon will be a valuable addition to the other ongoing efforts in this space.

Industry perspectives

Our partners across the industry are enthusiastic about the promise that Falcon holds for developing the next generation of Ethernet.

“We welcome Google’s contribution of Falcon as it shares the Ultra Ethernet Consortium’s vision to drive Ethernet as the best data center fabric for AI and HPC, and look forward to continuing industry innovations in this important space.” - Dr. J Metz, Chair, Ultra Ethernet Consortium (led by AMD, Arista, Broadcom, Cisco, Eviden, Hewlett Packard Enterprise, Intel, Meta, Microsoft, and Oracle)

“Falcon is first available in the Intel IPU E2000 series of products. The value of these IPUs is further enhanced as the first instance of an Ethernet transport to add low tail latency and congestion handling at scale. Intel is a Steering Member of Ultra Ethernet Consortium, which is working to evolve Ethernet for high performance AI and HPC workloads. We plan to deploy the resulting standards-based enhancements in future IPU and Ethernet products.” - Sachin Katti, SVP & GM, Network and Edge Group, Intel

"We are pleased to see a high-performance transport protocol for critical workloads such as AI and HPC that works over standard Ethernet/IP networks and enables massive application bandwidth at scale." - Hugh Holbrook, Group VP, SW Eng., Arista Networks

“Cisco is pleased to see the contribution of Falcon to the OCP. Cisco has long supported open standards and believes in broad ecosystems. The rate and scale of modern data center networks and particularly AI/ML networks is unprecedented, presenting a challenge and opportunity to the industry. Falcon addresses many of the challenges of these networks, enabling efficient network utilization.” - Ofer Iny, Cisco Fellow, Cisco

“Juniper is a strong supporter of open ecosystems, and therefore we are pleased to see Falcon being opened to the OCP community. Falcon allows Ethernet to serve as the data center network-of-choice for demanding workloads, providing high-bandwidth, low tail latency and congestion mitigation. Falcon provides the industry with a proven solution today for demanding AI & ML workloads.” - Raj Yavatkar, Chief Technology Officer, Juniper

“Marvell strongly supports and is committed to the open Ethernet ecosystem as it evolves to support emerging, demanding workloads such as AI. We applaud the contribution of Falcon to OCP and welcome Google sharing practical experiences with the industry.” - Nick Kucharewski, SVP & GM Network Switching Group, Marvell

Learn more

Networking is a foundational component in building the sustainable, secure, scalable societal infrastructure that we need for this AI-driven future. To learn more about Falcon, join us for the OCP Summit presentation, “A Reliable and Low Latency Ethernet Hardware Transport” by Google’s Nandita Dukkipati at 11:45am at the Expo Hall. We’ll contribute the Falcon specification to OCP in the first quarter of 2024.

To learn more about Google’s contributions to the Open Compute Project and our presence at the OCP Global Summit, check out the blog “How we’ll build sustainable, scalable, secure infrastructure for an AI-driven future”.

联系我们 contact @ memedata.com