From GTC to OFC (7): Stop Chasing Higher Single-Lane Speeds

April 15, 2025 303

April 16,2025, Pradeep Sindhu is the founder of the network equipment company Juniper, and his position in the field of computer network architecture is beyond question. At the beginning of last year, Sindhu was appointed as a Technical Fellow of Microsoft Corporation and Vice President of the Corporate Silicon business. At Microsoft, his task is to develop a network card that can better support Microsoft's self-developed Maia AI server chip, thereby reducing Microsoft's dependence on NVIDIA's GPU chips. Therefore, it is reasonable for OFC to invite him to be the keynote speaker to talk about the network architecture requirements of AI data centers. His speech topic is "Upgrading Data Centers Without Increasing Interface Speeds".

The core of Sindhu's speech in the OFC Keynote session this time is that for the AI interconnection architecture, 200Gbps per lane is sufficient, and there is no need to deliberately chase higher speeds. This view seems to be different from that of many large companies in the venue that are pursuing higher single-lane speeds.

Why is 200Gbps sufficient? Sindhu said that the AI data center interconnection architecture is different from the previous point-to-point connection system. Its main requirements are low latency, low cost, high reliability, and high energy efficiency. In the case of Scale Up, it now has to compete with copper cable solutions and needs to provide higher bandwidth density and Radix. Whether it is Scale Up, Scale Out or the interconnection of data centers, the core internal method is the exchange of Packets, and the core connection method is multi-point to multi-point connection. Under the current situation, upgrading to a single-lane 400Gbps is not only unnecessary but also harmful. There are some parts where I didn't fully grasp Sindhu's meaning. Readers who are interested can watch his speech video again. He probably means that further upgrading the single-lane speed will lead to a decrease in the radix, and then an increase in the number of switching layers, which will cause switching congestion; it will increase the technical difficulty of implementing the physical layer. The difficulty of implementing the physical layer is easy to understand, and for the others, one may need to be more familiar with switch technology to understand. But the conclusion is that 200Gbps per lane is a very good choice now. On the one hand, the speed is fast enough to alleviate the SerDes latency, and on the other hand, it can meet the users' requirements for cost and power consumption.

Sindhu and Andy Bechtolsheim of Arista are from the same era. They have both been committed to breaking Cisco's leading edge. They have studied network architecture on their own, made switches, and Sindhu has also made chips and network cards. Andy has been causing mixed reviews in the industry for advocating LPO in recent years. He didn't give a speech on stage like last year this year. I just saw him sitting alone at his booth from a distance, not knowing what he was thinking.

The greatness of OFC lies in the fact that we can always hear these different voices and there are these people and companies worthy of respect. In Sindhu's speech, there is such a passage: "In 1992, Dave Clark, CEO of Amazon's Global Consumer Business, said, 'We reject kings, presidents, and voting. We only believe in universal consensus and code.' In 2012, I also said, 'We reject all kinds of committees and paper designs. We only believe in universal consensus and chips that can work.'" For the development of the AI architecture, we hope to hear more such voices.