The OCP Community Prepares for Deployment of Sustainable Large Scale AI Clusters
The expected uptake for AI to assist humans in the workplace has created a data center computational infrastructure investment perfect storm. Capex spend forecasts have been adjusted upwards by major analyst firms, with the bulk of the spending to start in 2024. Underlying this expected spend is the number of use cases that are still being discovered which rely on Large Language Models (LLM) driving the deployments of large scale AI computational clusters. The enormous challenge in front of the industry is to provide the interconnect bandwidth this will require while reducing the power consumption curve of today’s interconnect technology. The solution will certainly include moving to optical interconnects from electrical wherever possible. For that reason the Open Compute Project (OCP) has a full day of optical connect talks at the upcoming 2023 OCP Global Summit, with a morning session focused on next generation optical connectivity for large scale Ai clusters, and an afternoon session on future generation in-server short reach optical interconnects.
The morning session on optics for large scale Ai clusters tackles the scale-out issues for connecting Ai clusters that will include a very large number of specialized Ai and ML nodes that need to be interconnected to share large volumes of data at very high speed, with very low latency, and minimal energy consumption. The next generation of optical networking will be a critical component needed to build these AI clusters. This breakout session will focus on the expected architectures for AI clusters and their interconnection requirements, advances in optical networking we can expect in the next generation of products that will be deployed to meet the challenge, and the expected improvements in energy consumption.
The afternoon session is where many startup companies will explain their optical technology that we might see within servers and to connect servers to in-rack memory and I/O. Specifically, xPU-xPU, xPU-memory, server disaggregation, cache coherent interconnects, and CXL/PCIE over optical. The use of photonics for short reach interconnects offers the promise to solve many problems limiting compute system performance, including the memory wall, I/O limitations, power constraints, and server disaggregation. This session will focus on use cases to tackle these limitations, where requirements are quite different from the more traditional networking applications. The session will also address associated promising technologies to meet these application requirements. We will have end-user, hyperscale DC operators, and HPC representation discussing use cases, applications, requirements, and optical technology, as well as component firms discussing how their approaches can meet these needs.
You can engage in the latest developments at the upcoming 2023 OCP Global Summit, taking place October 17 to 19 in San Jose, CA. There are two relevant tracks dealing with Optics:
- The Special Focus: Optical Connectivity for AI Clusters Breakout Sessions takes place starting at 8am for the morning of Wednesday October 18 in 210AE of the SJCC Concourse Level. This session will be kicked off by Andy Bechtolsheim, OCP Board Member
- As part of the Future Technologies Symposium, the Special Focus: In-server short reach optical interconnects takes place all of Wednesday afternoon, October 18, SJCC - Lower Level - LL20D
See the full schedule and register here: https://www.opencompute.org/summit/global-summit