Monday, June 25, 2012 · Posted by peterkrey at 20:32 PM
Finally, a Reason to Be Thankful for Airline Delays
The AMD Roadrunner server project was conceived at the end of the first Open Compute Summit, in Palo Alto in June, 2011. Thanks to an airline’s global system-wide outage, Grant Richard, Matthew Liste and myself were marooned overnight at SFO. During those 8-10 hours we talked a great deal about the many potential use cases and applications of Open Compute technologies and thought leadership. Some time before, we had reviewed many topics, including:
- Open Compute's data center power and cooling efficiencies
- Geographic locations
- Server and power distribution rack modules
The one that I recall being front and center was OCP's open motherboards and server chassis.
For context, Grant and I have known one another for many years through our prior team participation and technical leadership in building two of the largest HPC scale-out compute grids in financial services. The size and scope of these compute grids are confidential but were not far away in terms of size in relation to the Web 2.0 community.
In the past we needed to simplify and reduce motherboards of unnecessary proprietary components, open up and simplify management software, maximize hands-free management software, and so on, in order to make them work efficiently for us. This behavior was similar to Facebook’s early days' server OEM experiences. Despite our numerous attempts in the past to influence design, none of the server providers listened to our needs. Although not an ideal design, we maximized power efficiency and automated system management as best we could.
What jumped out at us last summer at the OCP summit was that for the first time the non-hyperscale world could access many of the same design points, ODMs, simplifications and "freedom of choice" advantages indigenous to the hyperscale Web 2.0 world. This open access led to an evolution in thinking.
Building a Team to Design the Square Peg to Fit a Round Hole
In the summer of 2011, we — Fidelity Investments and Goldman Sachs — shared ideas and observations around figuring out how to test and benchmark using the OCP servers for HPC and cloud infrastructure. At the time the biggest issues were:
- How to get the OCP servers to fit into traditional legacy data centers. Specifically, traditional 19" racks. Long story short, we could not.
- How to expand the ODM participation, Open Compute server availability and expand access.
By September of 2011 we discussed expanding our collaboration to include other firms and colleagues across financial services, and in October held an initial meeting to seek out those also interested in participating.
In the Fall of 2011, in collaboration with Fidelity, Goldman Sachs hosted an exploratory meeting with 10 other financial services firms to explore and discuss interest in forming a working group around Open Compute-enabled compute hardware and management. The vast majority of firms agreed there was great value in doing work together and collaborating around Open Compute.
A Roadrunner Born from a Lunch Napkin
At the October, 2011 NYC Open Compute Summit we ran into Walt Cataldo and Bob Ogrey of AMD. Over the years I had heard Bob's name as being a key contributor to numerous hyperscale motherboard designs for a number of significant Web 2.0 firms and felt very fortunate to meet him.
Walt and Bob invited me and a lead architect from Fidelity Investments to lunch to talk about Open Compute. Our conversation concentrated on our huge interest in OCP and working through some of the aforementioned hurdles and limitations.
Nothing negative, just working through how to get things to fit into traditional data centers and enabling multi-vendor access to OCP server technologies. By the end of the lunch meeting, Bob had created an open motherboard design outline literally on a turkey sandwich napkin and committed to formalizing it into a design specifications document. That's when the Roadrunner project was born.
At the end of 2011 we received the initial Roadrunner design document and in early 2012 we received an impressive update that evolved and expanded to include CAD diagrams. At a high level, the key design points for Roadrunner include:
- Create and enable an open source x86 motherboard project
- Be a universal motherboard, in terms of size, being able to fit in both a legacy 19" data center rack and an OCP data center rack
- Closely manage ODMs and only allow component selections that maximize reliability and power efficiency; do not allow a $0.50 savings to undermine quality, efficiency and reliability.
- Approach, enable and collaborate with as many ODMs and integrators as possible
- Maximize optionality and flexibility of add-on silicon choices such as 10GbE chipset selection via "mountable modules", not PCIe ... "mini-card / daughter-card silicon add on"
- Open Standard Baseboard Management Controller (BMC) in sync with Grant Richard's Open Compute Open Hardware Management initiative
- Fit across a wide variety of non-proprietary 1U, 1.5U, 2U, 3U and 4U mechanical stamped sheet metal enclosures supporting both 2.5" and 3.5" disc drives
- Minimize semiconductor and on-board silicon to reduce both purchasing and operating costs
- Be a universal motherboard, in terms of functionality, supporting 70-80% of target enterprise infrastructure use cases, including:
- HPC grid computing
- Cloud server IaaS & PaaS nodes
- Web application utility infrastructure server for Linux distributions
- Developer tools utility infrastructure servers
- Web application server developer platforms
- Multi-hypervisor next generation VDI
- Open source SQL servers
- Big Data scale-out servers
- No SQL server nodes
- Cloud scale-out distributed object store nodes
- Data caching
- Content management
- Content search
- Application servers for numerous proprietary software stacks, market data infrastructure, and so forth.
In April, 2012, version 1.0 of the Roadrunner document was distributed by AMD to a group of interested firms for their review. All were invited to a series of conference calls designed to solicit their feedback.
Twelve firms were invited and only one formally declined. Eight firms provided detailed feedback on the document. In a nutshell, it was all positive and constructive feedback, no show stoppers were identified!
Which brings us to today, soon after the third Open Compute Summit.
Whew ... so that's the very long short story on the Roadrunner project's history!
Peter Krey is a consultant for Fidelity.