Taking the Heat: How OCP Is Approaching Thermal Design Challenges in Next-Generation Data Centers
With the continual growth of data-rich applications, hyperscale data centers have been put under enormous strain. Network traffic has increased significantly within the data center, causing architects to search for new ways to achieve higher data rates and throughput.
The current state-of-the-art network interface controller (NIC) operates at 200G per port. Now, to meet the increased demands on the data center, the industry is advancing toward the use of 400G NICs. But this evolution requires a series of advancements in related and supporting technologies at the same rate – no small feat.
At our upcoming presentation at this year’s Open Compute Project (OCP) Global Summit, we will drill in on the thermal challenges that come with this transition and the unique ways our collaborative work group is approaching them.
Thermal Challenges at 400G
This transition to 400G NICs introduces a variety of thermal challenges into next-generation data centers.
The first challenge we face is the increase in power dissipation that comes with higher data rates. Through extensive research, experimentation, and simulation, we have found that this relationship between data rate and heat generation looks roughly linear, where doubling data rates increase system heat more than two times. The result? The move from 200G to 400G NICs will drive significant increases in system heat.
The second challenge comes from the infrastructure needing to support 400G NICs. Unlike 200G NICs, which use passive direct-attach cables (DACs), 400G NICs can sometimes require the use of high-power active optical cables (AOCs) to support these data rates. These high-power AOCs, which can dissipate upwards of 8W, introduce their own heat into the system, adding to the increased temperatures brought by purely churning data at these high rates.
Questioning the Infrastructure
These impending thermal challenges have led us to question the viability of certain components in the infrastructure of the current NIC environment. In collaboration with NVIDIA and Meta, we have begun to more thoroughly investigate this challenge.
A major focus of our investigation was form factor. Specifically, we investigated the viability of the OCP NIC 3.0 industry-standard small form factor (SFF) to see how it compared with the proposed tall SFF (TSFF). It is well known that the TSFF allows for more space and hence a better input/output (I/O) thermal solution, but ideally, system architects can continue with the SFF where possible. The real question is, does the SFF offer a viable solution for 400G NICs, or do we need to move to the TSFF as the industry standard?
The answer to this question is far from straightforward because several compounding variables can affect the conclusion. For this reason, our study considered many factors that could significantly impact thermal performance. These include:
- Form Factor: TSFF vs. SFF
- NIC ASIC Power Limitation (DAC cable only)
- Module Type: QSFP-DD Type 1 vs. Type 2A
- Monitor Location: Mean top back shell temperature, heatsink base temperature, and nose temperature
- Type of testing fixture: with and without testing fixture
- Cold aisle vs. hot aisle
Simulation Setup and Assumptions
Each degree Celsius has a bearing on the conclusion as it relates to viability. As such it is necessary to ensure that our simulation is representative of a realistic and reasonable scenario.
To this end, our simulation modelled an OCP NIC 3.0 card using both TSFF and SFF form factors. NVIDIA generously provided the simulated ASIC thermal model, the ConnectX-6 Dx, for our study. To simulate the ASIC, we assumed an upper power limit of 23W, and we modeled the device equipped with a standard aluminum heatsink.
For the QSFP-DD module we used a multilane thermal model with a conservative power consumption of 10.2W. Similar to the ASIC, we chose to model a standard aluminum heatsink for the QSFP-DD that maximizes the covered heated surface area but does not employ any advanced cooling techniques or materials as the intent was to understand the relative impact of the variables highlighted above
For our simulation environment, we tested both hot aisle and cold aisle environments. The hot aisle was represented by an ambient temperature of 55°C, an air velocity range of 200 to 1000 linear feet per minute (LFM) and airflow direction from back to front (all per OCP3 specification). In contrast, the cold aisle was modeled by an ambient temperature of 35°C, an air velocity range of 200 to 600 LFM and airflow direction from front to back.
As shown in Figure 1, our simulation employed the NVIDIA OCP NIC 3.0 test fixtures, including two identical cards installed inside the test chamber.
Figure 1. The test fixture and model setup used in our simulation.
Investigation Findings: The Impact of Form Factor
Our simulation gave us insight into how several boundary conditions and variables have a non-zero impact (i.e., greater than a couple of degree Celsius) on thermal performance.
The first notable result from our investigation was the significant role that form factor plays on thermal performance of the QSFP-DD module. As shown in Figure 2, we found that TSFF has a significantly better thermal performance than SFF, especially at lower airflows. In this scenario, the improvement in thermal performance was as much as 6°C. Although this result is not surprising, the margin of a 6°C improvement is significant.
Figure 2. TSFF was found to offer improved thermal performance over SFF in our simulation
By the same token, our findings showed that when using TSFF in hot aisle applications, the thermal performance of the ASIC improved by as much as 10°C. Further, the power limitations of the NIC ASIC (in passive DAC applications) increased by approximately 2.5W with the TSFF form factor, when compared to SFF in hot aisle conditions.
Investigation Findings: Other Variables
Beyond form factor, our investigation yielded insights into the impact of module type and monitor location on thermal results. When comparing the industry-standard QSFP-DD Type 1 modules with the QSFP-DD Type 2A modules, our results showed superior performance for Type 2A, improving thermal performance by approximately 4°C. This improvement is largely because the Type 2A QSFP-DD has an externally integrated heat sink on the nose of the module itself—again, not a surprising result, but significant nonetheless.
Finally, we found temperature deviation between the different monitor locations considered (i.e., the point on the module that is being probed). For example, our simulations showed that monitoring temperature at the heatsink base can yield results 5°C lower than if we were to monitor at the module nose. As Figure 3 shows, monitor location is clearly a non-negligible consideration when quantifying the thermal performance of NIC modules.
Figure 3: The monitor location used has a significant impact on thermal results
Our investigation provided insight into the impacts of certain variables and boundary conditions on thermal performance, but the results are not the major conclusion. More important than the findings about which setups are “reasonably representative of actual environments,” the study informed the need for the industry to reach consensus on these variables and boundary conditions.
Take variables such as module type and monitor location, for example. Our results show that module type can have a significant impact (≅4°C) on thermal performance. This finding begs the question: Instead of ruling out SFF as a viable form factor for 400G, is it possible to keep SFF but switch to a Type 2A QSFP-DD? Currently, the industry has not reached such consensus; to reach a genuine conclusion on SFF viability, the consensus must first be defined and agreed upon.
Similarly, the industry currently has no agreed-upon standard for monitor location. Our investigation showed that where we monitor thermal performance can have a significant impact (up to 5°C) on simulation results. If we cannot agree on monitor location, then the lack of uniformity between studies will make it impossible to truly compare results. Again, for OCP and the industry as a whole to move toward 400G NICs, consensus must first be achieved.
Call to Action
How will we reach that critical industry consensus? We believe increased multidisciplinary participation from module, I/O, NIC, system, and data center architects is required. This collaboration will help OCP better align on what is achievable and determine the most appropriate environment in which to conduct these viability studies moving forward. Further, because research up to now is not all encompassing, additional variables must be considered including the viability of QSFP-DD active electrical cables (AECs) which are anticipated to dissipate less heat than AOCs. If the industry finds that the SFF is not viable for use with AOCs, the next step forward may be to use AECs instead. Additionally, if we are moving toward the TSFF NIC form factor, we will need to expand our study to cover the viability of the octal SFF pluggable riding heatsink (OSFP-RHS) ports as well.
Collaboration will be critical to achieving thermal design consensus, and OCP will play a vital role. Molex is honored to have worked with Meta and NVIDIA to study these next generation solutions. By collaborating to design the test protocol and carefully running simulations to quantify the impact of each identified variable, we are working together to analyze the results and look for ways to reach new levels of performance as data center requirements heat up.