Data centre managers and equipment vendors looking for greener alternatives will begin to benefit this year from a major initiative aimed at reducing the power consumed by Ethernet equipment. IEEE 802.3az, or the Energy-Efficient Ethernet (EEE) standard, will implement low-power idle (LPI) modes for the full range of Ethernet BASE-T transceivers (100Mb, 1GbE and 10GbE) and the backplane physical layer standards (1GbE, 4-lane 1GbE and 10GbE).
Computing and network hardware has traditionally been benchmarked on performance with no clear metrics for energy efficiency. And because the focus of development has been ever higher performance, there has been a rapid increase in power consumption, particularly since the advent of multi-GHz processors. The EPA reports that energy usage in data centres doubled between 2000 and 2006 and is predicted to double again by 2011, hence the interest in energy efficiency.
Data centres are built to handle peak loads and often have excess capacity off peak. It's nice to have a lot of server cores available to tackle a big problem when you need them, but it costs a lot of money for power and cooling to keep those servers running constantly.
Clearly there are efficiencies to be gained by being able to operate the compute infrastructure in a manner that scales the power consumption down when the load is lower. Consider, for example, a data centre built to provide stock quotes within seconds. The servers are woefully underutilised when the market is closed. Making the power consumption proportional to the load would allow close to a ten-fold reduction in power consumption as the average utilisation tends to be less than 10% of peak capacity.
Approaching the problem
While the EPA recently defined metrics for server energy efficiency, there had been no measure to rate the energy efficiency of network equipment, so the EPA sought input from the Energy department-supported Environmental Energy Technologies division of the Lawrence Berkeley National Laboratory (LBNL). Mike Bennett and Bruce Nordman of LBNL chose to bring this task to the IEEE LAN MAN Networking group, which initiated a standardisation project (Project 802.3az).
When the energy efficiency standards activity started in the 802.3 working group, one of the options considered was stepping down the power consumption of Ethernet transceivers (PHYs) in stages when the data rate required was less than peak. That idea was abandoned, after much debate, in favor of defining LPI modes and mechanisms to switch rapidly between the full operating speed and the low-power idle mode.
With this approach, the EEE standard will not only be able to improve the efficiency of data centre network equipment, but also provide standardised signaling mechanisms that can enable rapid transitions between normal operation and LPI states in systems on either end of the physical layer link.
This capability is reminiscent of the Wake-on-LAN standard (which defines magic packets that can be sent to remotely wake up a computer in a sleep mode), however, EEE signaling has much less latency, on the order of 10 microseconds.
Low power idle mode
LAN links generally average less than 10% utilisation and even at peak times, the utilisation does not reach 100%. When data is not being sent, the lower-speed PHYs (10Mbps and 100Mbps) would stop transmitting and hence would automatically consume less power. However, the higher-speed PHYs (1000BASE-T and 10GBASE-T), the ones that are relevant to data centres, continue to transmit actively when there is no data to send, and thereby continue to consume power when idle.For the lower speed standards, the new EEE standard (IEEE 802.3az) specifies the transport of LPI signalling from the system on one side of the link to the system on the other side and provides only small incremental savings in the PHY power itself, but it enables rapid adjustments in power modes of the connected devices.
For 1000BASE-T and 10GBASE-T transceivers new LPI modes have been defined. Key features are:
1. They allow powering down the transmitters and three of the four receivers in a link when there is no data to send
2. They include a refresh cycle that requires transmission of short training sequences in LPI mode so the PHY parameters (clock tracking at the slave, receiver equalizer coefficients, echo canceller coefficients, crosstalk canceller coefficients, etc.) can be updated and kept current
3. They include the definition of an alert signal that can be used to rapidly wake up a PHY from sleep in the LPI mode
4. They can be initiated either from the local system by signaling from the MAC or station management or from the remote system over the PHY link
Because of these features, the LPI-to-active state transition can be made in less than 0.001% of the time it takes for the initial link-up of the PHY. During the sleep-to-wake transition, EEE requires that data transmission be held off for the PHY wake time so no data is lost.
The EEE standard also provides a mechanism to hold off transmission for longer periods than the minimum specified to allow for extended system wake times, and this can be coordinated across the network using the IEEE802.1AB protocol. This allows much finer grained and more flexible cycling between normal and LPI modes. This capability will be a powerful tool for power savings in future data centres.
The EEE standard has been through three cycles of working group balloting and has maintained an approval rating greater than 80%. It is slated to get approval to go to sponsor ballot next month. Full ratification is anticipated before year end.