Self-Protecting Microprocessor Saves Energy

Update: The paper “Active Management of Timing Guardband to Save Energy in POWER7” also earned 2011 Pat Goldberg Memorial Best Paper recognition.

Researchers at IBM’s Austin lab recently earned “Best Paper” at The 44th Annual IEEE/ACM International Symposium on Microarchitecture for work on “Active Management of Timing Guardband to Save Energy in POWER7” – a new way of managing and reducing microprocessor power consumption. 

Charles Lefurgy, a Master Inventor on the team, discusses how this technology works, and how it might affect chips in everything from servers to smart phones. 

How much energy do microprocessors consume in a system – from mobile devices to mainframes? 

Charles Lefurgy: High performance microprocessors can consume over 200 Watts, while mobile device microprocessors found in smart phones may only consume about a half of a Watt. 

How is this consumption monitored? 

CL: In IBM POWER7 processor-based systems, a dedicated “EnergyScale” microcontroller measures the power supplied to individual components, such as the microprocessors, disks, memory, and fans. While smart phones and PCs may have dozens of sensors to monitor their operating environments, a POWER7-based system has thousands of physical and virtual sensors to monitor its operation.

For this work, we use a new sensor built into the POWER7 chip called a “critical-path monitor.” It tells us when the chip is experiencing overly conservative conditions (high voltage or low temperature). This knowledge allows us to safely reduce the chip voltage and reduce chip power consumption by about 20 percent. Another key sensor we added in POWER7 monitors chip-level performance in order to guarantee that optimizing for high energy-efficiency does not harm performance. 

Talk a little bit about the history of microprocessor energy usage, and energy saving breakthroughs. 

CL: Technology changes associated with ever smaller CMOS transistor sizes have had a major impact on reducing the energy used per computation. From a systems viewpoint, dynamic voltage and frequency scaling has been commercially available for over a decade, and is used in everything from smart phones to supercomputers. This allows the power to reduce by lowering the supply voltage and clock frequency used by the microprocessor when the processing demand falls.

We have also seen many micro-architecture improvements in how instructions are processed within the microprocessor. For example, speculating the outcome of computations so that the next one can begin before the previous one ends. This reduces the time that the microprocessor sits idle (using power) while waiting for data to become available for processing. 

How does your use of the “Critical Path Monitors” technology further improve energy savings in microprocessors? 

CL: Today, the voltage level used to operate a microprocessor is overly conservative. The reason for this is that a large amount of engineering margin in the form of additional voltage is used to protect the microprocessor from worst-case conditions that may occur in the real world. For example, data centers may overheat or there may be an unexpected workload with high power consumption – both of which may cause the chip’s voltage level to droop. If the voltage becomes too low, electric signals cannot propagate through the chip, and it fails.

The new capability we’ve demonstrated safely removes some of this voltage margin. Operating at a lower voltage level reduces the chip’s power consumption – in addition to using traditional techniques.

The key to our work is to use the Critical Path Monitor’s (CPM) precision sensing of the time it takes for the circuits within the chip to complete a computation. If that timing changes due to a voltage droop, then the chip can react very quickly to protect itself.

The protection is enabled by using a new element in POWER7, called a digital phase-lock loop (DPLL). It continuously watches the CPM and reacts within nanoseconds to temporarily slow down the chip clock frequency to operate at a reduced voltage, and avoid a failure. Once we had this safety mechanism in place, we modified the EnergyScale microcontroller to look for opportunities to lower a chip’s voltage when it experienced excessive voltage margins. Voltage is lowered until the DPLL just begins to lower the frequency that keeps the chip operating correctly.

An additional benefit of using a lower voltage is that the microprocessor is cooler. This allows the server fans to run quieter, at a reduced speed and power consumption. 

Can it be applied to microprocessors across all types of systems and devices? 

CL: Every microprocessor applies additional voltage margin to eliminate harmful voltage droops. I expect that any system from mobile devices to supercomputers could use our method to save energy. 

Is this a redesign that will take time to realize or something that can be implemented in today's systems and devices? 

CL: Our method requires new sensors, and the DPLL clocking, to be built into the microprocessor. Today, only the POWER7 processor has this capability. 

How much energy savings could be realized – and would it change system performance? 

CL: Our results on an IBM Power 755 system achieved a power reduction of about 20 percent for the microprocessor and 18 percent for the entire server.

One way this could be used in practice is to increase the server density in data centers. Data centers often have power limitations for the amount of equipment that can be installed in a rack. Therefore, installing these energy-saving servers could allow a data center to install 18 percent more servers within the same power budget, and achieve 18 percent higher performance.

We observed no change in performance for industry-standard server benchmarks.  Our control algorithms adjust voltage rapidly enough to tightly achieve a desired level of performance. 

What would this energy savings mean for a mobile device or other consumer products? 

CL: For mobile devices, the battery life could be extended at today's performance level. However, the microprocessor is a small part of the energy consumption of a mobile device compared to the display. Therefore, the energy savings would be less, percentage-wise, compared to servers.

Another benefit is that the microprocessor requires less cooling due to running at a lower voltage. You could imagine using this in game consoles, laptops, and PCs where you don’t want loud fan noise. 

When and in what industry might we see this technology first utilized? 

CL: Critical Path Monitors are actively used today in IBM Power 775 supercomputers to detect a loss of voltage level. But currently, rather than reducing the clock frequency, a different method to protect the system is used – and is the subject of a pending conference publication.

We expect that our on demand reduction of microprocessor voltage margin, using CPMs and DPLLs, will appear in future IBM Power Systems starting in 2013. 

How many patents were granted for CPM? 

CL: IBM holds 11 granted patents related to CPM, including the circuitry used to synthesize a critical path and the circuitry to finely control the clock frequency output by the DPLL. Another seven patent applications have been filed. They’re for things such as methods to select the chip voltage level using the CPM and DPLL. 

The co-authors on the research paper are IBMers Alan Drake, Michael Floyd, Malcolm Allen-Ware, Bishop Brock, Jose Tierno, and John Carter.

Labels: , ,