In spite of significant gains in server energy efficiency, power consumption in data centers is still trending up. At the very least, we can make sure that the energy expended yields maximum benefit to the business. A first step in managing power in the servers in a data center is having a fairly accurate monitoring capability for power consumption. The second step is to have a number of levers that allow using the monitoring data to carry out an effective power management policy.
While we may not be able to stem the overall growth of power consumption in the data center, there are a number of measures we can take immediately:
- Implement a peak shaving capability. The data center power infrastructure needs to be sized to meet the demands of peak power. Reducing peaks effectively increase the utilization of the existing power infrastructure.
- Be smart about shifting power consumption peaks. All the watts are not created equal. The incremental cost of generating an extra watt of power during peak consumption hours is much higher than the same watt generated in the wee hours of the morning. For most consumer and the smaller commercial accounts flat rate pricing still prevails. Real time pricing (RTP) and negotiated SLAs will become more common to put the appropriate economic incentives in place. The incentive of real time pricing is a lower energy bill overall, although the outcome is not guaranteed. In pilot programs residential consumers have complained that RTP result in higher electricity costs. With negotiated SLAs the customer can designate a workload to be subject to lower reliability; for instance, instead of 3 9’s, or outages amounting to about 10 hours per year, the low reliability workload can be designated as only 90 percent reliable, and can be out on the average of two hours per day.
- Match the electric power infrastructure in the data center to server workloads to minimize over-provisioning. This approach assumes the existence of an accurate power consumption monitoring capability.
- Upgrading the electrical power infrastructure to accommodate additional servers is not an option in most data centers today. Landing additional servers at a facility that's working at the limit of thermal capacity leads to the formation of hot spots, this assuming that electrical capacity limits are not reached first with no room left in certain branch circuits. Hence measures that work under the existing power infrastructure are to be preferred over alternatives that require additional infrastructure.
For the purposes data center strategic planning it may make economic sense to grow large data centers in a modular fashion. If the organization manages a number of data centers, consider making effective use of the existing data centers, and when new construction is justified, redistribute the workloads to the new data center to maximize the use of the new electrical supply infrastructure.
Intel has built into its server processor lineup a number of technology ingredients that allow data center operators optimize the utilization of the available power system infrastructure in the data center.
Newer servers of the Nehalem generation are much more energy efficient, if only because of the side effect of increased performance per watt. These servers also have a more aggressive implementation of power proportional computing. Typical idle consumption figures are in the order of 50 percent of peak power consumption.
Beyond passive mechanisms that do not require explicit operator intervention, the Intel® Intelligent Power Node Manager (Node Manager) technology allows adjusting the power draw of a server and trade off power consumption against performance. This capability is also known as power capping. The control range is a function of server loading. For the Intel SR5520UR baseboard on the 2U chassis, the server will draw about 300 watts at full load and its power consumption can be rolled down to about 200 watts. The control range tapers down gradually until it reaches zero at idle.
For power monitoring, selected models of the current Nehalem generation come with PMBus specification compliant power supplies allowing real-time power consumption readouts.
The Node Manager power monitoring and capping capability apply to a single server. To make this capability really useful it is necessary to exercise these capabilities collectively to groups of servers, to add the notion of events and a capability to build a historical record of power consumption for the servers in a group. The additional capabilities have been implemented in software through the Data Center Manager Software Development Kit developed by the Intel Solutions and Software Group. An additional Software Development Kit, Cache River allows programming access to components in servers and server building blocks produced by the Intel Enterprise Products Server Division (EPSD), including the baseboard management controller (BMC) and the management engine (ME), the subsystems that host or interact with the Node Management firmware. EPSD products are incorporated in many OEM and system integrator offerings.
Data Center Manager implements abstractions that apply to collections of servers:
- A hierarchical notion of logical server groups
- Power management policies bound to specific server groups
- Event management and a publish/subscribe facility for acting upon and managing power and thermal events.
- A database for logging a historical record for power consumption on the collection of managed nodes.
The abstractions implemented by DCM on top of Node Manager allow the implementation of power management use cases that involve up to thousands of servers.
If this topic is of interest to you, please join us at the Intel Development Forum in San Francisco at the Moscone Center on September 22-24. I will be facilitating course PDCS003, "Cloud Power Management with the Intel(r) Xeon(r) 5500 Series Platform." You will be the opportunity to talk with some of our fellow travelers in the process of developing power management solutions using Intel technology ingredients and get a feel of their early experience. Also please make a note to visit booths #515, #710 and #712 to see demonstrations of early end-to-end solutions these folks have put together.