In this installment on uses of server power management we continue the discussion on using this capability for other uses beyond server rack density.
Intel(r) Data Center Manager (Intel DCM) is a software development kit that can provide real time information to optimize data center operations. It provides a comprehensive list of publish/subscribe event mechanisms that can form the basis of a sophisticated data center management infrastructure integrating multiple applications where applications get notified of relevant thermal and power events and can apply appropriate policies.
These policies can span a wide range of potential actions: dialing back power consumption to bring it down below a reference threshold or to reduce thermal stress on the cooling system. Some actions can be complex, such as migrating workloads across hosts in a virtualized environment, powering down equipment or even performing coordinated actions with building management systems.
Intel DCM also provides inlet temperature or front panel thermals along with a historical record that can be used to identify trouble spots in the data center. This information provides insights to optimize the thermal design of the data center. The actions needed to fix trouble spots need not be expensive at all; they may involve no more than relocating a few perforated tiles or installing blanking panels and grommets to minimize air leaks in the raised metal floor. Traditionally, the hardest part has been identifying the trouble spots, involving time consuming temperature and air flow measurements. Intel Data Center Management provides much of this data ready made from operations. Typically this type of analysis is done by a consulting team and the cost of this exercise is high, anywhere between $50,000 to a $150,000 for a 25,000 square foot data center. This analysis yields a single snapshot in time which becomes gradually more inaccurate as the equipment in the data center is refreshed and reconfigured.
Deployment scaling can range from a small business managing a few co-located servers in a shared rack in a multi-tenant environment to organizations managing thousands of servers.
The event handling capability is an software abstraction implemented by the Intel DCM SDK running in a management console. From an architectural perspective, and the fact that the number of nodes managed can range in the hundreds, it makes more sense to implement this capability as software rather than firmware. Node Manager is implemented as firmware and it typically controls one server. The choice of SDK over a self-standing management application was also deliberate. Although Intel DCM comes with a reference GUI to manage a small number of nodes as a self-standing application, it shines when it's used as a building block for higher level management applications. The integration is done through a Web services interface. Documentation for Intel DCM can be found in http://software.intel.com/sites/datacentermanager/.