Continuing on the theme of measuring Data Centre efficiency - power consumption of the facilities and IT load are only one element albeit a large one - that contributes to the overall efficiency of a data centre. Ultimately a DC has to deliver useful workload and the amount of workload that can be achieved within a given physical DC is an increasing challenge. Lowering server power and increasing the cooling effectiveness of a DC are one of several ways to enable more equipment to be installed into an existing facility.
General consensus seems to be that the servers in many data centres do not always run a maximum utilisation - many are in the 10-15% utilisation range. This results from many IT shops following a policy of hosting one workload ( application ) per server and sizing the server to support worse case usage of that workload - this leads to low average utilisation of the servers. There are several approaches that can be taken to increasing the server utilisation
Consolidating several applications onto the same server that have different mixes of utilisation - this is not perfect as a problem on one application could impact the others on that server causing significant business impact
Deploying virtualisation within the DC - this enables multiple OS/App instances to be run on the same server. There are multiple benefits here in that the server utilisation increases whilst the number of servers could potentially be decreased so reducing the overall electrical power consumption of the DC and consequently the utility bill. Another aspect of virtualisation is that to achieve the highest levels of consolidation it is best to deploy the latest generation high perf/low power servers, this can result in the removal of many older generation high power servers from the Data Centre and the deployment of a smaller number of newer more power efficient servers
There are circumstances where virtualisation may not be appropriate and it is necesseary to retain one workload per server - in this case an increase in the workload capacity of a DC can be achieved by replacement of older smaller servers with the latest generation high performance servers - this can enable the workload capacity of a DC to be significantly increased without building a new DC, again the side benefit here is that latest generation servers consume less power than the older servers they are replacing.
There are many different ways in which the workload capacity ( and hence utilisation ) of a DC can be increased , with care most can also result in a reduction in the electrical power consumed by the DC.
Given the right tools the utilisation of servers within a DC is 'relatively' easy to measure, so this element of DC effectiveness can be quantified. There is another major element that I believe contributes to the effectiveness of a DC - that is the processes that are in place to manage the DC and hence the way a DC can respond to the new challenges placed on it by a business unit. Gartner have an infrastructure maturity model that is useful to try and quantify how effective a DC is in responding to business needs and looks at responsiveness, Service Level Agreements, IT processes etc. Currently I do not believe many DC managers are measuring how effective their DC in terms of process and when asked to judge where they sit within a model like Gartner's many IT managers will judge themselves more efficient than they really are.
Are there other areas that contribute to the efficiency of a DC - I would be interested in your feedback.
An area where a vast amount of money and energy could be saved is by a concerted IT industry effort to do away with the need for air conditioning of computer rooms and data centres altogether. The exact percentage of computer room and data centre energy consumed by the air con depends on who you talk to, but it is somewhere around 40% to 60% of the total.
BT has recently gone public with a statement that it is running its hosting centres with fresh air cooling.
"To provide some sense of scale, BT’s data centres, the largest in
Europe, use close to 0.7% of the total power output for Great
Britain. The traditional formula for computer cooling is that for every 1 kilowatt used to actually run a computer, you need 1.2 kilowatts of power to cool it. With fresh air cooling as pioneered by BT, refrigeration becomes unnecessary for a majority of the year, reducing those costs by approximately 85%. BT also runs its servers hotter, extending parameters to between 5 and 40 degrees Celsius, further reducing cooling costs. Presently, BT has 107 data centre sites using fresh air cooling as well as 5600 telephone exchanges."
The remarkable findings of a huge Google study, in a white paper publish early last year, show that disk drive failures are INCREASED by lower temperatures, and that the optimum ambient temperature for minimum failure rates is somewhere around 35 to 40 deg. C.
http://research.google.com/archive/disk_failures.pdf.
In our UK office our company runs our 6 main network servers right in the open office, in specialist soundproofed cabs with fans in them to recalculate the office air. That office has never turned on it's heating in the winter and on a hot summer day they just open the windows. The building has no aircon. They have never had a hardware problem since the servers were put there four years ago.
Why do we persist in thinking that servers have to be operated only at 20 C or so? Because radical change of this sort is the hardest thing for an IT manager to execute on. He is judged on uptime, uptime and uptime. The status quo works. The server manufacturers continue to state maximum operating temperatures of 35 C and Sun, for example states a 22 C optimum in its best practice guide. How much of that is based on an unwillingness to risk bad PR and warranty claims by going into untested territory?
The server makers have a responsibility to work on testing that territory and improving their recommendations accordingly. If their servers really are intolerant to periodic exposure to higher temperatures, as might happen using fresh air cooling in a country with hot summers, they should get far more R&D resource onto it fix that. Think of the coup of being the first to market with servers that don't need cooling.
If people are still afraid of downtime at higher temperatures, they should do like Google does. Get the focus off the possibility of a hardware failure and put it onto super-sophisticated fail-over software and systems that make hardware failures more or less irrelevant to uptime.
Our society collectively put a man on the moon nearly 40 years ago. That society now requires that the IT industry drastically cuts its power consumption for reasons plain to all. I submit that if the right companies sat around the right table, and worked collectively, the need for computer room and data centre air con could be virtually eliminated. It cannot be beyond the wit of Man, or the leviathan IT industry, if responsibility is allowed to drive away technical inertia.
Tim Walsh
CEO
http://www.kellsystems.co.uk