L_Wigle

Intel Goes Green(er)

Posted by L_Wigle Jan 30, 2008

Monday, the EPA announced Intel was #1 a top their Green Power Partner list, designating Intel as the largest purchaser of Green Power in the US.  The purchase was significant as it represented the largest single purchase in the history of the program which dates back more than ten years.  Since this announcement the press, blogs and  environmental pundits have commented on the significance of this purchase as a demonstration of Intel’s eco-responsibility, while emphasizing the potential positive impact it may towards driving greater demand and supply of renewable energy. 


The questions many inside of Intel been asking are along the lines of, “What are the implication and relevance of the Green Power announcement to our products and technology?  And then specifically to the Green IT trend?  The answers to these questions are two fold.  First, by purchasing renewable energy credits to over 40% of Intel’s projected US electricity requirements for our facilities and factories customers can be assured Intel is making real actions to reduce the impact on the environment as we design and produce our products. 


The second answer is one of role modeling as an example with decisions based on environmental impact and sustainability.  Intel brings to the table a consideration for data center operators to drive the direct cost reduction benefits with greater energy efficiency and to evaluate improvements in sustainability of operations by considering renewable energy.


This may not be as direct a call to action as the Climate Savers Computing Initiative or defined like Green Grid BKM’s, but we can all agree that more Green Power usage and availability at competitive costs is good for business and the environment.  As PSO pointed out in his ISMC opening keynote, Intel must lead and be a leader.  The Green Power purchase is a good example of the “Impact” we can make as a company through our responsible actions and citizenship.


So, what are you doing along the 'Green' lines?

Ok, nothing is free, but some things are a pretty good deal. I spoke last time about the capacity boost delivered through virtualization. I threw out some big numbers, so here is a bit more detail. More accurately this capacity comes from applying virtualization to a new model for data center management ( you will have to do more than install a hypervisor). I felt pretty conservative with my 5x multiplier in five years.

 

Even if all you ever read is the in-flight magazine, you know virtualization is a big deal. Hype aside, virtualization is the foundation for realizing the "next generation data center-NGDC". Utilization on enterprise servers is pathetic. The number I used was 15%, but I have heard many customers talk of 5% or even less. The target I used for a super efficient data center was 75% utilization - hence the 5x.

 

Getting to 75% average utilization will take a lot more than simple consolidation of physical servers onto a virtualized server. This is why I jump to NGDC requirement. Reality says server utilization is all over the place, with odd spikes and many differences in where the bottle neck is. Capacity limitations can be in CPU, Memory, Disk, or Network.

 

 

The key to maximizing consolidation is in achieving what I call "Dynamic Resource Management" or sometimes Dynamic Resource Pooling. DRM is what moves the NGDC beyond simple consolidation to Policy Based Balancing of data center resources. In the DRM model a server has become a virtual collection of compute, storage, and network resources. This model is beginning to emerge in commercial offerings from VMware, Microsoft, Sun, Cisco, Virtual Iron, and others.

 

 

The trick here is to couple the ability( like in vmotion from VMware) to move a VM from one set of hardware to another, with policy based moves. In my view this makes DC efficiency "just" another logistics optimization problem, not unlike airline scheduling or package delivery. "A game to maximize the utilization, minimize energy use, maximize availability, gracefully handle exceptions, and meet all my SLAs". i.e. a really hard problem. I have tried to capture this journey to NGDC in a compelling graphic, but all seem to fall short. (Thinly veiled request for better pictures of NGDC)

 

 

For now achieving the NGDC requires complex software stacks, coupled with management heroics. Intel, IMHO, has the best roadmap and view of this future as shown in the addition of virtualization features across compute, storage, and network. I would like to hear from others where you see barriers and bridges to NGDC. Who are the rabbits leading the way to this dynamic data center?

Here's the 2nd follow-up post in my 10 Habits of Great Server Performance Tuners series. This one focuses on the second habit: Start at the top.

 

Let me start by relating a true (although simplified) story. My team at Intel has built up years of expertise running a particular benchmark. So when the time came to start running a new, similar benchmark, we thought: "No problem." We began running tests while the benchmark was still in development. Immediately we had an issue: the type of problem that would normally indicate our hardware environment wasn't set up properly. We checked everything that we had seen cause the issue in the past, and we couldn't find anything. So, we blamed the new benchmark. After all, we were experts and we had been setting up these environments for years! We knew what we were doing. You can probably guess where this story is going: after weeks of doing things to work around the "benchmark issue", we figured out that we had mis-configured the environment, resulting in a bottleneck on one part of our testbed. We didn't thoroughly test that part of the environment because it had never caused us problems with the old benchmark. And of course, on the new benchmark it was critical. We had broken one of the most important rules of performance tuning: Start at the Top.

 

 

So now you know how easy it can be to not Start at the Top. Even seasoned performance engineers can get overconfident and forget this rule. But the consequences can be dire:

 

  • 1. You have to eat major crow when you realize your mistake. I'm just now getting over the humiliation.

  • 2. You might have put tunings in place to address issues that weren't really there. This is at best wasted work and at worst something that you have to painstakingly undo when you fix the real issue.

 

So...how do you avoid this situation? Simple: use the Top-Down Performance Tuning process. This means you start by tuning your hardware. Then you move to the application/workload, then to the micro-architecture (if possible). What you are looking for at each level are bottlenecks: situations where one component of the environment or workload is limiting the performance of the whole system. Your goal is to find any system-level bottlenecks before you move down to the next level. For example, you may find that your network bandwidth is bottlenecked and you need to add another NIC to your server. Or that you need to add another drive to your RAID array, or that your CPU load is being distributed un-evenly. Any bottlenecks involving your server system hardware (processors, memory, network, HBAs, etc), attached clients, or attached storage is a system-level bottleneck. Find these by using system-level tools (which I will touch on in the future blog for Habit #8), remove them, then proceed to the application/workload level and repeat the process.

 

 

 

Being vigilant about using the top-down process will ensure you don't waste time tuning a non-representative system. And it just may save you some embarrassment!

 

 

Always measure your bottlenecks!

 

 

Keep watching The Server Room for information on the other 8 habits in the coming weeks.

 

 

 

First the supply side. Looking at the growth projections in compute capacity delivered by Intel based servers ( with 45nm silicon, more cores and more efficiency ( enabling density)), data centers have the potential to increase compute capacity by 40x over the next five years. <pause to let 40x sink in> Yes, I said 40 times the capacity in the same space and power footprint. My initial reaction to this thought was ‘whoa' or better stated ‘whoa with several expletives'. Does this mean Intel, and the rest of the server market, will sell fewer bits? Employment wise I selfishly want the server business to grow.

 

As I began to explore the other side of the economy - the demand side - I began to relax, maybe even get a bit optimistic / excited. Let's start with the trends. Multiple market indicators show data volumes doubling every year. Of course this is not uniformly distributed, but on average that is a potential 32X increase in data over the next five years. <another moment to ponder what you will do with 32 times as much data>. This alone is probably enough to consume my 40x growth, but when I add the other magnifying trends, it will blow past all my capacity estimates. For example if user population is growing at 10%/year, we boost the 5 year growth in capacity demand to something near 50X, actually 51, but these are all calculations worthy of a napkin. 50x is bigger than 40x, but there is more. I am not sure how to quantify all other factors, like the expanding desire of the business to do more. From the customers I have spoken with I get the sense that most businesses are still pushing IT for more value through faster decisions, faster BI, etc. If this only adds 5% capacity demand per year, we are suddenly knocking on the door of 65x. <another moment to ponder 65x capacity demand> That is 65 times as many transactions! Not 65% more but 65 times more.

 

So, using all the best bits money can buy, many data center managers will still run out of capacity. The server business still looks good. Whew.

 

How to meet that capacity gap? Is it time to break ground on new data centers? Maybe, but maybe not... I mention some alternatives here, and plan to keep poking at ‘ways to avoid data center capital'.

 

 

 

As a follow-up to my first post on the 10 Habits of Great Server Performance Tuners, this post focuses on the first habit: Ask the Right Question.

 

 

6 years of performance work have taught me to start all my projects with this habit. Before I explain the kinds of questions I ask, let me demonstrate why this is important. Here are some example undesirable outcomes of performance tuning:

 

 

 

  • You spend months of experimentation trying to match a level of performance you saw reported in a case study on the internet, only to find out later that it used un-released software you can't get yet.

  • You spend months optimizing your server for raw performance. As part of your optimization you fully load it with the best available memory and adapters. Then you find out that your management/users would have been happier with a lower level of performance but a less costly system.

  • Your team works hard to maximize the performance of your application server for the current number of users you have, but makes decisions that will result in bottlenecks and re-designs when the number of users increases.

 

The outcome we are all hoping for with our tuning projects is that we provide the best level of performance possible within the budgetary, time, and TCO constraints we have. And of course, without sacrificing any other critical needs we'll have for our server, either now or in the future. Since performance optimization can take a lot of time and resources, consider the following questions before embarking on a project:

 

  • Why are you tuning your platform? (This helps you decide the amount of resources to dedicate.)

    • As part of this question, consider this one: How will the needs and usage models for this server change over the course of its life?

  • What level of performance are you hoping to achieve?

  • Are your expectations appropriate for the software and server system you are using?

    • In determining if your expectations are appropriate, refer to benchmarking results or case studies where appropriate and make sure any comparisons you make are apples to apples!

    • A corollary to this question is: is the server being used appropriate for the application being run?

  • What qualities of your platform are you trying to optimize: raw performance, cost/performance, energy efficiency (performance/watt), or something else?

  • Is performance your top priority for the system, or is scalability, extendibility, or something else a higher goal?

 

Thinking about the answers to these questions can help you navigate the trade-offs and tough decisions that are sure to pop up, and will help make your tuning project successful.

 

Keep watching The Server Room for information on the other 9 habits in the coming weeks.

In a prior post I argued that a lot of the work happening in your data center could probably be done someplace else. One of the counter arguments to this approach is the potential loss of the competitive advantage achieved by owning your compute resource, especially where your competition can not or does not own a parallel resource. There may be some situations where this is true, but in most situations external resources (ex: Cloud Computing) can actually liberate a business from the capital constraints of building a private compute center. If compute capacity delivers a competitive advantage, external availability provides scale to the limits of what an organization use. Like any other resource, the trick is in using it effectively. Ability to take advantage of this resource will be a future differentiator for compute enabled companies. One of my favorite sound bites was an estimate in "information week" stating that a one-millisecond advantage in trading applications could be worth $100 million a year to a major brokerage firm.

 

Taking advantage of the computing cloud starts to look a lot like the fabled utility computing architecture. Utility computing is real, but Gartner* still places it on decent into the "trough of disillusionment". I agree, and broad availability of utility computing is still a few years out. That doesn't mean IT managers should be waiting.

 

 

Why does Intel care? Will processor type matter in this emerging utility era - in the era of hosting, SAAS, and clouds? My short answer is yes. I think Intel has the right products and roadmap to be "platform of choice" in the evolution to utility. My rationale for this position comes from the behaviors of companies doing leading work in these areas. It turns out that service providers want the very best value, where value is measured as a combination of performance, performance / watt, performance / $, platform efficiency, support for virtualization, management, and security. I.E. pretty much the same stuff that every data center manager should value. Intel has focused server platform evolution toward delivering platform leadership in, efficiency, virtualization and performance. Success in these three pillars ensures continued leadership in the data center. Beyond these pillars, Intel is also working with the software ecosystem to enable effective integration and optimization of the rest of the solution stack. The combination of technical leadership and a shared core architecture that spans mobile, desktop, and servers gives Intel a distinct advantage in utility computing.

 

 

Every now and then a colleague, customer or acquaintance sends me a link to an article or blog that usually either features our products or those from one of our competitors.  More often than not I get a lot of repeat sources (The Register, The Inquirer, CNET, etc…).  The blog that comes my way most often is one from George Ou at ZDNet.  One of his most recent blogs (A comparison of quad-core server CPUs) shows a bunch of our latest quad core CPUs and how they stack up against our previous versions as well as those from AMD.  I won’t rehash the article here aside from saying it was positive for Intel and to say AMD’s issues with their quad core processors have been well documented.

 

 

 

Is Intel winning now because our products are superior?  Are we winning because our competitor is struggling?  Do these benchmarks mentioned in George’s blog tell the whole picture?  As you can imagine we constantly ask ourselves these questions and many more internally.  Our conclusions are that for processors and server platforms, as long as we provide leadership along several key vectors then our market share and overall market position will improve. 

 

 

Manufacturing process, processor architecture, system architecture, cache size.  These are four critical vectors that we have direct control over when we are making design and enabling decisions.  At times in our past and in the present we have had leadership on all four.  In those times we have won hands down.  There have also been times where a competitor has chosen to focus on one or two vectors and that has led to their products being better for a specific area.  The four vectors above are things that Intel focuses on but we always have to keep an eye on what end user value they deliver. 

 

 

Our customers tell us they care about three main things; Price, Performance and Power.  The three P’s.  George’s blog shows that for one of the P’s (Performance) Intel has leadership, particularly on integer and floating point.  There are similar looking examples for database, virtualization and pretty much any performance benchmark we have looked at recently.  Thankfully for Intel, Performance is the “P” with the strongest correlation to success in the server market from a MSS perspective.  We are also doing some amazing things with regard to Power.  Some have been launched already and some will be coming soon with new products in 2008.  The market is segmenting and we now make CPUs, chipsets and networking components that help OEMs build platforms targeted at high performance computing, mainstream enterprise, blades, workstations and emerging markets.  Each has unique requirements with respect to the three P’s and one size no longer fits all. 

 

 

I believe that overall George’s blog highlights the success that we are having today.  I also think that there will be a steady stream of innovations that will be delivered in 2008 and beyond that will cause us to rethink how we deliver performance at the most efficient power level for the best possible price point.  Virtualization, utility computing and charge back models for datacenter environments are all stepping up to take center stage.  We all must innovate or become irrelevant…technological evolution waits for no one.

 

 

Shannon

Filter Blog

By date:
By tag: