The Server Room Blog

70 Posts
1 2 3 ... 5 Previous Next
0

Part four of three

Hopefully if you are watching this, you have already seen the first three installments I did on surviving data center crisis. A quick recap, the premise ( aka crisis ) is, You are running out of capacity.

According to Green Tech World, TMC 2007 "81% of IT mgrs will exceed capacity for power or space in the next 5 years".

In the first three video segments I spoke to three complementary approaches, that taken together could give you as much as 50X the data center capacity in your existing power and space .

Summarizing:

Data Center Crisis - How to Survive... Refresh with todays advanced high performing servers
Data Center Crisis - Part 2 - Using Virtualization... Virtualize and Consolidate
Data Center Crisis - Part 3 - Getting Dense- Use every Watt

Today I want to address two follow-up questions:

One, Where to go next when I used up all this new capacity?
Two, Who can help me get there?
The answers, it turns out, are related.

Moving outside the box is the 4^th^ strategy, and like the other strategies, it can be used anytime, in complement with the other three strategies.

Step to outside the boxness:

outside the box2.jpg
Moving outside the box allows it manager to move work that can be efficiently run elsewhere ( things like email ) outside the data center, and focus on the highest business value or least movable work inside.

As to who can help you get here. The system integrator/IT Outsourcer community offers support in all four strategies I have outlined.

My recommendation is to examine your situation, and your growth projection, and create a plan using all four strategies that will preclude the major capital expense of data center construction. Avoiding that 10 to 50 million dollar capital hit should be a very compelling proposal.

0 Comments Permalink
2

Have you ever asked yourself that question when you are bombarded with marketing messages from multiple different companies on why choose their products vs. a competitors product?. As a non-Engineer in an engineer centric company, I certainly have thought about this several times and asked myself a very simple question - Why should I choose one architecture type over another offering?

I suppose the best place is to start at the beginning and try and decipher the acronym soup of RISC, x86 etc. I decided to use my ‘old friend’ Wikipedia http://www.wikipedia.org/ to help with this process. What I found was another alphabet soup that I could have researched for hours, but try and simplify it below. I attach my detailed definition findings at end of this blog.

Simply put, RISC (pronounced risk) is a CPU design to use simplified instructions to execute very fast thus providing higher performance. x86 is a generic term that refers to the instruction set of another CPU architecture. So basically both RISC and x86 are types of instruction sets linked to CPU architecture.

So which one should I choose?.
Call me old fashioned, but as a business guy, it always comes down to 3 basic tenets in terms of making a decision
1) I like choice and the ability to pick and choose between multiple suppliers to get the best deal to meet my needs.(and the ability to change supplier without major obstacles)
2) Performance is really important. The higher performance means that I get my work done quicker which reduces the overall cost / improves time to revenue and ultimately improves the productivity of my business
3) System cost and total cost of ownership are key decision points in today’s era which is vastly different from the ‘dot.com’ boom. It is all about managing the bottom line through good decisions around CAPEX and OPEX spending

I applied my decision criteria and quickly found out that there is not a lot of choice from a hardware and operating system perspective with RISC architecture. In fact it looks quite the opposite of choice which always concerns me, call me pro-choice if you like, but I like the ability to move around suppliers!. On the other hand I found x86 to have lots of choice with many hardware vendors to list and a range of operating systems from windows to Linux and Solaris.

Having choice out of the way, I then moved onto performance for my business and looked at published results from many hardware vendors on different websites like http://www.spec.org. what I found was that Intel based systems had a lot of leading results against architectures like SPARC from SUN or Fujitsu and POWER from IBM.

I then looked at price (and being an ex-Accountant in my past career) nearly jumped for joy when I saw that system prices were low for x86 systems compared to the comparable RISC systems.

This analysis helped me understand it better and helped simplify my decision making.

Here is a short video with a little bit more detail. I would be interested in your thoughts and have you had any similar experiences that you would like to share.

2 Comments Permalink
0

I have visited a number of customers recently. The discussions are usually straight forward where I provide them with a download of our current products, I tell them about things that we are doing in the future and along the way I ask them some questions about trends that they are seeing with their businesses. It will come as no surprise that enterprises are trying to keep up with their current requirements while also squeezing out increasingly flat or dwindling budgets to do something new. Many are turning to virtualization as a way to do more.

So who cares? CFO's care. I went out to visit a leading Fortune 500 company based on the West Coast of the US. Keep in mind I am planning to discuss our server platforms, why I believe they are leadership on performance and power and also all of the great new virtualization features we have recently introduced or will intro in the future. Before we get started they proudly walk me through their new datacenter and I stop in front of a rack that has two servers in it. Two 2U two processor servers. It is right next to another rack that has four servers in it. I inquire as to why both racks are only partially full and I receive a response that says one is owned by Finance, one is owned by a business unit. IT just manages them. You can look at this two ways. The glass half empty way would be that they are wasting an incredible amount of datacenter space and they are hopeless. The glass half full way would be that this is a great opportunity to really deliver value to this company's bottom line by first convincing them that physical consolidation (full up their racks) is important, then showing them a path toward application consolidation and finally sharing a vision of datacenter virtualization that includes compute, storage and networking. Their CFO will care.

IT employees care. One theme that seems to be coming through loud and clear is that people who drive some form of virtualization are usually considered as innovators or leading edge thinkers within their company. I have heard the term "IT Hero" to refer to someone who has delivered on a high ROI project, usually these days through the use of virtualization. I have met a number of IT folks at conferences and during visits and it is uncanny how many are trying to dig for more product information and how eager they are to hear about what new features we're putting into CPUs, chipsets, networking devices. A quick search of Youtube found this case study (here) that sums up the sorts of things I have heard.

It is also increasingly important that all of this stuff works well with the software, VMM and OS vendors product offerings. I know we are working closely with all of the ecosystem players because if we come out with an amazing new feature in our components it would be wasted if the VMM, OS or software didn't take advantage of it. There is some interesting banter here (here) about some of the pros and cons with virtualization. We are busy working on features that improve the performance and simplify the experience end users have when they virtualize. Why do you care about virtualization? What are you doing today that you couldn't do a year or two ago that has been made possible because of virtualization related technology?

0 Comments Permalink
2

As part of the Sun Microsystems and Intel alliance, the two companies have collaborated to bring open source Threading Building Blocks (TBB) support to the Solaris Operating System (OS) and Sun Studio software toolchain. Check out the SUN Blog for additional information. Click the video below for a short interview with Deepanker Bairagi, Principal Engineer for the Sun Studio.

Software parallelism can unleash the processing power that the newer multi-core architectures provide, including the Quad-Core Intel® Xeon® processors. For developers, multithreading offers a software parallelism model, but many existing solutions require a lot of low-level coding. Threading Building Blocks offers a rich approach to expressing parallelism in a C++ program by offering higher-level, task-based parallelism that abstracts platform details and threading mechanism for performance and scalability.

The Solaris OS is able to take advantage of multicore architectures, including the Intel Architecture, with features such as a lightweight processes (LWPs), load-balancing across cores, and processor affinities. Sun Studio software offers a complete integrated toolchain for Solaris and Linux platforms, including parallelizing compilers, performance and thread analysis tools, memory and code debuggers, NetBeans-based Integrated Development Environment, and more.

Combined with Threading Building Blocks, developers for the Solaris platform now have a fully loaded toolbox that simplifies the development of optimized multithreaded applications for multi-core Intel processors. Click here to learn more about Threading Building Blocks and optimizing performance for multi-core processors.

Would like to hear from the community on how you see this impacting the next generation of software development for Solaris running on Intel Architecture.

2 Comments Permalink
1

Hi all, I just found out about this new site, check it out here: http://www.intel.com/references/

1 Comments Permalink
7

Yes, Interop has Virtualization training. It seems to be everywhere these days. The question is, how much quality is in the quantity?

Well, I am going to find out.

I am scheduled to attend Interop next week (April 28 - May 2) and am signed up for over a dozen classes/sessions that have to do with Virtualization. Here is a sampling;

  • The ABC's of Virtualization: A shortcut Guide to Virtual Technology
  • Virtualization and Security
  • Virtualization beyond Consolidation; Driving down OPEX, Not just CAPEX
  • Virtualization's Phantom Menace: Security
  • Planning the move from physical to virtual: Migration and Deployment
  • Storage Virtualization: What, Why, Where and How?
  • Virtualized Data Centers - Beyond the Virtual Sum of Virtual Parts
  • Microsoft's New Virtualization Strategy
  • One for all and all for Xen

Here is the official Virtualization Track site for the event.


I'll post updates along the way... keep your browser running so you don't have to warm it up again.

;o)

7 Comments Permalink
1

After coming back from IDF a couple weeks ago, I've had some time to go through the mountains of online material, presentations mostly and a few interesting videos. This video is from Pat Gelsinger's keynote address and features Mendel Rosenblum from VMware. Pat and Mendel discuss new technologies in virtualization and demonstrate "Flex Migration", just hit the play button below to view...


This is very interesting for those IT shops with multiple legacy platforms and new generation servers coming online. We will have more discussion on this topic in the future, and so in the meantime, let us know if you have questions on how this could benefit your datacenter.

1 Comments Permalink
0

45nm and Beyond

Posted by C_Peters Apr 23, 2008

Technology moves at such a rapid pace - it can often be mind-boggling. Even working directly with the product teams at Intel, I sometimes have difficulty keeping pace. The good news is that there is a tremendous opportunity today to be captured thanks to this rapid innovation, as well as a steady stream of advanced technology that IT can use to better support business and gain a competitive advantage. Recently I was interviewed by Tim Phillips from the Register about the current 45nm Quad-Core Intel Xeon products and the next generation Intel platforms based on the Nehalem processor.

A few years back, Intel fundamentally changed the way we design and develop our underlying micro-processor technology. We streamlined our innovation and accelerated it's pace. Internally, we call this new model Tick-Tock. I like to call it shrink and innovate.

A "Tick" is a manufacturing process shrink that delivers smaller silicon with higher speeds, more transistors and lower power consumption (example: moving from 65nm to 45nm process technology). The 45nm quad-core xeon processors (available since Nov '07) utilize unique materials (a high-k, dielectric) that are delivering industry leading performance / watt as measured by the industry's first and only standard benchmark, SPECPower
A "Tock" represents a more extensive architectural innovation (ex. Intel Core Microarchitecture) introducing new micro-architecture features and functionality fully utilizing the higher transistor count set up by the shrink. For Intel Xeon-based servers, the next "tock" is Nehalem. In addition to the new micro-architecture based on 45nm, a system re-design will incorporate next generation memory, I/O and virtualization technology for high performance, high bandwidth solutions compatible with today's leading software solutions
Listen to my podcast interview to learn more about the benefits of using today's products and the timing of next generation Intel technology featuring Nehalem. Is this information useful to you? If so ... how? Have any questions?

I'd be happy to hear from you. Chris



0 Comments Permalink
1


Here's the 4th follow-up post in my 10 Habits of Great Server Performance Tuners series. This one focuses on the fourth habit: Know Your BIOS.

http://communities.intel.com/openport/servlet/JiveServlet/downloadImage/1357/IMG_2318-noExif.jpg

My last blog talked about beginning your system tuning by consulting a block diagram. The other thing you should always look at is your system's BIOS. Many server BIOSes these days allow you to configure options that affect performance. Like everything in the performance world, which set of BIOS options will be best will depend on your workload!

First things first, how do you find this "BIOS"? Most servers have a menu called "Setup" (or something similar) that you can access while the system is booting, before it starts loading the operating system. This "Setup" menu allows you to access your system's BIOS. Changes that you make here will affect how the operating system can utilize your hardware, and in some cases how the hardware works. If you change something here, you usually have to reboot and then the change will "stick" through all future reboots (until you change it again). As platforms grow increasingly sophisticated, they are offering a widening array of user-configurable options in Setup. So a good practice is to examine all the menu options available whenever you get a new platform. Here are some of the most common options on Intel platforms that could affect performance:

  • Power Management - Intel's power management technology is designed to deliver lower power at idle and better performance/watt (+without significantly lowering overall performance+) in most circumstances. There are 2 types - P-States, which attempt to manage power while the processor is active, and C-States which work while the processor is idle. In some BIOSes, both of these features are combined into one option which you should enable. In other cases they are separated. If they are separate, here's what to look for:
    • Intel EIST (or "Enhanced Intel Speedstep" or "Intel Speedstep" or "GV3" on older platforms) - This is the P-State power management that works while the processor is active. Leave it enabled unless directed to change it by an Intel representative.
    • Intel C-States - If you have this option or something similar, it is referring to the power management used when the processor is idle. Enable all C-States unless directed by an Intel representative.
  • Hardware Prefetch or Adjacent Sector Prefetch - These options try to lower overall latencies in your platform by bringing data into the caches from memory before it is needed (so the application does not have to wait for the data to be read). In many situations the prefetchers increase performance, but there are some cases where they may not. If you don't have time to test these options, then go with the default. Intel tests the prefetch options on a variety of server workloads with each new processor and makes a recommendation to our platform partners on how they should be set. If, however, you are tuning and you have the time to experiment, try measuring performance using each of the prefetch setting combinations.

There are several other options that might affect performance on specific platforms. Some examples might be a snoop filter enable/disable switch, a setting to emphasize either bandwidth or latency for memory transactions, or a setting to enable or disable multi-threading. In these cases, if you don't have time to test, use your Intel or OEM representative's suggestion or go with the default setting.

Being familiar with how your system's BIOS is configured is another basic component of system tuning.

Keep watching The Server Room for information on the other 6 habits in the coming weeks.

1 Comments Permalink
0

Kirk was out at the Microsoft Server 2008 and talked about the "data center of the future". He discussed his thoughts on the data center of the future with some particularly interesting tidbits on the predictive enterprise, the world of Tera, emerging technologies and goings-on in Intel's IT shop.


Have you heard of the "predictive enterprise".? If you want to know more let me know as it is a very interesting topic.

0 Comments Permalink
0

Dynamic Power Management Has Significant Values - a Baidu Case Study
Jackson He, Intel Corporation
We have just completed a proof of concept (POC) project with Baidu.com, the biggest search portal company in China (60+% market share in China), using the Intel® Dynamic Power Node Manager Technology (Node Manager) to dynamically optimize server performance and power consumption to maximize the server density of a rack. We used Node Manager to identify optimal control points, which became the basis to set power optimization policies at the node level. A management console - Intel® Datacenter Manager (Datacenter Manager) was used to manage servers at rack-level to coordinate power and performance optimization between servers to ensure maximum server density and perform yield for given power envelope for the rack. We have shown significant benefit from the POC and the customer like the results:

  • At a single node level, up to 40W savings / system without performance impact when a optimal power management policy is applied
  • At rack level, up to 20% additional capacity increase could be achieved within the same rack-level power envelope when aggregated optimal power management policy is applied
  • Comparing with today's datacenter operation at Baidu, by using Intel Node Manager, there could be a rack density increase 20~40% improvement

Some background of the technologies tested in this POC:

Intel® Dynamic Power Node Manager (Node Manager)

Node Manager is an out-of-band (OOB) power management policy engine that is embedded in Intel server chipset. It works with BIOS and OS power management (OSPM) to dynamically adjust platform power to achieve maximum performance/power at node (server) level. Node Manager has the following features:

  • Dynamic Power Monitoring: Measures actual power consumption of a server platform within acceptable error margin of +/- 10%. Node Manager gathers information from PSMI instrumented power supply, provides real-time power consumption data (point in time, or average over an interval), and reports through IPMI interface.
  • Platform Power Capping: Sets platform power to a targeted power budget while maintaining maximum performance for the given power level. Node Manager receives power policy from an external management console through IPMI interface and maintains power at targeted level by dynamically adjusting CPU p-states.
  • Power Threshold Alerting: Node Manager monitors platform power against targeted power budget. When the target power budget cannot be maintained, Node Manager sends out alerts to the management console

More detailed findings from this POC are published in Intel Dynamic Power Node Manager POC with Baidu. We'd love to hear your comments and questions about this POc and Intel Dynamic Power Management Technology.

0 Comments Permalink
0

This is part three - the implication being that it is a sequel to part one and part two. It is. That said, each of the sections have their own messages and may or may not help your data center. The first part talked about the benefits of bringing in the latest hardware. Intel has been delivering performance increases at a pace beyond "Moore's Law". Getting rid of old, slow, inefficient servers can give you 2-12 times the capacity instantly. The second "episode" talked about getting everything you can from each server. Use virtualization and consolidation to make sure your servers are full and busy. The most efficient bus is a full bus ( this is a metaphor, I am talking about the big yellow things carrying students, not the circuitry in the box )

My focus in part three is on density. My operating premise is that the data center manager wants to get everything out of the current data center and avoid, or at least defer, construction of a new data center. If your in the data center construction business, this is not for you.

To get the most out of our data center we want to pack every server we can power into the space. You can do this by executing three actions. 1) Use every watt, 2) Build the right servers, and 3) Optimize HVAC. In many cases twice the servers can be crammed into the existing rack space even without adding power. If you are able to redirect your hvac power savings to your racks, your results could be even better.

So, we potentially got 5x capacity from new quad core servers, 5x capacity from boosting utilization with consolidation, and 2x capacity with higher density. My math says 5x * 5x * 2x = 50x the capacity ( in the same space and power!) video

0 Comments Permalink
0

Here's a good primer animation on Virtualization.

0 Comments Permalink
2

In part one of this "series" ( ok, mini-series) I spoke about the benefits of Server refresh. It is pretty huge for most installed servers. In many cases an IT manager could see a 5x jump in compute capacity by replacing depreciated servers. If these are older single core processor based servers, the number is probably even greater. Hopefully a 5x increase in capacity can push out your data center construction needs.

My next recommendation revolves around virtualization, or more specifically consolidation through virtualization. You can skip the words now and jump to the video below.... but since you are still reading, here is an intro to the video. I have seen a lot different data on "enterprise server utilization" but most of it pegs the meter at 10-15% utilization for volume landscape servers. ( By the way, that is a low number, not something to be proud of) Now, if you follow my advice and replace all these less-efficient older servers with cutting edge high efficiency Intel quad core machines, on a one for one basis, you are going to see some pretty un-pleasant utilization. Think single digit. In a nutshell, it is time to virtualize and consolidate. If you both virtualize and carefully manage and balance your workloads, it is reasonable to expect another 5x capacity boost through improved utilization. AND 5x*5x=*25x* more capacity ( in the same space and power!) (Try out the Intel consolidation calculator) vid 2

2 Comments Permalink
1

InfoWorld recently published some pretty scary data on the data center crunch: exerpt: "Forty-two percent of the respondents said their datacenters would exceed power capacity within 12 to 24 months unless they carried out expansion. Another 23 percent said it would take 24 to 60 months to run out of power capacity. The managers reported similar figures for cooling: 39 percent said they would exceed cooling capacity in 12 to 24 months, and 21 percent said it would take 24 to 60 months. "

I have done a series of blog entries on the topic: Almost Free Data Center Capacity and Big Numbers in the Data Center - The Data Tsunami

In these I have focused the solution ( or at least treatment) for data center pain on three strategies - Refresh, Virtualize, and Densification. I don't think I have used the word densification in a sentence before, but spell-check says it is real... For those who prefer a mixed media message, I agreed to record a series of short videos talking about the each approach and benefits for these strategies. Starting with the video on refresh.


The next two - virtualization and densification, will be posted soon.

Thanks for tuning in.

1 Comments Permalink
1 2 3 ... 5 Previous Next