1 2 3 Previous Next

The Data Stack

35 Posts authored by: K_Lloyd

Re-introducing myself, in case you didn’t feel like clicking on me to see what I look like, I am an ETS(Enterprise Technical Specialist) for Intel.  What the rest of the world would call a Sales Engineer.  I cover the fortune 2000, local and state education and government and healthcare in NW North America.  I have a pretty solid technical background but IT is a big place.  I manage to hold my own on topics Intel, but I also find that with every customer meeting I learn something, sometimes I learn a lot.

 

Since I started posting articles, I have been positioning Xeon as the logicalsuccessor to Risc ( IBM Power/AIX & Oracle Sparc/Solaris) based systems.  This seems like a good thing for Intel, and it is, but it is also a good thing for the customer.  I have posted several entries on Risc Migration where I have tried to address challenges customers might consider.

Recently I find my role has flipped.  I am no longer proselytizing to customers on the benefits of Unix migration, but instead I am being asked for any information on how to get there faster.  Blame it on the economy, the collective IT zeitgeist, or credit my persuasion – whatever the cause, my customers seem to have internalized the message.  I am working with one of my last Power/AIX purchasing holdouts to choreograph their journey to Xeon.

 

I often get the question “what size server for my XYZ application”. This can be tough to answer for a couple of reasons, and I hate responding with “it depends”.  Benchmarks are ok, but at best they give you a rough relative comparison of a specific use of an application or code.  Virtually every published server benchmark has current Xeon results, but for many benchmarks the Risc vendors just don’t publish.  I guess if you can’t say anything nice…

 

For SAP, a common app, there are generally benchmarks available. I usually recommend the SAP SD 2 tier scores for comparison.  As of this writing, the best published four socket Power 7 score is about 25% higher than the best published four socket Xeon E7 processor based system.  The big difference is in the system cost and support cost.  Intel based systems can cost as little as 1/5thof a comparable Power 7 platforms.  I think every company has developed expertise in operating Xeon environments and the operations, support, and licensing costs are well understood.

 

Frequently the target “XYZ” applications are databases.  It would be great if everybody chose to publish TPC-C, TPC-E, TPC-H, but many times the benchmarks just are not available, or if they are they cover different database products.  Customers ask me to clarify how we stack up as a database platform, but without published results on Power 7 there is little I can say.  My strong preference and recommendation to any migration evaluation is to “Run your own benchmarks”.  I have built test harnesses and benchmark tests, and I know it is hard.  To really understand how your application will perform on yournetwork configuration, with your storage architecture, running yourdata – there is no substitute.  Remaining questions on performance, migration/porting, and architecture can be answered, or at least accurately projected.  I have yet to have a customer run their own benchmarks then choose a Risc/Unix platform.  The potential ROI makes Xeon the logical, and most defendable, choice.

Almost a year ago I posted an article about why the time is right for cloud computing.

 

In that post, I spoke a lot about the changes that made the cloud an interesting option.  I will stop here to define my terms (note: I did not say define the terms, but define my terms as I am using them, at least for today).  For the next minute or so, Cloud is an environment I can host some of my business compute functionality where I retain management and control of the "Applications" and “Servers”.  Cloud means just about everything to somebody today...

 

 

jonimitchell.png

Here is where I say "I've looked at the cloud from both sides now," but then I get that song by Joni Mitchell stuck in my head for the rest of the day, so I am not going to say that – no way.  This also is a pretty good indicator of my age demographic when a Joni Mitchell song can get stuck in my head.

 

 

Moving on, the key idea in my earlier post was that virtualization has changed the game.  Virtualization provided a container that made the future of cloud technology possible. Intel has done a lot to make virtualization better. With the myriad of technologies ( VTx, VTd, VTc, …) layered into the processor, chipset, network adapters, etc Intel made it possible to virtualize everything.  With overhead as low as 4-6%, why not virtualize every server?

 

Finally, I want to talk about some of the other “barriers” to cloud adoption.  Virtualization made it possible, but there are reasons not to play there today; namely safety/privacy/security.

 

The first Intel technology I want to mention is AES/NI (an oh-so-clever engineering driven name).  AES/NI are a set of new instructions supported across all current Intel Xeon processors.  These instructions are called by encryption/decryption algorithms to improve encrypt/decrypt performance by as much as 400%.   What this enables for the folks counting coins and running servers and applications is an end to the encryption trade off.  If encrypting databases uses an extra 10-15% of my server, I might sweat the cost benefit before I click the encrypt checkbox.  With encryption pushed down to 2 or 3%, it is a no brainer.  Safer is better and I can afford to encrypt everything.  Even if someone/thing gets access to my data on disk, it will look like this #$%^&*()_ :).  Well, not exactly, but it will not be valuable.  AES/NI delivers the encryption performance to eliminate encryption cost benefit gambling.


The second technology that will make clouds “safer” is Intel TXT ( aka Trusted Execution Technology).  Here is Ken’s explanation of TXT and its benefit:  In a non-virtualized world, you load a series of applications onto your server.  The operating system has various rules about what code can see what, and what codes touch certain bits of memory.  This is good enough for most businesses, and as long as they have control of the operating system and take appropriate steps to prevent OS corruption, they feel ‘reasonably’ safe with their software jewels on the server.  In reality the hardware has access to “everything” but hacking the processor and chipset have to date been sufficiently difficult to make this situation “good enough”.

 

Then, along comes virtualization.  In a virtualized environment I can still have that sense of blissful safety in my management and control of my operating system in My VM.  The issue comes in what is under my VM.  Instead of raw Iron (silicon and microcode) there is a hypervisor.  This hypervisor is a chunk of software that has God-like access to anything in any of the VMs it controls.  Actually, it is a lot like the hardware in the non-virtualized example.  The issue is the “soft” part of software.  A hypervisor could be corrupted.  It's not trivial and not common but quite possible.  This is what TXT was built to address.  TXT “measures” the boot of the hypervisor and can assure that this critical chunk of software has not been deflowered.  TXT enables a VM owner to Trust that the hypervisor has not been corrupted, and therefore trust the cloud platform.

 

With VT, AES, and TXT Intel has made the cloud explosion possible.

I think it was the mid 80's when McDonald's advertised the "McDLT" (I loved that music). The claim to fame for this 'burger' was the packaging.  It was all about separation by temperature.  The hot meat separated from the cold and delicate lettuce to be joined sometime later by the consumer. At that point, my burger purchase to consumption delta t was about 5 seconds, and I didn't really benefit from the separation. I never bought one...

 

http://3.bp.blogspot.com/_f-W4oMS8xcI/SdA67AoAxLI/AAAAAAAAAAw/KjRt7hd6oog/s320/McDLT.JPG

 

25 years later, I look at a customer data center and I say "you need to keep your hot side hot and your cold side cold". Then, I inexplicably (to the customer) chuckle. The stuff we remember...

 

But, I was correct - you really do want to keep them separate.  I started digging around on the Internet and I found this is a good method for data center efficiency - It is a solid intellectual discussion of the benefits.

 

I do not advocate anybody's solution, but the benefits of separation seem obvious.

 

Separating your hot and cold air streams optimizes your use of cooling and fan energy. Separation also makes it possible to adopt all kinds of cool (pun intended) energy saving alternatives.

With the hot and cold streams separated, it becomes possible to:

  • Inject cool outside ambient air in to the hot stream - free cooling
  • Completely vent hot to the outside and pull outside air into the chillers ( that may not need to do any chilling much of the year)
  • Use that heat to make living space warmer - supplement heat plant for office space
  • Use available coolants like water to pre-chill the hot stream and so many more.

 

Lastly, it is relatively easy to achieve. The barrier need not be perfect; heavy plastic curtain can be a cheap resource to isolate the air flows (think freezer sections in some groceries).

 

Virtually every customer I speak with knocks on the door of power, space, or cooling constraints.  Hot aisle/cold aisle separation can go a long way to reduce the cooling problem. Fortunately, I also have a solution to the power and space problem! I'll save that for my next post!

I have been on a theme as of late with posts related to legacy migration.  The majority of the focus has been performance of Xeon vs legacy Sparc and Power, and stability/availability of today's Xeon solutions.  And Wally has been discussing the process of data migration. This post is going to look at the softer side of migration - the people.

 

Every IT manager I have discussed migration with has made it a central point to mention the people challenges in legacy migration.  So putting on my OD ( Organization Development ) hat, I want to share some a BKM I witnessed which delivered a smooth and flawless migration.

 

I see two primary soft barriers to migrating off legacy platforms:

    1. Desire to Win - I root for "My OS/Platform"
    2. Fear to Lose - My job depends on the legacy OS

 

The most successful migrations must deal with both issues early.

 

The first issue is not unlike cheering for "my team".  People want to be right.  If the migration is perceived as 'us and them' they will typically pick 'us'.  This belief creates a cognitive bias whereby information that challenges the superiority of their 'belief' is doubted, and only legacy positive information is accepted.   It is difficult to win this discussion using just facts and data, changing a belief system takes time.

 

While the first issue is mostly perceptual, the second issue can be profoundly real.  Every company today has a mature staff supporting X86 platforms, many with both Windows and Linux teams.  If my value and expertise is Solaris, or AIX - the loss of these environments would make me redundant.  The challenge here is to capture the knowledge, wisdom, and experience of these senior IT professionals without sacrificing their value.  Disgruntled IT professionals seldom deliver successful projects.

 

The best migration story I have ever witnessed was really the result of one person who perceived these challenges and addressed them elegantly.  His understanding of the challenges was matched by his ability to perceive the trend early (circa 2006) and build a long term plan that would optimize the  migration journey.

 

He was in a position to alter the roles of the Unix admins, and in 2007 he had them begin managing a set of Linux servers.  He also gave them Linux desktops, and made ample training and development opportunities available.  The key here was that this was not done in a convert or die scenario, this was done as an skill expansion opportunity.  These are geeky IT pros, like us, and given a new set of toys they dug in and found out how they worked.

 

By 2009 the group that would stereotypically be the harshest critics of legacy migration was actively coming to the manager discussing advantages in performance and cost on Linux-Xeon platforms.   He had created his own advocates from the very group that could have been most resistant.

 

In 2009 he kicked off the first migration projects.  They were a resounding success.  The critics he did not anticipate were the business groups that didn't believe Xeon could be as good as their legacy platforms.  Fortunately, they trusted their admins whom they had worked with for years.  The pilot convinced even the most hesitant that life was better on Xeon ( better performance, lower cost).  By the end of 2011 all legacy platforms will be replaced.

 

I think the things I admire most about this story are the manager's combination of vision and patience.  One persons ability to read the tea leaves and put the pieces in place to make BOTH people & technology successful.

K_Lloyd

How big is my Sparc?

Posted by K_Lloyd Jun 1, 2011

Normally I post some "opinion" or "interesting reference".  Consider this post an open request.

 

Per my earlier posts on Sparc ( Sparc Arrest ) lots of folks are migrating off legacy Sparc to current X86.  Every couple of weeks, an oem, reseller, or sometimes a customer pings me with a question about system sizing.  Normally this isn't typically a bake off head to head with the latest Sparc vs the latest Intel server processors.  This is something like: "How big of a server do I need to replace my SunFire V490 UltraSparc? It is about five or six years old..."

 

It would be great to just key this in to the evergreen super performance tool, if such existed.  Does it exist?  if so please tell me.

 

The reality I have found is that historical performance publications are a sparse matrix.  This is made more sparse by the fact that for many years Sun published very few benchmarks - especially my favorite generic indicator of enterprise performance SPECint_rate_base.   With this reality, each request becomes a combination of archaeology and documented assumptions...  Not my favorite process.

 

In an Intel deck I found the below historical SAP SD Sparc benchmarks.  Maybe not the perfect comparison tool, but at least it is something.

With this and what I can find online via searches and benchmark sites I can usually construct a supportable response.

 

As for the results I deliver, it is almost always a relatively small Intel Xeon server to replace a relatively large legacy Sparc box.  The savings in power, administration, licensing, etc are large. ROI can often be measured in months.  The biggest challenge is usually organizational, but that will be a topic for another post.

 

 

SAP SD Sparc Historical Performance Data (PDFs)

 

  • 4 processors / 32 cores / 256 threads, UltraSPARC T2 Plus, 1.4 GHz, 8 KB(D) + 16 KB(I) L1 cache per core,4 MB L2 cache per processor, 128 GB main memory, Number of benchmark users & comp.: 7,520 SD (Sales & Distribution) Average dialog response time: 1.99 secondsThroughput:,ully Processed Order Line items/hour: 753,000,Dialog steps/hour: 2,259,000,SAPS: 37,650,Average DB request time (dia/upd): 0.098 sec / 0.278 sec,CPU utilization of central server: 99%,Operating System central server: Solaris 10, RDBMS: Oracle 10g,SAP ECC Release: 6.0.   Certification Number. 2008058

  • 4 processors / 32 cores / 64 threads,Intel Xeon Processor X7560, 2.26 GHz, 64 KB L1 cache and256 KB L2 cache per core, 24 MB L3 cache per processor,256 GB main memory, Number of SAP SD benchmark users:10,450, Average dialog response time:0.98 secondsThroughput:Fully processed order line items/hour:1,142,330Dialog steps/hour:3,427,000SAPS:57,120Average database request time (dialog/update):0.021 sec / 0.017 secCPU utilization of central server:99%Operating system, central server:Windows Server 2008 Enterprise EditionRDBMS:DB2 9.7SAP Business Suite software:SAP enhancement package 4 for SAP ERP 6.0. Certification number: 2010012

  • 16 processors / 32 cores / 64 threads, SPARC64 VI, 2.4 GHz, 256 KB L1 cache per core, 6 MB L2 cache per processor, 256 GB main memory ,Number of benchmark users & comp.: 7,300 SD (Sales & Distribution) ,Average dialog response time: 1.98 seconds Throughput: Fully Processed Order Line items/hour: 731,330 ,Dialog steps/hour: 2,194,000 ,SAPS: 36,570 ,Average DB request time (dia/upd): 0.018 sec / 0.041 sec ,CPU utilization of central server: 99% ,Operating System central server: Solaris 10 , RDBMS: Oracle 10g ,SAP ECC Release: 6.0

  • 24 processors / 48 cores / 48 threads, UltraSPARC IV+, 1950 MHz, 128 KB(D) + 128 KB(I) L1 cache, 2 MB L2 cache on-chip, 32 MB L3 cache off-chip, 96 GB main memory ,Number of benchmark users & comp.: 6,160 SD (Sales & Distribution) ,Average dialog response time: 1.99 seconds ,Throughput: Fully Processed Order Line items/hour: 616,330 ,Dialog steps/hour: 1,849,000 ,SAPS: 30,820 ,Average DB request time (dia/upd): 0.018 sec / 0.033 sec ,CPU utilization of central server: 99% ,Operating System central server: Solaris 10 ,RDBMS: Oracle 10g ,SAP ECC Release: 6.0

  • 4 processors / 8 cores / 8 threads, UltraSPARC IV+, 1800 MHz, 128 KB(D) + 128 KB(I) L1 cache, 2 MB L2 cache on-chip, 32 MB L3 cache off-chip, 32 GB main memory , Number of benchmark users & comp.: 1,200 SD (Sales & Distribution) ,Average dialog response time: 1.86 seconds ,Throughput: Fully Processed Order Line items/hour: 121,330 ,Dialog steps/hour: 364,000 ,SAPS: 6,070 ,Average DB request time (dia/upd): 0.044 sec / 0.035 sec ,CPU utilization of central server: 97% ,Operating System central server: Solaris 10 ,RDBMS: MaxDB 7.5 ,SAP ECC Release: 5.0

  • 104-way SMP, UltraSPARC III, 1200 MHz, 8 MB L2 cache, 576 GB main memory, Number of benchmark users & comp.: 8,000 SD (Sales & Distribution) ,Average dialog response time:  1.81 seconds , Throughput:  Fully Processed Order Line items / hour:       813,000,  Dialog steps / hour:  2,439,000 , SAPS:  40,650, Average DB request time (dia/upd):  0.067 sec / 0.045 sec , CPU utilization of central server:  97% , Operating System central server: Solaris 9 , RDBMS:Oracle 9i , R/3 Release: 4.6 C,Total disk space: 1,818 GB

  • 36-way SMP, UltraSPARC IV, powered by Chip Multi-Threading technology (CMT), 1200 MHz, 192 KB L1 cache, 16 MB L2 cache, 288 GB memory , Number of benchmark users & comp.: 5,050 SD (Sales & Distribution) , Average dialog response time: 1.72 seconds ,Throughput: Fully processed order line items/hour: 517,330 ,Dialog steps / hour: 1,552,000 ,SAPS: 25,870 ,Average DB request time (dia/upd): 0.050 sec / 0.056 sec ,CPU utilization of central server: 98% ,Operating System central server: Solaris 9 , RDBMS: Oracle 9i ,SAP R/3 Release: 4.70 ,Total disk space: 3,816 GB

  • 72-way SMP, UltraSPARC IV, 1200 MHz, 128 KB(D) + 64 KB(I) L1 cache, 16 MB L2 cache, 576 GB main memory, Number of benchmark users & comp.: 10,175 SD (Sales & Distribution), Average dialog response time: 1.95 seconds, Throughput: Fully processed order line items/hour: 1,021,330, Dialog steps/hour: 3,064,000, SAPS: 51,070, Average DB request time (dia/upd): 0.060 sec / 0.074 sec, CPU utilization of central server: 98% Operating System central server: Solaris 9, RDBMS: Oracle 9i, SAP R/3 Release: 4.70, Total disk space: 3,816 GB
K_Lloyd

AMT in Workstations

Posted by K_Lloyd May 31, 2011

Finally! Out of Band Management!

'Real' ( read as Xeon) workstations have been the unsupported crossbreeds between servers and clients.

For years servers have been managed using a "base management controller".  In the OEM lexicon this includes technologies like Director, ILO, iDrac, ...

In clients( desktops and laptops)  this need was filled by Intel based systems with vPro which have AMT - Active Management Technology.

 

In either of the above, out of band management allows remote device management at the hardware level.  This is fundamentally different than what you can do with a software agent.  OOB management allows you to remotely do low level functions - lilke power on, power off, reboot, format, partition, bios, etc... all of which are not exposed through an agent that runs on top of the operating system.

 

OOB management is a critical tool for server management, and with vPro is becoming a critical tool for client management as well.

 

Up 'til "now" OOB has been missing on workstations.

 

With the E3 family and C206 chip set Intel first introduces AMT into the workstation family.  This will continue into the two socket space when the 'Sandy Bridge' products launch this fall!  This is seriously exciting. Customers can use the same tools to manage fleets of Xeon based workstations as they do to manage their vPro laptops and desktops!

 

OOB management can dramatically reduce support cost and travel time, keeping support staff efficient, and get your workers back to work faster.

K_Lloyd

Sparc Arrest

Posted by K_Lloyd Mar 21, 2011

Pun intended.

 

Sparc migration has been the "topic de jeur" or more accurately, if less cliche, the "topic de année", with many of my customers.  I am a solution engineer covering the Northwest and Canada.  In this region there is a substantial installed base of Sparc systems.  Many of them are getting a bit old.  Virtually every user I have spoken with would like to migrate these systems to X86.

 

There reasons for migration are varied, but generally hit on some common themes.

  • Reduce the number of hardware architectures supported
  • Reduce the number of operating systems supported
  • Reduce maintenance contracts
  • Address licensing concerns
  • Move to supported ( or supported earlier) platforms
  • Address performance gaps
  • Concerns about ecosystem

 

In general Sparc has not kept up with Moores law.  I do not mean to imply that there are not advances and some great products, but if we compare performance and price/performance of the silicon, Intel Xeon is a strong leader.

 

This performance gap is especially apparent for older systems.  Example, if we take specint_rate_base2006 as a pseudo indicator of "general enterprise workload performance" ( i hate benchmarks, but you have to use something) we see that a single 4 socket Xeon 7560 based system delivers about the same performance a 2004 vintage 72 socket SunFire E25k Usparc IV system.

 

i.e. the 72 processor system that 7 years ago was sized to run your "large" ERP, decision support, or CRM systems can be replaced by a single compact blade or rack Xeon server.

 

Using this benchmark, Xeon beats even the latest Sparc T3-4 system socket per socket.  Price performance is even better.

 

I get that migration is hard, and a bit scary.  It may be better to stay on Sparc, than risk the companies uptime... but the risk can be minimized.  There are many companies that have made the move.  Xeon architecture, especially in the EX class is very robust.  High availability configurations are available.  Virtualization provides the lubrication for easy and dynamic scaling across machines and sizes.

 

The time is right to make the move.

comparissons sourced from Spec.org

The top500 is a highly watched list in the high performance computing community.  My employer, Intel, is in no way immune to bragging about their success in this space.

 

Recently I have heard many complaints that the measure of performance for the top 500 ( LINPACK ) is something less than ideal.  For about 20 years the LINPACK benchmark has defined “leadership” in the top 500 supercomputing list -  causing governments and universities to focus on this one benchmark as a measure of status.

 

The problem is that LINPACK is a measure of compute, not a measure of work.  I.E. Just because a system can rock the LINPACK benchmark, it may not be the best at finding the next best protein fold. There is even concern that some solutions can game the benchmark to score higher in a top 500 competition. It appears that these complaints are now manifesting themselves in actual challenges to the status quo. Intel and IBM both recently questioned the Top 500 criteria at the recent SC2010 conference.

 

I was excited today to read about a potential replacement for LINPACK, Graph 500 is a benchmark that can measure actual work done.  Adopting something like Graph500 or maybe a series of top 500 benchmarks could make HPC bragging rights much more relevant to what we want to use these machines for.

 

My favorite quote on Graph500 was “Some, whose supercomputers placed very highly on simpler tests like the Linpack, also tested them on the Graph500, but decided not to submit results because their machines would shine much less brightly," said Sandia computer scientist Richard Murphy, a lead researcher in creating and maintaining the test.

I have been in this industry a long time. In that time there have been a lot of trends that were the “next big thing” and “going to change everything.”

 

Some of them really did – like the relational database. Some were all but missed by the experts – like the early web. Some went the way of Microsoft Bob…Ok, Bob was never suspected of changing anything, but I do take any opportunity to bring it up.

 

Clouds are the buzz-De-jeur. There is very little innovation today that is not being stapled to a cloud offering. Guess what?, Clouds are not new. When asked by customers to define the cloud, my slightly sarcastic answer is “Hotmail.” Actually “Hotmail”, and the myriad of other completely externally hosted internet applications are the front runners of Cloud computing. And they work well. The flagship externally hosted application today is probably “Salesforce.com”

 

These examples are one view of cloud computing – moving the app in its entirety, with no required integration to corporate systems, into an externally hosted environment.

 

Clouds today are more than this. I’ve looked at clouds from both sides now (ok, I really wanted to type that).

 

Clouds today encompass pretty much everything “as a service.” You can append almost anything IT does with“As a service” and you have named a potential cloud offering. This can be anywhere in the stack, hardware: Infrastructure as a service; OS: platform as a service; Compute: function calls as a service; or applications like Hotmail.

 

What has changed? (Why is it boom time for clouds?)

 

In a word – Virtualization. Virtualization provides the ideal “container” for selectively moving parts of an IT environment. It provides boundaries, isolation, and sufficient abstraction of the physical to provide the migratory lubrication never before seen.

 

My personal belief is that the offerings that win the race will be those that allow a near seamless migration of containers (VMs) from an internal cloud (normal operations) into externally hosted environments. I am not saying this will be easy, but the ecosystem is rallying to deliver this capability. Initiatives like the Open Datacenter and programs like Intel’s Cloud Builder are providing the reference designs that will open up flexible hosting.

 

In the end the cloud will reign, even if the notion of calling it the “cloud” is an anachronism from “shortly after the turn of the century.” Having compute evolve to be a utility just makes sense.

I have watched the growth of Cloud-mania and although I do think it is part hype ala the lexicon de jeur, that doesn't mean it isn't a really big deal.  Service based computing offers so many solutions to so many problems.  In many ways it is the obvious destination.  That said, it is not without significant hurdles.

 

Today's cloud adoption seems to be all or none, lacking in the federated nature that was called out as critical to further growth.  To really get there IT needs to be able to move data and code ( containers ) fluidly into the cloud and maintain performant and secure communication with these containers.  Furthermore containers need to be mutually exclusive, where no data can touch another container.  These containers must also be safe from denial of service.  No hostile container can take their resources or block their access.  Isolation and communication are both critical, and will sometimes be at odds...

 

Because of this need, I believe the sensitive ( business critical ) solutions will first be realized as "Private Clouds" .  Using the frameworks being defined by the cloudbuilder forum IT can gain the expertise needed to move forward with Utility "as a service" clouds.  Companies getting the expertise today will be among the first to take full advantage of utility/service/cloud computing.  This will give them a cost and performance advantage that may put the late adopters in jeopardy.

 

I sort of doubt we will be calling them clouds in 20 years, but my guess is most of the processing in the world will live there by then.

This new crop of servers based on the Xeon 7500 processor are seriously game changing.  Take everything that was good about the Xeon 5500 ( Nehalem ) processor last year and turn up the dial a few clicks.  More cores, More Interconnects, More memory, More Performance, More of everything.

 

The result is a server that can go head to head with anything out there.  I was chastised for writing a pointed blog last year that said the future of the Sparc™ and Power processors were “challenged”.

 

Fast forward and today a Xeon platform can match or exceed the performance of virtually any other enterprise platform. So, for enterprise applications today, performance is not the issue.  Selection of a platform is a function of:

 

  • Capital Cost of the platform

  • Support Cost of the platform

  • Operating System support

  • Platform Reliability


For capital cost, Xeon systems are typically much higher performance/dollar than Power or Sparc platforms.  Sometimes by a huge margin.

 

Support Cost: The reality is virtually every data center has expertise in support Windows or Linux on Xeon servers.  They may also have some people with expertise in other operating systems, but the cost to support incremental Xeon servers is relatively low, and no new training is required.

 

OS support can be an issue if the customer is dependent on lots of custom code that cannot be recompiled.  For many commercial applications and databases Xeon is often the primary supported platform, not a later port.

 

Platform reliability is a big concern for all customers but in truth much of this is a measure of software reliability.  Xeon platforms today are amazingly solid.  Windows and Linux have established themselves as real enterprise operating systems.  Solaris runs faster on Xeon than any other hardware.  Virtualization provides added levels of system reliability, providing high availability and live fail over.  Lastly, at the “processor hardness” level, Intel has added over 20 processor reliability features to the Xeon 7500 platform, making it truly a mission critical processor.

 

Absolutely I am biased toward Intel Xeon platforms, but when I add up the TCO/ROI numbers, Xeon servers just seem like a much better value for businesses that are driving for efficiency.

A: It depends.

No, seriously, as much as it sounds like a copout, that is actually the correct answer.

 

I still get asked this question several times a week.  After a deep breath, I carefully say “it depends”, but then try and explain my position.  Part of the problem comes from looking at virtualization as something you might do to a server, instead of looking at it as part of how you manage all your servers. 

 

In the olden days, say four years ago, it was a pretty simple question in that the options were limited.  Processors had a single core, and you had a choice – do I go with two processors or four.  Four processor systems had more room for memory, but they also had more processors. 

 

Today things are more flexible and more fluid, and this trend is only increasing.  Processors have multiple cores, and the options are vast.  Intel is introducing new processors into the Xeon family soon.  The Nehalem EX and the Westmere EP.  All of these benefit from the architectural advantages that came with the introduction of the Nehalem architecture.  All of them have multiple cores.  So how do we pick the right virtualization server – the “best” virtualization server? 

 

There will be a lot of dials we can turn.   A given server could have two, four, or eight processors.  Each processor could have four, six, or eight cores.  Different servers will have different memory capacities and I/O capabilities. 

To make a good choice you will need to understand which resources constrain the addition of more VMs to your servers.  Understanding your workload is the key.

 

As you load virtual machines onto your platform what barriers do you run into first – memory? CPU? Disk I/O? Network I/O?  Something else?

 

Choosing the right server will require understanding your workload, and selecting the hardware that best addresses your virtualization constraints, without breaking your licensing or budget.

 

There is no magic answer – the right server for your VMs depends on what your VMs do.  Are they web heads, sharepoint servers, data bases, or ERP modules?  The right answer depends on understanding your workload.  Or as I said before, It Depends…

K_Lloyd

Happy New Year

Posted by K_Lloyd Dec 14, 2009

I am excited for 2010, and a bit misty about 2009 ending.  Not to imply 2009 didn’t have issues, but it was a bang-up ( http://www.answers.com/topic/bang-up ) year for Intel’s server products and technologies.   2009 was a transformational year.  Why?  In a word, Nehalem.  Intel Xeon 5500 was the biggest jump in performance and efficiency in a single processor generation I have ever seen.

 

To put some perspective on this, in the four generations before the Xeon 5500, Intel increased two-socket Xeon performance almost 600%.  To do that required an improvement of about 80% per year.  Not a shabby achievement and it was a good reason to move to the next generation of Intel Xeon servers.

 

Looking at these four years before Nehalem, a cynical person could argue that a lot of Xeon’s performance gain was simply an exercise in adding cores – not that core addition is simple.  And truly, a fair chunk of that 80% can be attributed to more cores per processor.  What is interesting, and profound, is Nehalem’s leap in performance.  With the same number of cores as the Xeon 5400, operating at a slower clock speed and with less processor cache, the Xeon 5500 delivered a jump of about 2.5 times over the 5400.  This established an Intel lead in two socket performance and efficiency.

 

Re-reading what I just wrote, it sounds a bit like a puff piece, but I really mean it, Nehalem made 2009 incredible year.  Intel had its challenges this last decade, but they were not delivery of server products and technology in 2009.  In 2009 Xeon rocked.

 

Now back to my excitement about 2010.  Take all that goodness of the Xeon 5500 in two sockets, inject the silicon equivalent of steroids, and you get a sense for the Nehalem EX four+ socket processor.  This monster is positioned to change forever what it means to be a “high end Xeon server”.  With up to eight cores per socket, and designs of four and eight sockets ( that would be 64 Nehalem cores, 128 threads ) there are not very many jobs in the enterprise that won’t fit into on one of these platforms.  The addition of mission critical reliability features harden this platform to a level never before seen in the x86 market.  This is a machine that can do it all, scale to the biggest enterprise jobs, with reliability features for mission critical applications.  2010 should be very interesting indeed.

 

Happy New Year!

These are dog years for servers.   Pretty much every year Intel introduces a new Xeon processor.  Those who have heard the story recognize this as the Tic Tock model.  On Tic years the manufacturing process is updated, on Tock years the chip architecture is updated.  Every year customers get a boost in performance, and often a cut in power.  Typically this boost is in the 50% neighborhood, enough to make it worth the upgrade, and still achievable by engineering teams on a two year cycle.  Except, we are in dog years.

 

 

The Nehalem – Xeon 5500 – processor broke all prior boundaries on single generation performance gain.  Delivering two to three times the compute capacity of the Xeon 5400 (Harpertown) generation.  This is a big change, probably a once in a lifetime change – unless that quantum thing happens in my lifetime.  Roughly a 10X performance boost in less than 5 years.

 

During this same five years we have seen virtualization technology go from a lab project – something for test and dev – to mainstream data center process.  In 2005 it would have been heresy to suggest virtualizing the corporate ERP.  At that point virtualization overhead on the server could be as high as 25% and the entire server was needed to do “real work”.  Fast forward to today.  Virtualization technology in both the hypervisor and processor have reduced overhead to only a few percent, AND servers are 10X faster.  Not only can you virtualize the ERP, you are irresponsibly wasting resources if you do not.  Unless your ERP demands have grown 10X in 5 years, your ERP alone won’t even make a new Xeon 5500 system sweat.

 

If this advancement wasn’t enough, the announcements last month from Intel about the coming Xeon 7500 (4+ socket) processor were amazing.  All the benefits of the Xeon 5500, but on steroids.  The  new biggest leap ever.  With up to eight cores and four memory channels per socket, this is a monster.  Your ERP system will be barely a blip in perfmon.  It isn’t unreasonable that an entire data center for a SMB business could be virtualized onto one of these beasts.  And, how big is a Xeon 7500 server?  My guess is about the size of a breadbox

K_Lloyd

Data Center Security

Posted by K_Lloyd Sep 19, 2009

Even the name is a sort of a misnomer.  Not that there isn’t a lot of physical security around most data centers.  The doors are locked and not even regular employees have access.  This is necessary, and if someone gained physical access they could really mess things up. But, this is not where the big risk typically occurs.

 

The growing challenge is data security – i.e. protection from threats that come across the wire.  With ubiquitous networks, and data moving everywhere, protecting the crown jewels is a full time job.  Hackers, malware, employee abuse, and other threats can lead to data exposure that is potentially devastating, and almost undoubtedly embarrassing for the IT manager.

 

Gartner recently declared IT security the number one worry of fortune 1000 companies. This is not surprising when a report from Symantec showed exponential growth in internet security threats.

 

There is no silver bullet, and there is no system that can never be defeated.  We need to do the best we can with the tools we have.  Doing anything less could be seen as negligent.

 

Like security in the physical world, data security is a combination of business process and technology.  Neither can be effective alone.  Business processes must make clear what roles deliver data access, data steward ship, data ownership, and data disposal.

 

<sidebar>Data disposal is going to be one of the biggest challenges to the promises of cloud computing.  If we consider a hosted app like “gmail” to be part of the cloud, then we either must accept privacy policies like “all data belongs to the host” or try to stick to using internal systems. </sidebar>

 

The other half of the security solution is technology.  Intel, and others, are delivering new technologies to the server to assist with security enforcement.  New string accelerator functions dramatically speed content scans for malicious data.  Technologies like execute disable & SM range registers provide improved protection against buffer and cache attacks.  The next generation of Intel server processors will introduce new features that can validate that code is un-altered and remove much of the overhead from encryption.

 

Security can not be an occasional focus any longer.  Every security manager will need to be up to date on the state of technology and tools, and have the social skills to drive good data practices into the work environment.

Filter Blog

By author:
By date:
By tag: