Home > Intel Communities > Open Port IT Community > The Server Room > Blog > Tags > hpc
1 2 Previous Next

The Server Room Blog

21 Posts tagged with the hpc tag
0

Nehalem-EX: Big Memory for Big Science

I was at SuperComputing’09 last week in Portland, Oregon. I talked with some brilliant people, and saw some fantastic stuff.

It was good timing on my part because last week Intel also announced that it would offer a 6-core, frequency-optimized version of its Nehalem-EX product due out next year. This part is intended for use in tackling some of the types of high performance computing (HPC) workloads prominently displayed at SC’09.

Most people know that the majority of HPC workloads today are based on clusters of relatively small-memory, 2-socket systems. That is because most HPC workloads may be broken into smaller, discrete units of work that can be efficiently processed using such clusters. For these workloads the primary hardware capability selection criterion is typically a balance of both memory bandwidth and compute FLOPs (floating point operations per second).

But there are other types of HPC workloads. Specifically, those that deal with very large datasets (some as large as a terabyte) or those that have to deal with non-sequential memory access. This means the workloads simply aren’t easily divisible--or it is inefficient to do so-- into the relatively small memory footprints used in traditional clustered 2-socket HPC solutions. Examples of these types of bigger memory applications can be found in a variety of fields such as weather prediction, manufacturing structure analysis, and financial services.

The high-speed processing requirements and size of these workloads put a greater premium on system memory capacity/bandwidth than on compute FLOPs.

If the larger dataset won’t fit into available memory, and dividing up the dataset to spread across multiple nodes cannot easily be done, then data has to be moved in and out memory to hard disk.  But using hard disk drives (which are many times slower than RAM memory) can drastically impair performance.

There are now two better alternatives to the use of hard drivers. One is SSDs and the other is having a larger memory footprint. Solid State Drives have fairly high data density vs RAM, but much faster access than hard-disk drives--albeit still markedly slower than RAM. Another solution is to simply have more capacity of the faster RAM. This last one is what the Nehalem-EX HPC part is aimed at.

Nehalem-EX is the Expandable Class of Nehalem. The Expandable Class brings all the goodness of the Nehalem architecture (Xeon 5500 product line) to the HPC market, but in the form of a “super node” that has greater: a) core/thread count, b) socket scaling (up to 256), c) I/O and memory capacity (up to 1 terabyte in a 4 socket system) and bandwidth at capacity, d) reliability features, e) and other features.

The 6-core frequency-optimized Nehalem-EX part has also been tuned to offer the highest core frequency possible for this chip.   In creating this part, Intel is meeting the needs of the HPC community that want higher scalar performance along with the benefit of large memory capacity and bandwidth per core.

Of course the 8-core version of NHM-EX is still an option for those HPC workloads that scale well with more cores while still looking for the high memory capacity of the expandable class.

By having both 8-core and frequency optimized 6-core versions of the NHM-EX class of processors means HPC researchers have greater choice in selecting the processor best suited for their specific workloads.

After talking with some of the researchers at SC’09 last week I’m really excited to see how the Nehalem-EX “super node” will deliver the necessary compute and memory capabilities to help those researchers solve some of their biggest challenges.

0 Comments Permalink
0

This week I'm in Portland, Oregon, where I call home. Its interesting for me since this is my first Super Computing conference, and soo far, I'm really impressed, not only by the intense knowledge and the plethera of scientific discovery all around, but also by the fact this conference is so well attended. There -s a huge trade show floor, filled to capacity where you can see everything from genome research to oil and gas exploration, to bio-computing. . It's very cool to see NASA, Oak Ridge Labratory, and many top universities all showing off the lastest in High Performance Computing, some very cool stuff indeed. From the point of view of higher learning and how super computers are changing the world, this is the place to be. Here are a few shots of the Intel booth in case you get a chance to come by and see us.

 

SC09-Intel Booth01.JPG

 

SC09-Intel Booth02.JPG

 

SC09-Intel Booth03.JPG

SC09-Intel Booth05.JPG

 

I'll be capturing some cool videos from the conference and you should keep a look out for these on Channel Intel at YouTube. Thanks for stopping by The Server Room.

0 Comments Permalink
0

In order to deliver to the continued promise of Moore’s Law, Intel’s Information Technology team needs to enable Intel’s Silicon designers with the tools, capabilities and streamlined processes to bring higher performing processors to market every year.  The latest generation of 45nm products (ie the Intel microarchitecture, codenamed Nehalem) was an especially challenging project for us. 

 

With Intel design computing demand growing an average of 45% year over year coupled with the rich technology capabilities in the 45nm based Nehalem micro-architecture, the computational requirements of silicon tape-out (the last stage of design before manufacturing) represented an approximate 13x increase in increase in demand from prior 65nm processors. Staring at this demand (1.2 million hours of compute demand per day) plus a need to bring products to market faster and more efficiently, our IT team realized we needed to do something different - our standard grid computing solution that was sufficient for earlier stage design work was insufficient for tape-out. 

 

Solution: Intel IT built a High Performance Computing (HPC) solution that currently rank in the Top 500 list of supercomputers (#261, #308, Nov 09) and feature a new parallel storage environment to support our 45nm Silicon tape-out process.  The details of this effort are captured in this whitepaper. 

In summary, the Intel IT HPC solution employs two of the world’s fastest supercomputers to create the fastest microprocessors helping Intel achieve the following results.

 

·         Completed 45nm tape-out in 10day, less than HALF the time of prior products

·         Delivered an estimated incremental value of $44M to Intel

 

I can’t wait for what tomorrow will bring as Intel IT is already upgrading and evolving this HPC solution to support our future generations of micro-processor designs. Tune in tomorrow at SuperComputing 2009 in Portland where Shesha Krishnapura from Intel IT will present more details on our HPC environment or join us December 8th, 2009 from 10-12am PST for a live chat with Intel IT experts in the Server Room

 

Chris (twitter)

 

HPC roadmap.JPG


0 Comments Permalink
0

Are you ready to innovate faster or explore more design options in less time than ever before?

The digital workbench powered by two Intel Xeon 5500 processors gives you the opportunity to create, test and modify your idea right at your workstation. Have no doubt, workstations powered by two processors, with eight total cores, sixteen computational threads, and memory capacities up to 192GB are proving extremely capable at analysis-driven design.

Today’s digital workbench is nothing at all like last year’s workstation, which may have struggled to design and simulate. This new breed of a workstation presents you with the capability to rapidly play “what if?”

What is driving the interest in the digital workbench?

Organizations of all shapes and sizes are looking for opportunities to reduce design cycle times and associated costs without negatively impacting product performance. One potential method of achieving this is by enabling designers to consider the validity of a greater number of design concepts earlier in the design cycle. This may not only shorten design cycles, but it may also enable you to ultimately deliver a more favorable product configuration.

The product development rules are changing.

Manufacturers are recognizing that by reordering product design activities, they may be able to achieve a more efficient product development process. By empowering engineers with easy-to-use and powerful 3D conceptual design tools, together with early access to CAE applications, engineers may be able develop the most advantageous designs before committing them to labor-intensive detailed design processes.

Isn’t this old news?

Many manufacturers agree the greatest opportunity to impact product development cost is by bringing simulation forward. That is old news. Manufacturers know that when product analysis or simulation results trail the detailed design process then product changes become extremely expensive and negatively impact new product release schedules. Worse yet, they also realize that changes made downstream in a design cycle are “last minute” and almost always imply compromises on original design goals. This, of course, cuts into the product performance and profits of the new or updated product.

Using simulation and getting results before the detailed design process begins helps ensure that the CAD models meet performance requirements, mitigating last-minute and expensive design changes.

OK, the product development rules may be changing, but I still need an expert.

No doubt, the expert is still needed. However, advancements at companies like ALTAIR, ANSYS, SIMULIA, MSC, SpaceClaim and others are all making it easier to bring simulation and analysis further upstream in the design process.

As one example, let’s look at the ANSYS Workbench platform. This solution provides an easy-to-use framework that guides the user through even complex multi-physics analyses with drag-and-drop simplicity. It supports bi-directional CAD connectivity and enables the idea of simulation-driven product development.

ANSYS is an example of what ISVs are doing to create tools that learn from the experts and export them to others who need access to their knowledge. Yes, the expert is still very much needed, but leveraging the expert’s knowledge and driving it upstream in the design process is needed even more.

The new model

Using the combined hardware and software technologies delivered through a digital workbench, engineers can now create a single digital model that gives them the ability to design, visualize and simulate their products faster than ever.

This hardware and software suite enables users to create a digital prototype and can help engineers to reduce their reliance on costly physical prototypes and get more innovative designs to market faster.

The digital workbench helps users bring together design data from all phases of the product development process into a single digital model that can be rapidly changed, tested and validated.

What can you do to test the promise of the digital workbench?

Today’s workstation can provide you with a magnificent digital canvas to create tomorrow today. You need to decide if you want to explore reordering your product design activities and potentially achieve a more efficient product development process.

Today’s workstation gives engineers a new tool that can be likened to a digital workbench. This tool, powered by two Intel Xeon 5500 series processors, hosts a suite of software applications that engineers can employ to create and test their ideas. The pliers, hammer and nails found on a workbench in a garage or basement have now been replaced with digital tools that promise to accelerate innovations via a process known as digital prototyping. Its enablers include application tools like detailed CAD, CAE and PIM. Together they represent the new digital workbench—a powerful innovation tool you can use to bring your ideas forward faster than ever before.

Are you ready to use a digital workbench?

Visit www.intel.com/go/workstation to see which workstation is right for you.

0 Comments Permalink
0

Interactive Modeling and Simulation – Come on you are kidding!!

Recent advancements in mathematical modeling, computational algorithms, and the speed of computers based on technologies like the Intel® Xeon® processor 5500 series have brought the field of computer simulation to the threshold of a new era.  While not quite interactive, simulation and analysis can now occur at a pace that impacts decisions further upstream in the design process. 

Simulation and analysis tools are also no longer the domain of the expert.  Organizations can now potentially achieve a more efficient product development process by considering a reordering of product design activities and empowering engineers with easy-to-use and powerful 3D conceptual design tools and early access to CAE applications.

Why consider reordering your product development process?

This is not new news. Manufacturers know that when product analysis or simulation results trail the detailed design process that product changes are become significantly expensive and will most likely negatively impact new product release schedules. Worse yet, they also realize that changes made downstream in any design cycle are often “last minute” and almost always imply compromises on original design goals. This, of course, cuts into the product performance and profits of the new or updated product.

By reordering product design activities, manufactures may be able to achieve a more efficient product development process and reduce overall product development cost, time and risk.

No experts needed.

Don’t be fooled.  While ISV’s from ANSYS, ALTAIR, MSC, PTC, Siemens PLM, SIMLUIA, SolidWorks and others have made tremendous strides in making their simulation products easier to use, you probably still need an expert.  However, their collective advancements in tools, wrappers, and easy-to-use frameworks that guide the engineers through complex multi-physics analyses with drag-and-drop simplicity make it easier to move analysis further upstream. 

That means your expert can now focus on the really hard problems.

Workgroup Computing – Bringing “Real” HPC Computing To Your Department

Using analysis and simulation to get results before the detailed design process begins will help ensure the CAD models meet performance requirements and will almost always mitigate last-minute and expensive design changes.

Large scale compute intensive jobs used to require investments and/or access to a divisionally shared, large scale cluster housed in a controlled Data Center environment …supporting hundreds of users.

While this may have been true a few years ago, the advancements in mathematical modeling, computational algorithms, and the speed of computers based on technologies like the Intel® Xeon® processor 5500 series now makes it possible to quickly and efficiently solve large scale problems closer to the engineers responsible for dealing with them, on compute clusters supporting small workgroups or departments of engineers vs. large scale clusters shared by hundreds of engineers.

As an example let’s look at the Cray CX1™ deskside personal supercomputer.  Like others in this new usage category, it presents an organization with a solution that is the "right size" in performance, functionality, and cost for individuals and departmental workgroups who want to harness HPC without the complexity of traditional clusters.  Equipped with powerful Intel Xeon 5500 series processors the Cray CX1 delivers the power of a high performance cluster with the ease-of-use and seamless integration of a workstation.

OK, You Can Give Me The Performance, But The Support Can Be A Nightmare

Intel® Cluster Ready makes HPC simpler.  It boosts productivity and solves new problems. The Intel® Cluster Ready program makes it simpler to experience the power of high-performance computing. 

Intel Cluster Ready presents HPC users a certification program that is designed to establish a common specification among original equipment manufacturers, independent software vendors (ISVs) and others for designing, programming and deploying high performance clusters built with Intel components.

For users, this certification means that these certified HPC systems will run a wide range of Intel Cluster Ready ISV applications right of the box.  Tested, validated and simple.

By selecting a certified Intel Cluster Ready system for your registered Intel Cluster Ready applications you can be confident that hardware and software components will work together, right out of the box. Software tools such as Intel® Cluster Checker help ensure that those components continue to work together, delivering a high level of quality and a low total cost of ownership over the course of the cluster’s lifetime.

To learn more about Intel HPC Technology visit www.intel.com/go/HPC

0 Comments Permalink
0

I finally finshed the content for the talk that I am scheduled to deliver next week at IDF on Sept 22 (TCIS001 10:15 am – Room 2004).  The content covers examples of optimizing for multi-core using our software tools to accelerate performance,  and more importantly the seamless use of the same software base with minimal or no changes in next-generation architectures (what we call scaling performance forward). Personally, I am excited about the potential of multi-core optimizations with today’s architectures. When I was a graduate student in parallel computing from 1988-1994, it was extremely difficult to take any algorithm and map it to the parallel architectures since most of the algorithms were not very efficient once you took the communication delay’s into account.  The key is to get total delivered performance at an application level, not at the kernel level. However, given the architectures of today which are better balanced, and the availability of multi-core, the memory bandwidth, software tools that work, and the faster interconnects, the number of algorithms that can be parallelized and that actually benefit with accelerated performance (Delivered total application time)  is huge, pretty much every industry vertical is taking advantage of multi-core architectures, software tools, and clusters.

John Gustafson, from our Intel Labs, an industry HPC veteran, is my co-author, and I am thrilled to have him speak about Balanced Computing. Wes Shimanek, a colleague of mine at Intel introduced me to John and after listening to his explanation of balanced computing, and his views on what works and what doesn’t, we immediately  knew that John’s expertise will be greatly valued by the IDF audience, and invited John to be part of the talk. John graciously accepted to participate, and I hope that folks interested in computing architectures, especially in the HPC world will make the time to come listen to John’s talk.

I will also be giving you a high level view on the challenges that drive our products and briefly introduce you to the various aspects of our strategy. I will be followed by 3 more talks that will cover the key aspects of what we do at Intel in HPC, Software Tools for Scaling Application Performance Forward (TCIS002), Delivering more to HPC than just Performance (TCIS003), and Intel® Cluster Ready (TCIS009).

I am looking forward to IDF next week. See you all at the developer forum

Nash Palaniswamy (Intel)

0 Comments Permalink
0

Found this video about how intel IT converted what was a high volume manafacturing facility to a high performance computing datacenter that now is on the top 500 list.   Watch Tom Greenbaum, Data Center Operations Manager for Intel IT, provide a description of this retro-fit and tour of the new facility.

 

Some key facts highlighted in the video

  • avoided several million $ in facility cost avoidance
  • landed traditional enterprise environment in raised floor, hot/cold aisle design in one section of facility
  • landed HPC environmet on existing concrete slab floor which enabled higher density deployment of servers
  • 6M Watt, 10K server capacity (4.7k today)
  • room to grow for future to support data center consolidation

 

chris

0 Comments Permalink
0

At Intel when we think of scaling performance forward we think of one word, evolution, not revolution.

 

By evolution we mean developing high performance computing solutions that offer you the balance your applications require in order to deliver the best performance they can.  We do not maximize processor performance without matching it with the necessary memory capacity, bandwidth and system i/o.  We look to match these important components of performance to insure the data is where it needs to be, when it is needed to be in order to quickly and efficiently change it into actionable information.  We maximize your performance by minimizing your latencies.

 

Maximize your performance today, simplify your software development needs, and scale your performance forward as newer microarchitectures debut.

 

Seamless performance – bigger science – that is what we help you achieve faster than ever before.

 

To learn more about our approach to delivering highly effective HPC processors and software tools come to the Sun HPC Virtual Tradeshow on September 17th 2009 starting at 8am PDT.  In the virtual event attend the Intel presentation on “Accelerating Your Applications And Scaling It Forward” by Wes Shimanek & Dr. Nash Palaniswamy at 10:30am PDT.

0 Comments Permalink
1

I have been around the supercomputing market for over 25 years and have had an opportunity to see some interesting ideas come and go.  Let me share two that I experienced firsthand. 

·         CDC’s Cyber205 or a Cray 1S.  The CRAY-IS and the CDC Cyber 205 both offered effective vector processing, however, code conversion between them may have required some significant algorithmic changes. Cray of course won the HPC race at that time.  Note, the Cyber 205 was a tremendous performer, when you could keep their extreme.ly long vector pipeline busy. However, one branch or gap in the vector processing pipeline would cause a flush of the vector unit and what performance advantage you appeared to have vs. a Cray 1S was quickly erased.

·         An early day accelerator was Floating Point Systems.  In particular the FPS 164 was an awesome “off load” system where the needs of a few users were satisfied with better throughput than the Cray X-MP and Y-MP of the day. Convex, had a better idea.  It was better at serving the needs of more than an FPS 164 and was simpler to develop, maintain and scale software to next generation systems.

So what are the lessons from history? Perhaps it is that there it is there is a tight connection between application, architectures and algorithms and that it is extremely important to maintain a level of application flexibility and versatility in order to adopt new architectures as they become available in the market.  The old adage still remains true, software will outlive the useful life of hardware.  So it is important to be able to quickly adapt new shifts.

The same questions probably still apply today as they did when Cray, CDC and FPS were around.

When does an accelerator computing strategy work best?

The easiest answer is if your application is extremely data parallel in nature, then it may be well suited for an accelerator strategy. The word extremely is the critical part. 

If your application only performs some level of data parallelism and includes task, thread and cluster level parallelism or contains a small fraction of branching or is host to irregular data sizes, then perhaps an accelerator may not be the best fit.

How much real performance will an accelerator strategy deliver? 

Often times we hear claims of 10X, 20X or even greater than 30X. 

These are great headlines, but as many have noted, you need to understand an accelerators impact on the total execution time of your application.  What may have been 10X to 30X or more on a kernel of the application may only deliver a mere 2X to 3X or even less in terms of total application performance improvement.

Of course the real question is what are we really comparing performance speed ups to?

I have seen well tuned software on accelerators compared to “baseline” code running on one core of an old processor.  However, when you use available software technology and turn compiler flags on and add in a math kernel library call the performance on multi-core solutions can jump by over 10X and in some cases can exceed 30X multiples for total execution time.  This standards based accelerated software will scale forward as newer microarchitectures are made available from Intel.

Why is the difference between the promise and the actual performance so great? 

Always a good questionJ. 

The promise deals with a small part or a kernel of the software that is data parallel and can potentially scale linearly as more compute resources are added.  Again if the application is extremely data parallel, then an accelerator strategy may be the correct approach.

However, when the actual performance result, or total application performance, is significantly different it is often because of several things. 

·         One common reason is that you may be comparing optimized software on multi-core systems to optimized software on an accelerator.  When I compare similarly optimized software on a multi-core system I see that 20 – 30X difference often fades to less than 2X  and in most cases better than hardware accelerators.  This is because optimized software on a multi-core solution accelerates all components of the application.

·         Another situation is the bandwidth imbalance of the attach points of the accelerators, typically the attach speeds do not match the memory bandwidth or the ALU speed on the accelerators and the theoretical peak flops are tough to achieve.  Sometimes, for larger workloads due to limited amount of memory on the accelerator card, performance deteriorates.

·         Another situation may be that your application depends on different forms of parallelism which include task, thread or cluster level parallelism and even in some cases sequential forms of your software

So back to the differences in performance between the Cray 1 and CDC Cyber 205.

While Cyber 205 was great at edges of science the Cray proved to be the workhorse of high performance computing.  It offered better system balance than the Cyber 205.  Here is an example, if you take great care to optimize your software for a particular architecture you will no doubt see tremendous performance gains.  However, like the Cyber 205, if you break that pipeline you need to pay for the overhead to restart the long vector pipeline.  Often times, even with today’s accelerators, that start up cost reduces what appears to be stellar performance gains of the Cyber 205 to being no better than, or sometimes, even slower than the Cray 1.  There were of course examples with the Cyber 205, as there is today with accelerators that demonstrate where select sciences can see tremendous advantages over traditional computing solutions.

What other considerations may weigh in your decision to adopt an accelerator strategy?

Are you constantly refining your software?
Many researchers would probably answer yes.  They are constantly refining their software to improve the results the performance or both.

As I mentioned at the beginning of the blog, the old adage still remains true, software will outlive the useful life of hardware.  So it is important to be able to quickly adapt to new shifts.  One way to simplify these moves is to use standards based tools which can give you the flexibility to create applications that can use the multiple types of parallelism mentioned above via tools, compilers, and libraries.  You may also want to use standards based tools to acquire the versatility you need in order to scale your software across multiple architectures – e.g. large, many and heterogeneous cores. 

The caveat with respect to using non standard tools is that you become locked into a specific architecture.  If that architecture from the same vendor would happen to change, you may be required to make some significant changes (e.g. tuning to grain sizes).

Do you want to maintain, support and update multiple code bases?
I don’t.  I want to invest n the development of parallel algorithms.  The old adage is that software will far out live any hardware implementation still applies and I need the flexibility and versatility to quickly and as painlessly as possible be able to adopt new architectures as they are made available.  I do not want to invest in maintaining, supporting and updating an ever increasing set of code streams as newer architectures are made available.

Our team goal at Intel is to develop software tools and hardware technology that can help you scale-forward your application performance to future platforms without requiring a massive rebuild – just drop-in a new runtime that is optimized for the new platform to experience the improvement (akin to the printer/display driver model, buy a new printer/display, install the respective driver, and your system enjoys improved benefits).  That is the goal.

If you want to learn more about what we are doing to deliver high performing HPC solutions that are both flexible and versatile please visit www.intel.com/go/hpc

1 Comments Permalink
0

Picture1.jpgAt ISC09, the Top 500* results were announced: 399 out 400, nearly 80%, of the world’s supercomputers are using Intel processors.  The Top500 list is based upon one benchmark, Linpack.  While powering most of the world’s fastest computers is a great endorsement of the role Intel’s technology is playing to help solve the most complex high performance computing problems, no one buys a supercomputing machine just to run Linpack.  Linpack is a kernel that does not necessarily resemble any real application.  It’s just one evaluation vector among many. So, should you demand more?

Yes, look beyond the flops:  look at real application performance or benchmarks that might more closely resemble yours, look at the versatility, and look at ease of deployment of your solution.  

Today, Intel processors deliver more performance and throughput in less space and require less power than ever before.  The Intel® Xeon® 5500 Platform delivers up 3X performance over the previous generation Intel Xeon 5400 to decrease your time to discovery.   The Top500 list has 33 new entries based on the Xeon 5500, which launch only 3 months ago. Intel tools (compilers, libraries, and cluster kits) bring new levels of software versatility by enabling HPC users and ISVs write applications that extract peak performance and scale forward.  The Intel’s Cluster Ready program is easing cluster deployments, increasing reliability and lowering TCO by making it simpler to purchase, deploy and manage an HPC cluster.

So while providing flops is great, don’t forget (and demand) to look at real application performance, ask for software tools and technologies that maximize the value of your HPC system.

Jimmy

*Other names and brands may be claimed as the property of others

0 Comments Permalink
0

I have been watching the social chatter today about the latest Top500 supercomputing list and seeing companies, manufacturers, application vendors and even countries compete for mind share of this most recent list on twitter.

 

However, as I read about and explored this list, the things that jumped out at me were not the who’s number one, two, three … or who grew what number of spots ... but rather the trends that have occurred over time. These trends have not happened in the last 6 months or the last 6 years but instead over the course of nearly a decade of innovation

 

1)      Today, the #10 posting (a cluster using the 3-month old Xeon X5570 processor (Nehalem-EP)) delivers the same FLOPS performance capability equal to the entire June 2000 TOP 500 computers list. (see below)

 

 

top 500 over time jun 09 Performance_Development.png

source: http://www.top500.org/lists/2009/06/performance_development

 

2)      Also, the emergence of multi-core intel-based servers complemented by affordable open-source software solutions have enabled a transformation of how supercomputing performance is delivered. Intel based servers have gone from nearly “0” to nearly “400” over this decade.

 

 

IntelTOP500history.jpg

source: http://www.intel.com/pressroom/images/IntelTOP500history.jpg

 

I recently had the opportunity to co-present a webinar with Matt Jacob’s of Penguin Computing where we talked about how High Performance Computing is changing the way that businesses innovate, research, design, analyze and create. What used to be only done in large datacenters and universities are now available to mainstream IT and businesses.

 

This is extremely important for areas like health care, financial services, manufacturing and many other industries.  Equally important are the software technologies (intel cluster ready software) that can make clustering available and easy to use so that this performance capability can be tapped without a ton of complexity.

 

So, while the Top500 list may be interesting for bragging rights, what excites me and many of the end users that I talk to are is the power, affordability and accessibility that high performance computing has to mainstream business users and the innovation and creativity that brings to the marketplace.

 

How are you using computing perfomance to do things that once were not possible in your business?  Share your story with us !!!

 

Chris

http://twitter.com/chris_p_intel

 

 

0 Comments Permalink
4

BMW automobiles are known for speed, agility, quality, style and probably some other attributes I’m forgetting. Their IT infrastructure requires the same attributes for them to remain competitive in their industry.

Proactive server refresh, now using Xeon 5500 are part of that equation.  This recent case study outlines how BMWs migration to Xeon 5500 series lowers total cost of ownership and increases flexibility for their business.

Server refresh with Xeon 5500 delivers 30% higher IT performance with 75% less hardware, compared to dual core Xeon 5100 technology. 

The case study also says that BMW’s next refresh target are their RISC based servers

Can you gain a competitive edge replacing aging servers in your infrastructure

Estimate your savings today (www.intel.com/go/xeonestimator)

4 Comments Permalink
2

Looks like the Intel® Xeon® processor 5500 series is making lots of noise in HPC.  The QPI and integrated memory controller are really providing the boost necessary to make it an all around performance leader for HPC applications.  With all this performance why did Intel add a third memory channel?

The third memory channel enables the platform to support a boat load of memory.  Matter-of-fact, up to 192GB can be supported in a two socket configuration.  It wasn’t too long ago when only 32GB was supported in a dual socket configuration.  By having the ability to support so much memory you can now meet the needs of almost every HPC application.  The 5500 series is intended for all server markets, but let’s face it, with the design changes Intel made with the new architecture the server segment gaining the most benefit appears to be HPC. 

It seemed like yesterday when the only way to have access to large memory configurations was through expensive, proprietary SMP systems.  The HPC market for large SMP systems is still out there but it is shrinking…fast.  Today, we are clustering low cost solutions to create some of the most powerful systems in the world.  Standard components are leading to lower and lower system costs, delivering a price/performance advantage alternative solutions cannot meet.

Now that a single dual socket node can support up to 192GB’s it is important to understand how to get there.  First, to enable 192GB you need 16GB DIMMs x 12 memory slots.  There will be a premium for a 16GB DIMM.  Knowing the options and determining the best, most cost effective solution is going to be dependent upon your environment.  When a large memory node is required, do you purchase the 16GB DIMM’s or go up to a Multi-socket solution?  If I decide to scale back on the memory (use 4GB or 8GB DIMMs instead of 16GB DIMMs) what is the performance impact to my application?  If I am cost sensitive, will the lower cost outweigh the lack of performance?  Can I use SSD’s (Solid State Disk drives) to compensate for any performance loss due to lower memory capacity?  There are many questions to think about when deciding the right configuration for your application and environment and I certainly can’t answer them here.

Let’s not forget the third memory channel enables a different set of optimal memory configurations.  Think x3 when deciding on how much memory to install into your node; 12GB, 24GB, 48GB, etc.  What happens when you don’t use an optimal configuration?  Well it depends, in most cases the impact is minimal, but let me add a bit of context around minimal:

·         Low bandwidth sensitivity (more dependent upon the processor for performance)

        E.g. Monte Carlo, Black-Scholes (financial modeling), BLAST (bioinformatics), AMBER (molecular dynamics)

        Expect less than a 2% difference between memory configurations*

Ÿ  Medium bandwidth sensitivity (somewhat balanced between memory and CPU usage)

        E.g. CFD, Explicit FEA, Implicit FEA (with robust I/O system)

        Expect approx. 5% degradation for non-optimal symmetrical configurations*

Ÿ  High bandwidth sensitivity (high access to the system memory)

        E.g. WRF (weather), POP (climate), MILC (physics), Reservoir Simulation

        Expect approx. 10% degradation for non-optimal symmetrical configurations*

The results are interesting.  In all three cases above, the degraded performance is always better than the performance you would have with only two memory channels.

When you hear about performance impact of non-optimal memory you can see by the examples above, it is application dependent and will not have a severe impact on your overall system performance.   

The Intel Xeon processor 5500 series offers support for huge memory nodes with the addition of the third memory channel.  Memory configurations in multiples of three are ideal, but if you decide to stay with a power of two configuration the performance should still exceed that of a solution based upon only two memory channels.

*Based upon Intel internal measurements

2 Comments Permalink
2

Often this term is used around the HPC industry referring to the use of HPC to help companies and R&D accelerate the process of innovation.   One close to home example that comes to mind, because I own one of his products, is James Dyson, inventor of the Dyson* vacuum cleaners.   By all means he is a success story, but the road to success, by his own account, was paved by many failures along the way.  According to Dyson, it took him 15 years, nearly his entire savings and 5127 prototypes to develop his creation.   Could HPC technology have streamlined his success?

jamesdyson.jpg

A May 2008 study on the Industrial use of HPC for innovation, sponsored by DARPA, DOE, and Council on Competitiveness, concluded that “HPC-based virtual prototyping and large-scale data modeling provide breakthrough insights that dramatically accelerate and streamline not only ‘upstream’ R&D and engineering, but also ‘downstream’ business processes such as data mining, logistics and custom manufacturing.”    And while, “the United States is the largest consumer of HPC…. some U.S. firms are not applying HPC as aggressively as they could.”

Going back of Dyson’s example, granted a decade ago the use of HPC was limited to few industries, the HPC industry has grown dramatically and so the availability, access and affordability of the computing resources that can significantly streamline the time of discovery to accelerate innovation.  The availability, affordability and combination of powerful workstations, servers, and software can allow a designer to more quickly and efficiently innovate ideas for form, fit and function.  You might want to read this blog on “Why you need a Digital Workbench” by Thor Sewell.   And while cost might be mentioned as one of the barriers to HPC adoption, can you afford to persevere for 15 years and risk your savings?, like Dyson did.  As of matter of fact, companies are innovating around providing HPC services to the “masses”  by lowering the adoption barriers on a “pay-as-you-go-basis.” So while Dyson learned from each of his 5127 so called “failures” over 15 years and nearly his entire savings, HPC technology allows us to accelerate the time from idea to reality process you can more quickly streamline innovation cycle process:  from creating, to simulating, to analyzing , to visualizing.

Jimmy

*Other names and brands may be claimed as the property of others

2 Comments Permalink
0

My name is Steve Thorne, and this is my first blog post in The Server Room. I’m the product line manager for the Intel Xeon processor 5000 family, and I’m based out of our Hillsboro, Oregon facility. I’ve been looking forward to this blog post for quite some time, since I’ve been meeting with a wide variety of customers over the past few weeks.

 

It’s been just over a month since we introduced the Intel Xeon processor 5500 series (the processor formerly known as “Nehalem-EP”). We are certainly pleased with the response from the industry at this point. Below you will see some of my observations about what has transpired over the first 30 days of release. At the same time, I invite you to share some of your stories about recent installations of the Xeon 5500. Where is it being used? What kind of environments are you using it in? What kind of improvements have you observed in your deployments?

The industry response has been extremely encouraging to me. Our marketing teams spent more than three years diligently preparing for the successful introduction. Some of my observations from the first month include:

·         The list of vendors that support the Xeon 5500 continues to grow. We started with over 70 system manufacturers on March 30, 2009. And on April 14, 2009, Sun Microsystems introduced a new line of x64 blade servers, rack servers and workstations powered by the Intel Xeon processor 5500 series. Of particular interest is the Sun Blade X6275 server module. You can find more info at: http://www.sun.com/solutions/hpc/compute.jsp.

·         I attended our launch event in Santa Clara on March 30, 2009. While at the event, I was pleasantly surprised by the adulation from the customers who were in attendance. In particular, our friends in the Digital Content Creation (DCC) industry are eager to apply the capabilities of the Xeon 5500 for movie special effects and animated features. Being a father of three school age children, I’ve always been fond of our products’ role in the moviemaking process. It’s fun to take your kids to the theater and show them a concrete example of how these incredibly complex processors are used to generate chuckles and special effects in movies ranging from “Cars” to “Monsters vs. Aliens.”

·         Positive recognition has been accorded to the Xeon 5500 from a wide variety of independent press reviewers and articles. A recent internet search revealed almost 875 news references. Recently, George Ou of DailyTech published an interesting article titled “Server roundup: Intel “Nehalem” Xeon versus AMD “Shanghai” Opteron”. You can read the entire article at: http://www.dailytech.com/article.aspx?newsid=15036

·         On May 4, 2009 two independent financial analysts upgraded Intel Corp. stock. Both analysts attributed part of their positive outlook to the introduction and ramp of Xeon 5500 servers.

·         On April 8, 2009 the new Xeon 5500 was a centerpiece of our IDF event in Beijing. In his enterprise key note, Pat Gelsinger said the “Nehalem” microarchitecture has received worldwide acclaim.

·         Customer deployments are underway at leading data centers around the globe – particularly in High Performance Computing (HPC) applications. The HPC accounts encompass university research labs, commercial research and development and large scale clusters. These HPC customers are pushing the outer limits of scientific discovery and innovation, and the best examples are yet to come!

Personally, I was proud to be a part of the introduction of the Xeon 5500. There is a strong sense of satisfaction when the silicon is deployed in real-world environments. And in case you hadn’t heard, we are busy getting ready for the next addition to the Xeon family, codenamed “Westmere-EP.” We expect this new 32nm processor to be socket and pin-compatible with the Xeon 5500, and it will stretch the processor to support six individual CPU cores per socket. Stay tuned for this release in 2010!



 



0 Comments Permalink
1 2 Previous Next

Filter Blog

By author: By date: By tag: