If you want to virtualize IBM InfoSphere* but have put it off, I have some new information that might give you the bump you need to move forward: How does 98 percent throughput sound?
In a nutshell: extensive testing shows that InfoSphere can perform very well when virtualized. How well? We achieved between 90 and 98 percent of the throughput that we typically see in a physical environment, with an overhead tax of only about 10 percent on I/O-intensive workloads.
You can get all of the configuration and results details in the white paper about InfoSphere virtualization here, but I’ll hit the highlights for you in this post.
While virtualizing InfoSphere Information Server is cool, here’s the icing on the cake: virtualization with VMware vSphere* is now supported as an IBM PureApplication* pattern. “Patterns” are a growing ecosystem of software stacks for IBM PureSystems* running on Intel® Xeon® processors. Hence, adding the leading VMware vSphere 5.1 stack indeed broadens the use of these rich appliance systems versus other industry systems. PureSystems are innovative, quick-to-uptime, and run industry-standard software stacks.
Testing InfoSphere Virtualization
IBM, VMware, and Intel teamed up recently to see how IBM InfoSphere performs when it’s virtualized. Specifically, we looked at the runtime performance of:
- IBM InfoSphere DataStage* 8.7
- Running on VMware vSphere 5.0
- On a server powered by the Intel Xeon processor E7 processor family
The tests found that InfoSphere DataStage scaled smoothly as we cranked up the number of virtual CPUs (vCPUs), while clocking throughput at up to 98 percent of that found in a physical environment.
More good news: we saw only a slight performance difference when using VMFS (Virtual Machine File System) versus RDM (Raw Device Mapping) data stores. This means that you can reap the benefits of VMFS for storage provisioning with virtualized workloads without concern over performance.
Now let’s get down into some of the details so you can put these results in context.
InfoSphere Virtualization Test Configuration
We used an IBM* System x3850 X5 with a network-attached IBM System Storage* DS5300:
- Four socket system
- 40 physical cores
- Intel Xeon E7-8870 processors
- Configured as “optimized for performance”
To get a good comparison between physical and virtual environments, we controlled RAM and processor availability to the native environment so that we could match the virtual environment as closely as possible. We enabled all of the virtualization-related options, except for Intel® Hyper-Threading Technology (Intel® HT Technology), which we disabled to simplify the comparison.
We installed all InfoSphere DataStage components on one physical server in the native environment and on one virtual machine in the virtualized environment:
- IBM WebSphere* 8.1 Application Server (WAS)
- XMeta repository
- DataStage engine
We found that DataStage engine running in a single virtual machine had higher throughput. The specific reasons for this result weren’t clear, but we’re hopeful that further testing will provide more detail.
ETL Workload Virtualization Test Results
As expected, throughput in both environments increased as the number of processor cores increased, with performance in the two environments varying only between 2 and 10 percent. The overhead for the virtual environment was only 10 percent for all tested configurations.
Host Server Memory-Management Test Results
We also wanted to see what would happen with DataStage performance when overcommitting the host server. We measured a 34 percent drop in throughput when the system was 100 percent committed, and throughput continued to drop as the host processors were further committed.
Working with VMware engineers, we determined that this drop was because the eight vCPUs were not mapping neatly onto the 10-core, Non-Uniform Memory Access (NUMA)–node design of the Intel Xeon processor E7 family’s microarchitecture.
To work around this issue, you could use a five vCPU configuration instead of an eight vCPU configuration. Five vCPUs would map well to the 10-core NUMA nodes. Another workaround would be to turn off NUMA scheduling in BIOS or VMware vSphere, which would allow all of the CPU cores to be used, though you would see a lag in memory performance. The lesson? Understand NUMA configuration to optimize VM performance.
We repeated this over-commitment test with the VMware vSphere “reserve CPU” option for the DataStage guest set to maximum, and the result showed minimal performance impact. However, this move can potentially impact the performance of the other non-reserved virtual machines running on the same host.
If your system is overcommitted and you’re not seeing the DataStage runtime performance you want, the best option would be to increase the host system capacity or to move the VM to another host.
Go Ahead and Virtualize InfoSphere
Our tests showed InfoSphere Information Server runtime performed very well in a virtualized environment hosted on a platform powered by the Intel Xeon processor E7 family and using VMware vSphere 5.0. So if you’ve wanted to virtualize your InfoSphere applications but were concerned about performance, now might be the time.
Take a look at the white paper to see the test configuration, procedure, and results. And follow me on Twitter, @TimIntel, to get more useful InfoSphere and DB2* tidbits. You can also get the latest news and technology updates at the joint Intel and IBM DB2 website.