Recently i have purchased a new Intel Server ( SR1612UR platform) with INTEL RAID RS2BL040. I have installed a 40GB SSD Cache Drive (called CacheCade).
The server has 2 xeon 5620 cpu, 32 gb ram and 6 1TB SAS 6GB/s drives configured in RAID 10 (2 virtual drives, each 1,3 TB)
The bios and raid controller are up-to-date with their firmware.
I have ESXi 4.1 U1 with all the latest patches.
I have installed 3 VMs , Windows 7 x32, windows 7 x64 and debian 6 x64 with vmware drivers inside
For benchmarking i used "Iometer" on windows and "Stress" on linux.
The results are very strange and bad. As soon as i start the test on all 3 VMs, one machine has maximum perfomance with about 600 commands/s (according to esxtop) and the other two are very low, even 0 at some point (this test with 20GB testfile) ... we try to set up the same test with smaller
testfiles and different Iometer settings and the behavior is the same ..one machine running at full perfomance the rest are barrely running
Also we tested with 2 windows machines running just a torrent client on the same files ..one VM have sustainable perfomance and the other
has almost no access to disk.
When the load is not very high both VM are starting at the same performance but in 2 to 10 minutes (depending on the disk load) one of the VM performance is degrading to the point of no usability
As soon as i stop the fast VM, one of the other two takes over the resources, crippling the remaining VM.
If i run the same tests without the SSD cache drive, the load is distributed evenly to all VMs and the system performance is very good
Any one have any ideea why the server behaves like this? Is there a special setting that has to be done on the RAID controller?
You say you've installed a 40GB SSD drive and then you call it CacheCade. Are you using one of the SSD Cache with Fastpath (AXXRPFKSSD) Premium Feature Keys?
See the attached Intel® RAID Controller Premium Feature Key Training document for information on setup and use of the SSD Cache feature key.
You can also find information on the RAID Solutions website.
Yes, we are using the hardware premium feature key for fastpath.
We found one problem related to the lenght of the mini-sas cable which was too long (1meter) and after using a 60mm long cable now
the VMWare is running with SSD cache and both machines are sharing the io even but with no increase of performance.
We did lots of tests with IOMeter and some custom tests and everything the performance with CacheCade are worse then perfomances without CacheCade.
In every test combination when using SSD Cache performance droped, Also we did a test using couple of background archiving of 25Gb arhive and 1.3mil files the test had finish in 39 minutes without SSD CacheCade and in 51 minutes in CacheCade configuration,
For example :
Raid5 configuration with 6 disks sas 1TB - 7200
2VM machines with IOmeter running
Iometer settings : 4kb, 100% seq - 100%read
With SSD CachCade :
Total IO/sec VM1 : 9365.62 Total IO/sec VM2 : 8864.11
Total Mb/sec VM1 : 36.58 Total Mb/sec VM2 : 34.63
Avrg IO Response VM1 : 0.1060 Avrg IO Response VM2 : 0.1120
Max IO Response VM1 :153.1653 Max IO Response VM2 :206.5525
Without SSD CachCade :
Total IO/sec VM1 : 10512.24 Total IO/sec VM2 : 11172.80
Total Mb/sec VM1 : 41.06 Total Mb/sec VM2 : 43.64
Avrg IO Response VM1 : 0.0945 Avrg IO Response VM2 : 0.0888
Max IO Response VM1 : 17.4893 Max IO Response VM2 : 3.7226
Raid10 configuration with 6 disks - sas 1TB - 7200
2VM machines with IOmeter running
Iometer settings : 4kb, 50% seq - 70%read
With SSD CachCade :
Total IO/sec VM1 : 153.87 Total IO/sec VM2 : 160.24
Total Mb/sec VM1 : 0.60 Total Mb/sec VM2 : 0.63
Avrg IO Response VM1 : 6.4979 Avrg IO Response VM2 : 6.2383
Max IO Response VM1 : 183.6372 Max IO Response VM2 :242.0971
Without SSD CachCade :
Total IO/sec VM1 : 174.17 Total IO/sec VM2 : 170.65
Total Mb/sec VM1 : 0.68 Total Mb/sec VM2 : 0.67
Avrg IO Response VM1 : 5.7397 Avrg IO Response VM2 : 5.8576
Max IO Response VM1 : 102.9204 Max IO Response VM2 : 105.4651
We haven't seen this behavior in our lab, however, we have not done any testing with VMWare.
Here's an LSI created Advanced Software Evaluation Guide that details how to setup IOMeter to mimic a workload that has locality. You might find it useful in configuring IOMeter for your tests.
i m looking for a SSD Cache with a RS2BL080 controller with 6* 2 TB 6 GB/s Disks configured as Raid 5.
It may be the case, that yr SSD cant deliver the performance of yr drives. Most of the small SSD´s are in the
low performance area, delivering about 200 MB/s. Maybe lots of IOPS, but this depends on yr access profile.
Maybe yr SSD is only capable of 3 Gb/s. In case of caching in the SSD, all your Data is going through a single SSD.
My Raid5 gives about 600 MB/s reading and writing ... none of the current cheap (up to 500 EUR) SSD´s can deliver. So
maybe it is a good point to put at least 2 or more SSD´s into such a configuration to deliver at least the performance of the drives.
Your SSD maybe slower than yr drives. Maybe you try to add more or faster SSD´s.
Mihai, I wanted to add to my post below. I noticed a number of issues with the testing you did.
- First, CacheCade only accelerates random reads. This is not just true for CacheCade, but for similar products.
- For writes, the DRAM write cache on your RAID controller is orders of magintude faster than Flash. You can expect writes to DRAM to always outperform writes to Flash. In many cases, even without CacheCade, SSDs are slower than HDDs in write-intensive applications. That is why (for example) in TPC-C benchmarks they never put the redo logs (which is a sequential write workload) on SSD. They always put the redo logs on RAID-10 spinning disks.
- For sequential reads, spinning disk is generally equivalent to Flash SSD. In fact, dollar-for-dollar spinning disk is several times faster than Flash SSD for sequential operation! Whether using CacheCade or not, don't expect to see performance increases on sequential read operations from any Flash SSD.
- CacheCade (and similar products) require some time to figure out the read workload and copy "hot" data into the SSD. For CacheCade this timeframe is roughly 90 minutes according to the LSI performance docs I have seen. CacheCade is fast in this regard -- some competitive products take 10 or more hours to adapt to a workload. Try letting your IOmeter test runs go for at least 90 minutes and see what happens.
- Also, most SSDs rely on parallelism to get the high IOPS numbers, for example the Intel X25s are 10-way parallel. In other words, an X25 SSD is like a RAID-0 array of 10 independent Flash-chips. Individually the Flash chips are not nearly as fast as they are when running lots of "threads". If you are using IOmeter, you need to set your outstanding IOs per disk to something greater than 10 or so in order to see the speed advantage over HDD. Also, make sure to use the Windows-based version of IOmeter. The Linux versions are not capable of issuing deep queues of outstanding IO requests.
- Which version of IOmeter do you use? Newer versions use a more realistic form of "random" IO that contains real-world "patterns" that CacheCade may be able to interpret.
- Finally, how big an extent are you testing? If (in your 4kb, 50% seq - 70% test) you used too large an extent, or you used a 'raw' partition, then your random operations are spread across the entire LBA range -- this never happens in real life and would be very difficult for CacheCade to "optimize" around. On the other hand, if you used too small an extent, then it may be that what you were testing was all held in the DRAM cache in the RAID card. If so, then you would expect to see better performance from the "Cache-less" scenario because again, DRAM is faster than Flash.
- Were you using SLC or MLC Flash? MLC Flash is horribly slow on writes when compared to SLC and sometimes even slower than HDD!
One last thing to point out here is that all Flash-based storage suffers from asymmetrical performance -- that is to say that writes are 100x - 1000x slower than reads. Many workloads, such as OLTP, are bottlenecked by the speed of write operations.
Hope this helps...
Thank you for the analysis,
as for 2012 there is a new SSD Feature Key for chache 2.0. It should do read and write chache at once.
Does it deliver better results as for chache 1.0?
Is there any suggestion on how many SSD´s and which size should be used ?
What happens in case of a power outage without bbu?
What happens if there is a SSD error ?
Should the SSD´s be in a mirror configuration? Is there a konfiguration option for mirror ssd´s on the raid controlles ?
For a low cost environment- is it ok to use an Intel 520 SSD with 120 GB? It should deliver about 550 MB/s on a 6 Gb/s sata connector.
Thank you for a fast answer.
We purchased the SSD key to our RS2BL040 controllers too. We purchased the first version, not realizing there is an update. INTEL WILL EXCHANGE IT FOR YOU VIA RMA!
We are running two of the new 311 24GB SSDs from Intel, in a raid 1 as cache, and 6 1TB SAS drives in a raid 10.
However, we plan to add a similar setup to some larger capacity SAS computers we have too, and will most likely use the 700 series Intel SSD in a raid 1 for their cache.
They are most beneficial if the DB read requests can hit the SSD, and that requires an SSD array with enough capacity to cache the entire set of data. A smaller SSD array will be less effective, of course.
I do have one (?) question though: Should the SSD array I/O policy be set to cached, or direct? I have it set to cached, with the other policies set to write through, disable BGI and disk cache, and normal for read ahead.