Does anyone at Intel have information on the the slow RAID5 performance that some of us are seeing with our new Sandy Bridge desktops? I've been reading the forums here at communities.intel.com and it seems I'm not the only one having performance problems.
Running Win7 Enterprise w/SP1, i7-2600k, 16GB ram on a Asus P8H67-M Evo motherboard. I used four WD green 2TB drives to setup first a 100GB partition for the OS, then the remainder of 5489GB for data. I accepted the 64k default strip size for the OS partition, but was forced to use 128k for the data partition. After building the machine, installing Win7 and updates, it then took 20 hours to load 920GB at an average speed of 12-15MB/sec. Any ideas?
Are you using WD20EADS or WD20EARS drives? The EARS are the infamous "advanced format" ones with 4K sector size. If you didn't align the partitions correctly, the performance ends up being disastrous. Generally speaking, you shouldn't use those "green" drives in RAID sets.
Was the RAID set fully synchronized when started the copy? Otherwise the rebuild may have slowed you down.
Yes, all four of these drives are WD20EARS models with the 4k sector size (and no jumper). All four show up in the IRST management program (version 10.1.0.1008) as model WDC WD20EARS-00S8B1 with Firmware: 80.00A80. I'm not sure what you mean by "fully synchronized"? The RAID array had been defined in the BIOS, two volumes created, A 100GB for the OS, and remaining 5489GB for data. Then the Win7 OS installed on the first 100GB partiton(Disk0 in Disk Management). After the instalation of the OS, Computer "Disk Management" showed DISK0 for the OS. The status for DISK1 (the 5489GB) was "Healthy (Primary Partition)" before the copy began.
Please checkout "Intel Details the Source of the P67/H67 Chipset SATA Bug" at Softpedia.
" Intel has recently informed the press that the issue which forced the company to halt production of its P67 and H67 chipsets is due to a transistor in the 3Gbps PLL clocking tree that was mistakenly over volted when the PCH was designed"
Apparently this only affects ports 2 and 5.
I am not technicaly up with such things but was checking out the chipset for a Dell that is supplied with it when I found the above article and your posts and thought they may be related.
Hope this helps
I gave up trying to get valid RAID5 performance out of my Asus P68H67-M Evo motherboard with an i7-2600k CPU. I ran out of time to play with it. I don't think your comments about only ports 2 and 5 being affected pertain to my motherboard. I waited for Intel to give the manufacturers the B3 version of the H67 chipset before I purchased my hardware. I contribute my bad RAID5 performance (~13MB/sec write speed) to two separate problems. First was my decision to place the OS on the same physical drives as the data. You can't get too much I/O done when the disk heads are constantly being called away to service OS reads/writes. The second reason relates to my hard drive decision. I purchased four Western Digital green 2TB WD20EARS drives. These drives are known to use variable speeds between 5400-5900 RPM. Now that I look back, how could I expect the four drives to spin synchronously when they are each "doing they're own thing"? I am a photographer and I am looking for ways to build redundancy into my workflow. Maybe in the future I'll buy a pair of expensive enterprise level drives and run them as RAID 1. I also do not wish to pay $100-$500 for a separate RAID controller. I paid for RAID on the motherboard and I wanted to use it. I did spend $200 for a Crucial C300 128GB SSD just for the OS. THAT made a lot of difference. For the moment, I'm not using any kind of RAID. Just the SSD for the OS, and a 2TB WD for the data drive.
There is indeed something funky going on with RAID5 on Intel Rapid Storage. See my post here :http://communities.intel.com/message/124880#124880
In short I have two raid volumes in the same array (due to 2TB limit on boot volume) with similar configuration getting good performance on one and crappy on the other (with them having similar setttings) go figure
However the advanced format(4k sectors) drives are even worse in this case.( For this reason I choose 512byte sector drives).
The issue with them is not with Intel but due to alignment issues when placing your partitions. They can cause issues in single drive situations too!
Check this link to read a bit about it:http://consumer.media.seagate.com/2010/03/the-digital-den/4k-sector-hard-drive-primer/
All manufacturers will switch over to 4k sectors in the future(or larger) but OS:es and storage controllers need to catch up!
With a Raid implementation you have to match the following alignment criteria for good performance:
To get Partition offset; Start->Run->MSinfo32. in MSinfo32: Components->Storage->Partition Starting Offset. (this is in bytes so divide it with 1024 to get it in kb.
For spreadsheets and some online alignment calculators:http://www.techpowerup.com/forums/showthread.php?t=107126
How you place your partitions has nothing to do with Intel Rapid Storage. Win 7 automatically alignes their GPT-partitions. However not the system NTFS one. (Hopefully we can soon boot into a large 2TB+ volume containing a GPT partition)
The only problem I see with 4k sector drives is if Intel aligns the volumes on the actual disks when creating them?
http://www.intel.com/support/chipsets/imsm/sb/CS-031502.htm is extremely vague and does not state in what way and configurations they support them.
I have an H67-B3 (MSI) board with the 2600K, and I'm running a 5-disk raid 5 with 2TB hitachi deskstars (which have the 512e advanced format sectors). It is configured as a 1TB volume and a 6.3TB volume (both raid 5), and I have typical read performance > 480 MB/sec and typical write performance > 350 MB/sec (N.B. writes are SLOW with write caching disabled). I copied 100 GB from the first volume to the second volume and the average transfer rate was about 110 MB/sec (note that this is reading and writing on the same array).
I have the stripe size and ntfs cluster size both at 64k, and the volumes and partitions are properly aligned to a 4k boundary (at the physical disk level) - you can boot a live linux cd and use mdadm to examine the volume alignment. BTW, I didn't do anything special to align the array. In fact, I migrated it over from an old core 2 board with the G45 chipset (still ICH10R) in a 4 disk RAID 10 configuration, added the 5th drive, and converted it to RAID 5. It's actually all around faster (and twice as big) as the 4 disk RAID 10.
It is important to note that it is generally not good to run a raid 5 with an even number of disks. Use 3 or 5. For example, in my configuration, a 64KB stripe consists of 4 16 KB (4 4K sectors) data blocks and 1 16KB parity block. This aligns nicely across 5 disks. In a four disk configuration, the 64K data is divided over 3 drives, and the parity block will be (I think) 64K / 3, so basically you have 21 1/3 KB "chunks" per drive. I actually have no idea how the RAID ROM manages this (maybe rounds to the nearest sector, or maybe "packs" partial sectors together?), but I imagine what ever it does leads to extremely poor performance, as any reads will surely result in an unaligned access on one or more drives. This is pretty much true for any kind of RAID controller (h/w, s/w or "fake").
But seriously, I have a 256GB SATA III SSD (samsung 830) that I just installed and moved my OS over to, and while it smokes the RAID in access latency (I don't ever even see the "starting windows" screen during boot), the transfer rates are only about 15% better. Add another disk and grow your RAID, it'll probably start screaming!
Edit: Also, I used the 64KB ntfs cluster size to match the stripe size for a very particular reason. Note that any writes to the array 5 that are smaller than the stripe size will force the controller to read the whole stripe, modify it, calculate new parity, and write that all back to disk. I don't know how windows is issuing the actual writes to the drive, or if the Intel controller is smart enough to "know" that a full stripe write does not require a read-modify-write cycle, but if you do use the default 4k cluster and have lots of small writes, it will definitely not have optimal performance.