IT Peer Network

3 Posts authored by: ChristianBlack

24 months of Intel SSDs…. What we’ve learned about MLC in the enterprise…

 

The Enterprise Integration Center (EIC) private cloud lab (a joint Intel IT and Intel Architecture Group program) has been working with Intel SSDs (solid state disks) for the last two years in a number of configurations ranging from individual boot/swap volumes for servers to ultra performance iSCSI software based mini-SANs. So, what have we learned about performance, tuning, and use cases?

 

There are plenty of industry resources and comparisons available out at any number of trusted review sites, but most of these revolve around client usage and not server/datacenter uses. From my contact with industry, most engineers seem to think that using an SSD in the datacenter requires a SLC NAND device (Single Level Charge - Intel X25-E product) due to endurance requirements. For those new to NAND characteristics, endurance (usable lifetime) is determined by writes to the NAND device as block-erase cycles stress and degrade the ability of the flash cells to be read back. Basically, SLC devices last through more block-erase cycles than their less expensive and larger capacity MLC cousins (Multi Level Charge - Intel X25-M product). The assumption that ‘only SLC will do’ for the enterprise raises the $/GB cost flag and mires discussion. Endurance is the number one, “but those won’t for my use-case” argument.

 

The EIC cloud lab has some good news here, lower cost MLC or consumer grade devices can do just as well, especially in RAID arrays. To get the best out of these MLC devices though, we have to employ a few techniques that allow the drive and its components to function more efficiently. These techniques manipulate the three vectors in MLC… space, speed, and endurance by altering the useable size of the disk.

 

Assume I have a 160 GB X25-M MLC drive; this device is spec’ed at 250MB/s read and 100MB/s write (sequential) and has a lifetime of around 4-5 years in a ‘consumer’ use case (laptop-desktop). So if I was to use this same device as a repository for a database transaction log (lots of writes), the lifetime would shorten significantly (maybe as little as a year). There are specific formulas to determine endurance & speed, some that are unavailable to the public, but Principal Engineer Tony Roug wraps up the case for MLC in the enterprise quite well in this presentation from Fall 2010 Storage and Networking World.

 

Back to trade offs (space, speed, and endurance); my 160GB MLC drive won’t work for my database transaction log because the workload is too write intensive… What I can do about this is to take the 160GB drive and modify it to use only 75% (120GB) of the available capacity. Reducing the ‘user’ available space gives the wear-leveling algorithm in the drive more working room and increases both the speed (write speed as reads are unaffected by this) and the endurance of the drive, but also increases the $/GB as you have less available space.

 

With the ‘user’ space reduced to 120GB (over-provisioned is official term), that same 160GB is now capable of 250MB/s read and 125MB/s write (sequential) and has a lifetime of 8-10 years in the ‘consumer’ use case. Not terribly appealing to the average end-user who just spent $350 on an SSD as they lost 25% of the capacity, but in the performance and enterprise space this is huge. Once modified, my ‘consumer grade’ MLC drive gets roughly 75-80% of the speed & endurance of the X25-E SLC drive with 4x the space at about the same ‘unit cost’ per drive. Since the drive is 4x larger than SLC, will likely last as long as a standard hard disk once over-provisioned, has great throughput at 125-250MB/s, and can reach 100-400x the IO operations of a standard hard drive we can now begin the discussion around which particular enterprise application benefit from Intel MLC SSD.

 

For the enterprise, once we overcome the endurance hurdle, the value discussion can begin. For the performance enthusiast at home, this same technique allows a boost in disk write throughput, higher benchmark scores, and of course more FPS (frames per second) in whatever game they are thoroughly stressing their over-clocked water-cooled super-system with at the moment.

 

BKMs (Best Known Methods) for enterprise and use-case evaluation… AKA: The technical bits…

 

  • Get to know the IO characterization (reads/writes) of the target application & use case
  • Baseline the application before any SSD upgrades with standard disks, collecting throughput and utilization metrics
  • Knock a maximum of 25% off the top of any MLC drive you’re using in the datacenter
    • More than 25% has diminishing value
    • Use either an LBA tool, RAID controller, or partitioning tool after a fresh low level format
    • That % can be smaller based on the write intensity of the target application - less writes = less % off the top on a case by case basis
  • SAS/SATA RAID controller settings
    • Activate on-drive cache – OK to do in SSD
    • Stripe size of 256k if possible to match block-erase cycle of drive
    • Read/write on-controller DRAM cache should be on and battery backed
  • Make sure any drive to controller channel relationship in SAS controllers stays at 1:1
    • Avoids reducing drive speed from 3.0 Gbps to 1.5 Gbps
  • Avoid using SATA drives behind SAS expanders
    • Again… avoids reducing drive speed from 3.0 Gbps to 1.5 Gbps
  • SSDs are 5v devices, make sure the 5v rail in the power supplies has a high enough rating to handle to power-on of X number of SSDs
    • Only necessary if you’re putting 16+ drives in any particular chassis
  • Baseline the application after SSD upgrade to determine performance increase collecting throughput and utilization metrics
    • Look for higher IOPS and application throughput but also be looking for higher CPU utilization numbers now that you have eliminated the disk bottleneck from your system
    • There will likely be a new bottleneck in other components such as network, memory, etc… look for that as a target for your next improvement
  • Last but not least, when testing an application you’ll need to ‘season’ your SSDs for a while before you see completely consistent results
    • For benchmarks, fill the drive 2x times completely and then run the target test 2-3 times  before taking final measurements
    • For applications, run the app for a few days to a week before taking final performance measurements
    • Remember, a freshly low level formatted SSD doesn’t have to perform a block-erase cycle before writing to disk

 

Well, that’s it in a fairly large nutshell… We see using MLC disks in enterprise use cases as something that is growing now that the underlying techniques for increasing endurance are better understood. In addition, as Intel’s product lines and individual device capacities expand… so can enterprise use cases of these amazing solid-state disks. The question left to answer is, “In your datacenter, are there applications and end-users you can accelerate using lower cost MLC based Intel SSDs?”

 

- Chris

Cloud Computing & the Psychology of Mine

Legacy Thinking in the Evolving Datacenter

The 1957 Warner Brothers* cartoon “Ali Baba Bunny” shows a scene where an elated Daffy Duck bounds about a pile of riches and gold in Ali Baba’s cave exclaiming, “Mine, Mine.. It’s all Mine!” Daffy Duck, cartoons, Ali Baba… what do these have to do with the evolving datacenter and cloud computing?

The answer to this question is ‘everything’! Albeit exaggerated, Daffy’s exclamation is not far from the thinking of the typical application owner in today’s datacenter. The operating system (OS), application, servers, network connections, support, and perhaps racks are all the stovepipe property of the application owner. “Mine, Mine… It’s all Mine!” For most IT workloads, a singularly purposed stack of servers, 50-70% over-provisioned for peak load, and conservatively sized at 2-4x capacity for growth over time. The result of this practice is an entire datacenter running at 10-15% utilization in case of unforeseen load spikes or faster than expected application adoption. Given a server consumes 65% of its power budget when running at 0% utilization, the problem of waste is self-evident.

Enter server virtualization, the modern Hypervisor or VMM, and the eventual ubiquity of cloud computing. Although variations in features exist between VMware*, Microsoft*, Xen*, and other flavors of virtualization, all achieve abstraction of the guest OS and application stack from the underlying hardware and workload portability.

This workload portability and abysmal utilization rates allows consolidation of multiple OS-App stacks into single physical servers, and the division of ever larger resources such as the 4-socket Intel Xeon 7500 series platform which surpasses the compute capacity of mid-90s supercomputers. Virtualization is a tool that helps reclaim datacenter space, reduce costs, and simplify the provisioning and re-provisioning of OS-App stacks. However much like a hammer, virtualization requires a functioning intelligence to wield and could result in more management overhead if one refuses to break the paradigm of ‘mine’...

A portion of this intelligence lies with the application owner. In the past, the application owner had to sequester dedicated resources and over-provision to ensure availability and accountability. Although this thinking is still true to a degree, current infrastructure is much more fungible than the static compute resources of 10 or even 5 years ago. The last eight months working on the Datacenter 2.0 project, a joint Intel IT and Intel Architecture Group (IAG) effort, brought this thinking to the forefront as every Proof of Concept (PoC) owner repeatedly asked for dedicated resources within the project’s experimental ‘mini-cloud’. Time and time again, end users asked for isolated and dedicated servers, network, and storage demonstrating a fundamental distrust of the ability of cloud to meet their expectations. Interesting, most of the PoC owners cited performance as the leading reason for dedicated resource request yet were unable to articulate specific requirements such as network bandwidth consumption, memory usage, or disk IO operations.

The author initially shared this skepticism as virtualization and ‘the cloud’ have some as-yet immature features. For broad adoption, the cloud compute model must demonstrate both the ability to secure & isolate workloads and the ability to actively respond to demands from all four resource vectors of; compute, memory, disk i/o, and network i/o. Current solutions easily respond to memory and compute utilization however, most hypervisors are blind to disk and network bottlenecks. In addition, current operating systems lack the mechanisms for on-the-fly increase or decrease in the number of CPUs and memory available to the OS. Once the active measurement, response, trend analysis, security, and OS flexibility issues are resolved virtualization and cloud compute are poised to revolutionize the way IT deploys applications. However, this is the easy piece as it is purely technical and one of inevitable technology maturation.

The more difficult piece of this puzzle is the change in thinking and paradigm shift that the end users and application owners must make. This change in thinking happens when the question asked becomes, “is my application available” instead of, “is the server up?” and when application owners think in terms of meeting service level agreements and application response time requirements instead of application uptime. After much testing and demonstration, end users will eventually become comfortable with the idea that the cloud can adapt to the needs of their workload regardless the demand vector.

Although not a panacea, cloud computing promises flexibility, efficiency, demand-based resourcing, and an ability to observe and manage the resources consumed by application workloads like never before. As this compute model matures, our responsibility as engineers and architects is to foster credibility, deploy reliable solutions, and push the industry to mature those underdeveloped security and demand-vector response features.

Christian D, Black, MCSE

Technologist/Systems Engineer

Intel IT – Strategy, Architecture, & Innovation

Hey, I started this in a discussion forum and though better of it in a blog, please comment!

 

Original Post:  SSD: Throw out your hard disks!

 

SSD: Throw out your hard disks!

 

Wait, not all of them just yet! I'm officially jumping on the hype bandwagon as I've been exercising and testing Intel SSDs (read long hours here) for the last three months. The comments from several online hardware reviewers are flattering, but they don't tell the whole story as they focus on single disk client machines. BTW, my Vista Ultimate Intel quad-core takes 20 seconds from POST complete to login with an Intel SSD. But, I'm a hardware guy at heart I need to know what implications these devices have for the datacenter and my apps. So we took the SSDs, dumped them into a number of differing servers and controllers to really work them and find out what breaks and where. Results, fantastic - RAIDed SSDs beat out their 15k spinning cousins in sequential reads and almost best them in sequential writes - ho hum you say? This is where the paradigm shifts, the more interesting story is around random I/O. Somewhere between 6x and 12x the performance of traditional disk is where these cool operators land, depending on block size and queue depth of course. Massive throughput gains and latency reductions with a SATA device attached to a SAS controller, which downgrades the SATA bus speed to 1.5gbps and imposes the overhead of SATA Tunneling Protocol. That's not bad for a hamstrung Olympian?

 

 

 

Backing away from the face melting speed of the SSDs random I/O, what does this mean for the computing? Yes fast, yes low latency, yes.... We've been designing applications, firmware, and everything we know about I/O to get around the random stuff. Ever since your first ST-506... We cache it, lay it down in stripes, defrag it religiously, anything to make it more sequential than random. What if that doesn't matter anymore? What if we don't have to engineer and program around the I/O bottleneck? Time to pull out the random I/O paradigm and watch it crumble. The great part about this shift is that it starts today and only gets better with initial SATA and then native SAS devices at 3gbps. Today NAND is limited by write cycles, needs some time to charge cells, more time in a multi-level cell device (MLC), and has block erase cycles and write amplification to contend with. 3-5 years from now... BOOM - phase change memory replaces traditional NAND for more endurance (+100x), faster read/write (µs vs. ns), and single bit alterability. Passengers, please sit back and place an extra fan on your Northbridge or array controller, we are entering a time when your I/O doesn't lag behind your CPU by a factor of 100x. What can this change, what can this change improve, and what will this do for your business?

Filter Blog

By author:
By date:
By tag: