The Data Stack

2 Posts authored by: Tony

So what's Intel been doing with NAND based Solid State Disk (SSDs) since my blog on our next generation broadband video streaming demo (http://communities.intel.com/openport/blogs/server/2007/11/14/). Two things: 1) we're close to launching Intel's SATA based SSD products and 2) we've been engaging you to get more details on your usage models and value propositions. In the last few months, there have been a number of announcements for SSDs in server and enterprise storage applications (e.g. EMC: http://www.emc.com/about/news/press/us/2008/011408-1.htm) including a number of small startups offering solutions targeted for server deployments. Based on my discussions with you and looking at what's going on in the industry, here's my view of the value of SSDs in servers and how that maps to server usage models.

 

As a person who focuses typically on the end-users, SSDs are interesting because they weren't designed to specifically solve an end-user server problem. As I said in my previous blog "because we could", SSD largely exist "because they can". They are what Clayton Christensen would call a disruptive technology. As SSDs are considered for server based applications, I look at how SSDs as a technology can provide greater value when replacing server hard drives (HDDs) or server memory and then build possible usage models from there.

 

When comparing to HDD usage in servers, I start with the following:

 

Performance: SSDs can have much better random access performance as measured by higher IOPS, higher throughput and lower read/write latency. SSDs are typically achieving a least 10 times the number of IOPs as HDDs, at least 2-3 times better random access read rate and on the order of 10 times less read and write latency than HDDs. For random access performance, most SSDs blow the highest performing 15K RPM hard drives away.

 

Power: SSDs use lower power especially when compared to a disk is that is active (i.e. spinning). Given that for most server based applications, the hard disk is always active, this is especially significant. My general observation is that SSDs typically use less than 1/5th of the power of an active HDD. Here they look to be a key technology for making data centers more power efficient.

 

Cost: When comparing cost per Gigabyte, SSDs are higher priced. Given this, SSDs today are largely being considered for applications where storage IO is the bottleneck - where many hard drives can be replaced with just a few SSDs.

 

SSDs can be compared to DDR memory with the same three value vectors:

 

Performance: Unlike the SSD to HDD comparison, memory has higher throughput and lower latency than an SSD. When comparing SSDs to memory for server usages, the primary consideration looks to be latency. SSD reads and writes are on the order of 100s of microseconds. On the other hand, memory based reads and writes are typically less than 100 nanoseconds. Even so, for some applications (e.g. video on demand streaming) 100s of microseconds of latency looks to be acceptable.

 

 

Power: Like HDDs, when comparing active power usage, SSDs draw much less power than DDR memory as measure by watts per gigabyte. How much is dependent on how the application uses memory. But generally, SSDs looked to consume 1/10th of the power.

 

 

Cost: Unlike HDDs, when comparing cost per gigabyte, SSDs are significantly lower priced than DDR memory. Generally, I start with NAND based SSDs as being half the price of DDR based memory. Depending on the size of the SSD and the technology (whether Single Level Cell (SLC) or Multi Level Cell (MLC)) the difference can be much more.

 

 

One final vector to look at is the reliability of SSDs when compared to hard disk drives and memory. Going with just the MTBF numbers being published, SSDs look to be better than HDDs and just as reliable as memory. One area that generates confusion is how the write cycle limitations of NAND technology affect the life-time (as measured by MTBF) of SSDs for server applications. Getting into details on this is a good subject for a future blog. But based on discussions with you, I haven't encountered a server application where the write cycle limitation is the deciding factor in a deployment for SLC SSDs (at least for how we expect Intel's SSDs to perform). For many server applications, it's not the deciding factor for MLC SSDs either.

 

 

Using these value vectors, here are my generalizations for the SSD value for enterprise and portal applications:

 

 

 

 

 

 

 

  • Use SSDs for the server boot device. When compared to HDDs, SSDs enable faster boot (typically 30%), consume lower power, and are more reliable.

  • Use SSDs for high throughput, high IOP, low latency application storage. If storage IO is the application bottleneck, replacing with SSDs shifts the bottleneck back to CPU utilization. Example applications include video streaming, search query, and OLTP.

  • Use SSDs for building a high performance storage tier. Many applications have hot and cold (or long tail) data. By creating a storage tier, the solution cost of a deployment can be reduced significantly. Example applications include using SSDs for improving performance in a NAS or SAN (e.g. what EMC calls Tier 0) or to creating a high performance direct attached storage (DAS) solution (e.g an SSD optimized server).

  • Consider SSDs as a lower cost alternative to placing application data in memory. Many applications create memory based databases to achieve low latency access times. These applications create custom data structures, use RamDisks or rely on caching through the OS (e.g. SWAP). For many IO bound applications, memory is typically being used as a buffer for disk data. The lower latency and higher throughput of SSDs promise to require less memory for buffering while maintaining the quality of service objectives of the application.

 

Bottom line for servers today, SSDs look to be cost effective for applications where storage IO throughput and low latency are key. They move the application bottleneck from IO to back to CPU utilization. Get back to me on whether you agree and what additional usage models you're finding.

I'm excited about our server room blogs as a way for us to get feedback from you quickly. Would love to get your comments on a technology concept demo we did over the last 6 months.

 

I have been looking at the Internet video phenomenon over the last year. One interesting usage model is making most of what we see on TV today into video on-demand (wiki has a good description http://en.wikipedia.org/wiki/Video_on_demand) either as over the web (e.g. Google YouTube) or provided by a service provider via IPTV (e.g. AT&T Uverse). As Intel, we of course want to understand how we can optimize the on-demand video workload on Intel server technology.

 

 

On-demand video deployments today are engineered largely around three resources:

Server: typically a 2 socket, dual core processors per socket, 8G DRAM, rack mount server (workload is writing new videos to disk, reading requested videos from disk, formatting the video packets, transmitting video to client)

WAN: using GE ports, some configurations pushing to exceed 10GE

Storage: as a JBOD, in the past SCSI, moving to SAS and SATA Hard Disk Drives (HDDs)

Understanding this, we challenged ourselves to create a next generation configuration using our leading technology.

 

 

Here's what we ended up with:

Server: Fit into a 2U form factor with an integrated JBOD (http://www.intel.com/design/servers/storage/ssr212mc2/)

WAN: Replace GE with the Intel Dual 10GE NIC, target to achieve 20Gbps throughput (http://www.intel.com/network/connectivity/products/10GbE_XF_SR_server_adapter.htm)

Storage: Replace HDDs in the JBOD with prototypes of the Intel enterprise solid state disk drives (SSD)

 

We worked with Kasenna (http://www.kasenna.com/) to pull the technology together into a prototype demonstration. Actually, they did most of the work as experts in doing high throughput on demand video streaming. Kasenna in the test achieved about 16Gb/s streaming throughput. In IPTV terms, approximately 4000 simultaneous standard definition (3.75Mb/s MPEG2) streams. The demonstration largely focused on the HDD versus SSD engineering.

 

If you're not familiar with SSD technology, wiki has a good overview (http://en.wikipedia.org/wiki/Solid_state_drive). Intel also discussed our NAND based solid state drive technology at fall IDF (http://www.intel.com/pressroom/kits/events/idffall_2007/webcasts.htm). Pat Gelsinger introduces the technology about 40 minutes into his Tick-Tock - Powerful, Efficient, and Predictable presentation. Knut Grimsrud gives a good overview of the NAND technology in his Challenges and Opportunities for Non-Volatile Memory in Platforms presentation.

 

 

I won't post the gory details of the configurations today. If you're interested, send me email. Simple net, it took approximately 60 15K RPM HDDs to achieve the same throughput as 12 Intel prototype SSDs. Two major takeaways:

1. Intel solid state drives look to be ideal for high throughput workloads like on-demand video that require random access from disk. Kasenna achieved about 5 times the throughput on each solid state disk drive over the hard disk drives.

2. In this case, the Intel SSD configuration lowered the peak power for the configurations (disks, server, NICs, memory) to about 1/3 of the HDD configuration.

 

 

The demo also raised a number of other thoughts on whether the higher performance of the SSDs could reduce the amount of memory required in the server. No conclusions on this yet.

 

 

This was my first step in understanding the advantage of solid state drive technology for a server application. My conclusion, Intel NAND based solid state drive technology looks to be a promising technology for achieving higher throughput and lower power when compared to a hard disk. I'll be posting more examples in the future on where SSDs looked to be a good fit for applications. I would be interested in hearing your feedback on this concept demo and about server applications where you see SSDs as having high value.

 

 

Filter Blog

By author:
By date:
By tag: