This Question is Answered

1 "correct" answer available (4 pts) 2 "helpful" answers available (2 pts)
3 Replies Last post: Jun 9, 2008 12:40 PM by tomtzigt
Reply

Who uses cloud computing services?

May 29, 2008 1:14 AM

Click to view rick.j.white's profile rick.j.white 4 posts since
Aug 24, 2007
Cloud computing is one of the hot topics for 2008. One of the hardest things about this topic is coming up with a consistent definition, but a lot of folks seem to be converging on the following taxonomy: Cloud services are generally either Software as a Service (like subscribing to an online CRM service) or Infrastructure as a Service (like renting virtual machines for compute or remote storage services).

What cloud services are you using? Are there some specific issues you want to see resolved (security, compliance, ...) before you would subscribe to a cloud service? Let us know.
Average User Rating
(0 ratings)
Click to view tomtzigt's profile tomtzigt 2 posts since
Jun 5, 2008
Reply 1. Re: Who uses cloud computing services? Jun 5, 2008 2:06 PM

Stillwater Supercomputing has been evaluating two technologies on Amazon EC-3. The first was a test to see if you could use EC-3 as a compute cluster for HPC applications in the CAE space. This effort lead to the following insights:

The virtualization that is the basis of all cloud computing is the death knell for any HPC application. The virtualization does two things that are counter to effective usage of clouds for HPC. Virtualization kills memory foot print control essential to HPC applications. An HPC data structure is carefully constructed as a function of system memory and microprocessor cache and TLB attributes. Not having control over that makes the HPC application uncontrollable and thus unpredictable. The second problem of clouds is that the gear used is not uniform and typically cheap, that is, cost is extracted from the system on components that are essential for HPC: network latency and throughput. Variable cpu gear creates problems in load balancing and low cost networking creates problems for the efficiency of the algorithms.

Having gone through that exersize we now understand cloud computing and virtualization much better and what it means for the applications that are good candidates. The MapReduce concept and applications that can leverage MapReduce are very natural fits for cloud computing. So we are working with Hadoop/Lucene/Nutch and working through the issues for those types of applications. In that Hadoop space it is interesting that there are already quite a few companies that leverage Hadoop on EC-2/S3.

In summary, our assessment of cloud computing is that for workloads that do not care about latency, clouds are useful. However, cloud computing combined with data sets, such as Google or Amazon, cloud computing is much more interesting. We believe that cloud computing as a service for computes is not viable: it has to be coupled with an application, and even more attractively, with a valuable data set. When you think about the success of outsourced web servers and web application servers this should not come as a surprise. Organizations such as Rackspace attract an end user of an application, in their case, web servers, but they don't have any leverage to add value to their offering. Google and Amazon do have that option and the richness of their offerings easily surpasses Rackspace.

Click to view rick.j.white's profile rick.j.white 4 posts since
Aug 24, 2007
Reply 2. Re: Who uses cloud computing services? Jun 8, 2008 11:50 AM
in response to: tomtzigt
Thank you so much for this reply. Your comments get to the heart of the debate of where cloud computing architecture is going. Is it a general purpose virutal datacenter? For your HPC applications this isn't going to work. Will there be better "pupose built" clouds that can handle HPC applications, and where do the datasets come from? We've spoken with several academic groups who are looking at building scale out "supercomputer clusters" to handle specifc areas like genomics, weather and others - how do we get the cooperation between the groups who own the data to make it available for mining and synthesis? Do programming techniques like map/reduce really allow us to provide adequate programming for anything other than search? OK - enough questions for now. We've had almost 400 hits on this discussion thread - does anyone else have an experience they'd like to share or an opinion they'd like to register to my barrage of questions? Thanks!
Click to view tomtzigt's profile tomtzigt 2 posts since
Jun 5, 2008
Reply 3. Re: Who uses cloud computing services? Jun 9, 2008 12:45 PM
in response to: rick.j.white
Useful data to compute on will always be a problem, not just for academics but also for small and medium business (the other multi $T portion of the US economy). Governmental efforts (NIH, NSF, SciDAC) can afford to invest here, but industrial data is too important an asset to part with. For example, Google will never allow folks to get to their core web snapshot data or their user click data, nor will Intel part with its chip process data or core circuit data so that academics can innovate on process management or circuit analysis. It is hard to see how this will ever change since the data is the asset and companies and academics typically have invested heavily to create that asset and thus want to leverage it either as a commercial entity or as an intellectual entity.


Personally, I think we'll see a huge increase in the "dark matter' of the internet over the next couple of years. Dark matter defined as the data created by organizations with valuable data assets on the internet for sale. Examples would be market makers like Dun and Bradstreet, Gartner, IDC, or financial service brokers, and of course any product organization like GE and Boeing. These organizations are either motivated by generating revenue from their data asset or by leveraging the internet to amplify their geographical diversity. This makes it more plausible that federated systems are the end game, not cloud computing.

In one way, you can think of Google as an federated system. The world's web servers are the federated systems that Google aggregates, caches and adds value to the data contained. The world can self-organize these web servers into a collection of server farms managed by IBM, Google, eBay, and Amazon, or these web servers will become larger hubs themselves aggregating the data behind the firewall for profit or productivity.