Selecting a cloud computing provider is becoming increasingly complex. As cloud environments mature, many cloud providers attempt to differentiate themselves by focusing on specific aspects of their offerings, such as technology stacks or service-level agreements (SLAs). In short, not all cloud providers are created equal.
At the same time, enterprises are beginning to rely on cloud providers for hosting mission-critical applications, which raises the stakes for selecting the right cloud service. So how do organizations navigate this multifarious landscape? Below you’ll find a few key factors for evaluating services as well as some resources to use.
One of the main concerns for enterprises that are considering cloud computing is performance. Achieving high-speed delivery of applications in the cloud is a multifaceted challenge that requires a holistic approach and an end-to-end view of the application request-response path.
Performance issues include the geographical proximity of the application and data to the end user, network performance both within the cloud and in-and-out of the cloud and I/O access speed between the compute layer and the multiple tiers of data stores. A number of services and research reports such as CloudSleuth and CloudHarmony have recently attempted to measure the performance of cloud providers from various locations and with different application use cases.
Several cloud providers have focused their services on a particular software stack. This typically moves them from being Infrastructure as a Service (IaaS) providers to the realm of Platform as a Service (PaaS). As one would expect, the different stack-specific clouds align with the most popular software stacks out there.
Examples include Heroku and Engine Yard for Ruby on Rails; VMforce and Google App Engine (GAE) for Java/Spring (GAE also supports Python), PHP Fog for PHP and Microsoft’s Windows Azure for .NET.
If your application is built using one of these stacks, you may want to consider these cloud platforms. They can offer tremendous savings in terms of time and expense by shielding you from having to deal with lower level infrastructure setup and configuration. The flip side is that they often require developers to follow certain best practices in architecting and writing their apps, which creates a higher degree of vendor lock-in.
Service-level agreements and reliability
Some cloud providers offer guarantees for higher levels of service as a way to separate themselves from the pack. In Rackspace: The Avis of Cloud Computing, I describe how Rackspace has higher levels of cloud service SLAs to compete with Amazon, the 800-pound gorilla of cloud computing.
Note that SLAs are often merely an indication of the consequences when the service fails and not the service’s actual reliability. A great example of this is GoGrid’s 10,000% Guaranteed SLA. In other words, GoGrid offers a 100% uptime guarantee. Should it fail to meet that level of availability, it will compensate the customer with 100 times the fee paid for the downtime.
Although the SLA is a good indicator of any provider’s level of commitment, knowing the real uptime levels of a particular cloud provider is a trickier proposition. Most vendors have a status page that acts as a dashboard for the health of their services, but these generally display only stats from a few days ago at the earliest. To get actual long-term numbers for reliability and availability, it’s better to rely on customer testimonials and comparison services such as CloudSleuth and CloudHarmony.
APIs: Lock-in, community and ecosystem
Another critical aspect of selecting a cloud provider is the application programming interface (API) it exposes for accessing the infrastructure and performing operations such as provisioning and de-provisioning servers. The API is important in a number of ways.
First, an API that is supported by multiple providers and vendors reduces lock-in because migration from one provider to another — or simultaneously working with multiple providers — requires less change to the application and is, therefore, easier.
Second, an API that is widely supported by a community of developers and vendors has an entire ecosystem around it of complementary services and capabilities. The APIs offered by Amazon Web Services (AWS) and the various VMware cloud offerings have large ecosystems built around them, which includes tools for governance (such as enStratus), monitoring and management (such as Cloudkick and RightScale) and a slew of other services that complete their cloud service.
VMware itself does not have a cloud service, but various providers use the VMware stack and APIs — specifically vCloud — such as Terremark and Savvis.
Both Amazon and VMware — and perhaps Windows Azure as well — allow customers to implement in-house clouds using their stack and APIs, thus enabling an easy way to manage and run applications on what some call a hybrid cloud. A hybrid cloud is a cloud that is both hosted by a provider and runs in the company’s on-premise data center. In the case of Amazon, this can be done through Eucalyptus, a startup that provides a software stack for implementing private clouds using the AWS APIs.
A recent development in the space is worth mentioning. Rackspace, jointly with NASA and supported by many vendors and cloud providers, has open sourced its software stack in a project called OpenStack. Along with being the closest to what might be considered an industry standard, this move creates a viable alternative to the Amazon and VMware ecosystems.
Security and compliance
Two of the biggest barriers for companies considering cloud computing continue to be security and compliance. In a recent Zenoss Inc. survey conducted during the second quarter of 2010, nearly 40% of respondents listed security when asked about their biggest concerns about cloud computing. The second most common answer was management, which received only 26.5% of the responses. The Zenoss survey is consistent with a number of other surveys related to cloud computing.
The real concern for enterprises is not actually security threats but rather their inability to achieve compliance with security-related standards such as PCI. In response, many cloud providers are now touting their security and compliance chops with SAS-70 Type II audits, security white papers and other measures.
Banking on the opportunity, cloud provider Logicworks, has dubbed its offering the Compliant Cloud and has recently announced the Level 1 PCI accreditation of its cloud.
A straightforward way to compare cloud providers would appear to be cost, but it turns out to be anything but. The problem is that there is no consistency among providers in regards to the resources customers actually receive and pay for. Providers offer virtual machines (VMs) that vary widely in memory capacity, CPU clock speed and other features. Furthermore, the units that are actually provided to customers are often virtualized, creating even further confusion as to what the customer is actually getting and how it might be affected by other customers on the same cloud.
Amazon has EC2 Compute Units, Heroku offers Dynos and other vendors have created their own measurement units. The only truly reliable way to measure the cost-performance of different cloud providers at this point is to conduct an experiment with the same application or prototype on multiple providers and compare the results.
Choosing the best cloud provider for an application is a multidimensional problem. As the number of cloud providers increases, and as many of them focus on specialized needs and use cases, more choices require more focused examinations. Fortunately, services are emerging that help compare cloud services so that customers can tell which provider is best suited for which application.