With OnApp Cloud v3.0, storage isn’t boring any more
20 February 2013 by
It's difficult to write a blog post about OnApp Cloud v3.0, released today. There are so many new features that it's hard to know where to begin.
Maybe we should talk about our new VMware support. That's at the top of the list for a lot of our customers. Perhaps it's the addition of video to our CDN (which, by the way, just hit 148 PoPs). Maybe it's our fantastic new control panel.
For a lot of cloud providers, though, OnApp Storage is the real highlight of v3.0. Ubiquity Servers is one of the first companies to build a new service around our distributed SAN, and as their CTO Clint Chapman says, "it's the addition of OnApp Storage that's the real game-changer here." We're inclined to agree, of course. But what it is that makes OnApp Storage so important?
'Designed for cloud providers'
OnApp Storage is designed for cloud workloads. It's designed to help cloud providers who've always struggled to find the right mix of performance, resilience, scalability and price with traditional, centralized SANs.
The price part is always important, but let's put that to one side for the moment. You get OnApp Storage bundled with OnApp Cloud, at no extra cost - so enough said. Scalability is important too, but again that's a pretty straightforward argument: with OnApp Storage, your SAN scales naturally as your cloud grows, because you're using your cloud infrastructure to deliver your storage.
Instead I'd like to focus instead on what distributed storage really means for OnApp Cloud providers: how it helps you make storage part of your cloud USP, by customizing it for the different needs of your customers. To understand why, we need to look at what a distributed SAN is, and how OnApp Storage works at a high level.
Building your distributed SAN
OnApp Storage lets you create a distributed SAN in your existing OnApp Cloud infrastructure. The basic components are the same: you have a controller server, which manages a number of hypervisor servers and hosts the OnApp Control Panel. However, instead of hooking up your cloud to a centralized SAN, you populate your hypervisors with additional physical disk drives. Those drives can be SSD, SATA or SAS devices.
The first step in setting up your distributed SAN is to identify which hypervisors (and their disks) you want to include in it. OnApp Cloud will automatically discover your hypervisor servers (that's another new v3.0 feature) and you just select the hypervisors you want with a few clicks in the control panel.
Defining your storage service
So far so good. The next step is to create one or more virtual datastores in the OnApp Control panel, and assign disks from those hypervisors to each datastore. Datastores are categorized by performance, so you might create high performance, mid performance and lower performance datastores containing SSD, SAS and SATA disks, respectively. The disks in each datastore are distributed across multiple hypervisor servers.
Each datastore can have its own policies for striping, redundancy and overcommit percentage, too - and of course, its own pricing. All of this is set with a few clicks in the control panel.
Provisioning storage to customers
Now we get to the interesting part. When a customer sets up a virtual machine, you present them with storage options based on whatever datastores you have set up. For example, your 'turbo' package based on a datastore with high performance SSDs, 4 stripes and 2 redundant copies of data; or your 'slow but bulletproof' package based on SATA disks, with 4 redundant copies and no striping. Your pool of disks, of different types and sizes, can be sliced up however you like and presented to customers in many different ways - but with a single SAN in the back end, and a simple management GUI.
Behind the scenes is where the real magic happens. The customer's virtual machine is provisioned with a virtual disk (a vdisk), just as it would be with a traditional SAN in your cloud. However, OnApp Storage chooses which physical drives in the datastore actually handle the data for that virtual machine.
Which drives are chosen depends on how much available space they have, the number of vdisks already stored on each drive, and a few other factors.
- How many drives are chosen depends on the striping and redundancy policy of the datastore used. For example, a vdisk on a datastore with 1 replica and 4 stripes will be stored across 4 physical disks. A vdisk on a datastore with 4 replicas and 4 stripes will have 16 physical disks as 'owners' of its content.
'VM-aware' optimizes performance
At the same time, OnApp Storage will always attempt to store a vdisk locally to its virtual machine - i.e. on a physical drive in the same hypervisor. It knows which virtual machine owns which vdisk, and on which physical disk that vdisk lives. This 'VM-aware' technology helps maintain a high throughput for cloud workloads - up to 95% of the speed of the disks you use, in fact - because a copy of the virtual machine's data is always stored locally to the virtual machine, whenever possible.
'Smart disks' increase resilience
Even cleverer is OnApp Storage's 'smart disk' technology, which is patent pending. As well as being aware of the relationship between virtual machines, virtual disks, physical disks and hypervisors, the system knows which physical disks hold copies of data and - crucially - can use that information to rebuild vdisks if a problem occurs with any of the physical copies.
This doesn't require a centralized management system - in fact, it depends on not having one. Instead the intelligence is on each physical disk. That's important for two reasons. Firstly, each physical disk in your distributed SAN is a smart, self-managing, self-discovering and self-contained unit. It can make decisions about data synchronization and load balancing, without depending on a central controller. Secondly, it means disks are hot-pluggable. You can move disks between hypervisors and preserve the integrity of the data they hold.
Adding value with your SAN
So, putting all of this together... how is our distributed storage different from traditional SANs? If you're used to building cloud services with centralized SANs, it can be tempting to think about storage in a kind of binary way: it's either fast and expensive (and difficult to manage), or cheap and slow (and difficult to manage).
With our distributed SAN, however, you can start to think about storage from your customers' perspective, and design the storage part of your service to meet their needs.
You start with the basic performance characteristics of your disks; customize how much additional performance you'll offer to your users, by striping data across the SAN, and how much resilience you'll give them through redundant copies; and price and package the result.
You get a very simple way to manage your SAN through the OnApp control panel, as a seamless part of your cloud. Your customers can provision storage from as many different options as you care to give them. And behind the scenes the system takes care of performance optimization and data integrity, automatically.
With OnApp Storage, your SAN might be based on commodity hardware, but you don't have to provide boring commodity storage with your cloud. You get much more flexibility, and you can use your SAN to add real value to your clients.