Why does StorSimple Matter?

Today is one of those spotlight days when people who don’t know much about StorSimple will want to find out more.

In a nutshell, we have been developing what we believe is the best cloud data management technology that allows our customers to use cloud storage services to manage their enterprise data and storage.

We don’t make the cloud look like a disk drive or tape drive, we make the cloud available as a place to manage data. Our technology segments data that is stored in our systems into small pieces and we track each and every one of those segments wherever it happens to be – whether it is in SSD storage, on hard disks or in the cloud.  As the data is updated, we track all those changes too.

Why?  Because it makes things like recovery from the cloud a whole lot faster than pretending to be a tape drive and it creates a system where data portability between the enterprise data center and the cloud is possible. If you are going to move data between earth and sky, you need to keep track of it somehow and keep up with the changes.  We have a system for doing that.  The various segments of a volume can be anywhere within reach – such as in the cloud – and customers can mount the volume. We assemble all the segments and serve them to applications as they are needed. That’s why disaster recovery is so fast with StorSimple. Customers mount the volume – in the cloud – and have access to everything in it – but only download the data they need to get up and running again.

For those readers who are not into enterprise storage, the technology has similarities to both Data Domain and 3PAR.  All the data in the system is deduplicated like Data Domain systems do, and all the small pieces of the data are presented as live online data the way 3PAR does.  It’s small grained storage virtualization that includes both deduplication and cloud storage.

There is no special hardware required to do this. StorSimple has built systems that have a certain blend of SSDs and hard disks in order to meet certain performance expectations. While there are many exciting opportunities to further leverage our technology in the days ahead, for now we are enjoying the news and looking forward to the excitement of suddenly becoming much more visible and important to a lot of potential customers.

The cloud wants your junk data

What do you think about when you think about cloud?   A lot of people think of shiny, new technology made of all new APIs and hypervisors and mobile devices and cutting edge code and things that only the next generation will understand. And for a lot of cloud customers, that’s reality. New, new, new.

What you probably didn’t know, however, is that the storage part of the cloud service provider businesses aren’t hung up on new. In fact, they are ecstatic about old. Old junk data that you would rather forget about, get out of your life and out of your data center. Data that you know you shouldn’t just delete because an attorney somewhere will ask for it. But data that’s taking up expensive tier 1 storage that is the digital equivalent of engine sludge.

Cloud storage services want it – even if you end up deleting it later. It doesn’t matter to them.  You might be thinking they just want to mine your data.  Nope. They are perfectly fine storing encrypted data that they will never be able to read. To them, it’s all the same flavor of money at whatever the going rate is.  They don’t care if the data was a lot bigger before it was deduped or compressed or whatever you have done to it to reduce the cost. Why should they care if you send them 100 GB of data that was originally 1 TB. They don’t.

It’s good business for them – they’ll even replicate it numerous times to prevent data loss.  You might be thinking “but it’s garbage data, I’d never replicate it”.  True, but if it’s garbage, data then why do you have so many backup copies of it on tape and possibly in other locations?  Why are you managing garbage over and over again?

It’s a double win. They want it and you don’t. All you need is the equivalent of a pump to move it from your expensive tier 1 storage to their data storage services. There are a number of ways this can be done, including using products from StorSimple, the company I work for. A StorSimple system ranks data based on usage, compacts it, tags it (in metadata), encrypts it and migrates it to a storage tier in the cloud where it can be downloaded or deleted later if that’s what you decide to do with it. How much money do you think your company is wasting taking care of junk?

Are you feeling lucky, or just confident?

 

 

 

 

Chris Mellor wrote an article for The Register yesterday on cloud storage.  At the end of it all, Chris malappropriated the famous soliloquy from the movie Dirty Harry:

“Being this is a .44 Magnum, the most powerful cloud storage service in the world, and would blow your SAN head clean off, you’ve got to ask yourself one question: ‘Do I feel lucky? “Well do ya, punk?” ®

For those unfamiliar with the movie, the context here is that a violent detective (Dirty Harry) has caught a psychotic serial killer and asks him the ultimate question about his fate.  Tension builds with the realization that Harry is asking himself the same question because he is unsure if there are any bullets left in his gun.  He obviously wants to find out, but struggles with a good cop vs evil cop dichotomy. He needs his psychopathic adversary to make the first move, but he seems awfully confident.

It doesn’t have much to do with cloud storage, other than suggesting the question of fate – something that storage administrators think about with regards to data more often than they think about their own.

So what is the fate of data stored in the cloud and what sorts of steps do cloud service providers take to give customers re-assurances that theirs is safe? You can’t plan for everything, but you can plan to cover an awful lot of mayhem that can occur.

For starters you can store data in multiple locations to protect from being unable to access data from a single cloud site. As Chris’ article pointed out, StorSimple allows customers to do that. They can store data in separate discrete regions run by a single service provider or they can store data in cloud data centers run by different cloud service providers. Different customers will have different comfort levels where cloud redundancy is concerned.

But it’s important to know that cloud storage service providers already store data in multiple locations anyway to protect against an outage at a single site that could cause a data loss. Data in the cloud is typically stored multiple times at the site where it is first uploaded and then stored again at other sites in the cloud service provider’s network.  Customers who are concerned about the fate of their data should discuss how this is done with the storage service providers they are considering because they are all a little different.

There is an awful lot of technology that has gone into cloud storage. We tend to think of it like a giant disk drive in the sky, but that is only the easiest way to think about it.  Cloud storage – especially object storage in the cloud, the kind StorSimple uses and the stuff based on RESTful protocols has been amazingly reliable. There have been other problems with different aspects of the cloud, including block storage, but object storage has been rock solid.  It’s not really about feeling lucky as Dirty (Chris) Harry suggested, it’s about the scalable and resilient architectures that have been built.

We would love to talk to you about cloud storage and how you can start using it. If you have a cloud service provider in mind, we are probably already working with them.

Not dead yet, but when will you get rid of tape?

Do you have any more tapes you want to get rid of?

People have predicted the ending of tape as a storage medium since the first rotating storage drums were made by wrapping recording tape around modified washing machine drums. Too cumbersome and too error prone, tape has survived because people use it for archiving and off-site DR storage. It has always been the storage backstop for all the other things that can go wrong – from human error to combinations of calamities that are stranger than fiction.

But tape itself has been a big problem. It is a byzantine technology with impressive data fast transfer rates, but is saddled by cumbersome management that requires many touch points where things can go wrong. Restoring from multiple tapes is time consuming and unnerving, but considered normal. Contrast that with using dedupe technology that can access and restore  data much more quickly.  The main problem with dedupe is it’s cost. The most popular disk-based dedupe systems are not necessarily cheap. The other problem is that many customers still use tape with dedupe for DR purposes. Used this way, tape it is less intrusive, but it still is a pain.

Disk-based dedupe has taken a big bite out of tape’s business, but yet tape has continued limping along like an unkillable zombie. Now with cloud backup looking like it could take even more out of tape’s market, is tape going to finally keel over?

Tape is tired

 

Putting tape backups on less expensive virtual tape cloud storage could look like an obvious solution, but like all things in storage, initial impressions are usually misleading. While cloud storage can be made to look like a big disk or tape drive in the sky, it is much slower than the old frenemy tape. The difference is most pronounced when you want it to be most transparent – during restores. Technologies for data reduction, such as deduplication and compression help, but the fastest restores from the cloud will use technologies like thin restores that were developed by StorSimple. Why restore data that you probably won’t need again? Just leave it in the cloud.

But getting back to tape, the cloud industry is making enormous investments in service offerings, including storage services, which will continue to be improved and expanded.  The cloud service providers are not stupid. They want your data so they can get your computing business when you are ready to start doing that.

Tape technology vendors do not have the marketing muscle to protect their install base, regardless of how entrenched those customers may appear. The fact is, only the largest IT shops have the resources to “do tape” well. Everybody else struggles with the stuff and will happy to jettison it as soon as they can.

So will tape disappear completely if most of the market goes away? Probably not, for starters cloud storage service providers will probably use a lot of tape, and large customers that know how to make it work will continue to want it.

My guess is that tape will follow the path of mainframe technologies into the mostly invisible corners of the technology industry where vendors are few and margins are high. Tape won’t die, it will only seem like it did.

Some gigabytes are worth more than others

Getting clarity on the cost and relative worth of enterprise technology has always been a challenge because of the complex environments and diverse requirements involved. For every good question about which product is better, there is the almost universal answer – “it depends”.  One product might have more capacity than it’s competitors, while another might have a unique feature that supports a new application and another product might have a new operating or management approach that increases productivity.  Beauty is in the eye of the beholder and enterprise customers dig a lot deeper than what appears in competitors’ spec sheets. In some respects, it’s like comparing real estate properties where location and design trump square footage.

One of the traps people fall into when comparing the value of cloud services to legacy infrastructure technologies is limiting their analysis to a direct cost per capacity analysis. This article in Information Week did that in a  painstaking way where the author, Art Wittman, made a commendable effort to make a level cost comparison, but he left out the location and design elements.  He concludes that IaaS services are not worthwhile because the costs per capacity are not following the same cost curve as legacy components and systems.  There is certainly some validity to his approach – if the capacity cost of disk drives has dropped an order of magnitude in four years, why should the cost of Amazon’s S3 service be approximately 39% higher?

Conceding that productivity gains can be realized from cloud services, he limits their value to application services and summarily rejects that they could apply to IaaS. After all the work he had done to make a storage capacity cost comparison, he refused to factor in the benefits of using a service.  Given that omission, Mr. Wittman concludes there is no way for an IaaS business model to succeed.

I agree with Mr. Wittman in one respect, if a service can’t be differentiated from on-site hardware, then it will fail.  But that is not the case with enterprise  cloud storage and it is especially not true with cloud storage that is integrated with local enterprise storage. Here’s why:

Storage is an infrastructure element, but it has specialized applications, such as backup and archiving that require significant expense to manage media (tapes). Moving tapes on and off-site for disaster recovery purposes is time-consuming and error-prone. While the errors are usually not damaging, they can result in lost data or make it impossible to recover versions of files that the business might need. The cost of lost data is one of those things that is very difficult to measure, but it can be very expensive if it involves data needed for legal or compliance purposes.  Using cloud storage as virtual tape media for backup kills two birds with one stone by eliminating physical tapes and the need for off-site tape rotations. It still takes time to complete the backup job and move data to the cloud, but many hours a month in media management can be recaptured as well as tape-related costs.

There are even greater advantages available with backup if it can be integrated from primary storage all the way to the cloud, as it is with StorSimple’s cloud-integrated enterprise storage (CIES).  Using snapshot techniques on CIES storage, the amount of backup data generated is kept to a minimum, which means the amount of storage consumed from the storage cloud service provider is far less than if a customer used the cloud for virtual tape backup storage. Cloud-resident data snapshots have a huge capacity advantage over backup storage where the storage of files for legal and compliance purposes are concerned and it demonstrates how the design of a cloud appliance can deliver even more value from cloud storage.

The next increase in cloud storage value comes from integrating deduplication, or dedupe technology with cloud storage.  Dedupe minimizes the amount of storage capacity consumed by data by eliminating redundant information within the data itself. Sometimes, the amount of deduped data can be quite large – as occurs with virtualized systems. StorSimple’s CIES systems automatically applies dedupe to the data stored in the cloud and squishes capacity consumption to its minimum level – which also minimizes the amount of data that is transferred to and from the cloud. With the help of a cloud-integrated enterprise storage system, the capacity of cloud storage increases in value a lot because so much less of it is consumed.

But the worth of cloud storage is not all about consuming capacity, it’s about accessing data faster than you can from legacy data archives. Data stored in the cloud with a CIES system is online and can be accessed by workers and administrators without the need to find it in a separate archive pool of storage. If you don’t work in IT, you might not know how much time that can save the IT staff, but if you do work in IT, you know this is a huge advantage that returns a lot of administrator time for other projects.

The access to data in cloud storage is probably most valuable when it occurs following a disaster.  Cloud storage provides the ultimate flexibility in recovery by being location-independent.  Backup or snapshot data stored in the cloud can be accessed from almost any location with an Internet connection to the cloud storage service provider.  Again, cloud-integrated storage has some important advantages that further increase the value of cloud storage by requiring only a small subset of the data to be downloaded before application systems can resume production work. This is much faster than downloading multiple virtual tapes and then restoring data to application servers.

I could go on – and I will in future blog posts. This one is long enough already. There are numerous ways that cloud storage is worth more than it’s raw capacity.  Some of this worth comes from its role in disaster recovery but a lot of it comes from how it is used as part of an integrated storage stack that incorporates primary, backup, archive and cloud storage.

Speed excites, but how about cool, tight storage?

In 2011, Fusion-io disrupted the enterprise storage industry with their high-performance PCIe flash memory cards. EMC responded this week announcing its own VFCache product.  Suddenly there is a hotly contested race and  it’s up to the rest of the industry to respond.

I love going fast on bikes and skis, but fast is a relative thing. There are always people who will go a lot faster than me, but I don’t need to keep up with them to be happy. The same is true for driving – I drive a 4-cylinder Ford Fusion because I love Sync and I don’t care if it’s not fast.

Cold will be hot

The “good enough for me” principle works for storage too. Most companies have a lot of data that doesn’t need high performance I/O. Server flash products address the hottest data a company has, but what about all the “cool” or “cold” data that is infrequently or never accessed? It typically occupies storage systems that were built to compete based on performance. Even the lowest-performing tiers of most enterprise storage systems significantly over-serve customers by providing service levels far beyond what is needed.  At some point another industry disruption will occur as new products emerge that significantly reduce the cost of storing cool data.

A difficult problem for storage administrators is that there is no way to separate cool data that will be accessed again in the future from cold data that will never be accessed again. One approach is to archive cool data to tape, but the delays and difficulties in locating and restoring cool data when it reheats are not all that comforting. Another approach is to send cool data to an online cloud storage tier provided by enterprise vendors such as Microsoft, Amazon, Rackspace, AT&T, and HP. Cool data in the cloud that reheats is transparently moved back to a local, warmer tier until it cools off again. Data stored in a cloud tier does not require the power, cooling and footprint overhead of data stored in the corporate data center storage and it also reduces the cost and impact of storage system end-of-life events.

"Tight" storage looks good

But cloud storage tiers are not the whole answer. Customers will want to ensure that cool/cold data doesn’t consume any unnecessary storage capacity.  Cloud storage products that incorporate the various forms of data reduction technologies such as thin provisioning, deduplication and compression will provide the “tightest” fit for this data by running at capacity utilizations that are unheard of with primary enterprise storage today. In addition to saving customers on storage costs, these products will also increase the return on investment by saving customers on bandwidth and transaction costs that some cloud service providers charge.  Keeping a tight grip on storage expenses will become synonymous with using the tightest, most efficient cloud-integrated storage systems.