I talk to me about HPE

(This blog post was also published on LinkedIn)

Let’s get the disclaimers out of the way. HPE paid for my travel, lodging and food to attend HPE Discover 2016 earlier this month. That’s it (besides me having been an employee there in the past), they aren’t paying me for my opinions, writings, rants or videos.

I see a lot of good things going on at HPE. They say they deliver solutions to help customers IT organizations’ become more agile and they appear to be eating their dog food. For example, after they tried to compete in the public cloud business, they decided to focus instead on producing their excellent Helion private cloud management software.  They also adopted flash technology in their flagship 3PAR storage product line in a very effective, straightforward way, as opposed to creating confusing and diverging product lines the way EMC and Netapp did.  It appears they are even starting to figure out how to leverage the technology and team from their disastrous Autonomy acquisition to develop enterprise software for Big Data and IoT. In short, they seem to have figured out the markets that are important to them, the products they can sell today and the investments they need to make to compete in the future.

The pendulum of focus appears to have swung to the technology and product side of the business and away from marketing side. This was necessary, but HPE also needs to figure out how to communicate effectively about their technology and products, which is not easy for a company suffering from branding/naming confusion. Names are a tough challenge for many large IT vendors and HPE often struggles by inflicting good technology with unfortunate names – including flagship tech like “Composable Infrastructure”. The word composable does not mean anything to anybody and dictionary definitions shed no light whatsoever on what HPE is trying to communicate. This does not lead to a sense of mystery and capability as much as a sense of baloney. Ambiguity will not help HPE make it to where they want to go and they need all the clarity they can get.

HPE is not back to where they used to be because there really isn’t a “back there” anymore – the world has moved on and HPE is much better situated to pursue enterprise technology opportunities than HP ever was.

Fun Times in LinuxLand at SCaLE14x: Food, Drinks, Deadpool T-shirts & More!

linuxlandThe SCaLE14x conference begins tomorrow in Pasadena. Seeing as how I’ve been a Windows user all my life, there is a lot going on there that I don’t know anything about. I guess that’s what they call a learning opportunity.

But I do know a little bit about fun (although I am a Windows user) and one thing going on near the conference is happening Saturday night, January 23rd at the Kings Row Gastropub.  Starting at 7PM and ending sometime later, Datera (the company I work for) is hosting the worlds first #LIOBeers . LIO refers to Linux IO, especially Linux SCSI IO.  The beers part of LIOBeers is self-evident.  You don’t have to drink beer if you don’t want to, its just a name.

One of Datera’s founders, Nick Bellinger, who did a lot of work on and maintains LIO will be there. You can ask him anything about LIO or take the conversation elsewhere with Nick. He’s a lot of fun to hang with.  In addition, Mike Perez, who works for the OpenStack Foundation herding cats (Cross Project Development Coordinator) and was a Cinder Project Technical Lead will also be there. He also knows a lot about cats – the animals, which you know already if you follow him on twitter @Thingee.

0120161639_HDRAnd there’s more!! We will give away 10 Deadpool t-shirts to the first 10 peeps that show up!  I am modeling one of them in the photo on the right, where the shirt is obviously transferring super powers of some sort to me. (disclaimer: the powers you get from these shirts may or may not be super. Datera takes no responsibility for the effects transmitted.)

If you want to know about IO Linux and OpenStack IO, The Kings Row Pub will be the place to be Saturday. You don’t need to attend SCaLE14x to attend our event at The Kings Row. Datera is buying, burning some of our investor’s money. See you there.

Rubrik in the clear: clusters, clouds and automation for data protection

polyclouds

When I posted about Rubrik back at the end of March I didn’t expect it to be Part 1 of a 2-part blog.  But that’s blogging for you – and today Rubrik announced it’s product and a big B-round of funding ($41M).   I also know a lot more about Rubrik, the people behind the company and their technology so I can put together a better picture of them now.

For starters, I think they are doing something very big and executing very well.  The bigness is a matter of eliminating data protection complexity with automation and connectivity and the execution is a matter of sophisticated integration that does not over-reach. Unlike StorSimple, where I worked prior to Quaddra, Rubrik’s solution is not primary storage and doesn’t depend on customers migrating data from where it is to Rubrik’s platform. Managing data in-place is a big deal because it allows customers to try it without first disrupting their environment. The combination of doing something big and doing it well, gives Rubrik a chance to change the technology landscape. If it catches on, everybody else in the data protection business will be forced to do something similar to keep their customers, except it’s not as simple as adding a feature, like dedupe – it will require significant architectural changes.

kumbayaThe Rubrik solution is comprised of some number of nodes residing on a local network where they discover each other automatically as well as gaining access to a vSphere environment through a vCenter login. It uses VMware APIs to get both the configuration information needed to operate as well as for making backup copies of VM data. Their use of VMware APIs is discussed briefly in a blog post by Cormac Hogan. In addition, Rubrik is also developing their own API so it can someday be operated under the control of whatever management system a customer wants to use. This is exactly the technology approach business managers want for controlling their organization’s storage costs. It is designed to be a good interoperator. (I know, interoperator is not a real word, but shouldn’t it be?) It’s an interesting thing to ponder: Rubrik’s system software replacing backup software, but Rubrik’s software becoming transparent under the operation of some other orchestrator or software controlled something-or-other.

Rubrik’s cluster nodes have local capacity for protecting data as well as software that manages all operations, including using cloud storage for tertiary storage. Data that is copied to the Rubrik cluster is deduped upon arrival. Considering that they are copying data from VMs, usually having many of the same operating system files, they will likely get decent dedupe ratios. Data placement and retention policies for secondary and tertiary storage are managed through a web interface. There are several editable backup retention templates that can be used to approximate a customer’s own data retention policies. Rubrik will likely learn a lot more about data retention then they expect in the years to come from customers who want all possible options. Nonetheless, they have done a decent job coming up with a base set.

Rubrik’s policies include settings for when data can be moved to the cloud and when it should expire. Initially they support Amazon’s S3, but they indicate other options are in the works. I believe Rubrik’s solution mandates cloud storage and a cloud storage account, but there can be exceptions to every rule. The important thing to recognize here is that policies for data snapshots can be extended in ways that on-array snapshots can’t, simply due to the enormous capacity available in the cloud. There are still going to be bandwidth issues to deal with getting data to the cloud, but Rubrik clusters are scale out systems with their own “expanding bag” of storage hitched to a powerful dedupe engine. As newcomers to storage, they seem to have figured out a few things pretty well.

Rubrik has come up with interesting recovery/restore capabilities too. Rubrik contends that a primary array shouldn’t have to provide snapshot volumes when they could be mounted instead from the Rubrik backup cluster. They point out that there is no reason to restore secondary volumes to primary storage when they could be more easily and quickly be mounted from the Rubrik system. In addition, volume clones can also be created in the Rubrik cluster and mounted by whatever VMs need to access the data. There is a bit of a throwback here to EMC’s TimeFinder and I expect that Rubrik will find customers that justify purchasing their solution in order to simplify making and managing clones for DevTest. FWIW, I don’t see Rubrik’s volume restore capabilities working for large scale DR scenarios because of the massive resources needed by large scale recovery efforts. This is an area Rubrik will be educated about by their customers in the next couple years.

Rubrik talks about their technology as consolidated data protection, which seems apt to me because I believe they have a chance to permanently alter the worlds of backup, snapshots and recovery. But that’s a tall order and there are some red flags to consider as they make their way in the market: 1) Rubrik’s solution appears to be a vSphere-only solution, which will exclude them from some number of opportunities; 2) Rubrik has now raised $51M, which has it’s advantages, but it also means they need to start having sizable successes sometime too. 3) The weather forecast calls for uncertain storms when things don’t work quite as advertised or when – worse – a customer loses data. It’s not impossible or unthinkable – it’s the storage business. 4) combined with #3, they will have to learn a whole lot of things they don’t know yet about storage customers and the number of practices they have that can complicate otherwise “perfect” technology.

 

Quaddracomix: Maybe you should take the ice bucket challenge

Honey badgers are busy

Starting up at Quaddra Software

mountain-bike-jumpThere’s something about working for a startup that gets my motor running. There’s more at stake (more thrills) and more freedom to do things that don’t fly in large, organizations. So here I am, back at a startup, Quaddra Software, after a year and half with Microsoft.  Microsoft was a good place to be an employee, but I’m a “journey-guy” and am more interested in overcoming new obstacles than repeating old ones.

Quaddra is developing highly-scalable, high-performance file analytics software that leverages open source software with a pluggable architecture for adding new functionalality. I really like the potential of this technology because its value comes from creating intelligence from opaque data and creative customers always find interesting new ways to use intelligence.

Timing mattersthings have changed

Beyond the concerted efforts of talented people working together, two of the most important elements to a startup’s success are luck and timing. Startups always need as much luck as possible and should do all it can to create its own. Timing is one of those cruel things that a startup can’t do much about. If it’s too early, the company has to create a market by itself without an ecosystem to support it and the odds are very good that the company will run out of money and go out of business. On the other hand, if the startup is too late, other companies’ products get the best opportunities and the startup suffers from slow growth and lower margins – a slower path to going out of business or becoming one of those zombie companies that is not really alive, but not yet dead either.

The best scenario for a startup is to have technology siblings that compete and grow together,  increasing awareness of the solution set and expanding the market much faster than they can on their own. It’s that old strange math where 1+1=3.  10062010_catalina1 (2)This go round, it appears Quaddra will be sharing a room with a company populated by friends from my EqualLogic days Data Gravity. From reading their blog and with their public announcement today in the Wall Street Journal, it appears we are on very similar vectors. I have a lot of admiration for Paula Long, John Joseph and the rest and I hope they are wildly successful and that Quaddra competes with them in the file analytics business for a long time.

Applications for Quaddra Software’s

IT teams that know a lot more about their company’s unstructured data can control it, manage it and generate reports about it.

Control – Quaddra’s software scans and searches unstructured data in-place where it resides and builds independent indices where system administrators search, identify and tag files for various actions. For instance, certain media file types that are not be part of the normal corporate work streams could be identified as non-work files and removed from corporate storage. Likewise files with sensitive data could be found in locations where they might pose a risk for loss or leakage.

Manage– Quaddra’s software can copy, migrate and archive data from it’s current location to to virtually any secondary store, including cloud object storage. Files can be identified using many different criteria and acted on according to management policies that safely store historical files and make free space available on primary storage.

Report – The results of searches can be used to generate reports to analyze and share with managers and co-workers to make informed decisions. For instance files can be analyzed by their access data and correlated with the amount of capacity they consume to determine candidates for migrating to archival storage.

age analysisIf you want to talk to us about developing intelligence for your opaque, unstructured data, please send an email to: info@quaddra-sw.com

The new world of DR: cloud-integration

When most people think about disaster recovery they automatically assume it requires a complicated configuration with replicated data on redundant storage systems in two locations some distance apart from each other.  There are many details to pay attention to, including storage performance, network performance, available bandwidth and data growth. It costs a lot and takes a long time to implement.

But more and more customers are discovering that it doesn’t have to be that way. Just as cloud technology is changing how developers think about structuring and deploying applications, it is also changing the face of business continuity.

One of the biggest ways DR is changing with cloud technology is by removing the requirement for a separate DR site with all the networking, storage and server equipment. Customers are starting to realize instead that backup and recovery data can be automatically stored at one or more cloud storage service providers, such as AWS, Azure, EMC/ATMOS, Google, HP, Nirvanix and  Rackspace.  Using the cloud for DR provides the following key benefits

  1. Transfers infrastructure costs to cloud service providers
  2. Facilitates DR testing and validation
  3. Eliminates physical tapes and tape management
  4. Provides flexibility for the recovery location
  5. Centralizes DR storage from multiple sites, including ROBOs
  6. Improves RTO
  7. Enables recovery-in-cloud

StorSimple makes Cloud-integrated enterprise storage that does all of these things by automating data protection between on-premises storage and cloud storage services.

Transfer infrastructure costs

Equipment and resources for DR have costs with a very small chance of generating a return on the investment. There is no point in owning resources such as storage, networking, servers, racks, power and cabling that you hope to never use. Clearly, the cloud mantra of paying only for what is used applies here.  Don’t overpay for insurance.

Facilitate testing

Of course everything has to work when you need it to. The interesting thing about cloud DR is that it is even easier to test and validate than traditional DR because it can be done without interrupting production systems. Many of our customers at StorSimple cite this as a very important benefit.

Eliminate tapes

One of the worst parts of any recovery operation is anything and everything involving tapes. Naming tapes, loading tapes, unloading tapes, moving tapes, retensioning tapes, copying tapes, deleting tapes, disposing tapes, and all things tape-related.  They aren’t needed with cloud DR.

Recovery location flexibility

Cloud-based recovery can happen at any site with a reasonably good Internet connection. Moreover, it can happen at multiple sites, which means it is easier to make contingency plans for multiple-site complications as well as being able to spread the recovery load over more resources.

Centralize DR storage

Another aspect of location flexibility with DR is the ability for companies to store DR data in the cloud from many sites or remote branch offices (ROBOs). While each site or branch office will have a unique URL to store their data, the access to this data is centralized in the cloud where it can all be easily accessed from a single Internet connection in their primary data center. In other words, the DR data from any ROBO can be instantly accessed at headquarters.

Improve RTO

The data that is needed to resume operations after a disaster can be limited to only the data that is needed by applications – as opposed to downloading multiple tape images in-full and restoring data from them. This can save weeks during a large scale recovery. Data that is not needed immediately does not consume any bandwidth or other resources that would interfere with the restore process. This approach to DR uses a concept called “the working set”, which is the collection of data that is being used by applications. Working-set based DR is the most efficient way to recover data.

Recovery in-cloud

Related to recovery flexibility is the ability to resume operations in the cloud by using of cloud compute services. In this case, the DR data stays in the cloud where it is accessed by cloud-resident applications. Application users connect to the application through a connection to their cloud service provider. The data that stays in the cloud needs to be presented to the application in it’s usual fashion – as a file share, for instance.