ClearSky is a Boston-based startup founded in 2014 by industry veterans Lazarus Vekiarides and Ellen Rubin; ClearSky comes with a unique proposition, which – if successful – might revolutionize the way primary storage is consumed. I introduced ClearSky in my previous TFD14 preview article where I described their solution; the objective is to reduce drastically the Data Center footprint of traditional primary storage by shifting it to the Cloud while at the same time simplifying DR operations and ensuring accessibility of data from any location. This outcome seemed to be impossible to achieve due to the strict latency requirements that primary storage inherently carries, but ClearSky has found an elegant and effective solution to this conundrum. However, there is one caveat here and it will be evident in the following paragraph.
Here is how it works: ClearSky’s will deliver a 2U all-flash “edge” caching appliance to your Data Center and they will connect it to their Global Storage Network through their nearest Point Of Presence by means of dual Gigabit dedicated leased line that ClearSky will contract and deploy for you as part of the package. The strict requirement here is that your DC must be at metro distance from the nearest POP to ensure that latency is below two milliseconds. The caveat I was referring to is that, as of today, these POPs are available only in key US locations – mostly in the East Coast – although the plan is to expand ClearSky’s presence, first within the USA and then globally. This will be possible with alliances with local or global service providers that will provide dedicated connectivity and hosting at competitive prices; Akamai has recently strategically contributed to ClearSky Round B funding and this is a clear indication of technological and business synergies to come. The POPs are connected (again with high-speed dedicated links) to a cloud backend where the single, complete, fully protected copy of the customer’s data is stored. Right now, this cloud backend is located within the AWS US East Region, but there is no strict dependency on AWS and any Cloud Provider could perform that role, as long as they are able to guarantee the same strict availability and durability requirements AWS can ensure to ClearSky. Securing deals with providers who can offer high-speed dedicated connectivity and durable, “infinite” elastic storage will be the key to ClearSky’s global expansion and business sustainability.
Going back to the technical details, the Edge Appliance is designed for full redundancy and to be connected through iSCSI or FC protocols to compute nodes; writes directed to the Edge Appliance are in reality going through it and are immediately stored in the main storage at the POP. Therefore, the Edge is only acting as a pure read cache device hosting “Hot Data”, while the POP hosts both “Hot” and “Warm” Data. Every 10 minutes the data in the POP is synchronized with the Cloud Backend, which eventually contains the one and only copy of the full data set. ClearSky’s algorithm ensures that the “read hits” on the Edge Appliance are always at least 90% and, if data must be retrieved from the underlying layers, the low latency links should be able to take care of this need without major performance degradation. In any case, the local storage capacity of the edge appliance can scale up and, if its limit is reached, more appliances (with the associated additional bandwidth) can be deployed. The availability and durability of the data is ensured by the backend cloud providers and guaranteed by the SLA between them and ClearSky.
The beauty of this model is that, if your DR Data Center is within the same metro area (i.e. if it connects to the same POP), there is no need for data replication as your full data set is in the cloud and can be accessed by the compute nodes in the DR site by just pointing the DR Edge appliance to the same POP. This implies a RPO=0 in metro environments and an RTO<1 min from the moment data is requested and it is eventually fetched from the cloud backend. If your DR site is in another geography and a POP exists there, RPO will be now 10 minutes because one must take into account that the Primary POP will dump data to the Cloud Backend at every same time interval.
In addition to this, ClearSky has announced at TFD14 the availability of a Virtual Edge Appliance (as an Amazon instance): this would allow a customer to deploy a DR site in the cloud, fully virtualized. The economies of scale coming from the “single copy, accessible anywhere” are immediately evident.
Another one of the new capabilities announced by ClearSky at TFD14 is that the Cloud Backend can be leveraged to perform scheduled backups, which consequently are stored off-site by default. The architectural complexities of a fully available, redundant and protected legacy infrastructure are removed
or, better said, hidden inside the Cloud by ClearSky.
ClearSky also mentioned a few very cool features at TFD14, some already available, and one in their roadmap (Containers support). Let’s start from the immediately available ones: with end-to-end encryption data is encrypted at rest and in flight using keys under complete control of the customer. This and the adoption of TPM technologies within the appliance ensure that data is always inaccessible to 3rd parties not even in the case the Edge appliance is stolen from the customer’s premises. The other cool feature is the integration with VMware vSphere; besides “speaking” VAAI and VASA, ClearSky plugs in seamlessly into the vSphere Web Client to manage backups that are obviously performed into the Cloud Backend. The VMware demo is worth having a look at TFD website. Finally, ClearSky mentioned future support for Containers as they are working with Docker and Kubernetes to provide persistent storage to containerized apps within ClearSky’s model.
My verdict is simple: I believe that ClearSky’s solution can really make a difference in a storage market that, despite all the innovations from new players (Nimble, Pure etc) is still bound to a traditional, on-premises architectural approach. Now, for ClearSky to succeed, they have to strike deals with connectivity and hosting partners to bring their solution to Global Customers. Whether they will succeed or not, is not a technological matter, but more one of business model sustainability. I wish them to break through and I hope I will write more about ClearSky in the future.
Disclaimer: I have been invited to Tech Field Day 14 by Gestalt IT who paid for travel, hotel, meals and transportation. I did not receive any compensation to attend TFD and I am under no obligation whatsover to write any content related to TFD. The contents of these blog posts represent my personal opinions about the products and solutions presented during TFD14.