data multi-cloud springone

Multi-Cloud and Data Replication Over a Wide Area Network: A High Interest Topic At SpringOne Platform

The interest in multi-cloud was evident at SpringOne Platform this year,  with frequent mentions, and two well-attended sessions dedicated solely to this topic. This is consistent with market data from industry analysts, which also points to strong interest in multi-cloud. WAN replication maintains copies of data across multiple clouds that are geographically separated. Updates to the data are replicated asynchronously so that copies are eventually consistent. Things are moving fast—we used to talk about cloud-first strategies not too long ago. Now, we’re already hearing about multi-cloud first strategies.

First up was the Apache Geode™ Summit, a half-day pre-conference event featuring use cases and best practices related to this in-memory data grid technology. We had a great agenda, and I couldn’t think of a better way to set the tone for the conference. The room was packed, causing us to actually ask people to vacate the room for 10 mins to add more chairs.

Multi-cloud was the focus of several sessions at the Summit, such as “Design Patterns Facilitated by Geode's WAN Distribution” presented by Pivotal’s Helena Bales, Diane Hardman, and Karen Miller and “It’s a Multi-Cloud World, But What About the Data?”presented by Pivotal’s Pulkit Chandra and Nikhil Chandrappa.

Interest in this topic stems from some important business requirements:

User Experience—Meeting Response Time and Quality of Service Requirements: Response time to user requests is highly dependent on network latency. If user requests have to travel far, the response time and performance of the system will not be adequate enough for a seamless user experience, resulting in a significant drop off in users. Replicating application instances and data across geographically separated data centers makes it possible to respond to user requests from a nearby data center, and meet the quality of service requirements.

High Availability and Disaster Recovery: If an outage impacts an entire data center, the only recourse is to route traffic to another geographically separated data center. This requires application instances and data to be in both places.

Regulatory Compliance: Some industries, like banking, have regulations that require data to be protected by maintaining copies over a specified distance.

As the presenters at these sessions explained, this capability is delivered via our caching products—Pivotal GemFire and Pivotal Cloud Cache. Both products are based on open source Apache Geode, and WAN replication is an integral part of Geode. I recently published a blog post explaining the relationship between these products.

At the Geode Summit, Helena, Diane, and Karen walked through various design patterns facilitated by Geode’s WAN distribution. These design patterns include: blue-green disaster recovery; active-active; command query responsibility segregation (CQRS); hub and spoke, or star, topology; and the follow the sun pattern. Here’s a recording of a demo that Diane did, showing how to set up an active-active WAN topology. We’ll post a recording of the full session in the next few days.

Pulkit and Nikhil focused on how WAN distribution works on Pivotal Cloud Foundry (PCF), using the Pivotal Cloud Cache (PCC) service. They described the availability problem by showcasing some high profile service outages that resulted in severe business impact. Pulkit and Nikhil then covered how PCC handles failures by sharding data so that a failure of one node in a cluster does not disrupt availability or result in data loss. They also described how disk persistence provides a recovery mechanism even when an entire cluster fails. Both of these failure recovery mechanisms are also available with Geode and GemFire. In the case of PCC the system gets a major assist from the platform, via BOSH, which monitors nodes and spins up a new node if one fails. The data store is re-attached to the new VM to maintain access to the data. Recently, my colleague Mike Stolz wrote a blog post about how PCC supports availability in a way that minimized the trade-offs related to the CAP theorem. PCF also makes it very easy for developers to self-serve the provisioning of their clusters.

This topic is currently on fire, but we’ve actually been at it for a while. These caching products have supported WAN replication for years, and we’ve featured it on several occasions. A few months ago, I hosted a multi-cloud webinar with Mike Stolz and industry analyst Mike Gualtieri from Forrester entitled “Overcoming Data Gravity in Multi-Cloud Data Architectures.” This webinar goes into the business drivers, challenges, application design thinking for multi-cloud architectures. Mike Gualtieri presented research data associated with multi-cloud. Also, Mike Stolz literally wrote the book on GemFire, dedicating an entire chapter to this topic.

When it comes to wrangling data in a multi-cloud environment, we’ve got your back with Apache Geode, Pivotal GemFire and Pivotal Cloud Cache. We’ll also try our best to become better at forecasting attendance to our conference sessions so that we don’t have to play musical chairs to get everybody seated.