Data is one of the most fluid elements in the world, and it continues to change and develop at a rapid pace. This presents a number of rich challenges and practicalities that need to be taken into account, such as how to get the data effectively to your applications and ultimately into your customers hands, or how to move it between different database types so that you can make the best usage of the different environments.
Last week we announced VMware Continuent 4.0, an important release not just because it supports these key types of deployment, but also because for the first time cluster and replicator deployments can be mixed and matched, so that if you want to replicate directly from your transactional cluster out to your data warehouse, you can do so in one complete step.
All of these solutions make use of a powerful central replication engine, which supports both statement and row based data transfer, filtering, modification, and multiple source and target databases. Although it’s a powerful system, the basic structure and layout is very simple.
The clustering models also use a manager and connector component to provide cluster management and proxy services.
We’ll build up to that, and start by looking at some of the key deployment topologies we have available.
Simple DR Deployment
Using VMware Continuent for Disaster Recovery gives you a simplified option for providing a full replication service to an offsite server or datacenter, including vCloud Air, to ensure that your data is safe in the event of a failure. The high availability (HA) option works for single MySQL or Oracle data servers. In the example below, we’re providing the DR functionality for an entire cluster in San Jose to New York.
In this scenario, we’re making use of the replication component to provide replication between the two clusters. VMware Continuent replication is asynchronous, and also works off the MySQL binary log, which means a database daemon failure does not prevent us from reading the binlog and applying transactions.
A management layer enables the services to be monitored, handles switches and failover within the clusters, and can manually failover between sites transparently.
MySQL Cluster Deployment
A MySQL Cluster is a more typical deployment, and enables you to configure a group of MySQL servers in a master/slave architecture, supporting scale-out for your MySQL database needs and providing a management and connector interface. The cluster deployment is incredibly powerful, and as before, the manager handles the cluster and provides the functionality to start, stop and pause the replication and database services, handle switches and failover. The manager is also responsible for making sure that in the event of a change of topology, for example, adding nodes to provide more read functionality, or in the event of a planned or automatic failover. The manager handles the switch process, and also tells the connector about the change of the topology.
The connector, meanwhile, is responsible for providing your applications with connectivity to the underlying database servers. The connector acts as a transparent proxy, routing requests from your application servers to the right server underneath, sending writes to your master and reads to your slaves. It keeps in constant touch with the manager, so when the master fails over and is switched by the manager, or when you deliberately take a server out of the cluster for maintenance, the connector just reroutes the data to the correct server.
The connector is completely transparent, and can either speak the native MySQL protocol or operate in what is called bridge mode and provide routing connectivity. Either way, your applications do not need to be modified in any way. In fact, it’s best to think of the cluster as a single database – and let VMware Continuent worry about how the cluster operates and works underneath.
The connector provides this transparent interface to the cluster, meaning you don’t have to change or re-architect your application in any way. In some situations the process can be implemented without any change to your running servers.
Multi-Site Multi-Master Deployment
Many companies want more than just clustering within a single site. They need to be able to have multiple datacenters around the world that are notionally close to their users, and therefore able to support the workload. But all the datacenters need the same data. This is the role of the Multi-Site Multi-Master (MSMM) deployment; it combines the clustering and bi-directional replication to enable local load balancing, with global data sharing.
At each of the local sites sits a cluster handling the local load, while the replication between sites keeps them all in sync. This geographically distributed cluster model is already used by many customers to run and manage their operation. It’s also important to keep in mind that the same HA and failover rules apply in this model. We can failover between sites in the event of a local problem and recover as necessary.
Replicating in and out of Oracle is supported by VMware Continuent for replication. You can replicate data from Oracle to MySQL, from Oracle to Oracle for the purposes of DR, and you can replicate from MySQL to Oracle.
There are a myriad of different uses for this model; some customers use it to distribute load from Oracle into MySQL for supporting web applications. It can also be used to concentrate data from multiple MySQL installations into Oracle for analysis.
Replication into a Data Warehouse
Transactional data stores are great, quite unsurprisingly, for transactional data. But what if you have more complex, or more extreme amounts of data that need to be queried and handled? This is where data warehouses come in. Whether you are capturing all of your data, concentrating the data from multiple databases, or just collecting it into a system for easier analysis, VMware Continuent for Analytics and Big Data can help. With 4.0, we can also do it in one configuration from a cluster into your chosen data warehouse.
Unlike traditional ETL tools, VMware Continuent makes use of the replication technology to stream live updates over to the data warehouse. We don’t do big extractions of data, even periodically, we stream the live changes over to the target database.
This provides a stream of change data – you can either use that directly, and look for changes to data over time, or you can use it to create carbon copies of the data. That gives us a lot of flexibility in how we process the information.
The replicator, connector, and manager components work together to provide a very flexible and manageable system. With a cluster, DR or MSMM deployment have the power of a cluster with automated failover and powerful replication.