Home > Blogs > VMware vFabric Blog > Tag Archives: MySQL

Tag Archives: MySQL

Disaster Recovery Jackpot: Active/Active WAN-based Replication in GemFire vs Oracle and MySQL

Ensuring your systems run smooth even when your data center has a hiccup, or a real disaster strikes is critical for many companies to survive when hardships befall them.  As we enter the age of the zettabyte, seamless disaster recovery has become even more critical and difficult. There is more data than we have ever handled before, and most of it is very, very big.

Most disaster recovery (DR) sites are in standby mode—assets sitting idle, waiting for their turn. The sites are either holding data copied through a storage area network (SAN) or using other data replication mechanisms to propagate information from a live site to a standby site.  When disaster strikes, clients are redirected to the standby site where they’re greeted with a polite “please wait” while the site spins up.

At best, the DR site is a hot standby that is ready to go on short notice.  DNS redirects clients to the DR site and they’re good to go.

What about all the machines at the DR site?  With active/passive replication you can probably do queries on the slave site, but what if you want to make full use of all of that expensive gear and go active/active?  The challenge is in the data replication technology. Most current data replication architectures are one-way. If it’s not one-way, it can come with restrictions—for example, you need to avoid opening files with exclusive access. Continue reading

Scaling and Modernizing .NET and Java: SQLFire Performance Test Blows Away Traditional RDBMS

We all know the devil is in the details when it comes to technology.

Yet, our recent vFabric SQLFire performance test (a benchmark from vFabric SQLFire Best Practices Guide) is certainly worth review if you need to scale a Java app, .NET app, or other legacy data source.

If you don’t know what vFabric SQLFire is, it is basically what happens when Apache Derby gets married to vFabric GemFire:

  • Apache Derby is used for its RDBMS components, JDBC driver, query engine, and network server.
  • The partitioning technology of GemFire is used to implement horizontal partitioning features of vFabric SQLFire.
  • vFabric SQLFire specifically enhances the Apache Derby components, such as the query engine, the SQL interface, data persistence, and data eviction, as well as adding additional components like SQL commands, stored procedures, system tables, functions, persistence disk stores, listeners, and locators, to operate a highly distributed and fault tolerant data management cluster.

Continue reading

How to Prepare for a Hyperic 5.0 Upgrade

Recently, we announced a new version of our popular web infrastructure management tool, Hyperic 5.0.  We also explained how, in this new version, the backend data store is moving to support PostgreSQL. In this article, we are providing existing Hyperic users the information they need to plan their upgrade to Hyperic 5.0. This article is for anyone already running Hyperic HQ (open source), vFabric Hyperic, vFabric Application Performance Manager, or vCenter Operations Manager.

This post is meant to augment the knowledge base article, KB 2033940, published back in August.

Planning the Upgrade

We know platform changes can make an upgrade more difficult and certainly raise eyebrows. So, we’ve taken measures to help make the migration as seamless and simple as possible. So far, the cases we’ve seen take about an hour. As with any data migration, the greater the volume of database records, the longer it can take. Continue reading

Why Hyperic is Going to Support PostgreSQL Only As a Backend Database

The next release of Hyperic is coming up soon and the biggest change is to the backend. In the next release, we will only support one database, namely PostgreSQL. Those of you who have been with Hyperic for a while as long as I have may be surprised considering our history with PostgreSQL, but, as you read though this blog, it will start to make sense.

History of PostgreSQL and Hyperic

For the last few years Hyperic has supported only two databases for production use at scale—Oracle and MySQL. This in itself was a big change since at one point, PostgreSQL was our bread and butter.  Hyperic was originally designed on PostgreSQL 7.x. As an open source project, PostgreSQL has a very easy license for distribution. As a startup company we had to get our product out into the marketplace quickly and affordably, so therefore PostgreSQL made sense.

Continue reading

2 Lessons on RabbitMQ from HighLoad – Extending the server with Plugins and HA Queues

Recently, I presented at the HighLoad++ conference in Moscow Russia. Many of the developers I spoke with were interested in two pieces of information about RabbitMQ, and I wanted to share these with a broader audience:

  • Extending through Plugins
  • High Availability Queuing

If you haven’t heard of HighLoad++, the conference is pretty special since it focuses only on high traffic websites (mostly from Russia). The main point of the conference is to talk about new architectures and approaches for highly complex systems and covers things like:

Continue reading

PostgreSQL contributor Heikki Linnakangas joins VMware vFabric Team

In this short Q&A, we get the perspective of Heikki Linnakangas who’s just joined VMware after being a senior software architect and contributing to PostgreSQL for six years.

1. You’ve been involved with PostgreSQL for a while, could you give us a bit about your background and how you’ve been involved?
It all started in 2003, when my second child was born. I was at home with the baby for a month or two, and thought it would be fun to take a look at how a DBMS works under the covers. I have done programming as a hobby since I was a kid, but had not had a chance to do much outside a work environment for some time. Continue reading

Spring and RabbitMQ – Behind India’s 1.2 Billion Person Biometric Database

Aadhaar was conceived as a way to provide a unique, online, portable identity so that every single resident of India can access and benefit from government and private services. The Aadhaar project has received coverage from all possible media – television, press, articles, debates, and the Internet. It is Screen shot 2012-07-30 at 5.53.12 PM seen as audacious use of technology, albeit for a social cause. UIDAI, the authority responsible for issuing Aadhaar numbers, has published white-papers, data, and newsletters on progress of the initiative.A common question to the UIDAI technology team in conferences, events and over coffee is – what technologies power this important nation-wide initiative? In this blog post, we wanted to give a sense of several significant technologies and approaches.

Fundamental Principles

While the deployment footprint of the systems has grown from half-a-dozen machines to a few thousand CPU cores processing millions of Aadhaar related transactions, the fundamental principles have remained the same:

Continue reading

Using SQLFire as a read-only cache for MySQL.

A question came up in our forums about using SQLFire to cache data from a database using lazy loading. I wanted to provide a fairly concrete example that uses MySQL that should give you all the info you need to try it out yourself.

Why do this?

SQLFire can play a number of roles. For one thing it has its own persistence right out-of-the-box so it can be your database of record unassociated with any other database and brings big advantages in terms of speed, eliminating single-points-of-failure and so on. But chances are you have a database already and would like to get a feel for SQLFire without a huge commitment. In this case there are a couple of options, SQLFire can act as a read/write layer atop the database, persisting database to a more traditional RDBMS, or SQLFire can act as a read-only cache that loads data in as you need it. Many times this latter case is the least commitment way to try SQLFire out, after all your database may have stored procedures or other things that make it difficult to port to another database.

Putting a read-only cache atop an RDBMS helps by offloading some reads to another server. Let's think about a database that is constantly getting reads and writes for a lot of different users (high concurrency). This sort of situation easily leads to disk thrashing which leads to extremely bad performance unless you're willing to throw lots of money into storage. Serving a portion of reads in-memory cuts down on this thrash, making your write throughput go up on the underlying database. In addition, an in-memory architecture like SQLFire handles high concurrency extremely well since it doesn't suffer from thrashing problems the way disk-based databases do. Another way caching brings read throughput up is by not forcing every read to be consistent with every write. I'll talk about this more in the next section.

Before you begin, understand these.

  1. When it comes to read-only caching, SQLFire 1.0 supports accessing database rows based on primary key. This sort of access pattern is pretty similar to what you might do with a memcache or a Redis, except that you get RDBMS rows instead of objects, which has some advantages in terms of reusing code you already have.
  2. Second, SQLFire caches the data. That might sound silly to say, after all that's the goal, but it also means that when you use SQLFire to cache, the data in SQLFire will be stale to some extent. If SQLFire were to reach out to the database on every access there would be no performance benefit, probably just the opposite. Your application and use case need to be ok with data that is stale. By adopting this "eventually consistent" approach the system as a whole does less work and higher throughput is possible. As more and more applications are exposed to the big-I Internet people have been forced to realize that there are quite a few places in their application where eventual consistency is perfectly fine. How stale the data should be is configurable as we'll see below.

Step 1: Install and configure MySQL, including the MySQL JDBC Connector

I'm not going to give detailed instructions on this one, I'm going to assume you have a database up and running already. You should also get the MySQL JDBC connector. If you're on Ubuntu for instance just "apt-get install libmysql-java" and it will appear in /usr/share/java/mysql-connector-java.jar. Similar if you're on Fedora using yum. If you're on another platform I wish you luck in your adventure to track that file down, this blog will still be here when you return.

By the way, if you're using some database other than MySQL, very little changes in these instructions, all you need to do is find the appropriate JDBC driver and change the connection strings.

Step 2: Create a table and some data in MySQL.

Again I'm going to mostly assume that you know how to do this but I'll create a table called names and insert a few rows. It's important that the table use a primary key.

Step 3: Compile the JDBCRowLoader sample (and put it in a JAR file).

So it turns out there is some assembly required if you want to use SQLFire as a read-only cache. Luckily it's not too painful, everything you need is included in the SQLFire kit, when you install you'll have a directory called examples with some Java code in it. One of these files is JDBCRowLoader.java, which is a sample class that implements the RowLoader interface. There are a number of ways to approach the RowLoader interface but this example is quite general-purpose. To get things ready to use, run these three commands from within the examples directory:

  1. javac -classpath ../lib/sqlfire.jar JDBCRowLoader.java
  2. cd ..
  3. jar -cf examples.jar examples

Now you have a JAR file you can use as a row loader. This generic row loader even handles cases where you want fewer columns in SQLFire than in MySQL using the query-columns parameter. You can implement your own row loader if you need to. Why would you do that? Most likely you would want to apply some transformation to the data as you're reading it in. For one example you may have noticed that SQLFire 1.0 doesn't support the boolean datatype yet. If your MySQL table had boolean data, you could use your RowLoader to transform it into an integer value of 0 or 1 as appropriate. For this demo the generic RowLoader is more than enough.

Step 4: Start SQLFire with an extended classpath.

We need SQLFire to understand where the MySQL JDBC driver is, as well as our JDBCRowLoader code is. To do this we will need to use -classpath when starting SQLFire. Here's an example:

  • sqlf server start -classpath=/usr/share/java/mysql-connector-java.jar:/home/carter/sqlf/examples.jar

You will need to substitute the paths to your examples.jar and MySQL connector. After a successful start you can connect as you normally would.

Step 5: Re-create the table in SQLFire and call SYS.ATTACH_LOADER to hook the two together.

First we create a table as in MySQL except we don't populate any data. Next the special ATTACH_LOADER call is invoked to hook SQLFire to MySQL.

Here's what this should look like in your session.

  Linking SQLFire and MySQL

Now when you query SQLFire by primary key, rows will be loaded in on demand. You should also understand that if you don't specify a where clause when querying the table, SQLFire will report whatever keys have been cached thus far rather than everything in the underlying table. Chances are this is not what you want, so be careful of it.

Select from SQLFire caching MySQL

Step 6: Controlling cache size and data freshness.

The table created above is not the sort of thing you really want when you're caching with SQLFire for two reasons:

  1. First of all, data is never updated once it's placed into the cache. Ever. No matter how much the underlying data changes in MySQL, those changes will never make their way into SQLFire. Not likely what you want.
  2. The table will grow indefinitely large. Chances are you'd rather cache a subset of your data since you're using SQLFire as an in-memory layer.

Overcoming both these problems is very easy to do in SQLFire, using CREATE TABLE semantics. Here's an example:

Let's take a quick look at what this does. The first difference between this statement and the one above is the EVICTION BY LRUCOUNT 50000 clause. This means that SQLFire will maintain up to 50,000 entries in the cache before taking action. The action is configurable but this setting causes SQLFire to simply discard the data, perfectly appropriate for a cache. The second line is more interesting to most people, EXPIRE ENTRY WITH TIMETOLIVE 300. This says that an entry should be kept in the cache no more than 5 minutes. The net effect is to guarantee that data is never more than 5 minutes stale. After 5 minutes the data is discarded and must be re-fetched from the underlying database. If you have a fairly short TTL you might not even want to set a maximum cache size.

Variations.

Here are a few simple variations on this demo:

  1. Change your cache policy by changing the LRUCOUND or TIMETOLIVE parameters in the CREATE TABLE statement.
  2. If you want to act as a cache on top of Oracle or some other database, just change the url parameter to mention the appropriate driver and URL. Note you will need the respective JDBC JAR for that database to be in the classpath when you start SQLFire.

Questions? Comments? Join us in the SQLFire Community.