The next release of Hyperic is coming up soon and the biggest change is to the backend. In the next release, we will only support one database, namely PostgreSQL. Those of you who have been with Hyperic for a while as long as I have may be surprised considering our history with PostgreSQL, but, as you read though this blog, it will start to make sense.
History of PostgreSQL and Hyperic
For the last few years Hyperic has supported only two databases for production use at scale—Oracle and MySQL. This in itself was a big change since at one point, PostgreSQL was our bread and butter. Hyperic was originally designed on PostgreSQL 7.x. As an open source project, PostgreSQL has a very easy license for distribution. As a startup company we had to get our product out into the marketplace quickly and affordably, so therefore PostgreSQL made sense.
As the years went on, it was obvious that we could gain adoption for the product by supporting more databases. Larger customers started to demand Oracle for the large deployments and MySQL for the medium sized deployments. As time passed, supporting three databases proved to be very taxing on our Engineering and QA teams. For each database, we had to maintain separate code paths in the product to ensure the data access was as efficient as possible. Then QA had to run full regression tests each release with our entire matrix of supported databases. Additionally, supportability by engineering of the product to our support, sales and professional services teams was much more time consuming.
This meant a lot of extra effort, but we made it work for a while to keep our customers happy. However, as we continued to look into the performance of the three databases, we eventually found that MySQL had the best performance (some of you may remember this performance study we published to show how well it worked). Internally and externally, we had the least amount of issues on MySQL. The data was convincing enough that eventually we made the hard decision to cut support for PostgreSQL so that we would only support one open source database and one commercial database. Still, we were well aware that the best scenario for the quality and cost of support for the product was to only support one database.
Over the past few years, our experience is that the tides of customer demands have been shifting. We have not been seeing as strong of a demand for either Oracle or MySQL. A couple things have changed in the industry, Oracle bought Sun/MySQL and NoSQL databases have flooded the market. These changes, in my opinion, have worked to lower the likelihood of a customer to demand a specific database and signal that they are more willing to support a variety of databases for different purposes. Essentially, today customers are all about choice, and with lots more options and Oracle & MySQL being operated out of the same company, the demand from our install base is lower. So while Oracle and MySQL continue to be good products, the freedom of choice is spilling into the software developers arena as well. Now, customers are more willing to accept the developer’s choice to use the right tool for the job in the right way.
Why are we moving back to PostgreSQL?
It is no secret that VMware has invested in PostgreSQL development for virtualized deployments. Our product, vFabric Postgres, also called vPostgres for short, makes improvements to the core of PostgreSQL and has lots of features which lead to many advantages when running it in a virtualized environment—more than other databases of its kind. About one year ago, I was approached with the idea of moving Hyperic back to PostgreSQL and dropping support for both MySQL and Oracle. I’ll admit, my personal reaction to this idea was not favorable. While I knew supporting only one database would be a good thing for Hyperic, and we had supported PostgreSQL previously so we would have some headstart, it was a lot of work and change for the product for essentially a gamble on a direction that I, at least initially, wasn’t comfortable fully supporting.
In the grand scheme of things, I consider myself a pragmatic person. So, I understood the big picture and started seeking out the opportunities that could come with this change and make it work for both us as an engineering team, but also for our customers and users.
Some of the key things I focused on were:
- When Hyperic stopped supporting PostgreSQL, we were on version 8.2.5. PostgreSQL has made some great strides to its performance with 8.3 and 9.x, which would likely work to our advantage.
- Hyperic is much more efficient at interacting with the database since version 4.0, which was around the time that we dropped PostgreSQL support. As a result, the Hyperic application was able to achieve a higher end scale regardless of the database attached to it.
- I have been very impressed with our internal vPostgres team. Considering we have never had direct database engineering support before, we could cut days or weeks of researching configuration issues into a couple hours of technical conversation internally, gaining engineering cycles in every release.
Re-Optimizing for vPostgres
Our high level goals for Hyperic on PostgreSQL were:
- 2000+ agents
- 150k metrics / min
- 50k+ managed resources
- 500 compatible groups with >= 500 resources / group
- Ability to have to 10 concurrent users continuously pounding the HQ UI
Lucky for us we always kept PostgreSQL compatibility in Hyperic therefore we didn’t have to modify our schema. The major challenges were:
- PostgreSQL 9.x changed it’s blob implementation. Therefore, the driver needed to be updated to a 9.x compatible driver.
- Migrating from MySQL / Oracle to PostgreSQL for existing customers.
- Figuring out where database slowness was occurring while applying real load to the Hyperic server.
- Optimizing our Data Inserter for PostgreSQL, this was a major pain point in the past.
- Correctly configuring PostgreSQL.
Before getting started, we made a best effort to tune these using the expertise of our vPostgres team, guiding us in the right direction. We made several code changes in this area to prepare for the real test on a scale environment.
Experience using vPostgres 9.x with Hyperic
To scale up, we started with a large sized Hyperic HQ instance and a large sized vPostgres instance and progressively tried to break it until we reached our goals. We used Apache JMeter to simulate concurrent user UI load. This allowed us to continuously monitor the slow query log in order to figure out what queries / flows in various areas of the application needed work. Additionally, we used tcpdump analysis to gather response time metrics on a per page basis and understand the differential between our MySQL / Oracle instances and our new, scaled up vPostgres instance. In the end, we were able to tune our vPostgres implementation to be, at minimum, on par with respect to MySQL and Oracle. Then, we further tuned it to outperform these backends in certain cases.
While going through this process to increase the Hyperic server load, I was impressed that vPostgres did not break down as it had in the past. The configuration we used was relatively the default settings, mainly sizing the
effective_cache properly. At this time, there is only one area where performance has not met or exceeded previous releases—the I/O write performance of our metric data inserts. Our vPostgres team is currently investigating this, and we expect, within the next couple of releases, that this will no longer be an issue. Even though we achieved the same scale of MySQL and Oracle, we are working to substantially increase the scale in future versions.
Using Hyperic to Analyze vPostgres Performance
One nice thing about optimizing Hyperic is that I was able to use Hyperic’s monitoring capability with our newly enhanced vPostgres plugin to monitor it’s own performance.
Here is a sample of the configuration to get the monitoring started:
To monitor the I/O, it is very simple to create a resource for monitoring the data mount and the log mount:
Once this is complete, here is a sample of the data available for viewing.
LogDir Mount Sample Data:
VPostgres Plugin Sample Data:
In all, this experience has made me a believer again that vPostgres is moving in the right direction with more to come. We have invested a lot into making vPostgres work well for Hyperic and for cloud deployments, and since the database isn’t overly configured to optimize it, I believe it will be easy for customers to support internally. I am excited to see how the evolution turns out, so as you try it out, either in the current beta program or once we release it, be sure to let us know how its going.
|About the Author: Scott Feldstein has spent his entire 12+ year career implementing Datacenter management solutions. Starting at Sun Microsystems, Scott spent the first half of this time working on very large scale monitoring solutions for Sun’s internal compute grids. In 2007 he started at a small startup company called Hyperic which eventually became part of VMware. Scott’s technology passions are working on large scale software implementations and working with data storage solutions. Scott received a Computer Engineering degree from Cal Poly (SLO) and currently works out of the VMware San Francisco office.|