Home > Blogs > VMware vFabric Blog > Monthly Archives: February 2013

Monthly Archives: February 2013

From the Front-line of StrataConf: A VMware Perspective

Day 2 of the O’Reilly Strata Conference is starting here in Santa Clara, California and the focus is very much on data. In 2005, Tim O’Reilly predicted: “Data is the Next Intel Inside.” At VMware, big, fast data has never been so critical for our customers and innovations are transforming the cloud applications landscape at an unprecedented rate. This conference comes at the perfect time to reset what everyone knows about big, fast data.

The conference kicked off yesterday with several brief 20 minute keynotes. They were all  succinct and to the point. Greenplum‘s Scott Yara reflected on how the big data market has grown tremendously over the past few years and mentioned several key data scientist practitioners.  Scott also mentioned the increased investment in open source Hadoop. Of course, Strata comes on the heels of the Greenplum Pivotal HD  announcement on Monday which launched their distribution of Hadoop which can improve performance 50X to 500X when compared to existing SQL-like services on top of Hadoop.

Another great keynote presentation was from Yael Garten, a Senior Data Scientist from LinkedIn. Yael leads the mobile data analytics team. She began by polling the audience and noting that many in the audience had already been on 3 different devices that morning and it wasn’t even 9:30 am yet. She noted we’re constantly connected, and we need to use data to personalize the experience for users no matter what device we’re on.  She had an interesting graph highlighting device use and laptop use during our morning time of “coffee to couch”.  And those uses are different in the US compared to places like India. Continue reading

Behind the Scenes: Patching PostgreSQL for Performance–vFabric Team Contributes to Open Source

After PostgreSQL 9.2 was released, users that relied on PostgreSQL for scale, may have noticed a performance hit. In fact, the PostgreSQL community alongside the VMware vFabric Postgres team, was able to prove that the new version demonstrated a 10% performance hit over version 9.1. As part of the VMware Postgres team, we wanted to fix this problem for our own distribution, but as mentioned in previous posts, we also wanted to contribute our fixes back to the common core.  This post provides additional detail on how this problem was identified and how we worked with the open source PostgreSQL community to restore performance.

Background on the Performance Issue in PostgreSQL 9.2

Last year, during routine regression testing of vFabric Postgres, we found that PostgreSQL 9.2, the latest major release of PostgreSQL, demonstrated a significant performance regression from version 9.1. Using DBT-2, an open-source and fair-use implementation of TPC-C benchmark [1], we noticed a 10% performance degradation, which we then reported to the community [2].

To troubleshoot the problem we used git bisect to find the type of commit that caused the performance problem and cross-examined the statistical profiles using oprofile. As it turns out, the regression was caused by a commit that changed the way memory was allocated when SPI queries were executed. The commit was intended to reduce the number of allocations for queries using a cached plan at the cost of more logistics work. However, according to the DBT-2 test, we could see that this tradeoff was unfavorable for dynamic queries. So to fix it, we would need reintroduce the original tradeoff on its intended queries using conditions [3].

We proposed the fix to the wider PostgreSQL community and the ensuing discussion led to a refined resolution which was implemented in a patch [4]. This patch has been back-ported to the latest PostgreSQL 9.2.3 release and is included in the latest vFabric Postgres release [5]. Continue reading

Why Lean Application Servers Are Faster, Cheaper, and Better For Business

The application server has been the centerpiece of modern architectures for web-based applications for over a decade. However, there are trends in technology that make us rethink how we use application servers and how we can get the most value out of them.

Over the years enterprises have built up considerable technical debt. This debt is made up of outdated processes, legacy applications, and stale technologies. We are all familiar with the types of headaches caused by older apps:

  1. Development is slow.
  2. Costs continue to rise, not fall.
  3. Business needs are increasing in speed and complexity.

The good news are there are solutions today that solve all of these challenges. This post and accompanying video are aimed straight at helping you understand what will help you evolve your applications to a modern approach that will benefit your company and your customers alike. Using VMware and open source technologies such as Spring, Apache Tomcat, vSphere, Spring Insight and Hyperic we will explain to you how these tools and methodologies come together with tc Server to evolve your development organization and applications to tap into the full potential of lean development and cloud computing.

Continue reading

Introducing A New Reference Architecture That Will Speed Knowledge & Development of Modern Cloud Applications

Technology is evolving at breakneck speeds.

Universally, applications are faster, deal with large data sets, and provide more compelling user experiences than ever before.

Competition is steep.

As a result, competitive organizations demand that IT leaders speed the rate of new application innovation and development.  IT must rise to the challenge or face competitive threats, missed business opportunities, and lose momentum within their user base. In short, IT leaders and providers that do not accelerate will face a backlash from executives.

In order to meet these challenges, IT is renovating application architectures to thrive in the cloud. This is an organization-wide change involving people redirection, process redesign, and technology exploitation. For many, there is a steep learning curve. Continue reading

Join Us at Strata – Feb 26-28 in Santa Clara

The vFabric and Greenplum teams will be at Strata on Feb 26-28 at the Santa Clara Convention Center.

While the Pivotal Initiative is forming, both vFabric and Greenplum groups will be represented separately. Of course, you can also learn what’s going on by checking out Strata Greenplum or Strata VMware on Twitter.

If you aren’t familiar with Strata, it is a great conference for those building apps in the cloud. Its focus is all about the future of big data and how to use big data successfully. Speakers include representatives from Google, VMware, Amazon, Microsoft, and many other software companies focused in the big data space. Topics include: Continue reading

Dynamic Memory Management of vFabric Postgres

In a nutshell, dynamic memory management in vFabric Postgres is conceptually like Elastic Memory for Java (EM4J), but for a virtualized, enterprise-class, open source database instead of an application server.

Compared to a normal PostgreSQL server, vFabric Postgres brings two additions necessary for flexible virtualization of the database server. These two features can help companies realize the benefits of virtualizing the database and the associated cost savings from running an open source database on an extremely cost-effective infrastructure.

  1. Elastic shared memory management
  2. Automatic memory configuration

Elastic Shared Memory Management

Directly embedded with PostgreSQL core, the elastic shared memory management is a new feature of vFabric Postgres. This capability allows memory to be released or obtained according to the other virtual machine needs on the same server. Continue reading

Choosing Your Messaging Protocol: AMQP, MQTT, or STOMP

One of the most common questions I’m asked to cover when I discuss software architecture topics is the difference between the various application messaging protocols that exist today—issues like how and why the protocols came about, and which one should be used in a particular application.

Their question is valid.

Today, application architects need to use a messaging broker to speed and scale their applications, particularly in the cloud. Even once you select your messaging middleware application, application developers need to then select the protocol. Understanding the subtle differences between them can be difficult.

Today, we will consider three of the most common and popular TCP/IP-based messaging protocols, and provide a quick summary on the advantages of each: AMQP, MQTT and STOMP. Before we go on, I should also point out that all three of these protocols are supported in RabbitMQ version 3.0—something we will use as an example and come back to later.

So, in alphabetical order…

AMQP in a Nutshell Continue reading

5 Characteristics of a Modern Mainframe Cloud App – Avoid Tornado IT

No one likes being rushed into bad decisions.

Yet, the pace of information technology often forces IT executives to do that.

In today’s world, mainframe-to-cloud decisions need solid thinking or we risk a technology tornado. This article outlines some key lessons learned at the front-line of IT decision-making.

As previously discussed, it’s possible to “modernize” mainframe legacy applications to the cloud. You can get there with little to no modification by using a “lift-and-shift” strategy.  Several of my clients have taken this approach to quickly satisfy a “cloud mandate”. The results have been less than desirable:

  • Without the use of pooled resources, the applications do not scale well.
  • Timely user provisioning and access from any device is still a challenge because the apps do not provide on-demand, ubiquitous access.
  • In addition, utility-based pricing/costing is performed manually, with little accuracy to the realities of actual usage.
  • Most importantly, the applications continue to have monolithic, stove-piped architectures, which are difficult and expensive to maintain and enhance.

These “cloud” applications are more like funnel cloud apps or tornoado apps—waiting to cause IT organizations extreme havoc. Assuming you want to avoid funnel clouds and IT tornadoes, consider applying the following five application architecture and design principles indicative of a true cloud application: Continue reading

3 Key Performance and Scale Improvements with VMware vFabric Postgres 9.2

Recently, vFabric Postgres 9.2 launched with additional cloud computing capabilities like elastic memory management. Some of the most compelling new features are performance-related and take linear scaling to new levels.

This article will cover 3 key improvements as listed below:

  • 4x Improvement with vertical linear scaling for reads
  • 2x Improved write efficiency for write ahead logs
  • Index Only Scans and More

4x Improvement with vertical linear scaling for reads

Modern websites are almost all database driven. When consumers browse online retailer catalogs, 99% of the load is reads and 1% of the load is updates to the data on the tables. Even in highly updated websites, the grand majority of load is from reads. In these high-read usage scenarios, the database needs to handle a high read load on certain tables compared to the other tables in the database. We’ve seen this behavior drive enhancements within databases. For example, many application designs started putting a caching mechanism in their application to limit the database hits. Continue reading

Run DBaaS Metering & Chargebacks w/Data Director, then add Hyperic

Ever heard or thought this?

“We have so many databases—the cost is so significant and hard to track. If we could just measure and charge groups based on usage, we could better manage this and cut down on license costs.”

Many companies are realizing that they must run all services like a utility. There are “table stakes” for playing in the model. It means IT must manage all data-related services with metering and chargebacks. You just can’t be a true software service if you don’t know your real costs. Not to mention, other executives look at you with big eyes when you say, “I am not sure how much all our databases cost us.”

If you are unaware of what vFabric Data Director does, here is a short overview explaining how it can help consolidate databases by 13 to 1.

Through database-aware virtualization and self-service lifecycle management, vFabric Data Director (vFDD) enables organizations to offer DBaaS for reduced TCO, increased agility, accelerated application development, and more. Like all other XaaS offerings, vFDD 2.5 is implemented based on a  cloud architecture—resource pooling and elasticity through resource pools/resource bundles, multi-tenancy through the organization/database group/database structure, self-service through automated database lifecycle management workflows, and additional security.

However, some suspect there is one important, business-centric cloud component missing in vFDD—the service metering capability, a.k.a. chargeback. Many have wondered if vCenter Chargeback Manager (vCBM) can be utilized together with vFDD to fill the gap. Can it? In short, the answer is yes. Continue reading