Have you ever heard of a zettabyte? If you work in IT, you’ll be hearing more and more about zettabytes, exabytes, and petabytes while the data terms we think are big, such as terabytes and gigabytes wane away from our vocabulary. Right now, we are growing our data stores by 50% year-over-year, and its only accelerating.
While data volumes are skyrocketing, the type of data is also becoming more difficult for traditional databases to handle. Over 80% of it will be unstructured file based data that does not work well with block-based data storage typical of your typical relational databases (RDBMS). So, even if hardware innovations could keep up to support greater volume, the kinds of data we are now storing break traditional RDBMS at today’s speeds.
The bottom line is the volume and types of data being stored is unrealistic for a single, monolithic, structured RDBMS data store. They need to be broken apart and re-architected to survive the Information Explosion we are experiencing today.
Apache Derby is used for its RDBMS components, JDBC driver, query engine, and network server.
The partitioning technology of GemFire is used to implement horizontal partitioning features of vFabric SQLFire.
vFabric SQLFire specifically enhances the Apache Derby components, such as the query engine, the SQL interface, data persistence, and data eviction, as well as adding additional components like SQL commands, stored procedures, system tables, functions, persistence disk stores, listeners, and locators, to operate a highly distributed and fault tolerant data management cluster.
Application and operations teams sometimes reach a point where they must upgrade the database. Whether it’s due to data growth, lack of throughput, too much downtime, the need to share data globally, adding ETLs, or otherwise, it’s never a small project. Since these projects are expensive, any recommendation requires a solid justification. This article a) characterizes 3 signs where traditional databases hit a wall, b) explains how vFabric SQLFire provides an advantage over traditional databases in each case, and c) should help you make a case for moving towards an in-memory, distributed data grid based on SQL.
For those of us tasked with upgrading (or architecting) the data layer, we all go through similar steps. We build a project plan, make projections and sizing estimates, perform architecture and code reviews, create configuration checklists, provide hardware budgets and plans, talk to vendors about options, and more. Then, we work to plan the deployment with the least downtime, procure hardware and software, test different data load times, evaluate project risks, develop back-up plans, prepare communications to users about downtime, etc. You know the drill. These projects can take months and consume a fair amount of internal resources or consulting dollars. If you are starting or working on one of these types of projects with a traditional database architecture in mind, are you considering these 3 signs as you consider your options? Continue reading →
As we’ve previously covered, data growth is quite unbelievable, and this means traditional database models are being stretched. On Tuesday, November 13, 2012 at 9:00 AM PST, VMware’s Joe Russell will be presenting on several topics related to Big, Fast, Flexible Data and how VMWare’s key data management technologies help companies overcome some of the key challenges with traditional RDBMS.
Attend to learn:
How Hadoop and new analytics technologies are allowing companies to use Big Data in new ways to gain meaningful business insights
What’s new with Project Serengeti, a VMware initiative to help you deploy and manage elastic Hadoop clusters in minutes
How Fast Data is bringing data logic in-memory, allowing for dramatic scale, reduced costs, and improved performance
How Flexibile Data, includng NoSQL and open source relational data technologies can improve your data model
How virtualizing the database layer enables a new Cloud Delivery Model, allowing enterprise IT departments to offer self-service data services elastically on demand, maintain centralized control, and operate within regulatory guidelines
How do you plan a roadmap for moving from a legacy data architecture to a cloud-enabled data grid? In this article, we will offer a pragmatic, three-stage approach. At SpringOne-2012, the “Effective design patterns with NewSQL” session (see presentation embedded below) generated a lot of interest. (Thank you to everyone who joined us!) Jags Ramnarayan and I discussed problems with legacy RDBMS systems, NewSQL driving principles, SQLFire architecture, application design patterns as well as data consistency and reliability.
We went deep into vFabric SQLFire which is a pragmatic solution that addresses these data challenges:
How do I architect my data tier for very high concurrent workloads?
How do I achieve predictability both for data access response time and availability?
How do I distribute data efficiently and real time to multiple data centers (and to external clouds)?
How do I process these large quantities of data in an efficient manner to allow for better real-time decision-making?
In this short Q&A, we get the perspective of Heikki Linnakangas who’s just joined VMware after being a senior software architect and contributing to PostgreSQL for six years.
1. You’ve been involved with PostgreSQL for a while, could you give us a bit about your background and how you’ve been involved?
It all started in 2003, when my second child was born. I was at home with the baby for a month or two, and thought it would be fun to take a look at how a DBMS works under the covers. I have done programming as a hobby since I was a kid, but had not had a chance to do much outside a work environment for some time. Continue reading →
Despite what people tell you, managing on-line applications on a cloud-scale is hard. One of the main challenges is related to the fact that as an application gets more and more popular, the underlining database often becomes the bottleneck.
When demand spikes, organizations are comfortable scaling their Web and App Server layers. However, as they increase the number of application instances to accommodate the growing demand, their data layer is unable to keep up.
We all know that a solution’s overall performance is only as good as it’s lowest common denominator. Increasingly, the lowest common denominator of today’s on-line applications is the database.
A Customer Example
Recently, a large retail customer spoke to us about their experiences in dealing with demand spikes during holidays. Their virtualized infrastructure was more than capable of scaling horizontally to address the growing demand. However, their underlying, traditional database could not handle the large load increases. The database started to experience deadlocks, connection timeouts, and various other problems.
In a nutshell, relational databases weren’t built for the cloud. With vFabric Postgres, VMware customers can get a proven, enterprise database integrated with VMware virtualization and ready for cloud computing.
As announced earlier this week, vFabric Postgres (vPostgres) is now available within vFabric Suite 5.1 Advanced. With vPostgres, the well-respected, open-source database gains built in best practices, optimized configuration, and cloud-ready features. While vFabric Postgres is synced up to PostgreSQL 9.1.3 minor release and includes all the new features of this version of the database (see PostgreSQL wiki for more), vFabric adds many features and considerable improvements in three categories:
1. Development and deployment become simpler, smarter, and cloud ready 2. Performance improvements with elastic memory and more 3. Monitoring and administration get an upgrade 4. Lower TCO and increased staff efficiency
Development and Deployment with vFabric Postgres
First, vPostgres is available in two form factors:
vPostgres Virtual Appliance
vPostgres RPMs for 64-bit Linux Servers (RHEL 6, Suse 11 sp1+)