VMware has a strong commitment and belief that PostgreSQL will be a broadly used and popular IT technology for decades to come. This latest release marks a significant advancement that serves to underscore this assertion. First, VMware has chosen to standardize on a single common core, donating all advancements to the core to the community at large. As a result, the Postgres community at large will benefit from consistent, professional engineering that will ensure this project continues to advance with the demands of industry, in particular with cloud computing. The new VMware distribution, now shares the same common core as the open source PostgreSQL 9.2 release in September 2012.
This release builds on the PostgreSQL 9.1 most notably with the addition of new developer-oriented capabilities including JSON support, and enterprise IT-oriented capabilities such as cascading replication and index-only scans. These advancements solidify Postgres now as a database that can handle the vast majority of data types and workloads.
In addition to improvements to the core, VMware will continue to extend the vFabric Postgres distribution to better meet the demands of large scale web applications running on virtualized and cloud deployments. Continue reading →
In our last post, we 1) covered how geographic data can release value in mobile and machine-based applications, 2) explained how technology is used to overcome barriers to these types of big data scenarios, and 3) detailed the architecture for a data fabric or grid (like vFabric GemFire) that works with geographic data and specialized or alternative indexes. There were also code examples to explain the object model, the spatial index, and data changes.
Now, we will continue the examples, show you how to make the index highly available, and use a function to access the data via the index.
The Scenario for a Highly Available Index
In some cases, a piece of data may be added to a node, or become primary on a node without a clean method call. This happens in the cases of both failover and rebalancing. In the case of failover, a bucket that is on a node (that was also a redundant copy) may suddenly become the primary copy if the node that held the primary failed.
In the case of rebalancing, an entire bucket can be moved to a new node that was added to the system without the benefit of capturing the “put” call on each piece of data. Continue reading →
Apache Derby is used for its RDBMS components, JDBC driver, query engine, and network server.
The partitioning technology of GemFire is used to implement horizontal partitioning features of vFabric SQLFire.
vFabric SQLFire specifically enhances the Apache Derby components, such as the query engine, the SQL interface, data persistence, and data eviction, as well as adding additional components like SQL commands, stored procedures, system tables, functions, persistence disk stores, listeners, and locators, to operate a highly distributed and fault tolerant data management cluster.
Next year is going to be even bigger with the Pivotal Initiative where several of the products covered on this blog will be following the new venture. This is still in the planning stages, so we will be expecting to share with you the plans for our products alongside the formal communications from each of the companies involved. (Sorry — no extra information is available right now)
Application and operations teams sometimes reach a point where they must upgrade the database. Whether it’s due to data growth, lack of throughput, too much downtime, the need to share data globally, adding ETLs, or otherwise, it’s never a small project. Since these projects are expensive, any recommendation requires a solid justification. This article a) characterizes 3 signs where traditional databases hit a wall, b) explains how vFabric SQLFire provides an advantage over traditional databases in each case, and c) should help you make a case for moving towards an in-memory, distributed data grid based on SQL.
For those of us tasked with upgrading (or architecting) the data layer, we all go through similar steps. We build a project plan, make projections and sizing estimates, perform architecture and code reviews, create configuration checklists, provide hardware budgets and plans, talk to vendors about options, and more. Then, we work to plan the deployment with the least downtime, procure hardware and software, test different data load times, evaluate project risks, develop back-up plans, prepare communications to users about downtime, etc. You know the drill. These projects can take months and consume a fair amount of internal resources or consulting dollars. If you are starting or working on one of these types of projects with a traditional database architecture in mind, are you considering these 3 signs as you consider your options? Continue reading →
The next release of Hyperic is coming up soon and the biggest change is to the backend. In the next release, we will only support one database, namely PostgreSQL. Those of you who have been with Hyperic for a while as long as I have may be surprised considering our history with PostgreSQL, but, as you read though this blog, it will start to make sense.
History of PostgreSQL and Hyperic
For the last few years Hyperic has supported only two databases for production use at scale—Oracle and MySQL. This in itself was a big change since at one point, PostgreSQL was our bread and butter. Hyperic was originally designed on PostgreSQL 7.x. As an open source project, PostgreSQL has a very easy license for distribution. As a startup company we had to get our product out into the marketplace quickly and affordably, so therefore PostgreSQL made sense.
As we’ve previously covered, data growth is quite unbelievable, and this means traditional database models are being stretched. On Tuesday, November 13, 2012 at 9:00 AM PST, VMware’s Joe Russell will be presenting on several topics related to Big, Fast, Flexible Data and how VMWare’s key data management technologies help companies overcome some of the key challenges with traditional RDBMS.
Attend to learn:
How Hadoop and new analytics technologies are allowing companies to use Big Data in new ways to gain meaningful business insights
What’s new with Project Serengeti, a VMware initiative to help you deploy and manage elastic Hadoop clusters in minutes
How Fast Data is bringing data logic in-memory, allowing for dramatic scale, reduced costs, and improved performance
How Flexibile Data, includng NoSQL and open source relational data technologies can improve your data model
How virtualizing the database layer enables a new Cloud Delivery Model, allowing enterprise IT departments to offer self-service data services elastically on demand, maintain centralized control, and operate within regulatory guidelines
How do you plan a roadmap for moving from a legacy data architecture to a cloud-enabled data grid? In this article, we will offer a pragmatic, three-stage approach. At SpringOne-2012, the “Effective design patterns with NewSQL” session (see presentation embedded below) generated a lot of interest. (Thank you to everyone who joined us!) Jags Ramnarayan and I discussed problems with legacy RDBMS systems, NewSQL driving principles, SQLFire architecture, application design patterns as well as data consistency and reliability.
We went deep into vFabric SQLFire which is a pragmatic solution that addresses these data challenges:
How do I architect my data tier for very high concurrent workloads?
How do I achieve predictability both for data access response time and availability?
How do I distribute data efficiently and real time to multiple data centers (and to external clouds)?
How do I process these large quantities of data in an efficient manner to allow for better real-time decision-making?
Virtualization continues to be one of the top priorities for CIOs. As the share of virtualized workloads approaches 60%, the enterprise is looking at database and big data workloads as the next target. Their goal is to realize the virtualization benefits with the plethora of relational database sprawling in their data centers. With the increasing popularity of analytic workloads on Hadoop, virtualization presents a fast and efficient way to get started with existing infrastructure, and scale the data dynamically as needed.
VMware’s vFabric Data Director 2.5 now extends the benefits of virtualization to both traditional relational databases like Oracle, SQL Server and Postgres as well as Big Data, multi-node data solutions like Hadoop. SQL Server and Oracle represent the majority of databases in enterprises, and, Hadoop is the one of the fastest growing data technologies in the enterprise.
vFabric Data Director enables the most common databases found in the enterprise to be delivered as a service with the agility of public cloud and enterprise-grade security and control.
The key new features in vFabric Data Director 2.5 are:
Support for SQL Server – Currently supported versions of SQL Server are 2008 R2 and 2012.
Support for Apache Hadoop 1.0-based distributions: Apache Hadoop 1.0, Cloudera CDH3, Greenplum HD 1.1, 1.2 and Hortonworks HDP-1. Data Director leverages VMware’s open source Project Serengeti to deliver this capability.
Streamlined Data Director Setup – Complete setup in in less than an hour
One-click template creation for Oracle and SQL Server through ISO based database and OS installation
Oracle database ingestion enhancements – Now includes Point In Time Refresh (PITR)
Data Director’s self-provisioning enables a whole new level of operational efficiencies that greatly accelerates application development. With this new release, Data Director now delivers these efficiencies in a heterogeneous database environment.