Today VMware is releasing a significant new release of their big data virtualization open source project Serengeti called M4 or version 0.8.0. Designed to help make it easier for Hadoop users to deploy, run and manage mixed workload clusters on a virtualized platform, this release broadens support across the various distributions of the Hadoop community, including new support for Cloudera CDH4, MapR, and HBase. Additionally as part of this release, Serengeti M4, includes updated performance configuration improvements and a hardware reference architecture guide.
This release comes at a perfect time for an exploding data market. This year, worldwide we will create 4 zettabytes of new data, and more than 80% of that will be unstructured data that does not work in a traditional database management system. At the same time, businesses are learning to harness that data and use it to better their business.
A popular strategy to succeed in the data market is Hadoop, an open source data framework that that allows for the massive distributed processing of large data sets across clusters of nodes using simple programming models. Additionally, Hadoop offers a scalable file system (HDFS) that allows users to store huge amounts of data leveraging inexpensive disks on commodity servers. The powerful framework has spawned many new startups in Silicon Valley and has Enterprise IT departments clamoring to harness the power of this technology. Huge web applications like Facebook, LinkedIn, Yahoo! and eBay all rely on Hadoop to process and store data for hundreds of millions of users. Continue reading →
We are very fortunate to post an interview with Shay Banon, the founder of elasticsearch. Elasticsearch is technology that is very popular among some of the coolest companies on the web today, including SoundCloud, StumbleUpon, Mozilla and Klout. These companies use elasticsearch to help them deploy powerful search capabilities in their applications that are easy to set up, scalable and built for the cloud. In this interview, we get to learn all kinds of cool things:
How Shay got into search
How he came up with the idea for elasticsearch
Why elastic search is different than other OSS search projects
Running elasticsearch on virtualized infrastructure
Without further ado, here is the interview.
Q1. So, how did you end up getting into search?
About 10 years ago, I moved from Israel to London because my wife was going to study to be a chef at the Cordon Bleu. I had no job. I was in a new country. I was unemployed. So, I started to get into the latest, cool, new technologies. Continue reading →
Disabled SSL/TLS Compression. OpenSSL compression is now disabled by default for protection against the CRIME exploit vector. The mod_ssl “SSLCompression on” configuration option is added to allow the administrator to re-enable compression. See Vulnerability Summary for CVE-2012-4929. Continue reading →
The next release of Hyperic is coming up soon and the biggest change is to the backend. In the next release, we will only support one database, namely PostgreSQL. Those of you who have been with Hyperic for a while as long as I have may be surprised considering our history with PostgreSQL, but, as you read though this blog, it will start to make sense.
History of PostgreSQL and Hyperic
For the last few years Hyperic has supported only two databases for production use at scale—Oracle and MySQL. This in itself was a big change since at one point, PostgreSQL was our bread and butter. Hyperic was originally designed on PostgreSQL 7.x. As an open source project, PostgreSQL has a very easy license for distribution. As a startup company we had to get our product out into the marketplace quickly and affordably, so therefore PostgreSQL made sense.
Virtualization continues to be one of the top priorities for CIOs. As the share of virtualized workloads approaches 60%, the enterprise is looking at database and big data workloads as the next target. Their goal is to realize the virtualization benefits with the plethora of relational database sprawling in their data centers. With the increasing popularity of analytic workloads on Hadoop, virtualization presents a fast and efficient way to get started with existing infrastructure, and scale the data dynamically as needed.
VMware’s vFabric Data Director 2.5 now extends the benefits of virtualization to both traditional relational databases like Oracle, SQL Server and Postgres as well as Big Data, multi-node data solutions like Hadoop. SQL Server and Oracle represent the majority of databases in enterprises, and, Hadoop is the one of the fastest growing data technologies in the enterprise.
vFabric Data Director enables the most common databases found in the enterprise to be delivered as a service with the agility of public cloud and enterprise-grade security and control.
The key new features in vFabric Data Director 2.5 are:
Support for SQL Server – Currently supported versions of SQL Server are 2008 R2 and 2012.
Support for Apache Hadoop 1.0-based distributions: Apache Hadoop 1.0, Cloudera CDH3, Greenplum HD 1.1, 1.2 and Hortonworks HDP-1. Data Director leverages VMware’s open source Project Serengeti to deliver this capability.
Streamlined Data Director Setup – Complete setup in in less than an hour
One-click template creation for Oracle and SQL Server through ISO based database and OS installation
Oracle database ingestion enhancements – Now includes Point In Time Refresh (PITR)
Data Director’s self-provisioning enables a whole new level of operational efficiencies that greatly accelerates application development. With this new release, Data Director now delivers these efficiencies in a heterogeneous database environment.
While vFabric Application Director supports a variety of products out of the box (mostly vFabric) and a growing number of products on the Cloud Application Management MarketplaceBETA (like Puppet Integration), it is easy to extend Application Director to support additional applications. Let’s take a look at how to use Application Director with Apache’s open source database, Cassandra. If you are new to Application Director, you might check out this 5-minute explanation. Otherwise, this post will show you how to automate the provisioning and set-up of a Cassandra cluster with Application Director in two main steps: 1) creating the catalog service and 2) defining a blueprint. Then, we will look at an example. Continue reading →
If you follow this blog, you know we keep hearing people talk about simplicity when discussing app servers and architectures. We certainly heard this at JavaOne and also at VMworld, but it’s been popular for a while.
The fact is that traditional Java EE (JEE) app servers bring complexity to the mix. In addition, they are costly and consume a lot of resources. Forrester wrote Continue reading →