Home > Blogs > VMware vFabric Blog > Tag Archives: bigfastdata

Tag Archives: bigfastdata

Disaster Recovery Jackpot: Active/Active WAN-based Replication in GemFire vs Oracle and MySQL

Ensuring your systems run smooth even when your data center has a hiccup, or a real disaster strikes is critical for many companies to survive when hardships befall them.  As we enter the age of the zettabyte, seamless disaster recovery has become even more critical and difficult. There is more data than we have ever handled before, and most of it is very, very big.

Most disaster recovery (DR) sites are in standby mode—assets sitting idle, waiting for their turn. The sites are either holding data copied through a storage area network (SAN) or using other data replication mechanisms to propagate information from a live site to a standby site.  When disaster strikes, clients are redirected to the standby site where they’re greeted with a polite “please wait” while the site spins up.

At best, the DR site is a hot standby that is ready to go on short notice.  DNS redirects clients to the DR site and they’re good to go.

What about all the machines at the DR site?  With active/passive replication you can probably do queries on the slave site, but what if you want to make full use of all of that expensive gear and go active/active?  The challenge is in the data replication technology. Most current data replication architectures are one-way. If it’s not one-way, it can come with restrictions—for example, you need to avoid opening files with exclusive access. Continue reading

Why Every Database Must Be Broken Soon

Have you ever heard of a zettabyte? If you work in IT, you’ll be hearing more and more about zettabytes, exabytes, and petabytes while the data terms we think are big, such as terabytes and gigabytes wane away from our vocabulary. Right now, we are growing our data stores by 50% year-over-year, and its only accelerating.

In 2010, we crossed the barrier of the zettabyte (ZB) across all online data. This year, we will produce 4 ZB of data worldwide. In 2016, global IP traffic will reach 1.3 ZB.

While data volumes are skyrocketing, the type of data is also becoming more difficult for traditional databases to handle. Over 80% of it will be unstructured file based data that does not work well with block-based data storage typical of your typical relational databases (RDBMS).  So, even if hardware innovations could keep up to support greater volume, the kinds of data we are now storing break traditional RDBMS at today’s speeds.

The bottom line is the volume and types of data being stored is unrealistic for a single, monolithic, structured RDBMS data store. They need to be broken apart and re-architected to survive the Information Explosion we are experiencing today.

Continue reading

From the Front-line of StrataConf: A VMware Perspective

Day 2 of the O’Reilly Strata Conference is starting here in Santa Clara, California and the focus is very much on data. In 2005, Tim O’Reilly predicted: “Data is the Next Intel Inside.” At VMware, big, fast data has never been so critical for our customers and innovations are transforming the cloud applications landscape at an unprecedented rate. This conference comes at the perfect time to reset what everyone knows about big, fast data.

The conference kicked off yesterday with several brief 20 minute keynotes. They were all  succinct and to the point. Greenplum‘s Scott Yara reflected on how the big data market has grown tremendously over the past few years and mentioned several key data scientist practitioners.  Scott also mentioned the increased investment in open source Hadoop. Of course, Strata comes on the heels of the Greenplum Pivotal HD  announcement on Monday which launched their distribution of Hadoop which can improve performance 50X to 500X when compared to existing SQL-like services on top of Hadoop.

Another great keynote presentation was from Yael Garten, a Senior Data Scientist from LinkedIn. Yael leads the mobile data analytics team. She began by polling the audience and noting that many in the audience had already been on 3 different devices that morning and it wasn’t even 9:30 am yet. She noted we’re constantly connected, and we need to use data to personalize the experience for users no matter what device we’re on.  She had an interesting graph highlighting device use and laptop use during our morning time of ”coffee to couch”.  And those uses are different in the US compared to places like India. Continue reading

Build Your First Mobile App in the Cloud in 45 Minutes (Tutorial)

Two of the hottest topics in technology today are “mobile” and “cloud.” They are at the top of most CTOs list of objectives, yet they also seem to be the ones most shrouded in mystery. So where do you start?

With the video and do-it-yourself guide below!

This past year, at VMworld 2012 San Francisco and Barcelona, I ran a session where we built a complete database-backed web application from scratch using the SpringSource Tool Suite and the Grails framework for Java. Then, we published the application to Cloud Foundry—our open Platform-as-a-Service offering. Finally, we proceeded to build a mobile application that consumed the data from the web application built earlier.  I broke a cardinal rule by doing the entire session live, but it all went off without a hitch and audience participation with the application was an absolute blast. By the time we were done, we had built two applications from the ground up, and folks had an application that looked, smelled, and tasted like a native mobile application running on their phones. And, we did all of this in less than one hour! Continue reading

Part 2: The Value, Architecture, & Code for Building Geography-Based Apps

In our last post, we 1) covered how geographic data can release value in mobile and machine-based applications, 2) explained how technology is used to overcome barriers to these types of big data scenarios, and 3) detailed the architecture for a data fabric or grid (like vFabric GemFire) that works with geographic data and specialized or alternative indexes. There were also code examples to explain the object model, the spatial index, and data changes.

Now, we will continue the examples, show you how to make the index highly available, and use a function to access the data via the index.

The Scenario for a Highly Available Index

In some cases, a piece of data may be added to a node, or become primary on a node without a clean method call. This happens in the cases of both failover and rebalancing. In the case of failover, a bucket that is on a node (that was also a redundant copy) may suddenly become the primary copy if the node that held the primary failed.

In the case of rebalancing, an entire bucket can be moved to a new node that was added to the system without the benefit of capturing the “put” call on each piece of data. Continue reading

4 Reasons You Should Consider Modernizing Mainframe Apps in 2013

IT organizations are facing significant challenges maintaining legacy mainframe applications: challenges ranging from the high cost of proprietary hardware and software, to the attrition of people with qualified mainframe skills and experience, and the inability to support modern computing demands of mobile and big fast data.

Cloud computing offers an opportunity to rationalize and modernize application portfolios, which can include migrating legacy mainframe apps to the cloud. Unfortunately, many IT organizations see the prospect of modernizing mainframe apps as a “mission impossible”; the path forward too cloudy and the costs and risks are too great.

As a result, many resign themselves to living with the burdens of a legacy mainframe environment.   And while remaining status quo may appear to be the best option, over time, it only intensifies the challenges associated with maintaining mainframe apps.   Eventually the business loses confidence in IT’s ability to deliver, and costs continue to rise without corresponding value. Continue reading

Building a Multi-Tenant Development Mindset in the Travel Software Industry

The travel industry has been a technology innovator for decades.

But how do these tech innovators use a cloud application platform like vFabric?

In this article, we get a real-world, inside perspective from a cloud architect who designs and leads development teams for airline check-in and baggage software and cloud-based services. We will dive into his requirements and approaches to cloud-centric devops tools that keep systems running in high performance environments.

Travel companies use technology everywhere they can. For example, their technology lets us buy tickets over the web, check in via self-serve kiosks, and use iPhones or Androids as boarding passes. It wasn’t long ago that these capabilities didn’t exist, but innovative companies like American Airlines use technology everywhere to differentiate their company and connect with customers. For example:

  • AA.com gets 1.6 million visits per day.
  • Their mobile app has over 3 million downloads:
    • It’s available on iPhone, iPad, iPod Touch, Android, Blackberry, Nook, Windows Phone 7, and Amazon Kindle Fire.
    • Mobile boarding passes are available in 77 cities.
    • The apps include wifi flight search, flight status notification, and more.
  • Over 800 kiosks allow customers to check in while people can also check in from the website. The kiosks provide passport, credit card, and barcode scanning.
  • The list continues with on-board purchases, advanced loyalty programs, RFID, and more. Continue reading

The Best VMware vFabric Stories of 2012 & What’s In Store for 2013

As this year comes to a close, it’s time to be reflective of what happened in the past and start planning for a new year. The vFabric team has had some major achievements this year, introducing several new products to the market including the innovative vFabric Application Director, the widely anticipated Project Serengeti to enable rapid cloud deployments for Hadoop, and a new tool to the vFabric Suite users called vFabric Administration Server (VAS).  We announced a new VMware Cloud Applications Marketplace to help further accelerate application development with a professionally moderated library of enterprise grade, ready-to-use application components that can be run on any cloud.

Next year is going to be even bigger with the Pivotal Initiative where several of the products covered on this blog will be following the new venture. This is still in the planning stages, so we will be expecting to share with you the plans for our products alongside the formal communications from each of the companies involved. (Sorry — no extra information is available right now)

One thing that we are going to be doing in early 2013 is to move the conversation of how you manage applications to be with the conversations of how you manage virtual infrastructure. To that end, we will be moving all topics of Application Performance Manager, AppInsight, Application Director, Hyperic, and Spring Insight to the VMware Management Blog as of January 1st. To make sure you keep up with the management topics, please be sure to follow us @vmwareappmgmt and @vmwaremgmt.

In the meantime, we’d like to reshare with you the top 20 stories we had for 2012, and invite you to comment here on what stories you would like to see us cover on either blog for 2013.

Continue reading

Part 1: The Value, Architecture, & Code for Building Geography-Based Apps

Will machine-generated data be larger than mobile and tablet-generated data?

No matter where you might place your chips on that bet, they both rely on geographic data for quite a number of business applications. These geographic data applications stand to release a tremendous amount of business value, and, in this two-part series, we will explain:

  1. How geographic data can release business value through applications.
  2. Where technology overcomes big data barriers to release the business value
  3. The concepts behind vFabric GemFire’s data fabric as well as an object model and data architecture for software services connecting to geographic data fabrics
  4. How to use an open source quadtree index and the related Java code for interacting with geographic data in vFabric GemFire

How Geographic Data Releases Business Value

Many early versions of geography, location, or proximity-based applications can be found in the market. Recently, we published a few examples of these types of applications in articles about ocean sensor data and mobile applications, but there are more: Continue reading

Big, Fast Data Opportunities in Mobile Applications

Mobile applications are one thing, but mobile apps WITH fast data requirements are another.

The combination of mobile apps and fast data requirements can cause major data scale issues. Whether you are trying to update an existing application or build a new application, mobile apps with personalization, pricing, location, or gaming functionality must consider data architecture differently from the outset.

The Fundamental Growth Characteristics of Mobile

It’s official, mobile applications now account for over 50% of the computing market. It crept up on us quickly, and it’s easy to forget that mobile growth is different than what we’ve seen before. Of course, most people in the mobile space know about the growth. You would have to be living in a country without internet, TV, or radio if you haven’t heard that mobile usage is growing tremendously along with mobile applications. But, let’s put this in perspective: AOL took 2.5 years to reach 50 million users and Angry Birds took 35 days. Yes, Angry Birds was a smash hit, but so was AOL.

An AT&T Senior EVP recently wrote, “Over the past five years, AT&T’s wireless data traffic has grown 20,000%. The growth is now primarily driven by smartphones.” In fact, many say that mobile use will cause a spectrum deficit in the U.S. According to the Telegraph, smartphones are mostly used for internet (24 minutes and 29 seconds per day) and social media (17 minutes and 29 seconds per day) while phone calls are ranked 5th (12 minutes and 6 seconds per day).  Similarly, mobile commerce is planned to rise from 1% of all e-commerce sales in 2010 to 7% in 2016 (i.e. from $3 billion to $31 billion in a 6 years period). Apps are also accounting for more minutes of usage. So, no wonder business groups are clamoring for mobile-centric programs and applications.

The bottom line is that mobile applications are growing data differently than traditional database applications.

Continue reading