Home > Blogs > VMware vFabric Blog


How Even the Ocean (Data) Is In The Cloud

Recently, VMware worked with the Ocean Observatory Initiative to discuss an interesting case study that affects us all. The U.S. has built an ocean of big data on the ocean itself. Currently, we are collecting about 8 terabytes a day or 3 petabytes a year of data about the ocean in order to more efficiently and safely study the body of water that covers over 70% of earth.

The Ocean Observatories Initiative (OOI) is a 25-year program responsible for managing a networked set of 100s of sensor instruments that sit in the ocean, take measurements, send data back to a massive data infrastructure, and make data-sets and reports available to oceanographers, scientists, educators, and the public on a very broad scale. This system, quite literally, is a Hubble Telescope for observing the ocean. While this mega-system has an amazing history and tons of interesting capabilities, we think it’s pretty cool that VMware vSphere and vFabric RabbitMQ play key roles.

Monitoring the Oceans

Example ocean monitoring

As information technologists, we’ve all dealt with monitoring systems in some way, shape, or form, but imagine if the Ocean was your data center.

Instead of monitoring disk, CPU, or memory usage, the OOI captures data from global, regional, and coastal sensors and pulls the information into a common information management system called the Common Operating Infrastructure (COI). The sensors include telemetering buoys, electro-optical seafloor cables, underwater gliders, AUVs, profilers, moorings, fixed instrument chains, seafloor equipment, and sub-seafloor installations to monitor information – a total of 49 classes and over 700 instruments deployed off of 6 coastlines.  Above, we have a diagram showing examples.

These sensors help scientists to:

- Provide continuous, real-time information about climate, circulation, ecosystem dynamics, air-sea exchange, seafloor processes, and plate-scale dynamics.
- Track dozens of measurements like humidity, water velocity, salinity, pressure, chlorophyll, and nitrates across various physical, chemical, geological, and biological variables on a coastal, regional, and global scale.
- Measure at microscopic and global levels.
- Capture a massive amount of time and space-based information (think geo/map data combined with calendar/time data on steroids).

Processing and Using the Ocean Observatories Data

OOI CI Data Processing Diagram Most importantly, all of this data is made available via “virtual observatories” to scientists.  In the diagram below, the data moves from sensors on the left and through physical interfaces as part of Marine Management and Operations.  The data is processed in various ways to correlate information, deal with calibration of instruments, add quality assurance, provide management capabilities (e.g. quality assurance, security policies, metadata cataloging, notifications), and more.  This is shown in the diagram above.

As the data is made available to end users, the data is distributed through GUIs and downloaded as data-sets via various formats like CSV and potentially Keyhole Markup Language or MATLAB binary files.  The data supports a variety of consumption use cases that ultimately support scientific analysis and presentation. See an example of the end-user GUI below. 

Under the Hood – A System Architecture Overview

The entire system is referred to as the Integrated Observatory Network (ION), and ION provides a unifying  information  conduit  with  additional capabilities  like  identity  management, governance,  state  management,  resource  management,  a  service framework,  and  a  presentation framework. The  subsystems  and  COI  capabilities  are  held together  by  a messaging  service, based on RabbitMQ, that  is  applied  in  a  cross¬‐cutting  way  to  the  interaction of  all  elements  across  all  subsystems.

Register for VMworld!
Click Here

Register for Session APP-CAP1714 – Next Generation Messaging: VMware vFabric RabbitMQ:
Click Here

Follow all vFabric updates at VMworld on Twitter:
Click Here

OOI calls this messaging system the “Exchange”.  The diagram shows the Exchange components.  It uses pubsub messaging and queues as the central paradigm to wire and integrate all information between all applications. The subsystems (i.e. applications) include:

- The Data Management subsystem – manages the dynamic data distribution network of data products and metadata based on the OOI-CI common data model.
- The Sensing and Acquisition (SA) subsystem – provides the life cycle and operational management of sensor network environments as well as observing activities (i.e., scheduling, collecting, processing, calibration) associated with sensor data acquisition.
- The Analysis and Synthesis subsystem (AS) – provides capabilities to support advanced data analysis and output synthesis applications. This includes the visualization of science data products, the execution of user provided real-time and interactive analysis workflows, and the operational management numerical models
- The Planning and Prosecution (PP) subsystem – provides services together with the standard models for the management of stateful and taskable resources. It provides controller processes with the semantics to monitor and control the operating state of an active resource as well as to initiate, monitor and amend tasks being carried out by a taskable resource.
- The Common Execution Infrastructure (CEI) – provides an infrastructure for the virtualization of computing across the OOI, including taskable resource provisioning, remote operational management, and process execution.

External integrations are also considered a key type of system:

- The user experience and application interfaces via HTML (web user interfaces), direct data access, and through exchange APIs (if authorized).
- Marine integration provides facilities for agents to manage physical instruments through a common interface to provide status, represent capabilities, perform data acquisition, take commands, and more.
- External observatory integration allows for adapters, scripts, and integration tools for external data input and output.

How the Exchange works with Rabbit MQ

Data Product Generation ScenarioThe  Exchange shown in the diagram uses  a  common  message  format and  manages  “Exchange  Points”  and  “Exchange  Spaces”.  These are where Message Clients interface.

- Message Clients are the interfaces to application logic.
- Exchange Points are the ‘postboxes’: where messages can be sent to and received from.
- Exchange Spaces group points and permitted users, i.e. they are like ‘postal services’

For example, the diagram depicts raw data and meta‐data coming in from an instrument in Portland and put on the Exchange for consumption by any interested party.  In this case, the raw stream is noted in the repository and picked up for data processing in an Amazon cloud. The raw data is also turned into processed data and put back on the exchange where the repository is updated again along with a research team’s event detection.

This is an example of how Message‐oriented systems enable “loosely coupled” integration – because the message senders are not directly coupled to message receivers. Instead everyone is connected to the Exchange. Loose  coupling  is  an  important  architectural  property that  has  beneficial  influences  on  maintainability, extensibility,  robustness,  scalability  and  other  quality  properties  of  the system  and  its  individual software  components.

Additional Information

According to OOI’s documentation on release 1, ION uses AMQP 0.9.1 and RabbitMQ-Server v. 2.3.1 on CentOS 5.5.
- To learn more about RabbitMQ visit the website, podcast, product page, or download a trial.
- See the full case study on how With VMware RabbitMQ, OOI Gives Ocean Scientists Vast New Infrastructure.

About the Author: Stacey Schneider has over 15 years of working with technology, with a focus on working with sales and marketing automation as well as internationalization. Schneider has held roles in services, engineering, products and was the former head of marketing and community for Hyperic before it was acquired by SpringSource and VMware. She is now working as a product marketing manager across the vFabric products at VMware, including supporting Hyperic. Prior to Hyperic, Schneider held various positions at CRM software pioneer Siebel Systems, including Group Director of Technology Product Marketing, a role for which her contributions awarded her a patent. Schneider received her BS in Economics with a focus in International Business from the Pennsylvania State University.

10 thoughts on “How Even the Ocean (Data) Is In The Cloud

  1. Pingback: The Ocean Observatories Initiative (OOI) Is A "Hubble Telescope For The World's Oceans" | Inhabitat - Sustainable Design Innovation, Eco Architecture, Green Building

  2. Pingback: 5 Trends Driving Change in App Architectures | VMware vFabric Blog - VMware Blogs

  3. Pingback: Part 1: The Value, Architecture, & Code for Building Geography-Based Apps | VMware vFabric Blog - VMware Blogs

  4. Pingback: VMware vFabric Blog: Part 1: The Value, Architecture, & Code for Building Geography-Based Apps | Virtualization

  5. Pingback: VMware vFabric Blog: Part 1: The Value, Architecture, & Code for Building Geography-Based Apps | Strategic HR

  6. Pingback: The Best VMware vFabric Stories of 2012 & What’s In Store for 2013 | VMware vFabric Blog - VMware Blogs

  7. Pingback: VMware vFabric Blog: The Best VMware vFabric Stories of 2012 & What’s In Store for 2013 | Virtualization

  8. Pingback: VMware vFabric Blog: Messaging Architecture: Using RabbitMQ at the World’s 8th Largest Retailer | Virtualization

  9. Pingback: Messaging Architecture: Using RabbitMQ at the World’s 8th Largest Retailer | VMware vFabric Blog - VMware Blogs

  10. Pingback: Choosing Your Messaging Protocol: AMQP, MQTT, or STOMP | VMware vFabric Blog - VMware Blogs

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>