Home > Blogs > VMware vFabric Blog

Map-Reduce Style: Data Aware Distributed Querying in vFabric GemFire

vFabric GemFire ArchitectureBig, fast data is powering some of the most interesting computing opportunities in today’s market. But in order to get there, we need to change our approach to the data tier. Enterprises are trying to move from costly mainframe architectures to virtualized datacenters and utilize commodity hardware more efficiently. With the data tier, this means an architecture that scales horizontally by adding more commodity-based computing and storage at runtime.

To scale the data tier horizontally, companies use systems like vFabric GemFire, a distributed data system that is designed to specifically accommodate large data sets across commodity hardware nodes. In GemFire, data is spread across members of a cluster with members referred to as “nodes,” and the distribution of data across those nodes is called “partitioning.” vFabric GemFire then allows developers to query the data that resides across many nodes while retaining core values of very high performance at scale. How? In short, the answer is “Data Aware Querying” – a query API that allows a query to execute on selective nodes instead of all nodes (i.e. execute in a map-reduce style).

To answer this question, this article covers the following:

  • Understanding Data Partitioning
  • Understanding Basic Data Querying
  • Using Custom Partitioning to Achieve Data-Aware Querying
  • Implementing Function Execution with Custom Partitioning

Understanding Data Partitioning

First, we should understand how data is mapped out in order to understand how we can store and access a lot of it quickly and in dynamic ways.

GemFire partitions data using keys, and, hence, a subset of keys and corresponding values are stored on a single node. This approach facilitates concurrent access to a large data set with very high throughput without affecting store/access latency in a cluster of nodes. A key is unique entity making store/access an O(1) operation (i.e. something that always takes the same amount of time and does not depend on the size of the input) and allows for the storage of duplicate values. Also, a key can either be an independent entity like a sequence number or be composed of references to multiple attributes in a value, letting the partitioning to be based on a composite key.

Partitioning data can increase query performance because it uses a partial scan of a large data set and avoids using a full data store scan or multiple random reads scattered across the whole data store.

Within GemFire, data is partitioned using a PartitionRegion. One partition, or node, consists of multiple buckets that are configured at startup. Buckets are distributed across multiple nodes deterministically based on the key. To add one additional piece of background information on buckets, they are the smallest unit of data in a partitioned region that can be moved from one partition (JVM) to another during rebalancing.

Understanding Basic Data Querying

GemFire provides a modern way to query the distributed data. A query is executed in a scatter-gather fashion – starting from a coordinator node and gathering results from other concerned nodes to the coordinator, and, finally, providing results to the application. All nodes where query is executed are considered data nodes, and the first node, which starts a query (or receives it from a client), becomes the coordinator. This allows the query to run in parallel on concerned data nodes and gather results on the coordinator node for final processing. For example, the coordinator for an ORDER BY query performs the merge sort of ordered result-sets chunks.

Before we get more advanced, let’s begin with a basic example. Again, GemFire distributes the data using keys in a key-value pair. Querying this data involves the usage of a SQL-like query language known as Object Query Language, or OQL. Without using any special partitioning in GemFire (as discussed later), keys end up having no relation with value. OQL queries are executed on values without specifying the distribution of data across nodes (i.e. the scatter phase). Without specifying distribution, all nodes must be queried. This is both inefficient and expensive to do across the network.

For an example, let’s say we have a Passenger object with a Flight field.

Passenger {

String name,
Date travelDate,
int age,
Flight flt,

Flight {

int flightId,
String origin,
String dest,


Let’s say 100 million Passenger objects are stored in the “Travelers” datastore (i.e. a datastore is called “Region” in GemFire), which is partitioned across 3 nodes, and we want to run the following query for all Passengers in the Region.

"SELECT p.travelDate, p.age,

FROM /Travelers p, p.flt f


f.origin IN ('Boston', 'Chicago')

AND f.dest = 'Seattle'

AND p.age < 35"

(Note: The above, example query can be used by an airline to determine what movies to offer on flights from Boston OR Chicago to Seattle based on Passenger age being under 35.  The idea is that young travelers or families with children will want more family oriented movies, while adults may want news or recent episodes of a popular television series.)

This query would basically create a full table scan of 100 million records, which is highly inefficient. While GemFire supports the creation of indexes, we are omitting the discussion of indexes here to clarify the improvement due to data-aware partitioning alone.

Using Custom Partitioning to Achieve Data-aware Querying

Logically, a query will be more efficient if it is targeted to a specific location. Custom-partitioning or fixed partitioning (also, sometimes referred to as “column-based partitioning” in relational database terms) is a GemFire option used to deterministically distribute data. In GemFire 6.6.2, we can query the distributed (i.e. partitioned) data based on a column in a selective manner.

Using the same example as above, all Passenger data is partitioned across multiple GemFire nodes. In the Passenger object, Flight has an “origin” field. Data can be partitioned to certain buckets (i.e. sections within a partition) based on the origin city if we make it part of the key. This means the routing of Passenger data to a particular node would be based on “origin” provided in the Flight field. So, within a partition, only certain buckets would be queried instead of many nodes, a single node, or a partition. So, there would NOT be 100 million Passenger objects to iterate over. With a data-aware querying set up, the above query is executed to a limited data set. If we assume there are 100 million passengers, 50% of Passengers are below 35, there are 9 total origination cities, and no indexes exist, GemFire’s query engine iterates over ~11 million [100 * (50%) * (2/9) =11.11] passenger objects.

To custom-partition the data an application developer has to implement the PartitionResolver to plug-in their partition strategy for GemFire. A PartitionResolver might look like as follows,

* This resolver stores all Passengers based on their location in one bucket.
* The region is configured to contain 9 buckets.
public class MyPartitionResolver implements PartitionResolver {

//Know no of buckets in the partition region which is configurable for a partition region.

//9 different locations possible.

Map keyToRoutingObject = new HashMap();
keyToRoutingObject.put("Seattle", 8);
keyToRoutingObject.put("Chicago", 4);

public Serializable getRoutingObject(EntryOperation opDetails) {
- - - - - -
//opDetails.getKey() returns key, used in region.put(key, value);
return keyToRoutingObject.get(opDetails.getKey().getOrigin()); //Could be "seq_num+origin"

All Passengers having same origin in Flight field will be routed to the same bucket on the same node as shown in the diagram below.

GemFire Function Execution Service

Implementing Function Execution with Custom Partitioning

GemFire’s Function Execution Service can then be used on this partitioned data to achieve a map-reduce way of operating on distributed data and query data where it is located. This is known as data-aware querying. The Function Execution Service task can be executed on a particular node or set of nodes. Functions are dropped on filtered nodes (in above diagram, Partition B for “Chicago” and Partition C for “Seattle”) and execute the code locally on each node. Query execution also happens ONLY locally using the new API. No remote or distributed querying is performed on a node. The difference between querying without function context and with function context is that in former case, the query runs all local buckets but in later it only runs on Buckets C and S.

To query through the new Query API inside a Function:

Class EmpFunction extends FunctionAdapter {

- - - - -
void execute(FunctionContext context) {

- - - - -
Query query = new Query(context.getArguments());
SelectResults results = query.execute(context); //New API
- - - - -

- - - - -


Execute above Function as follows in your Application code to execute the query:

// Application Client code.

Set filter = new HashSet();

Function empFunc = new EmpFunction("NAZFunction");
//Execute Function
ResultCollector rColl = FunctionService



//Get Results
Object result = rColl.getResults();
SelectResults queryResults = getResults(result);

This approach provides a sophisticated way to effectively query distributed data while retaining very predictable performance.

About the Author: Shobhit Agarwal is a member of VMware’s Technical Staff, working on high-availability, low-latency, in-memory data management systems for virtual environments for the past two years. Agarwal graduated from Northeastern University with a MS in Computer Science specializing in Systems Engineering. His specialties include java development, distributed systems and data structures.

148 thoughts on “Map-Reduce Style: Data Aware Distributed Querying in vFabric GemFire

  1. Pingback: VMware vFabric Blog: 3 Game Changing Capabilities in SQLFire | Strategic HR

  2. Pingback: Understanding Speed and Scale Strategies for Big Data Grids and In-Memory Colocation | VMware vFabric Blog - VMware Blogs

  3. commercial moving company Chicago

    I write a leave a response whenever I like a article on a website or
    if I have something to add to the discussion. It is caused by the fire displayed in the post I looked at.

    And on this article Map-Reduce Style: Data Aware Distributed Querying in vFabric GemFire |
    VMware vFabric Blog – VMware Blogs. I was actually moved enough to post a thought 🙂 I do have a couple of questions for you if it’s okay. Could it be only me or do some of the comments appear as if they are coming from brain dead people? 😛 And, if you are posting at additional social sites, I’d like to follow anything new you have to post.
    Could you make a list the complete urls of your social pages like
    your Facebook page, twitter feed, or linkedin profile?

    my web page – commercial moving company Chicago

  4. bulk mail provider

    Hi there, all is going sound here and ofcourse every one is sharing facts, that’s in fact excellent, keep up writing.

  5. Sherryl

    Fantastic web site. Lots of helpful info here. I’m sending it to
    some friends ans additionally sharing in delicious.
    And of course, thanks to your effort!

  6. tigaraelectronica

    Thanks for finaally writing about > Map-Reduce Style: Data Aware Distributed Querying in vFabric GemFire | VMware vFabric Blog – VMware Blogs < Liked it!

  7. ecigs

    Hi there, You’ve done an excellent job. I’ll definitely digg it and personally suggest to my friends.
    I’m confident they’ll be benefited from this web site.

    Also visit my web blog … ecigs

  8. poussoirs-Ressort.fr

    poussoirs ressort
    You can certainly see your enthusiasm in the work you write.
    The world hopes for even more passionate writers such as you who are not afraid to
    mention how they believe. At all times go after your heart.

    Fabriquant de poussoirs à ressorts industriels

    Here is my web-site: interrupteur à bouton poussoir (poussoirs-Ressort.fr)

  9. pulvérisation huile

    Howdy, i read your blog from time to time and i own a similar one and i was just wondering if
    you get a lot of spam comments? If so how do you protect against it, any plugin
    or anything you can suggest? I get so much lately it’s driving me mad so any help
    is very much appreciated.

    Here is my homepage: pulvérisation huile

  10. Déménagement presse

    Fantastic site you have here but I was curious if you knew of any
    discussion boards that cover the same topics talked about here?
    I’d really love to be a part of online community where I can get responses
    from other knowledgeable individuals that share the same
    interest. If you have any suggestions, please let me know.
    Thank you!

    Here is my blog post – Déménagement presse

  11. Banderoles Publicitaires

    faire son drapeau – Borney.fr – affiche impression
    – fabrication en france Achat drapeaux, fabricant de drapeaux français banderolles publicitaires
    My relatives all the time say that I am killing my time here at net, except I know I am getting experience everyday by
    reading such pleasant articles or reviews.

    Here is my website: Banderoles Publicitaires

  12. système automatisé industriel

    It’s remarkable in favor of me to have a web site, which is beneficial designed for my knowledge.
    thanks admin

    my website: système automatisé industriel

  13. http://www.Mandaley.fr/

    I have read several excellent stuff here. Definitely price bookmarking for revisiting.
    I wonder how a lot attempt you set to make one of these
    magnificent informative web site.

    Here is my webpage :: http://www.Mandaley.fr/

  14. Derrick

    What’s Going down i’m new to this, I stumbled upon this I’ve discovered It
    positively useful and it has helped me out loads.
    I hope to contribute & assist other users like its aided me.

    Good job.

  15. plan cul colombes

    Thanks for the good writeup. It if truth
    be told used to be a amusement account it. Glance complex to far introduced agreeable
    from you! However, how can we keep in touch?

  16. mobile games

    I blog often and I seriously appreciate your information. This article has truly peaked my interest.
    I’m going to bookmark your site and keep checking for new information about once a
    week. I opted in for your RSS feed as well.

  17. http://www.allungareilpene.hotfacts.eu

    Greetings! Very useful advice within this article!
    It’s the little changes that produce the biggest changes.
    Thanks a lot for sharing!

  18. badge personnalis竢adge personnalises

    I’ll right away take hold of your rss feed as I
    can’t find your e-mail subscription link or newsletter service.
    Do you have any? Kindly permit me realize so that I may subscribe.

  19. خبرهلی بین المللی

    very nice

  20. خبرهاي روز دنيا


  21. خبرهاي ورزشي

    very well post

  22. حوادث

    follow post

  23. dapna

    My programmer is trying to convince me to move to .net from PHP.
    I have always disliked the idea because of the expenses. But he’s
    tryiong none the less. I’ve been using Movable-type on numerous websites for about a
    year and am nervous about switching to another platform. I have heard good things about blogengine.net.
    Is there a way I can import all my wordpress content into it?
    Any help would be really appreciated!

  24. importador express

    for sharing

  25. ninite

    I liked the very good article

  26. active directory

    Very useful article congratulations

  27. Curso de Porcelanato Líquido

    My congratulations, an excellent article. Thank you for sharing with us!

  28. Supere a Ansiedade

    Good article! Very nice!

  29. Curso de Cavaquinho 2.0 com Dudu Nobre

    Nice Article

  30. jhon

    Very nice! Good article!

  31. PARE


  32. Azenka

    Really Great Article, Keep Updating your site and just bookmark it.

    Thanks a lot;)

  33. Carla Beatriz


  34. andrer

    Thank you very much for the information of this site, I really liked it.

  35. gabriel

    What an incredible article, congratulations, thank you very much for the information.

  36. vencendo a insônia

    tank you

  37. Arshiya

    Hi Thank You .

  38. José

    I am very pleased with the information in this article, note 10!
    Good article! Very nice!

  39. Vencendo a azia

    I’ve been using vmware for quite some time now. Some of the best of my opni…

  40. Marco Paulo de Morais

    Thank you for sharing this great article

  41. Aline Queiroz

    Eiiita que tem comentario hahah

  42. طراحی سایت

    Though withdrawal from your lover can be as powerful as withdrawal from cocaine, there are ways to deal with heart-breakers other than by kissing your existence goodbye:

  43. Magda Saúde e Bem Estar

    Fantastic web site. Lots of helpful info here. Thank you for sharing this content.

  44. Supere a Ansiedade

    My congratulations, an excellent article. Thank you for sharing with us!

    My congratulations, Good article! Very nice!

  45. OWL Mídia

    Acesse o meu site:

  46. daniela dos santos

    I am very pleased with the information in this article,
    Congratulations, you’ve made a great article.

  47. Beatriz diniz

    My congratulations you did a great job.

  48. gabriel

    Very good this site, congratulations

  49. Joao Vitor

    Thank for sharing, very good!

  50. Claudio Varques

    Fantastic web site.

  51. Como cuidar de orquideas

    Thank you very much for the information. This article really helped me a lot. 😀

  52. emagrecer urgente

    Very good text. Thank you for the informations

    Your design was very good, congratulations

  53. آهنگ ایرانی

    آهنگ ایرانی

  54. Ana Reborn

    Very good, this article lots of quality of the best

  55. Joao Vitor

    Very interesting! Congratulations on the excellent work

  56. matrizes de bordados

    I really enjoyed your tips. Thanks for sharing

  57. Fidget cube

    I blog often and I seriously appreciate your information.

  58. Rosangela

    Thank for sharing this excellent post!

  59. John

    I’ve seen a lot of good stuff here, your site is very informative.

    Very Nice

  60. Americas Marketing

    Américas Marketing

  61. Diabetes tem Cura

    Tratamento de 21 dias

  62. Salgados Congelados

    ganhe dinheiro com salgados congelados

  63. Cura Quantica

    A co-criação de uma nova realidade

  64. Casamento em Crise

    Volte a ter um casamento de sucesso

  65. Ganhar Massa Muscular

    Ganhe massa muscular 4x mais rápido

  66. Detox para emagrecer

    Emagreça de forma saudável

  67. Pronta para o Romance

    Pronta para se apaixonar

  68. Como Montar um Salão de Beleza

    Salão de beleza de sucesso

  69. Como Aumentar os Seios

    Aumente os seios naturalmente

  70. Despertar da Consciência

    O despertar do EU Superior

  71. Aula de Violão para Iniciantes

    Curso de violão online

  72. Disfunção Erétil

    O fim da impotência masculina

  73. Brigadeiro para Vender

    Ganhe dinheiro com brigadeiros

  74. Manutenção de Celular

    I have read several excellent stuff here. Definitely price bookmarking for revisiting.
    I wonder how a lot attempt you set to make one of these

  75. Como acabar com zumbido no ouvido


  76. Paulo Ferreira

    Americas Marketing

  77. Genésio Lima

    Saiba como passar na OAB

  78. Gertrudes

    Musical marketing

  79. Gertrudes

    Games marketing

  80. Gertrudes

    Tecnologia marketing

  81. Gertrudes


  82. Como vender hinode

    Aprenda como Vender Hinode

  83. Luma

    Plano de Saúde Para Cachorro
    plano de saúde para cachorro

  84. Como cuidar das orquídeas

    Great article, I will help a lot.

  85. Auto Eletrica Unitec

    Obrigado por compartilhar esta excelente postagem!

  86. Helenice Honorio

    Meus parabéns, um excelente artigo. Obrigado por compartilhar conosco!

  87. Fabrica de laços e tiaras

    Very useful article congratulations

  88. como fazer sapatinho de bebe

    Fantastic web site. Lots of helpful info here. I’m sending it to
    some friends ans additionally sharing in delicious.


  89. Ralf

    Virtualization is of great importance to the branch of Ti.Belo article.

  90. Como Fazer Sabonete

    This article is very cool, very relevant content.

  91. Cristiano Lopes

    Great article. As it is very complex I am going back to learn to apply in my projects. Strong hug

  92. مهر برجسته

    This article is very cool, very relevant content.

  93. thais

    This article is very cool, very good

  94. felipe barcellos

    Very useful article congratulations, very good.

  95. Luna

    Obrigado ajudou muito.

  96. orquidea

    I loved the post and I’m going to share it with my team at Thursday’s meeting. I believe we are going to correct some important errors.

  97. Pet Shop Em Porto Velho

    Thank you for sharing this information!!!

    You have a lot of useful and important information.


  98. Storage Site no RJ

    gostei do post

  99. site de moda infantil

    Post muito útil



  101. Nei de Souza

    Yes, fantastic !! Thank you for sharing this information.

  102. Sara Campos

    Muito legal as informações, irei compartilhar em minhas redes sociais.

  103. desempenho


  104. آب شیرین کن

    Thank you for sharing this article!

  105. خواص روغن خراطین

    خواص روغن خراطین اصل چیست ؟
    خواص روغن خراطین اصل با توجه به مواد غنی همچون پروتئین ، الانتئین ، امگا ۶ ، امگا ۹ و امگا ۳ بی شمار است.
    ولی در این بخش به مهمترین خواص روغن خراطین اصل اشاره می نماییم و سپس توضیحات کاملی ارائه می نماییم .

  106. Como parar de fumar naturalmente

    very complicate to me

  107. como falar ingles fluente

    i love this article. Congratulations

  108. pilares do canto

    I don´t understand nothing. HELP ME

  109. piscinologo

    Nothing anu matter, anu one can see. Nothing really matter too me

  110. Melissa

    como fazer reeducação alimentar

  111. como fazer reeducação alimentar

    Muito bom

  112. Candidíase Como Curar

    sim eu gostei do post vou salvar em favoritos

  113. Ed

    Muito interessasnte esse conteúdo valeu

  114. روغن خراطین

    روغن خراطین و خواص روغن خراطین
    خراطین اصل

  115. روغن خراطین

    همه چیز درباره روغن خراطین اصل
    روغن خراطین اصل

  116. Doces para vender

    Parabéns pelo artigo, informaćões de grande importancia.

  117. Mark

    Obrigado pelas informações!

  118. danilo smith

    very good, thank you for nice post!

  119. danilo silva

    Congratulations for nice post!!!

  120. danilo silva

    very good informations

  121. Marshall

    Very beautifull, artcle.

  122. Nei de Souza

    Show de bola !!!

  123. Will Stuart

    Wonderfull article

  124. Cesar

    Legal mesmo muito obrigado e parabéns

  125. ریلکس کالا

    very good

  126. michele do nascimento rodrigues

    Otimo artigo,muito interessante!!

  127. magna GNC

    Melhores suplementos naturais você encontram aqui.

  128. Osteo Bi-Flex Força Tripla

    O conforto que suas articulações merecem.

  129. Ariel Macedo

    Muito esclarecedor! Parabéns!

  130. André Santos

    Best article!

  131. Mauro Carlos

    Very Good! Best Job!

  132. Paulo Anderson


  133. Maria Melo

    A lista de benefícios comprovados do Ômega 3 é enorme. Como exemplo principal, podemos citar a contribuição na redução de problemas cardíacos e colesterol ruim, mas seu consumo ainda conta com a prevenção de várias doenças relacionadas a inflamações (pelo fato do Ômega-3 ter caráter anti-inflamatório), diminuem o risco de câncer, diabetes, aterosclerose, deixam saudáveis unhas, pele e cabelos, influencia no bom funcionamento do cérebro, elimina sintomas da TPM e ajuda a emagrecer. São muitos os benefícios conquistados a longo e a curto prazo com o consumo diário de Ômega 3.

  134. Jânio Kelvin

    Dê um fim as gorduras indesejadas com Ab Cuts Sleek & Lean

  135. Miguel Arouca

    Feito para ser uma alternativa natural para o aumento dos seios sem cirurgia, Breast Success reúne em sua composição 13 ervas naturais que, além de promover o crescimento dos seios, age no seu organismo de maneira a beneficiar a saúde feminina como um todo.

  136. Eugênio Ramos

    A Bioperina aumenta a eficácia de todos os outros ingredientes. E se comparado a outros produtos para libido e saúde sexual, só Vig RX Plus conseguiu encontrar a fórmula exacta, fazendo-o o melhor produto do mercado para vigor máximo masculino.

  137. قیمت پوشک بچه

    GOOD page

  138. Ari Pires

    Ácido Caprílico 2160mg Solaray é seu melhor antifungicida natural contra a cândida.

  139. Ana Maria

    Apoio anti-envelhecimento para saúde celular, Apoia uma função imunológica saudável, Apoia a saúde cardiovascular, Apoia o corpo para uma rápida recuperação no pós-exercícios, entre muitos outros benefícios

  140. Portal dica online

    i love

  141. aluminum sheet

    China aluminum sheet factory-Henan Huawei Aluminum Co., Ltd professional mechanical aluminum sheet、lacquered aluminum sheet、color coated aluminum sheet manufacturers and suppliers.17 years experiencen china aluminum sheet suppliers are waiting for you! https://www.alufoil.cn/aluminum-sheet/

  142. Alberto Schuman

    This article really made me interested. I’ll save you in my favorites

  143. como cuidar de um bonsai

    A Cultura do bonsai objetiva o cultivo de pequenas árvores com os mesmos elementos que
    caracterizam um grande e longeva árvore crescida em ambiente natural. Ao contrário do que muitos pensam, não existem grandes dificuldades em como cuidar
    de um bonsai, entretanto algumas orientações devem ser seguidas para sucesso no seu cultivo. Aprenda agora como cuidar de um bonsai.

  144. Orquidea

    Thank you very much for the information.

  145. Pramil

    Great Post

  146. Wakky

    Ótimo Post

  147. Site Mulherão

    Very Good

  148. Como Fazer Sabonete Artesanal

    Thanks for sharing this great post!


Leave a Reply

Your email address will not be published. Required fields are marked *