Home > Blogs > VMware vFabric Blog

Spring and RabbitMQ – Behind India’s 1.2 Billion Person Biometric Database

Aadhaar was conceived as a way to provide a unique, online, portable identity so that every single resident of India can access and benefit from government and private services. The Aadhaar project has received coverage from all possible media – television, press, articles, debates, and the Internet. It is Screen shot 2012-07-30 at 5.53.12 PM seen as audacious use of technology, albeit for a social cause. UIDAI, the authority responsible for issuing Aadhaar numbers, has published white-papers, data, and newsletters on progress of the initiative.A common question to the UIDAI technology team in conferences, events and over coffee is – what technologies power this important nation-wide initiative? In this blog post, we wanted to give a sense of several significant technologies and approaches.

Fundamental Principles

While the deployment footprint of the systems has grown from half-a-dozen machines to a few thousand CPU cores processing millions of Aadhaar related transactions, the fundamental principles have remained the same:

  • Simplicity in design & development, commodity hardware for deployment
  • Leverage best of breed solutions
  • Use of open standards and open source where prudent to avoid vendor lock-in

Categorizing Workloads

We made unbiased technology choices by precisely listing various workloads on the system and then mapping solutions best suited to process these workloads. The Aadhaar workloads were categorized into the following mutually distinct types:

  • Batch oriented, asynchronous tasks that may be parallel-processed.  Workloads are assigned using a scheduler. Transaction throughput and data-integrity are non-negotiable.
  • Synchronous, OLTP style API gateway. Workloads are user triggered. High availability and low latency data reads are critical needs.

Solutions to handle the above types of workloads needed to scale linearly and handle millions of transactions per day. A distinct characteristic of the Aadhaar systems with respect to resource utilization is that most transactions are I/O bound.

Principles and Patterns

Based on the fundamental principles above, we developed more concrete ones and adopted several architecture patterns:

  • Avoid container bloat and J2EE application server features that we don’t need. Custom-built J2SE based runtime that enables POJO based applications was used instead.
  • Use technologies that could help us distribute work across a number of nodes. SEDA was a natural fit with high-speed messaging providing the transport.
  • Ease of integration across the stack.
  • Use of Distributed File System to store and serve tera-bytes of biometric data. Adopt Data-Locality compute patterns to move compute closer to data.
  • Data Sharding as a technique to distribute data on both SQL and NoSQL data stores.

Processing, Messaging and Data storage nodes will fail. The system should support recovery and replaying of failed transactions using techniques like check-pointing execution state.

While this article only covers a few key technologies, the overall solution included:

  • Hadoop: HDFS, HBase, Hive, Pig, Zookeeper
  • MySQL: sharded, partitioned, distributed
  • SEDA: Mule, RabbitMQ
  • Search: MongoDB, sharded Solr
  • Compute Grid: Spring, GridGain
  • Monitoring: Custom built, Nagios
  • Analytics & Visualization
  • Deployment footprint : Thousands of CPU cores
  • Extensive Data archival, DR

Spring Application Runtime

All Aadhaar application runtimes were custom built using the Spring framework. We created various runtime profiles to suit the workload characteristics described above:

  • Basic profile – supports application bootstrapping, management, and loading application extensions defined as Spring Application Contexts.
  • Batch profile – extension of Basic profile using Spring Batch and enriched with administration and deployment capabilities.
  • Service profile – extension of Basic profile to support Service orientation, a registry of deployed services and invocation broker.
  • SEDA profile – extension of Service profile using Mule framework to support orchestration using RabbitMQ as messaging layer.

The application runtime makes extensive use of Spring framework modules namely, Core, AOP and JEE.

RabbitMQ Messaging

RabbitMQ was a perfect fit in the Aadhaar application runtime system for these reasons:

  • Low process footprint of the server (i.e. broker).
  • Great quality AMQP Java client libraries.
  • Ease of integration with rest of the stack – we wrote an AMQP transport for Mule that could be configured and managed using Spring.

P2P messaging in RabbitMQ helped us distribute work across the various SEDA runtimes. A single node RabbitMQ instance could easily scale to deliver millions of messages per day while exhibiting high degrees of system availability.

SQL and NoSQL Data Stores

The Aadhaar systems used a number of data stores that may be broadly classified as SQL and NoSQL. We adopted Data Sharding techniques to distribute data across clusters. We implemented a JPA-like persistence framework for this, used Spring DAO implementations for Transaction Management, and created Routing Data sources (each pointing to a data shard). Spring’s support for AOP and proxying was used extensively in building the persistence framework.


Managing a deployment footprint of a thousand plus CPU cores required extensive monitoring to maintain business SLAs. We implemented an agent-less, custom monitoring solution where applications emit metrics as Spring ApplicationEventS published to the runtime’s  Application Context. Metrics are aggregated using timers and published to the monitoring server using Spring Remoting Http endpoints. Metrics are cached in memory and also published to RabbitMQ queues that are then persisted to an RDBMS data store by the SEDA profile runtime.

See Related Slides from the Fifth Elephant Conference on Big Data

Screen shot 2012-07-30 at 5.57.45 PM About the Author: Regunath Balasubramanian is the Principal Architect of the Aadhaar project and works for MindTree. He has over 15 years of experience in technology consulting and implementation. He is passionate about using and contributing to Open Source. Regunath is presently part of the Flipkart CTO organization and working on the Customer Platform.
He was the Principal Architect of the Gov. of India UID project – the world’s largest identity database. Regunath blogs frequently and is an occasional guest columnist for CIOUpdate.com.

50 thoughts on “Spring and RabbitMQ – Behind India’s 1.2 Billion Person Biometric Database

  1. Nitin Kaulavkar

    This is a great post Regu! Way to go..

  2. David Mytton

    Do you have any information about the kind of hardware infrastructure? You mentioned the software components and that i/o is important so it’d be interesting to understand how this was deployed. Did you use dedicated or virtualised instances? SSDs? What kind of capacity is used to handle that 5TB of replication traffic?

  3. Regunath B

    All instances were physical on blade servers using Intel chips – standard ones from the likes of Dell, HP and IBM.
    Storage for the 5TB per day data was on FC disks. Data is moved to SATA disks after processing. All storage is managed off SAN(s). SSDs were used only for storing indexes, bin logs and the like.

  4. biometric systems

    This is quite interesting , i need it hardware configuration for it application that i want know more about this , actually i want to go more one step a head as you mention in your blog

  5. biometric systems

    This is quite interesting , i need it hardware configuration for it application that i want know more about this , actually i want to go more one step a head as you mention in your blog

  6. biometric systems

    It’s good to hear that the US Government has increased their commitment to cleaner energy. Today people are using alternative energy such as sun and wind due to the benefits it gives to us. Excellent post, it’s worth reading.

  7. biometric systems

    I’m still mastering from you, but I’m trying to obtain my objectives. I certainly appreciate looking at all that is placed on your website.Keep the details arriving. I beloved it.

  8. Jeff Ostrin

    Regunath –

    Thanks for the article. I found this article while searching “Data-Locality compute patterns”. Do you have any more information about what patterns you used here?


  9. Sprinkler Tune Up Monument CO

    I’ve learn several just right stuff here. Certainly worth bookmarking for revisiting.
    I surprise how a lot attempt you put to make any such fantastic
    informative site.

    My web page – Sprinkler Tune Up Monument CO

  10. animal and veterinary

    Appreciate this post. Will try it out.

    Have a look at my homepage … animal and veterinary

  11. social networking programs

    This website definitely has all of the information I needed about this subject and didn’t know who
    to ask.

    Check out my web blog – social networking programs

  12. the best free ipad apps

    MTS), even common Video files like AVI WMV MPEG that want to play Movie on i – Pad, you must need to.
    The second edition i – Pad, the i – Pad 2, got a more sleek
    design with some boosted hardware that it much faster.
    PPT files and hit the road, provided that changes aren’t needed.

    Here is my web blog … the best free ipad apps

  13. modern love by david bowie

    Appreciating the commitment you put into your site and in depth information you offer.

    It’s good to come across a blog every once in a while that isn’t the same out of date rehashed
    information. Excellent read! I’ve saved your site and I’m adding
    your RSS feeds to my Google account.

    my web-site – modern love by david bowie

  14. خرید هاست وردپرس

    This is a great post Regu

  15. آموزش برنامه نویسی اندروید

    Keep in mind that native iOS or Android programming could still leverage the same calls to this back-end. In other words, the back-end provides a common set of services no matter which client is used

  16. google 2

    great post thx.

  17. هنرمندان


  18. خودرو


  19. خودرو


  20. سئو در کرج

    شرکت طراحی وبسایت و سئو در کرج

  21. dtq


  22. هدایای تبلیغاتی

    This is a great post Regu

  23. مسائل زناشوئی

    good post

  24. قالب تزریق پلاستیک

    Appreciating the commitment you put into your site and in depth information you offer.

  25. فروش مبل تختخواب شو

    wow, thanks.

  26. سولجر قالب

    very good and very nice.

  27. طراحی سایت اصفهان


  28. تعمیرگاه مجاز لوازم خانگی

    very good and very nice.

  29. apple

    the best

  30. modem

    the best modem

  31. بررسی

    the best review

  32. تعمیرگاه مجاز لوازم خانگی

    the best review

  33. takhfifan

    Thanks for your helpful information.

  34. خدمات تلويزيون شهري

    خدمات تلويزيون شهري

  35. Lynn Cohen

    I work at Salesforce.com in the Life Sciences and Health Care verticals, working directly with partners that build on our platform. I’d be very interested in pursuing a PoC with you to publish your data into our Big Objects store. It could be quite interesting to see this data in the context of our Health Cloud app or custom situations. I assume your data is anonymized? If you are interested, please contact me.

  36. دیجیتال مارکتینگ

    so thanks for sharing information

  37. تلگرام باز

    thanks .

  38. Nair Pitta

    Very good your article, really is of great relevance, I will follow your blog. Thank you for sharing.

  39. تعمیر کولر گازی مدیا

    Very good your article, really is of great relevance, I will follow your blog. Thank you for sharing.

  40. Anusuya Manohar

    Thanks for this article Regu.

  41. دانلود فیلم

    Thanks for this post

  42. دستگاه تزریق پلاستیک

    thanks a lot…

  43. دانلود فیلم

    Thanks alot for this post……

  44. دانلود آهنگ

    thanks a lot….

  45. orquídea

    Did you use dedicated or virtualised instances? SSDs? What kind of capacity is used to handle that 5TB of replication traffic?

  46. دانلود فیلم

    The naughty site is a great source of downloading the Iranian movie with a direct link, the new foreign dubbing movie, serial download, free movie movie, hd movie animation.

  47. زیرنویس فارسی

    very good . thanks

  48. دانلود فیلم سینمایی آینه بغل


  49. خرید ماساژور

    مشاهده، بررسی و خرید انواع دستگاه و صندلی ماساژور در وبسایت آی‌رست ♥

  50. تناسب اندام با مِتد ذهنی

    Post was interesting, thank you for sharing this content


Leave a Reply

Your email address will not be published. Required fields are marked *