Home > Blogs > VMware vFabric Blog


Spring and RabbitMQ – Behind India’s 1.2 Billion Person Biometric Database

Aadhaar was conceived as a way to provide a unique, online, portable identity so that every single resident of India can access and benefit from government and private services. The Aadhaar project has received coverage from all possible media – television, press, articles, debates, and the Internet. It is Screen shot 2012-07-30 at 5.53.12 PM seen as audacious use of technology, albeit for a social cause. UIDAI, the authority responsible for issuing Aadhaar numbers, has published white-papers, data, and newsletters on progress of the initiative.A common question to the UIDAI technology team in conferences, events and over coffee is – what technologies power this important nation-wide initiative? In this blog post, we wanted to give a sense of several significant technologies and approaches.

Fundamental Principles

While the deployment footprint of the systems has grown from half-a-dozen machines to a few thousand CPU cores processing millions of Aadhaar related transactions, the fundamental principles have remained the same:

  • Simplicity in design & development, commodity hardware for deployment
  • Leverage best of breed solutions
  • Use of open standards and open source where prudent to avoid vendor lock-in

Categorizing Workloads

We made unbiased technology choices by precisely listing various workloads on the system and then mapping solutions best suited to process these workloads. The Aadhaar workloads were categorized into the following mutually distinct types:

  • Batch oriented, asynchronous tasks that may be parallel-processed.  Workloads are assigned using a scheduler. Transaction throughput and data-integrity are non-negotiable.
  • Synchronous, OLTP style API gateway. Workloads are user triggered. High availability and low latency data reads are critical needs.

Solutions to handle the above types of workloads needed to scale linearly and handle millions of transactions per day. A distinct characteristic of the Aadhaar systems with respect to resource utilization is that most transactions are I/O bound.

Principles and Patterns

Based on the fundamental principles above, we developed more concrete ones and adopted several architecture patterns:

  • Avoid container bloat and J2EE application server features that we don’t need. Custom-built J2SE based runtime that enables POJO based applications was used instead.
  • Use technologies that could help us distribute work across a number of nodes. SEDA was a natural fit with high-speed messaging providing the transport.
  • Ease of integration across the stack.
  • Use of Distributed File System to store and serve tera-bytes of biometric data. Adopt Data-Locality compute patterns to move compute closer to data.
  • Data Sharding as a technique to distribute data on both SQL and NoSQL data stores.

Processing, Messaging and Data storage nodes will fail. The system should support recovery and replaying of failed transactions using techniques like check-pointing execution state.

While this article only covers a few key technologies, the overall solution included:

  • Hadoop: HDFS, HBase, Hive, Pig, Zookeeper
  • MySQL: sharded, partitioned, distributed
  • SEDA: Mule, RabbitMQ
  • Search: MongoDB, sharded Solr
  • Compute Grid: Spring, GridGain
  • Monitoring: Custom built, Nagios
  • Analytics & Visualization
  • Deployment footprint : Thousands of CPU cores
  • Extensive Data archival, DR

Spring Application Runtime

All Aadhaar application runtimes were custom built using the Spring framework. We created various runtime profiles to suit the workload characteristics described above:

  • Basic profile – supports application bootstrapping, management, and loading application extensions defined as Spring Application Contexts.
  • Batch profile – extension of Basic profile using Spring Batch and enriched with administration and deployment capabilities.
  • Service profile – extension of Basic profile to support Service orientation, a registry of deployed services and invocation broker.
  • SEDA profile – extension of Service profile using Mule framework to support orchestration using RabbitMQ as messaging layer.

The application runtime makes extensive use of Spring framework modules namely, Core, AOP and JEE.

RabbitMQ Messaging

RabbitMQ was a perfect fit in the Aadhaar application runtime system for these reasons:

  • Low process footprint of the server (i.e. broker).
  • Great quality AMQP Java client libraries.
  • Ease of integration with rest of the stack – we wrote an AMQP transport for Mule that could be configured and managed using Spring.

P2P messaging in RabbitMQ helped us distribute work across the various SEDA runtimes. A single node RabbitMQ instance could easily scale to deliver millions of messages per day while exhibiting high degrees of system availability.

SQL and NoSQL Data Stores

The Aadhaar systems used a number of data stores that may be broadly classified as SQL and NoSQL. We adopted Data Sharding techniques to distribute data across clusters. We implemented a JPA-like persistence framework for this, used Spring DAO implementations for Transaction Management, and created Routing Data sources (each pointing to a data shard). Spring’s support for AOP and proxying was used extensively in building the persistence framework.

Monitoring

Managing a deployment footprint of a thousand plus CPU cores required extensive monitoring to maintain business SLAs. We implemented an agent-less, custom monitoring solution where applications emit metrics as Spring ApplicationEventS published to the runtime’s  Application Context. Metrics are aggregated using timers and published to the monitoring server using Spring Remoting Http endpoints. Metrics are cached in memory and also published to RabbitMQ queues that are then persisted to an RDBMS data store by the SEDA profile runtime.

See Related Slides from the Fifth Elephant Conference on Big Data

Screen shot 2012-07-30 at 5.57.45 PM About the Author: Regunath Balasubramanian is the Principal Architect of the Aadhaar project and works for MindTree. He has over 15 years of experience in technology consulting and implementation. He is passionate about using and contributing to Open Source. Regunath is presently part of the Flipkart CTO organization and working on the Customer Platform.
He was the Principal Architect of the Gov. of India UID project – the world’s largest identity database. Regunath blogs frequently and is an occasional guest columnist for CIOUpdate.com.

58 thoughts on “Spring and RabbitMQ – Behind India’s 1.2 Billion Person Biometric Database

  1. Nitin Kaulavkar

    This is a great post Regu! Way to go..

    Reply
  2. David Mytton

    Do you have any information about the kind of hardware infrastructure? You mentioned the software components and that i/o is important so it’d be interesting to understand how this was deployed. Did you use dedicated or virtualised instances? SSDs? What kind of capacity is used to handle that 5TB of replication traffic?

    Reply
  3. Regunath B

    All instances were physical on blade servers using Intel chips – standard ones from the likes of Dell, HP and IBM.
    Storage for the 5TB per day data was on FC disks. Data is moved to SATA disks after processing. All storage is managed off SAN(s). SSDs were used only for storing indexes, bin logs and the like.

    Reply
  4. biometric systems

    This is quite interesting , i need it hardware configuration for it application that i want know more about this , actually i want to go more one step a head as you mention in your blog

    Reply
  5. biometric systems

    This is quite interesting , i need it hardware configuration for it application that i want know more about this , actually i want to go more one step a head as you mention in your blog

    Reply
  6. biometric systems

    It’s good to hear that the US Government has increased their commitment to cleaner energy. Today people are using alternative energy such as sun and wind due to the benefits it gives to us. Excellent post, it’s worth reading.

    Reply
  7. biometric systems

    I’m still mastering from you, but I’m trying to obtain my objectives. I certainly appreciate looking at all that is placed on your website.Keep the details arriving. I beloved it.

    Reply
  8. Jeff Ostrin

    Regunath –

    Thanks for the article. I found this article while searching “Data-Locality compute patterns”. Do you have any more information about what patterns you used here?

    Thanks

    Reply
  9. Sprinkler Tune Up Monument CO

    I’ve learn several just right stuff here. Certainly worth bookmarking for revisiting.
    I surprise how a lot attempt you put to make any such fantastic
    informative site.

    My web page – Sprinkler Tune Up Monument CO

    Reply
  10. animal and veterinary

    Appreciate this post. Will try it out.

    Have a look at my homepage … animal and veterinary

    Reply
  11. social networking programs

    This website definitely has all of the information I needed about this subject and didn’t know who
    to ask.

    Check out my web blog – social networking programs

    Reply
  12. the best free ipad apps

    MTS), even common Video files like AVI WMV MPEG that want to play Movie on i – Pad, you must need to.
    The second edition i – Pad, the i – Pad 2, got a more sleek
    design with some boosted hardware that it much faster.
    PPT files and hit the road, provided that changes aren’t needed.

    Here is my web blog … the best free ipad apps

    Reply
  13. modern love by david bowie

    Appreciating the commitment you put into your site and in depth information you offer.

    It’s good to come across a blog every once in a while that isn’t the same out of date rehashed
    information. Excellent read! I’ve saved your site and I’m adding
    your RSS feeds to my Google account.

    my web-site – modern love by david bowie

    Reply
  14. خرید هاست وردپرس

    This is a great post Regu

    Reply
  15. آموزش برنامه نویسی اندروید

    Keep in mind that native iOS or Android programming could still leverage the same calls to this back-end. In other words, the back-end provides a common set of services no matter which client is used

    Reply
  16. google 2

    great post thx.

    Reply
  17. هنرمندان

    ff

    Reply
  18. خودرو

    -ikujk

    Reply
  19. خودرو

    ss

    Reply
  20. سئو در کرج

    شرکت طراحی وبسایت و سئو در کرج

    Reply
  21. dtq

    http://viprank.ir

    Reply
  22. هدایای تبلیغاتی

    This is a great post Regu

    Reply
  23. مسائل زناشوئی

    good post

    Reply
  24. قالب تزریق پلاستیک

    Appreciating the commitment you put into your site and in depth information you offer.

    Reply
  25. فروش مبل تختخواب شو

    wow, thanks.

    Reply
  26. سولجر قالب

    very good and very nice.

    Reply
  27. طراحی سایت اصفهان

    fantastic…

    Reply
  28. تعمیرگاه مجاز لوازم خانگی

    very good and very nice.

    Reply
  29. apple

    hi
    the best

    Reply
  30. modem

    the best modem

    Reply
  31. بررسی

    the best review

    Reply
  32. تعمیرگاه مجاز لوازم خانگی

    the best review

    Reply
  33. takhfifan

    Thanks for your helpful information.

    Reply
  34. خدمات تلويزيون شهري

    خدمات تلويزيون شهري

    Reply
  35. Lynn Cohen

    I work at Salesforce.com in the Life Sciences and Health Care verticals, working directly with partners that build on our platform. I’d be very interested in pursuing a PoC with you to publish your data into our Big Objects store. It could be quite interesting to see this data in the context of our Health Cloud app or custom situations. I assume your data is anonymized? If you are interested, please contact me.

    Reply
  36. دیجیتال مارکتینگ

    so thanks for sharing information

    Reply
  37. تلگرام باز

    thanks .

    Reply
  38. Nair Pitta

    Very good your article, really is of great relevance, I will follow your blog. Thank you for sharing.

    Reply
  39. تعمیر کولر گازی مدیا

    Very good your article, really is of great relevance, I will follow your blog. Thank you for sharing.

    Reply
  40. Anusuya Manohar

    Thanks for this article Regu.

    Reply
  41. دانلود فیلم

    Thanks for this post

    Reply
  42. دستگاه تزریق پلاستیک

    thanks a lot…

    Reply
  43. دانلود فیلم

    Thanks alot for this post……

    Reply
  44. دانلود آهنگ

    thanks a lot….

    Reply
    1. طراحی سایت اصفهان

      Very good your article, really is of great relevance, I will follow your blog. Thank you for sharing.

      Reply
  45. orquídea

    Did you use dedicated or virtualised instances? SSDs? What kind of capacity is used to handle that 5TB of replication traffic?

    Reply
  46. دانلود فیلم

    The naughty site is a great source of downloading the Iranian movie with a direct link, the new foreign dubbing movie, serial download, free movie movie, hd movie animation.

    Reply
  47. زیرنویس فارسی

    very good . thanks

    Reply
  48. دانلود فیلم سینمایی آینه بغل

    thanks

    Reply
  49. خرید ماساژور

    مشاهده، بررسی و خرید انواع دستگاه و صندلی ماساژور در وبسایت آی‌رست ♥

    Reply
  50. تناسب اندام با مِتد ذهنی

    Post was interesting, thank you for sharing this content

    Reply
  51. thuốc chữa bệnh trĩ

    Post was interesting, thank you for sharing this content

    Reply
  52. SƠN DÂN DỤNG

    Post was interesting, thank you for sharing this content

    Reply
  53. sơn sắt mạ kẽm

    The naughty site is a great source of downloading the Iranian movie with a direct link, the new foreign dubbing movie, serial download, free movie movie, hd movie animation.

    Reply
  54. تبصدا

    The naughty site is a great source of downloading the Iranian movie with a direct link, the new foreign dubbing movie, serial download, free movie movie, hd movie animation

    Reply
  55. mạ vàng

    Very good your article, really is of great relevance, I will follow your blog. Thank you for sharing.
    https://mavangcaocap.com/do-trang-tri-ma-vang/

    Reply
  56. thi bằng lái xe máy ở hà nội

    The naughty site is a great source of downloading the Iranian movie with a direct link, the new foreign dubbing movie, serial download, free movie movie, hd movie animation

    Reply
  57. salam

    hi there this article very useful tanks

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*