App Modernization

Ingestion in a Modern IoT Application

By, Luis M. Valerio Castillo, Consultant, PS Research Labs; Neeraj Arora, Staff Architect, PS Research Labs


In the article, Infrastructure Sub-Systems of an IoT Solution, we examined the high-level architecture in the infrastructure plane applicable to IoT deployments. Following that, in the article, Introduction to a Modern IoT Application we introduced the Solution Components applicable to an IoT solution. By using simple use-cases, we also introduced the pattern of Ingest, Analyze, and Engage applicable to most IoT solutions. We introduced the concept of using a layered design approach comprising infrastructure sub-systems, the solution components, and the six processes of multi-cloud to fulfill requirements of this pattern.

In this blog post, we will begin our exploration into the ingestion, and the solution components used in our sample application’s design. We will do this using the People Counter Modern IoT Application.

Infrastructure Sub-Systems of Ingestion

The first step, ingest, in the pattern of ingest, analyze, and engage has the following sub-systems:

  • Devices
  • Protocol Gateways
  • Device Edges

The devices only need to have enough networking or communications capability to connect with device edges. Depending on the protocols used by the network of devices, and capabilities of the device edges, we may require a protocol gateway to translate the protocol of the network of devices to one that device edges can understand.

In our People Counter sample IoT application, the device and device edge are a single physical unit. The device is a physically connected camera and the device edge is an IP capable Raspberry Pi board. It requires software or firmware to add ingestion capabilities to the hardware. In our case, the People Counter Ingestion Service acts as the firmware.

Ingestion Microservice

The ingestion microservice has three core tasks:

  • Capture a photo at a regular interval,
  • Upload captured photo to an object store making it available for upstream components to download, and
  • Communicate with next-in-line microservices informing them that a new image is available for processing
Although the ingestion microservice uses libraries to communicate with the directly connected camera to take photographs, it needs the IP communication infrastructure to talk with storage services and other application components.

Solution Components

The People Counter ingestion architecture makes use of the following solution components:

  • Storage Platform
  • Ingestion Platform

MinIO is our storage platform while the ingestion microservice code and MQTT form our ingestion platform.


We selected MQTT as our messaging layer because it is lightweight and does not require a great deal of network bandwidth. That fact is important since, in a production deployment of an IoT application, devices will probably be in remote locations which may not always have access to fast and/or reliable network connections. A single MQTT broker can support tens of thousands of connections which allows for our application to scale to support the large number of devices we can typically expect in an IoT use case. In our sample implementation, MQTT carries metadata about the photos taken by the camera.


We chose MinIO as the storage platform used to store the photos taken by the Raspberry Pi 3 camera. It is a high-performance object store that can store large files. The object store allows our microservice to share larger artifacts, such as images, with other microservices that comprise the Modern IoT Application.

Development Considerations

We must consider carefully the programming language to use when creating the microservice to extract data from devices. The availability of libraries to interact with the devices will often have a big influence in making the selection. In our case, we chose Python because the Raspberry Pi Camera module client library only has support for Python.

 A highly desirable feature of both MQTT and MinIO is that they can easily integrate with applications via client libraries and APIs. Besides easing the development workload, this also allows for rapid development and prototyping, which is one of the fundamental tenets of DevOps.

Deployment Considerations

Whether the deployment of the solution components is self-managed or as-a-service does not impact the ingestion design. In our case, MQTT is being consumed as-a-service from a public cloud provider and MinIO is self-managed running in our private cloud. However, the deployment of either or both could be to either private or public clouds without changing the design. Also, both components are platform-agnostic and can run in VMs or as containers. Since both MQTT and MinIO run on separate clouds, the IoT solution is effectively a multi-cloud solution as explained in the article Defining The Modern Application.

Data Orchestration Pipeline

The purpose of integrating the solution components we mentioned earlier is to orchestrate the data coming from the device. We created the People Counter Ingestion microservice with that purpose in mind and forms the first part of our data orchestration pipeline. One thing to note here is that there are three microservices which form the data orchestration pipeline. We will cover the other two in later posts.

To integrate the Raspberry Pi 3 camera, we use a Python library called Picamera. It allows us to capture images at the desired resolution only limited by hardware specifications, using a Python program, and at a configurable interval. There are many other configuration options in the ingestion microservice, which we’ll discuss in a later post.

After the photo is on disk, we use the Paho MQTT Python client library to send metadata to MQTT about the photo. The metadata looks something like this:

‘type’: ‘data’,
‘deviceID’: ‘60261e35-7c0c-4ad2-9543-43855f35a1e6’,
‘filePath’: ‘people-counter-images/imagef9f90802-d5d2-44fb-bcd5-16cfc6ad6035.jpg’,
‘creationTimestamp’: 1577830187.990863

Concurrently, we use the MinIO Python client library to send the actual photo file to the MinIO object store. From there, other microservices will download the photo for further processing.

Once the microservice transfers the image and image metadata, we schedule the image for deletion after a configurable time interval.

Below is a diagram of the application data flow and control:

Notice the direction of the arrows between the different components which show actions taken by the ingestion microservices. The order is:

  1. Drives the camera and captures data from it
  2. Stores the photo in MinIO
  3. Pushes metadata to MQTT, which lets the other microservices know the photo is available in MinIO for download

We included Camera... X and Raspberry Pi 3.. X to illustrate the point that there could be one or more devices and devices edges, but the architecture stays the same. Scalability is an important consideration when architecting the ingestion of an IoT solution, given the vast number of devices we can expect in a production environment.


In the article, we covered the infrastructure sub-systems, solution components and the data orchestration pipeline for ingestion in a modern IoT application. We also talked about the sample implementation of the ingestion portion of an IoT architecture called People Counter Ingestion. In that sample application, we described the rationale behind the choice for each component and how they all fit together. In the next few posts, we will dive deeper into the technical details of the implementation and the inference portion of the architecture.

About the Authors

Luis M. Valerio Castillo is a Solutions Development Consultant with PS Research Labs at VMware, focusing on IoT, and Edge Computing. Prior to this role, Luis worked in the field implementing solutions for customers, which included application deployment automation, third party system integrations, automated testing, and documentation. His six years of experience started at Momentum SI, which was acquired by VMware in 2014. He holds a Bachelor of Science, with a major in Computer Science.

Neeraj Arora is a Staff Architect with PS Research Labs at VMware. He leads the development of service offerings for Machine Learning, IoT, and Edge Computing. Previously, Neeraj was part of the VMware Professional Services field organization delivering integrations to Fortune 500 companies using VMware and non-VMware products. Industry experience includes gaming, utilities, healthcare, communications, finance, manufacturing, education, and government sectors. Neeraj has published research papers in the areas of Search Engines, Standards Compliance, and use of Computer Science in Medicine.


Leave a Reply

Your email address will not be published. Required fields are marked *