Build Next Gen Apps VMware Tanzu Observability

At DoorDash, It’s Running with Data

Take a quick look at DoorDash’s website, and you’d be forgiven for thinking that they’re simply a food delivery company.

Ask them, and they see themselves as a logistics company; food just happens to be the focus, at least for now. Says DoorDash iOS Engineer Jeff Cosgriff in a company profile, “At its core, DoorDash is a technology company.

We are working to solve last-mile logistics by allowing local merchants to outsource delivery.” Much broader than food there.

This means that, once they nail food, they could expand elsewhere. Given infusions of over $180M in a lean investment environment, their investors likely see that potential as well.

DoorDash isn’t the only company providing on-demand logistics. Several contenders are vying to conquer the “last mile”; standing out is tough. As with most crowded marketplaces, margins are tight; so, even small efficiency gains can separate the leaders from the laggards.

Within this environment, DoorDash has had to establish a long-term sustainable advantage: a culture built around the worship of all things data that lets them both excel today and continue improving on an ongoing basis.

What they’ve taken on is no easy task: bringing together merchants, delivery people (called “Dashers”), and consumers:

  • They need to make merchants happy by bringing them business.
  • They need to make Dashers happy by maximizing their income through efficient assignment of deliveries.
  • They need to make consumers happy so that they’ll order again and again.
  • The challenge lies in creating a monster model that optimizes all aspects of this operation and integrating it into a complex software system running across the cloud and mobile apps.

    When a consumer creates an order with a specific desired delivery time, the merchant has to know when to start preparing the order so that it’s ready when the Dasher arrives.

    The Dasher needs to get the assignment in time to arrive at the restaurant as the food comes out of the kitchen.

    “We really can’t see without data – the challenge is so huge.”

    So “go time” in the kitchen depends on how long the item takes to prepare, how busy the kitchen is that time of day, and how long the delivery will take – which depends on how far away the customer lives and how bad traffic will be at this time of day.

    And DoorDash is motivated to find other nuances that further refine how to optimize the timing.

    This software system can be a huge beast to manage, since it must be available and stable in production, and it must predict the right behaviors at the right time. How do they make this happen?

    By collecting massive amounts of data on every aspect of the operation, determining the metrics most important to the business, actively monitoring them, and then relating it all back to the underlying data when they notice anomalies.

    Hendra Tjahayadi, DoorDash DevOps/Data Infrastructure Manager, emphasizes the challenge in yet another company profile: “We really can’t see without data — the challenge is so huge.”

    Hendra Tjahayadi, DevOps/Data Infrastructure Manager at DoorDash

    So DoorDash’s organization and culture revolve around the primacy of data and metrics. They form the basis of all operations and engineering decisions.

    For operations, the mantra is simple: measure and record everything. You then monitor key metrics to know that they remain within expectations. DoorDash tracks hundreds of metrics to make sure a customer’s food arrives on-time and fresh. If something looks off, you drill down to find out why.

    A similar philosophy applies to engineering, but with a twist. Existing products have historical benchmarks for metrics, but new products don’t. So engineering must surround any new capability with critical metrics that measure whether the feature is operating as expected and whether customers like it.

    If those metrics suggest there’s trouble, then they lead to the cause of that trouble so that, if necessary, the code can be quickly updated. Rinse and repeat as necessary.

    “On the engineering side, people are using data metrics to find the results of experiments. People are constantly trying new things against our control groups, to see what customers like and don’t like,” says Mr. Tjahayadi. As a result, data helps drives a culture of experimentation at DoorDash.

    Sounds straightforward, but there’s one critical requirement for this to work: ALL the data must reside in a single, unified data store so that everyone can see and work with the same data, with minimal mis-interpretations.

    Says Mr. Tjahayadi, “If you don’t have this process, my definition of something might be different than yours, so we’re talking a different language. Whereas if we allocate everything correctly, there’s only one definition.”

    “If we collect everything but people can’t search, it’s like finding a needle in a haystack.”

    The challenge is that there are numerous tools for measuring and recording different aspects of the data, and each tool tends to create its own data silo.

    The final, critical tool pulls together all those silos into one high-quality repository, including the key metrics. In order to unify and analyze metrics, the tool must be flexible and customizable to the specific requirements of DoorDash or any other business.

    Data reliability and accuracy are also essential, as the quality of the data drives the quality of the entire system that builds upon that data.

    This is one way how Wavefront helps teams across DevOps to improve collaboration and productivity for business advantage – to move faster with more stability.

    It’s also critical that software developers, cloud operations, product marketers, and anyone else driving operational excellence be able to access metrics and data easily, defining analytics and alerts as necessary.

    That makes internal tools an important engineering product. “If we collect everything but people can’t search, it’s like finding a needle in a haystack,” says Mr. Tjahayadi. “On the DevOps side of things, this includes site reliability, developer productivity, and our production quality.

    We don’t want to slow people down, but we set the balance between moving fast and stability.” Good tools drive developer productivity and trust.

    Companies like DoorDash live and die by centralized, easily accessed, uniform data and metrics that everyone can see. There’s nothing that happens here without data. As noted by Jessica Lachs, Head of BizOps and Analytics, “In order to solve problems you have to go to the data, because that’s where the answers are.”

    Talk to Wavefront today if your organization can benefit from a more unified and scalable approach to metrics monitoring. If you prefer to get started first with a hands-on demo and trial of our metrics-as-a-service platform, click here.