Iterating Real-time Assignment Algorithms Through Experimentation

DoorDash operates a large, active on-demand logistics system facilitating food deliveries in over 4,000 cities. When customers place an order through DoorDash, they can expect it to be delivered within an hour. Our platform determines the Dasher, our term for a delivery driver, most suited to the task and offers the assignment. Complex real-time logistic fulfillment systems such as ours carry many constraints and tradeoffs, and we apply state-of-the-art modeling and optimization techniques to improve the quality of assignment decisions.

While the field of delivery logistics has been well studied in academia and industry, we found the common methodologies used to optimize these systems less applicable to improving the efficiency of DoorDash’s real-time last-mile logistics platform. These common methodologies require a stable prototype environment that is difficult to build in our platform and does not allow for the accurate measurement of the algorithm change.

To address our specific use case, we designed an experiment-based framework where operations research scientists (ORS) dive into the assignment production system to gain insights, convert those insights into experiments, and implement their own experiments in production. Our framework allows us to rapidly iterate our algorithms and accurately measure the impact of every algorithm change.

Common solutions to logistics optimization problems

In our previous roles as ORS developing optimization algorithms in the railway and airline industries, we normally worked in a self-contained offline environment before integrating algorithms into the production system. This process consists of three stages: preparation, prototyping, and production (3P).

Workflow diagram showing 3P model — Figure 1: A typical workflow for algorithm optimization in industry involves a three-step process, Preparation, Prototyping, and Production.

Preparation:
- Collecting business requirements: ORS need to work with business and engineering teams to collect the requirements, such as how the model will be used (as decision support or in production system), the metrics to optimize, and constraints that must be followed.
- Find data: ORS need to understand what kind of data is needed for modeling and what data is practically available.
- Making assumptions: Given the requirements, ORS must make assumptions on the requirements that are unclear or on data when they are not available.
Prototyping
- Once we have the requirements, find data, and make appropriate assumptions, we can create a model to solve the problem.
- Once the model is available, sometimes a prototype environment needs to be built to help iterate on the model. The prototype environment can be as simple as a tool to calculate and visualize all the metrics and solutions, or as complex as a sophisticated simulation environment to evaluate the long term effect of a model or algorithm.
- During the iteration process, ORS may need to work with a business or operations team to validate the solutions from the model and make sure that the results are consistent with expectations.
Production
- When the model has been validated in a prototype environment, the engineering team needs to re-implement and test the prototype model and algorithm. ORS will need to validate the re-implementation and make sure the performance of the production model is similar to the prototype. In certain rare scenarios, ORS may work with the engineering team to validate the model through an experiment.
- Normally the roll-out would be first performed in a small scale in subsets of the locations or small duration of time to validate the impact.
- After the model is fully rolled out in production, ORS, as well as the business and operations team, will monitor the metrics to make sure that the new model achieves the desired results without breaking other metrics. This measurement is essentially a pre-post observational study.

The challenges of applying the 3P framework

In the real-time food delivery environment, we find it extremely hard to apply the 3P modeling approach. Our quickly evolving production system makes it difficult to maintain a self-contained environment for modeling, a necessity for the preparation phase. As a fast-growing business, our engineering team is adding more and more new requirements to the system. Plus, software engineers are constantly looking to optimize the efficiency of code, which may impact how the data is processed and how the assignment decisions are post-processed.

The challenges in prototyping are even larger. The key of the 3P framework requires creating a self-contained environment so that ORS can get accurate feedback from the environment to iterate on the model. However, this is much harder for our real-time logistics problem because real-time dispatch data comes in continuously. We need to make an assignment decision for deliveries within minutes, and Dashers, as independent contractors, may accept or decline delivery assignments.

To cope with this volatile environment, the optimization decision needs to be made continuously over time based on continuously updated information. This creates many challenges in creating the prototyping environment. For example, given that information such as Dashers’ decisions come continuously and assignment decisions are made over time, every assignment decision may have a dramatic impact on future decisions. To make the model prototype possible, an accurate and elaborate simulation system needs to be built, which is as hard as, if not harder, than solving the logistic problem itself.

Finally, it is difficult to measure the production impact of new models through pre-post analysis. Given that the supply, demand, and environmental traffic conditions are highly volatile, the metrics fluctuate a lot day over day. The high volatility makes it difficult to measure the exact production impact through pre-post analysis.

A framework to iterate our real-time assignment algorithm

To address the issues we face in the 3P solution methodology, we developed a framework which incorporates experimentation, enabling us to develop, iterate, and productionize algorithms much faster. In this framework, each ORS gains a deep understanding of the production codebase where the algorithm lives, relentlessly experiments with new ideas on all aspects of the algorithm (including its input, output, and how its decisions are executed), and productionizes their own algorithms. The framework not only increases the scope and impact of an ORS, but also increases the cohesion between the algorithm and the production system it lives in, making it easier to be maintained and iterated.

In this new framework, algorithms develop following three steps: preparation, experiment, and production.

Workflow diagram showing modified optimization model — Figure 2: Our new framework for algorithm iteration replaces the prototyping step of the 3P process with an experimentation step. This new step lets us develop and test algorithms in the production environment.

Preparation: The first step is to dive into the codebase where the algorithm lives, and gain insights into how assignment algorithms could be impacted by each line of code. As a fast growing business, our codebase for assignment service evolves quickly to cope with new operational and business requirements. Accordingly, the owner of the assignment algorithm, the ORS in most cases, can not effectively iterate on the algorithm without a thorough understanding of the codebase. Besides, deep knowledge into the codebase will enable ORS to see beyond the model and focus on the bigger picture, including the input preparation, algorithm output post-process, and engineering system design.

Our experience suggests that refining those processes may be more fruitful than narrowly focusing on the algorithm or modeling part. For example, our order-ready time estimator is essential to making good assignment decisions to avoid delivery lateness and excessive Dasher wait times at merchants. Instead of treating the order-ready time as a given fixed input into the algorithm, we relentlessly work with our partners on the Data Science team to refine prediction models for order-ready time so as to improve assignment decisions.

Experiment: With deep understanding of the assignment system, ORS are able to propose improvements that cover a broad range of areas:

Better preparation of the input data for the core algorithm
Refining the MIP model, including the objective function and constraints
Finding new information for more informed decisions
Execution of the algorithm output
Improving engineering design that may hurt solution quality

These ideas can be validated through analysis of our internal historical data or results from a preliminary simulator.

After validating the idea, the ORS need to set up and run an experiment in production. We used the switchback experiment framework which divides every day into disjoint set time windows. At each time window and geographic region (e.g. San Francisco is a region), we randomly select the control algorithm (incumbent algorithm) or the treatment algorithm (our proposed change). Given that our decisions are real time and the lifespan of a delivery is normally within an hour, the window size can be as short as a few hours. Short window size allows us to get our experiment results within a few weeks.

Through numerous trials, we find it most effective to have the ORS implement most of their own experiments in the production system. Software engineers only step in when changes are complex and involve deep engineering knowledge. Our rigorous code review process ensures that changes made by ORS are subject to our engineering practice and do not cause production issues. This approach has the following benefits:

It dramatically reduces the barrier to implement new experiment ideas since it eliminates the communication and coordination overhead between ORS and software engineers.
Most experiments may require not only understanding of the assignment system, but also domain knowledge like optimization or machine learning. Our experience suggests that sometimes a small difference in the implementation can have dramatic consequences on the assignment quality. This process can make sure that the algorithm designer’s intention is fully reflected in the implementation. It also makes the experience analysis process much more efficient since ORS know every detail about their own implementation.

Production: Normally, it takes around two weeks to get enough data for each of our experiments. If the experiment achieves good tradeoff between delivery quality and delivery efficiency, the change will be rolled out in production. Given that the new algorithm is implemented as an experiment, rolling out the change is straightforward and takes little time. This manner of productionization has the following benefits:

Compared to the traditional 3P framework, it dramatically shortens the lead time between when an algorithm is validated and the algorithm is fully rolled out in production.
It almost eliminates the discrepancy between the validated algorithm and productionized algorithm.
In our new framework, any changes in our assignment algorithm are measured rigorously in switchback experiments, and the impact of the change is accurately measured. In the 3P framework, the pre-post observational study has many pitfalls like the unanticipated confounding effect (for more information, refer to chapter 11 of the book, Trustworthy Online Controlled Experiments, by Ron Kohavi, Diane Tang, and Ya Xu).

If the experiment doesn’t work as intended, we normally perform deep dives into the experiment result. We examine how the algorithm change impacts every aspect of our assignment metrics and process and try to find a new iteration that fixes issues in the previous version. Given that ORS implements the experiment, they can connect the implementation to any metric changes in the assignment process, provide better insights into the results, and propose a new iteration.

Conclusion

With the challenges of DoorDash’s on-demand real-time logistics system, ORS find it difficult to apply the 3P framework to develop and iterate models and algorithms. Instead, we work at the intersection of engineering and modeling: we seek to thoroughly understand the production system, iterate the assignment algorithm through switchback experimentation, and productionize our experiments. With the new framework, we improve our assignment algorithm at a much faster pace and accurately gauge the impact of every algorithm change.

Given that ORS have knowledge in both modeling (including ML and optimization) and software engineering, they can serve as adaptors to connect data scientists to the production system. Data scientists normally are not aware of how their models are used in production and may not be aware of scalability and latency issues in their model. ORS with deep knowledge in the production system can help other data scientists shape their modeling ideas so that they fit better into the production system.

diagrams comparing a production system with and without an ORS — Figure 3: Rather than have data scientists hand off models to be implemented into the production system, operations research scientists work within the production system, iterating models on real-time data.

Our new framework is an extension of the DevOps movement. Instead of working in a self-contained prototype environment offline, ORS integrate modeling and algorithm iterations into the day-to-day software engineering process. This integration helps increase efficiency in many aspects: it reduces the communication and coordination overhead to make algorithm changes and it allows the algorithm designer or owner to maintain the algorithm. As a result, the whole process dramatically reduces the time it takes to form an experiment idea, shape it into a testable experiment, and detect and fix any errors in the algorithm design.

What’s next?

Armed with the experiment-based framework, we have been improving our real-time assignment algorithm over time. With diminishing returns on our incremental gains, we are working closely with our Experiment and Data Science teams to increase the experiment power so that we can measure smaller improvements. Given the proven power of our framework, we believe we can apply it to solve many other real-time problems at DoorDash.

Header photo by Alina Grubnyak on Unsplash.

Iterating Real-time Assignment Algorithms Through Experimentation

Common solutions to logistics optimization problems

The challenges of applying the 3P framework

A framework to iterate our real-time assignment algorithm

Conclusion

What’s next?

About the Author

Related Jobs

Similar Blogs

Path to high-quality LLM-based Dasher support automation

How DoorDash is pushing experimentation boundaries with interleaving designs

Growing Your In-House Legal Career