AWS Machine Learning Blog

Delivering real-time racing analytics using machine learning

AWS DeepRacer is a fun and easy way for developers with no prior experience to get started with machine learning (ML). At the end of the 2019 season, the AWS DeepRacer League engaged the Amazon ML Solutions Lab to develop a new sports analytics feature for the AWS DeepRacer Championship Cup at re:Invent 2019.

The purpose for these real-time analytics was to provide context and more in-depth experience with top competitors’ strategies and tactics. This helped viewers tangibly interpret how specific model strategy translated to on-track performance, which further demystified ML development and demonstrated its real-world application. This enhancement enabled fans to monitor the performance and driving style of competitors from around the world.

In this post, we share how we developed these analytics, deployed them into production, and delivered them to the fans.

Using insights from ML and classical statistics

Drawing from our expertise in motorsports, the ML Solutions Lab built a custom analytics suite powered by both ML and classical statistics.

A competitor’s momentum is a crucial indicator of future performance. For example, being on a hot streak can boost a your confidence as you record one blazing fast lap after another. A cold streak, however, can do the opposite and make it hard to stay on the track. We communicated this trend to fans by predicting a competitor’s next lap time. After comparing a diverse group of forecasting methods using AutoML on Amazon Forecast, we found that the Exponential Smoothing (ETS) algorithm produced accurate forecasts despite the small datasets we had available.

While metrics like lap time consistency allowed fans to interpret different driving styles during the Championship Cup opening day round of 64 competitors, it was unclear which styles would prevail in the second day bracket matchups. Would the consistent, surgically-precise style of Fumiaki triumph over the aggressive, breakneck pace of Sola?

Using simulated races, we could predict the winner of each matchup and help fans predict races that would go down to the wire. We took a statistical approach and modeled the lap times of each competitor with a probability distribution. By referencing data from the 2019 AWS DeepRacer Summits, we found that the distribution of lap times for a given competitor is generally right-tailed, improving over time. To capture this skew in the lap times, we fit a Weibull distribution using maximum likelihood estimation to find the optimal scale and shift parameters.

The following graphs show the distribution of lap times for the top three competitors.

We reframed the winner prediction problem as the likelihood of beating the fastest time of the better competitor (the tail probability) and used a Monte Carlo simulation to sample from each competitor’s Weibull distribution and calculate this likelihood.

Deploying the solution with a serverless architecture

To compute these insights in real time, we deployed our analytics suite to a low-latency, serverless architecture on AWS. The following diagram illustrates this architecture.

The architecture includes the following steps:

  1. As competitors complete laps, their times upload to an Amazon Relational Database Service (RDS) cluster running Amazon Aurora Serverless. By using RDS as a datastore, we could implement our entire analytics suite as a collection of lightweight, stateless AWS Lambda
  2. We exposed an interface to trigger the notifications so we could dynamically integrate our insights with the commentary at the MGM Garden Arena.
  3. The triggered Lambda function queries RDS to acquire historical data and uses it to compute the analytics.
  4. Our Lambda function uses a webhook to publish the insights to Amazon Chime, a business conferencing tool that supports instant messaging.
  5. A diverse group of stakeholders from social media, AWS DeepRacer TV, and production teams can use tablets with Amazon Chime installed to tune in to the analytics feed.

Going serverless allowed our team to iterate rapidly while having the scalability and fault-tolerance that production systems require. Our team of three built this architecture in just under 3 weeks. The architecture was extremely cost-effective to run because we didn’t incur charges for idling resources. After the Championship Cup, we didn’t even have any servers to turn off!

Using analytics in real time

Finally, to integrate our insights into the programming for the Championship Cup, we collaborated with the commentators to deliver the analytics at the right times. We used headsets with live audio supplied by the production team to listen for certain cues from the commentators, such as the following:

  • Interviewing a specific competitor
  • Calling out that a competitor is starting or finishing a lap
  • Noting major events, like setting a new world record or two closely matched competitors about to face off

Stay tuned for more in 2020

Our analytics gave fans an enhanced experience during the Championship Cup. But we’re not done yet. Over the course of re:Invent 2019, we captured over 12,000 lap times from over 500 competitors. This data will power even more advanced analytics in 2020 as AWS DeepRacer rolls out new cars and new competition formats.

We would like to give a big thank you to the AWS DeepRacer team for this amazing opportunity, and to the production, social media, and commentator teams for working with us to bring this to life! If you’d like help accelerating your use of ML in your products and processes, please contact the ML Solutions Lab.

Start your machine learning journey with AWS DeepRacer at www.awsdeepracerleague.com.


About the Authors

Ryan Cheng is a Deep Learning Architect in the Amazon ML Solutions Lab. He has worked on a wide range of ML use cases from sports analytics to optical character recognition. In his spare time, Ryan enjoys cooking.

Delger Enkhbayar is a data scientist in the Amazon ML Solutions Lab. She has worked on a wide range of deep learning use cases in sports analytics, public sector and healthcare. Her background is in mechanism design and econometrics.

Saman Sarraf is a data scientist in the AWS Machine Learning Solutions Lab. His background is in applied machine learning including deep learning, computer vision and time-series data prediction.