PyData London 2022

A Hitchhiker’s Guide to MLOps
06-19, 10:15–11:00 (Europe/London), Tower Suite 2

Bringing Machine Learning (ML) applications to a live production phase comes with all the same challenges of traditional software development, and more. Examples are: large datasets, tracking data quality and models quality, experiments reproducibility, and monitoring a live application. This talk is a grounded introduction to monitoring the ML lifecycle with only open source software.


Bringing your Machine Learning (ML) project from your local Jupyter notebook is often not that obvious. The set of tasks needed for "productionalizing" an ML project are now called MLOps.

Although MLOps shares similarities with traditional software development and IT operations (DevOps), it also introduces new challenges such as large datasets, ongoing experiments and the unpredictable behavior of live ML models. One of the most common pitfalls is releasing a ML application in the wild without monitoring it, only to find out it has gone out of control.

But don't panic! Following your ML application in its lifecycle is easier than what you think with Python and already available open source software.

As a part of this practical introduction I will cover:

  • Traditional ML development VS MLOps driven ML development (3 mins)
  • Key concepts and common pitfalls in MLops (5 mins)
  • MLFlow for data and model tracking in Python (5 mins)
  • MLFlow for experiment reproducibility (5 mins)
  • Grafana for human friendly monitoring of a live ML application (5 mins)
  • Conclusions and examples on how to contribute to these projects (2 mins)

Why should you attend this talk?

Anyone interested with developing ML applications in a production environment, such as ML Engineers, Data Scientists, or ML product owners. I specifically focus on tips and tricks that will help you getting started with monitoring ML applications.

Who can attend this talk?

To maximize the value of this talk, it is recommended that you are familiar with the basics of machine learning and Python programming.


Prior Knowledge Expected

No previous knowledge expected

Heya!

I am Davide, Machine Learning Operations Engineer at Massive Entertainment – A Ubisoft Studio, and Pythonista at heart.

I have spent the past 7 years working with data science, both researching and developing machine learning applications and data platforms. My main interest is in bridging the gap between development and production in machine learning world.
Eventually I decided to go back to one of my original passions and landed in the videogames industry, excited to get for Avatar: Frontiers of Pandora and Star Wars released soon!

When I am not making or playing videogames, I like to keep myself active biking, bouldering, practicing yoga, and finally learning how to swim.
I am also an active community builder in and out IT, with a strong interest in local NGOs communities in my city, Copenhagen. It is not hard to find me around the city making food with my non-profit restaurant One Bowl.