Profile, Optimize, Repeat: One Core Is All You Need™

Track:: PyData: Data Engineering (2024)
Type:: Talk (long session)
Level:: intermediate
Room:: Terrace 2A
Start:: 10:30 on 11 July 2024
Duration:: 45 minutes

Abstract

Your data analysis pipeline works. Nice! Could it be faster? Probably. Do you need to parallelize? Not yet.

Discover optimization steps that boost the performance of your data analysis pipeline on a single core, reducing time & costs.

This walkthrough shows tools to identify bottlenecks via profiling, and strategies to mitigate those, demonstrating them in an example. To improve our memory and runtime performance we will use numpy, numba jit-ing and pybind11 extensions.

Recording

Resources

Slides