Skip to main content

Profile, Optimize, Repeat: One Core Is All You Need™

Track:
PyData: Data Engineering
Type:
Talk (long session)
Level:
intermediate
Room:
Terrace 2A
Start:
10:45 on 10 July 2024
Duration:
45 minutes

Abstract

Your data analysis pipeline works. Nice! Could it be faster? Probably. Do you need to parallelize? Not yet.

Discover optimization steps that boost the performance of your data analysis pipeline on a single core, reducing time & costs.

This walkthrough shows tools to identify bottlenecks via profiling, and strategies to mitigate those, demonstrating them in an example. To improve our memory and runtime performance we will use numpy, numba jit-ing and pybind11 extensions.


The speakers

Valentin Nieper

Valentin Nieper

Valentin is a software and machine learning engineer at scalable minds. He works on implementing the newest models for biological image analysis and makes sure the data analysis pipeline scales on the cluster.

Jonathan Striebel

Jonathan Striebel

Jonathan is a senior ML software engineer at Aignostics in Berlin, Germany. He works on machine-learning pipelines for medical image analysis, ensuring scalability and maintainability.