Optimizing Manipulations Using Cython
Explore how to enhance pandas data manipulation by using Cython to compile functions for native speed improvements. Understand when to use apply in string operations, how to optimize with typed Cython code, and leverage vectorized methods and regular expressions for efficient data processing.
We'll cover the following...
The previous example uses apply, and it’s clear by now that we don’t prefer that method because it’s slow. Let’s divert from strings for a minute and look at making the apply operation quicker using Cython.
Cython is a superset of Python that can compile to native code. To enable it in Jupyter, you’ll need to run the following cell magic:
Then you can define functions with Cython. We’re going to “cythonize” the between function as a first step:
When we benchmark this, it’s no faster than our current code. If we add types to Cython code, we can get a speed increase. We’ll try that here:
Because we’re ...