New features in `pandas` 2.0

The pandas library version 2.0 was launched in April 2023 amidst plenty of fanfare and excitement after three years of development. Given the popularity of the library, the upgrade from pandas 1.0 to 2.0 comprises numerous key changes that greatly impact many users. Let’s take a look at some of the key new features introduced in pandas 2.0, which is the version we use in this course.

Improved performance and memory efficiency

The pandas 2.0 update introduced PyArrow (a Python library built on top of Arrow) as the backing memory format for DataFrames, which used to be based on inefficient NumPy data structures. With these new Arrow extension arrays and memory structures as the backend, there is a vast improvement in speed and memory utilization because we can leverage the C++ implementation of Arrow.

Previously, inefficient memory usage caused by the original ...

Before We Begin

Reading Data into pandas

Combining Data

Reshaping and Manipulating Data

Encoding Data Types

Handling Numerical Data

Handling Categorical Data

Handling Text Data

Handling Time Series Data

Handling Sparse Data Structures

Handling Missing Data

Data Analysis and Visualization with sidetable and Bokeh

Leveraging Further Features of pandas

Utilizing Extended Libraries

Wrap Up

Appendix

Time Series Analysis and Visualization Using Python and Plotly

What's New in pandas 2.0

New features in `pandas` 2.0

Improved performance and memory efficiency

Data Analysis and Visualization with sidetable and Bokeh

Time Series Analysis and Visualization Using Python and Plotly

What's New in pandas 2.0

New features in pandas 2.0

Improved performance and memory efficiency

New features in `pandas` 2.0