What's New in pandas 2.0
Explore the significant new features introduced in pandas 2.0, such as improved memory efficiency with PyArrow, enhanced nullable data types, broader NumPy dtype support, and optimized copy-on-write operations. Learn how these updates improve data handling and analysis in pandas.
New features in pandas 2.0
The pandas library version 2.0 was launched in April 2023 amidst plenty of fanfare and excitement after three years of development. Given the popularity of the library, the upgrade from pandas 1.0 to 2.0 comprises numerous key changes that greatly impact many users. Let’s take a look at some of the key new features introduced in pandas 2.0, which is the version we use in this course.
Improved performance and memory efficiency
The pandas 2.0 update introduced PyArrow (a Python library built on top of Arrow) as the backing memory format for DataFrames, which used to be based on inefficient NumPy data structures. With these new Arrow extension arrays and memory structures as the backend, there is a vast improvement in speed and memory utilization because we can leverage the C++ implementation of Arrow.
Previously, inefficient memory usage caused by the original ...