Sorting Lists

Learn about the sort objects in Python lists by defining custom comparison functions, allowing flexible sorting rules.

Overview

Without any parameters, a list object’s sort() method will generally do as expected. If we have a list[str] object, the sort() method will place the items in alphabetical order. This operation is case-sensitive, so all capital letters will be sorted before lowercase letters; Z comes before a. If it’s a list of numbers, they will be sorted in numerical order. If a list of tuples is provided, the list is sorted by considering the elements in the tuple in order. If a mixture containing unsortable items is supplied, the sort will raise a TypeError exception.

Comparing objects

If we want to place objects of classes, we’ve defined ourselves into a list and make those objects sortable; we have to do a bit more work. The special __lt__() method, which stands for less than, must be defined on the class to make instances of that class comparable. The sort() method on the list will access this method on each object to determine where it goes in the list. This method should return True if our class is somehow less than the passed parameter and False otherwise.

When we need comparisons like this, we’ll often use a dataclass. As we discussed in the previous lesson, the @dataclass(order = True) decorator will assure that all of the comparison methods are built for us. A named tuple also has the ordering operations defined by default.

Tagged union

One tricky situation that arises with sorting is handling a data structure, sometimes called a tagged union. A union is a description of an object where attributes are not always relevant. If an attribute’s relevance depends on another attribute’s value, this can be seen as a union of distinct subtypes with a tag to distinguish between the two types.

Example

Here are some example data where a tag value, the “Data Source” column, is required to decide how best to deal with the remaining columns. Some values of “Data Source” tell us to use the timestamp, whereas other values tell us to use the creation date.

Get hands-on with 1200+ tech skills courses.