Recall
In this chapter, we’ve looked at the following topics:
-
The ways to encode strings into bytes and decode bytes into strings. While some older character encodings (like ASCII) treat bytes and characters alike, this leads to confusion. Python text can be any Unicode character and Python bytes are numbers in the range 0 to 255.
-
String formatting lets us prepare string objects that have template pieces and dynamic pieces. This works for a lot of situations in Python. One is to create readable output for people, but we can use f-strings and the string
format()
method everywhere we’re creating a complex string from pieces. -
We use regular expressions to decompose complex strings. In effect, a regular expression is the opposite of a fancy string formatter. Regular expressions struggle to separate the characters we’re matching from meta-characters that provide additional matching rules, like repetition or alternative choices.
-
We’ve looked at a few ways to serialize data, including Pickle, CSV, and JSON. There are other formats, including YAML, that are similar enough to JSON and Pickle that we didn’t need to cover them in detail. Other serializations like XML and HTML are quite a bit more complex, and we’ve avoided them.
Synopsis
We’ve covered string manipulation, regular expressions, and object serialization
in this chapter. Hardcoded strings and program variables can be combined into outputtable strings using the powerful string formatting system. It is important
to distinguish between binary and textual data, and bytes
and str
have specific purposes that must be understood. Both are immutable, but the bytearray
type can be used when manipulating bytes.
Regular expressions are a complex topic, and we only scratched the surface. There are many ways to serialize Python data; pickles and JSON are two of the most popular.
Get hands-on with 1300+ tech skills courses.