Search⌘ K

Unicode and Bytes Data Type

Explore how Python represents text using Unicode and handles binary data with the bytes type. Understand code points, encoding schemes such as UTF-8, and how to convert between strings and bytes. This lesson prepares you to work effectively with text and binary data in Python programming.

We'll cover the following...

Unicode

Unicode is a standard for the representation, encoding, and handling of text expressed in all scripts of the world.

It is a myth that every character in Unicode is two bytes long. Unicode has already exceeded 65536 characters—the maximum number of characters that can be represented using two bytes.

In Unicode, every character is assigned an integer value called a code point, which is usually ...