How to convert strings to bytes in Python

A byte object is a sequence of bytes. These byte objects are machine-readable and can be directly stored on the disk. Strings, on the other hand, are in human-readable form and need to be encoded so that they can be stored on a disk.

There are several different types of encodings that can map text to the disk. The most popular ones are the​ ASCII and UTF-8 encoding techniques. ​

Convert strings to bytes

In this section, we will explore two different method to convert a string to bytes.

Using the bytes() class

We can use the built-in Bytes class in Python to convert a string to bytes: simply pass the string as the first input of the constructor of the Bytes class and then pass the encoding as the second argument.

Printing the object shows a user-friendly textual representation, but the data contained in it is​ in bytes.

string = "Hello World"
# string with encoding 'utf-8'
arr = bytes(string, 'utf-8')
arr2 = bytes(string, 'ascii')
print(arr,'\n')
# actual bytes in the the string
for byte in arr:
print(byte, end=' ')
print("\n")
for byte in arr2:
print(byte, end=' ')
  • Line 4: We defined a variable with the name of arr, where we stored the utf-8 encoded string.

  • Line 5: We defined a variable with the name of arr2, where we stored the ascii encoded string.

  • Line 10&11: We defined a loop to iterate over each byte in arr and print it out, followed by a space, effectively displaying each byte individually.

  • Line 13&14: We defined a loop to iterate over each byte in arr2 and print it out, followed by a space, effectively displaying each byte individually.

Using the encode() method

In this section we will use the encode() a built-in method to convert strings to bytes.

string = "Hello World"
# Convert string to bytes using encode() method
arr = string.encode('utf-8')
arr2 = string.encode('ascii')
print(arr,'\n')
# Actual bytes in the string (UTF-8)
for byte in arr:
print(byte, end=' ')
print("\n")
# Actual bytes in the string (ASCII)
for byte in arr2:
print(byte, end=' ')
  • Line 4: We defined a variable with the name of arr, where we stored the utf-8 an encoded string that we generated by using the encode() method.

  • Line 5: We defined a variable with the name of arr2, where we stored the ascii an encoded string that we generated by using the encode() method.

Copyright ©2024 Educative, Inc. All rights reserved