A computer can only store data in the form of bytes. In order to store anything on your computer, you must convert it in a form that a computer can understand and store. This conversion is called the encoding of data. Therefore, in order to store data, the user must first encode it to a bytes-like object. Let’s look at a few types of data and their relevant encodings:
Type | Encoding(s) |
---|---|
Music | MP3, WAV |
Text | ASCII, UTF-8 |
Image | JPEG, PNG |
In Python, a string object is a series of characters that make a string. In the same manner, a byte object is a sequence of bits/bytes that represent data. Strings are human-readable while bytes are computer-readable.
Data is converted into byte form before it is stored on a computer. It is vital that you encode data before sending it during client-server transmissions. Let’s look at an example of how to encode and decode a string in python.
newstr = "Hello World"
newstr_bytes = newstr.encode("ascii")
print(newstr_bytes)
When you run the code above, it will give you:
b'Hello World'
The b
represents the string in bytes in ASCII form.
In order to decode it back into its original form, you may use the decode()
method. Look at the code below:
newstr_decode = newstr_bytes.decode("ascii")
Encoding and decoding are inverse operations. Data must be encoded before storing in on a computer. Data must be decoded before a human reads it. These byte-like objects are used across various operations and require data to be in binary form, e.g., compression, file transfer, data storage, socket programming, etc.
# a string that we will convert to bytesstr_string = "Educative"# convert the string to using bytes with ascii encoding# parameters of function bytes# 1. The string that needs to be converted# 2. The specified encoding e.g ascii, uft-8 etc.str_bytes = str_string.encode("ascii")# Will print the string but as bytesprint("encdoed = ", str_bytes)# The type will represent a byte objectprint(type(str_bytes))# represents the ascii encodings of each character in converted stringprint(list(str_bytes))list_bytes = list(str_bytes)for b in list_bytes:print(chr(b), "is represented by",b)# decoding the stringstr_decode = str_bytes.decode("ascii")print("decoded = ", str_decode)