How Python implements super-long integers

Share

In Python, long integers (also known as "arbitrary-precision integers" or "bignums") are implemented using a variable-length integer representation. You can work with extremely large integers without worrying about overflow issues. This is particularly useful in cases where precise calculations involving very large numbers are required, such as cryptography, number theory, and scientific computations.

On the other hand, programmers working with languages like C/C++ must be mindful of memory management and selects suitable variable types (e.g., int or long) to prevent integer overflow problems.

Integer representation in Python

In Python, integers can be represented using a struct. An illustration of this integer representation is given below:

struct {
sssize_t ob_refcnt;
struct _typeobject *ob_type;
ssize_t ob_size;
uint32_t ob_digit[n];
};
Integer representation

The element ob_refcnt is used in Python’s garbage collector and the ob_type is used for type identification. In this case, the object is an integer. These two variables are not particularly significant when discussing integer representation in Python.

The representation of the integer value in Python relies on two other variables which are the ob_digit and ob_size

  • Python uses the ob_digit array to store individual digits of the number at distinct index positions.

  • The ob_size variable serves a dual purpose. First, stores the array's length nn and indicates the sign of the integer (whether it is positive or negative).

  • The uint32_t is a data type in C/C++ that represents an unsigned 32-bit integer.

  • The ssize_t is a signed integer data type in C/C++ commonly used on POSIX-compliant systems to represent the size of objects, typically 64 bits on modern systems and 32 bits on older systems.

The technique of representing integer values using a sequence of digits through strings or arrays is called Bignum Arithmetic.

Base system

Usually, we rely on the base 10 system in our daily lives, but it has a drawback when dealing with large numbers due to limited space. In Python, we cannot utilize all 32 bits to represent an integer effectively because of memory constraints, allowing a maximum of 30 bits for representation. To address this limitation, we convert numbers to base 2302^{30}, meaning all digits in the array have values ranging from 00 to 23012^{30}-1 which is 10737418231073741823.

Moreover, this integer representation system in Python stores the array in little-endian orderLittle-endian is a byte order in which the least significant byte of a multi-byte data is stored at the lowest memory address., meaning the least-significant value comes first (at a lower index). To clarify, suppose the numbers were stored in base 10 instead of base 2302^{30}, the number 432432 represented using the array would be: <2,3,4><2,3,4>.

Example

Let's say we want to represent the number 24868541023353707297502486854102335370729750 in Python. Our very first step will be to convert this large number into base 2302^{30}. The idea is similar as we convert base 10 to other bases such as hexadecimal etc.

24868541023353707297502486854102335370729750 in base 2302^{30} would be represented as <45259030,2250912,2157><45259030,2250912,2157>because 45259030×(230)0+2250912×(230)1+2157×(230)2=248685410233537072975045259030 \times (2^{30})^{0} + 2250912 \times (2^{30})^{1} + 2157 \times (2^{30})^{2} = 2486854102335370729750.

Representing a very large number in Python
Representing a very large number in Python

Note: If we were to represent 2486854102335370729750-2486854102335370729750 (negative of original number) we would simply change the variable ob_size from 3 to -3.

Code

We will now see how Python can handle very large numbers, whereas other languages like C/C++ will overflow and won't be able to deal with large numbers. We will explore whether it is possible to represent the extremely large number 24868541023353707297502486854102335370729750 in C/C++ or Python and demonstrate whether performing a square of this number is feasible or not.

largeNumber = 2486854102335370729750
print("Large number:",largeNumber)
print("Square of large number:",largeNumber*largeNumber)
Representing extremely large integers in different programming languages

As shown, in Python, we can easily represent extremely large integers and perform various operations on them, ensuring feasibility and accurate results. However, in C/C++, representing such large integers is not possible due to overflow limitations, leading to a loss of precision and incorrect outcomes.

Limitations

A limitation of very large numbers in Python is that they can consume a significant amount of memory and lead to slower computation times. Although Python supports arbitrary-precision arithmetic, handling extremely large numbers can become impractical in terms of performance and resource usage.

Conclusion

Python3 offers capabilities for effectively representing and handling large numbers, enabling arithmetic operations with high precision even for numbers that exceed the limits of other common programming languages. Python's flexibility and built-in support for arbitrary-precision arithmetic make it an ideal choice for dealing with extremely large numerical values.

Copyright ©2024 Educative, Inc. All rights reserved