Float vs. double in Java

A data type in a programming language classifies and represents data in a particular category that determines the type of values that can be stored in a variable. Different programming languages offer different data types including integers, floating-point numbers, characters, strings, and boolean numbers. In this blog, we’ll focus on floating-point numbers, more precisely on the differences between float and double.

The difference between float and double is in their precision. Float is a 32-bit single-precision floating-point type, whereas double is a 64-bit double-precision floating point.

However, there is more to learn about these data types, such as how are they stored? What are their use cases? And more importantly, are there any differences between float and double in Java?

The two representations of a floating-point number#

A number in a binary system consists of three parts:

Sign: Represents whether the number is positive or negative. It is usually represented by a bit, with $0$ indicating a positive and $1$ indicating a negative number.
Integer: Represents the whole number that appears before the decimal point.
Fraction: Represents the fraction that appears after the decimal point.

Fixed-point representation of a binary number uses a fixed number of bits for integer and fraction parts. Although convenient, fixed-point representation has limited precision and depends on the number of bits allocated to the fractional part.

Take an example of a 16-bit representation of $3.1416$ where we fix $7$ bits for the integer and $8$ bits for the fraction after the decimal point. The remaining $1$ bit is reserved for the sign bit.

The sign bit will be $0$ since the number is positive.
The integer $3$ is converted into binary as $0000011$ .
The fraction $0.1416$ is converted into binary as $00100100$ .

Another approach could be to fix the number of bits for the number and another set of bits to indicate the position of the decimal point within that number. This is called floating-point representation. We call the number without a decimal point a mantissa and the position of the decimal place an exponent. Floating-point representation is beneficial for applications in which the range of the values is large and the precision requirements are high.

Let’s continue the previous example and see how $3.1416$ is represented in a 16-bit floating-point representation. Considering $10$ bits for the mantissa and $5$ for the exponent, the sign bit will be $s=0$ since the number is positive. For the mantissa, we start with a binary equivalent of $3.1416 = 11.00100100$ , which can be written as $11.00100100 \times 2^1$ . Now the mantissa will be $1100100100$ , and the exponent is a binary representation of $1=00001$ .

Float vs. double in Java#

Having understood the concept of floating-point representation, it becomes easy to distinguish between float and double. As stated earlier in this blog, the primary contrast between float and double lies in their precision.

According to the IEEE 754 standard, a float is a 32-bit binary format, while double is a 64-bit binary format. The difference in the number of bits used for an exponent and a mantissa is summarized in the table below:

Now that we know the distribution of bits in float and double, we can determine the range of the data types — the maximum and minimum values that can be stored.

A float can store decimal numbers ranging from approximately $\pm 1.5 \times 10^{-45}$ to $\pm3.4 \times 10^{38}$ .
A double can store decimal numbers ranging from approximately $\pm5 \times 10^{-324}$ to $\pm1.7 \times 10^{308}$ .

Applications#

Let’s talk about some common applications in which float and double are used.

Float#

In general, applications that require less precision, are bound by processing power, or are limited by storage are suitable for using float instead of double. Some common examples of these applications are as follows:

Mobile devices: Storage is usually limited in mobile devices, making float an obvious choice. Float requires less memory and is more efficient at processing power than double.
Time-critical systems: Time-critical systems are often constrained by latency, making them an obvious use case for using float. A typical example is a self-driving car, in which faster processing and low processing latency are critical. Note that using float will make the processing faster at the expense of precision.
Graphics and audio processing: Since float has less precision, it’s also suitable for graphics and audio processing — applications in which it can provide enough precision.

Double#

Since double provides greater precision, the use cases are distinct compared to float. Here are some examples in which double would be suitable instead of float:

Financial calculations: Since precision is a key here, double is preferred in financial calculations to avoid rounding errors.
Scientific computing: Another use case for double is in scientific computations requiring accuracy. Examples include physics simulations, statistical simulations, climate modeling, etc.
Defense systems: A vital application in which precision is essential is defense systems. This is because in defense systems, like missile guiding systems, representing coordinates is critical and significantly impacts the accuracy of the results.

The diagram below briefly summarizes how to choose between a float and a double. Gauging the requirements of the underlying application can help you choose.

Java

public class Main {
    public static void main(String[] args) {
        float exp_result = 1.0f;
        float fraction = 1.0f / 10.0f;
        System.out.println();
        float sum = 0.0f;
        for (int i = 0; i < 10; i++) {
            sum += fraction;
        }
        System.out.println("Expected Result: " + exp_result);
        System.out.println("Actual Sum     : " + sum);
        if (exp_result == sum)
            System.out.println("The expected result is equal to the calculated result");
        else
            System.out.println("The expected result is not equal to the calculated result");
    }
}

Here, we append f or F to the value to declare a float on lines 3–4. We define a fraction on line 4 and add this fraction ten times to the variable sum on lines 9–11 in a for loop. Finally, the expected and actual results are compared on lines 16–19.

Note: The output will not change even if we change the variable types from float to double on lines 3–4. Go ahead and try it yourself!

The output shows that the actual result is different from the expected result. This is because the rounding errors accumulate over time.

Java

public class Main {
    public static void main(String[] args) {
        float exp_result = 1.0f;
        float fraction = 1.0f / 10.0f;
        float tolerance = 0.000001f;
        System.out.println();
        float sum = 0.0f;
        for (int i = 0; i < 10; i++) {
            sum += fraction;
        }
        System.out.println("Expected Result: " + exp_result);
        System.out.println("Actual Sum     : " + sum);
        if (Math.abs(exp_result - sum) < tolerance)
            System.out.println("The expected result is equal to the calculated result");
        else
            System.out.println("The expected result is not equal to the calculated result");
    }
}

Learn to Code: C++ for Absolute Beginners

C++ is a versatile language known for its efficiency and flexibility, widely used in industries like game development, finance, and system programming. This course dives deep into C++ programming foundations, focusing on practical problem-solving techniques. You will start by learning basic problem-solving skills using simple programs and execution sheets. Then, you'll explore decision-making, branching, loops, and manipulation of strings and arrays using C++ programming. Finally, the course will cover functions and complex program structures, ensuring a comprehensive grasp of the language's capabilities. By the end, you will be equipped with problem-solving skills, a solid understanding of C++ basics, and confidence in writing structured code, setting you on the path to becoming a proficient C++ developer.

8hrs

Beginner

4 Challenges

4 Quizzes

Learn Java from Scratch

This course focuses exclusively on teaching Java to beginners and demystifies procedural programming by grounding each concept in a hands-on project. You’ll start by learning built-in input and output methods, followed by user-defined methods. As you progress, you'll explore basic data types and apply them in sequential, selective, and iterative program structures. Finally, you'll use these concepts to complete an engaging project. By the end, you'll develop a fascination with Java programming, making it an excellent start to a career in computing for anyone looking to learn Java.

6hrs

Beginner

60 Playgrounds

5 Quizzes

Building Blocks of Coding: Learning Python

This course is designed for you to learn Python from scratch, making it ideal for anyone interested in Python programming for beginners. Using Edward the robot to gamify concepts, you'll explore Python programming fundamentals, from built-in functions to user-defined functions and basic data types. You’ll also learn how to write programs using sequential, selective, and iterative structures. By completing hands-on projects, you'll gain the skills needed to kickstart your career as a Python developer and become a lifelong learner in computing.

10hrs

Beginner

80 Playgrounds

2 Quizzes

	Float	Double
Sign	1	1
Exponent	8	11
Mantissa	23	52
Total	32	64

Float vs. double in Java

The two representations of a floating-point number#

Float vs. double in Java#

Applications#

Float#

Double#

Rounding errors#

Using tolerance#