computer can use only two kinds of values. That is, fixed point and
floating point. The fixed point values are stored in the computer
memory in binary format representing their ASCII value.

example:-

‘A’ can be stored as- 1000001. Because, 65 is ASCII value of ‘a’.
In case of floating point values, these follow the IEEE 754
standard to store in memory. Whenever any programming language
declared-

*float a*; Then the variable 'a's value will be stored in memory by following IEEE 754 standard.
standard specifies the single precision and double precision format.
In case of C, C++ and Java,

*float*and*double*data types specify the single and double precision which requires 32 bits (4-bytes) and 64 bits (8-bytes) respectively to store the data.
have a look at these precision formats.

Precision:-

requires 32 bit to store. Following is the format of single
precision.

order to store a float value in computer memory, a specified
algorithm is followed.

an example at float value- 3948.125

- Covert 3948 to binary. i.e. 111101101100
- Convert .125 to binary,

0.125
x 2 = 0.25 0

0.25
x 2 = 0.5 0

0.5
x 2 = 1 1
0.001

3948.125 = 111101101100.001

- Normalize the number so that the decimal point will be placed after MSB-1. i.e.

111101101100.001
= 1.11101101100001 x 2

^{11}- Now, for this number s=0, as the number is positive.

= 11 and

= 11101101100001

- Bias for single precision used is 127 so,

exponent = exponent' + 127 i.e.

11 + 127= 138 = 10001010 in binary.

- Final value-

this format the number 3948.125 will be stored in main memory.

double precision values following changes are expected:

bits required – 64

– 11 bits

– 52 bits

value – 1023

if you want to find the IEEE 754 representation at any floating point
number, following program can be used.

**#include<stdio.h>**

**int binary(int n, int i)**

**{**

**int k;**

**for (i--; i >= 0; i--)**

**{**

**k = n >> i;**

**if (k & 1)**

**printf("1");**

**else**

**printf("0");**

**}**

**}**

**typedef union**

**{**

**float f;**

**struct**

**{**

**unsigned int mantissa : 23;**

**unsigned int exponent : 8;**

**unsigned int sign : 1;**

**} field;**

**} myfloat;**

**int main()**

**{**

**myfloat var;**

**printf("Enter any float number: ");**

**scanf("%f",&var.f);**

**printf("%d ",var.field.sign);**

**binary(var.field.exponent, 8);**

**printf(" ");**

**binary(var.field.mantissa, 23);**

**printf("\n");**

**return 0;**

**}**

Explanation-

function binary( ) is used to convert the number ‘n’ into binary
format and print its ‘i’ number of bits.

C, structure members can be specified with no. of bits with size. It
is known as

*bit**fields*. As ‘*f**loat f*’ is declared in ‘*union**myfloat*’. It can use 23 bits to store mantissa exponent can use 8 and sign can use one! The variable ‘*var*’ is at*myfloat*type. So, in order to access mantissa, we can use ‘*var.field*.*mantissa*’. Here, mantissa is the name of internal structure. So, float value’s internal bits can be accessed bitwise with*sign*,*exponent*and*mantissa*separately.
the program and see the output of the said example!

