Since numbers can either be fixed point or floating point, at the same time positive and negative; the entire classification of number representation methods is as under:
In digital architecture Fixed point representation is a method of storing numbers in binary format. It is widely used in DSP products for telecommunications. One reason to use fixed point format (rather than floating point) is for cost savings in the digital signal processing chips designed for implementing a system
We have the following hierarchy of representations in fixed point numbers:
A Fixed point number is nothing but a REAL number with decimal point placed to extreme right of the number. We use a significant size of register to store this kind of numbers.
A negative number is represented by using any of the three following ways:
In this representation, again we use a significant size of register so that every bit gets represented easily. However the MSB bit position in the register works as SIGN bit. It should always be "1" in order to dictate that number as NEGATIVE. Other bit positions are filled with the binary value of the number simply in POSITIVE Format.
In this representation, again we use a significant size of register. Again the MSB bit position in the register works as SIGN bit. However the number itself is not represented normally. Instead, it's represented in it's 1's compliment format.
Thus all that we have to do is, compute the 1's compliment of that number in n-digits in order to fit it into an n-bit register and place every bit to it's respective bit position in the register from right to left. Since the MSB bit is always "0" for a positive number, it'll be converted to "1" in the 1's compliment of thereby making that number a NEGATIVE number.
In this representation, again we use a significant size of register. Again the MSB bit position in the register works as SIGN bit. However the number itself is neither represented normally nor in 1's compliment. Instead, it's represented in it's 2's compliment format.
Thus all that we have to do is, compute the 2's compliment of that number in n-digits in order to fit it into an n-bit register and place every bit to it's respective bit position in the register from right to left. Since the MSB bit is always "0" for a positive number, it'll be converted to "1" in the 2's compliment of thereby making that number a NEGATIVE number.
A floating-point number (or real number) can represent a very large (1.23x10^{88}) or a very small (1.23x10^{-88}) value. It could also represent very large negative number (-1.23x10^{88}) and very small negative number (-1.23x10^{88}), as well as zero, as illustrated:
A floating-point number is typically expressed in the scientific notation, with a fraction [ m ], and an exponent [ e ] of a certain radix [ r ], in the form of [ m x r^{e} ]. Decimal numbers use radix of 10 [ m x 10^{e} ]; while binary numbers use radix of 2 [ m x 2^{e} ].
Normally, IEEE-754 standard is used for representation of Floating Point Numbers in 32 bits. In this floating-point representation:
NOTE: The fact behind adding 127 to the exponent is a bit more interesting. Since IEEE hasn't defined the exact side where the fractional point should be moved while normalization. A left movement will produce a +ve exponent while a right movement will produce a -ve exponent. Now to make sure that the exponent is always +ve while storing in the register, we do add 127 ( 2^{8} / 2 ) to our produced exponent.