Real Numbers

>> Wednesday, October 7, 2009

How the real numbers are represented in computer ?

Real Numbers

To use real numbers in commands of the Intel processor (the arithmetic coprocessor[3]), they must be represented in computer memory in the normalized form. In general, the normalized form of a number appears as follows:


  • A = (NSM×Nq

Here, NS designates the number sign; M stands for mantissa, which usually meets the < 1 condition; N is the base of the numeral system; and q is the exponent, which might be positive or negative. Numbers represented this way are often called floating-point numbers. Consider a practical example of a floating-point number. Try to represent 5.75 in the normalized form. First, it is necessary to convert this number into the binary notation. This task is trivial: 5 in binary notation will be represented as 1001, and 0.75 equals (1/2) + (1/4). In other words, 5.75 = 1001.11B. Furthermore, 1001.11B = 1.00111 × 23. Thus, the normalized number will comprise the following components: NS = +1, M =1.00111, N=2, and q = 3. Note that when using such a representation, the first number of the mantissa always equals one; consequently, it is possible to do without storing it. Intel format is based on this possibility. In addition, it is necessary to bear in mind that the q exponent is stored in the memory in the form of a sum with a certain number, to ensure that it is always positive. The Intel processor can work with the following three types of real numbers:


  • Short real number — For storing a short real number, 32 bits are allocated. Bits 0-22 are reserved for the mantissa. Bits 23–30 are intended for storing the q exponent added to the number 127. The last bit, bit 31, is intended for storing the number sign (if this bit is set to one, then the number is negative; otherwise, the number is positive).


  • Long real number — Here, 64 bits are allocated for storing such a number. Bits 0-51 are reserved for storing the mantissa. Bits 52-62 are intended for storing the q exponent added to 1024. The last bit, bit 63, determines the number sign (if this bit is set to one, then the number is negative; otherwise, the number is positive).


  • Extended real number— For storing such numbers, 80 bits are allocated. Bits 0-63 are intended for storing the mantissa. Bits 64-78 store the q exponent added to 16,383. The last bit, bit 79, is intended for storing the number sign (if this bit is set to one, then the number is negative; otherwise, the number is positive).

Consider a practical example illustrating representation of a floating-point number in the memory. Assume that the following variable is declared in some program written in C:

float a = -80.5;

The float type corresponds to the short real number. This means that its memory representation will take 32 bits. Now, try to view the memory using the standard approach. Here are 4 bytes that represent the previously mentioned number:

00 00 a1 c2

To make this representation easily understandable, convert it into the binary representation:

00000000 00000000 10100001 11000010

To make this representation more understandable, rewrite it starting from the most significant byte to emphasize the mantissa, exponent, and sign:

11000010 10100001 00000000 00000000

Now, separate the mantissa. Recall that 23 bits are allocated for storing it. Thus, the following binary number will be obtained: 0100001. Note that mantissa bits are counted starting from the most significant one (in this case, this is bit 22). The trailing zeros are discarded because the whole mantissa is located to the right of the decimal point. However, the obtained number doesn't represent the mantissa exactly. As already mentioned, the first number of the mantissa is always equal to one; consequently, there is no need to store it. Thus, when using Intel representation, this one should be restored. Therefore, the following number will represent the mantissa: 1.0100001B. The sign of the whole number is negative because bit 31 is set to one. As relates to the exponent, it must be obtained from the 10000101B binary number. In decimal system representation, this will equal 133. To obtain the exponent for a short real number, subtract 127 from this value; the result will be 6. Thus, to obtain a real fractional number from the mantissa, the decimal point must be shifted six positions to the right. The result will be 1010000. 1B. In hex notation, this is 50.8H; if you convert this number to decimal notation, the result will be 80.5.

To have hands-on practice, consider the following sequence of bytes:

00   80   FB   42

Try to prove that this sequence of bytes corresponds to the representation of 125.75.

On the basis of the material in this section, it is possible to conclude that if real numbers are used in a program, they might become approximate before any actions are carried out over them. This is because all real numbers must be normalized before they can be written into the memory.

1 comments:

booktraining.net October 19, 2009 at 12:55 PM  
This comment has been removed by a blog administrator.