Largest number that can be stored in a floating word of 7 bits

QUESTION: What is the largest base-10 positive number that can be stored using 7 bits, where the 1st bit is used for the sign of the number; the 2nd bit for sign of the exponent; 3 bits for mantissa, and the rest of the bits for the exponent?

ANSWER: Remember the base is 2.
1st bit will need to be zero as the number is positive.

2nd bit will need to be zero as that will make the exponent positive as 2^positive. number will give higher number than 2^negative number.

The mantissa bits will need to be 111 as you are looking for largest number and that will give the number to be 1.111 (the 1 before radix point is automatic) in base of 2 or 1*2^0+1*2^(-1)+1*2^(-2)+1*2^(-3)=1.875 in base of 10.

Now the exponent: it uses 2 bits. This will need to be 11 in base 2 and that is 3 in base 10. So the exponent part is 2^(+3)=8.

Largest number is +1.875*8=15

Now think what will give you the smallest positive number.

_______________________________________________

This post is brought to you by

A Floating Point Question Revisited

QUESTION: A machine stores floating point numbers in 7-bit word. The first bit is stored for the sign of the number, the next three for the biased exponent and the next three for the magnitude of the mantissa. You are asked to represent 33.35 in the above word. The error you will get in this case would be
(A) underflow
(B) overflow
(C) NaN
(D) No error will be registered

The solution to problem is given here.

However a student asked me a follow up question, and here is the answer.

QUESTION: I was doing the multiple choice question and I am having trouble understanding it. I looked at the solution but I am having trouble still. I began by turning 33.35 into binary and i get 100001.01011. I just am having trouble putting it into the format. The max exponent value is 4 in this case but in the solutions it says you need 5. Maybe I do not understand what underflow and over flow is exactly.

ANSWER: The solution is given as you have pointed out.

The binary number in fixed format needs to be converted to floating point format. That would be 100001.01011=1.0000101011*2^5 as you move the radix point by 5 places to the left.  We move that 5 places as it gives us only one non-zero digit now to the left of the radix point.  This is no different from the procedure you use for converting a decimal format to scientific format for base-10 numbers.

Now all floating point formats have an upper limit of number it can represent.  Since the biased exponent has 3 bits, the biased exponent that can be represented is from 0 to 7, which means the unbiased exponent that can be represented is from -3 to 4 (biasing by +3, and unbiasing by -3).  But since we need to represent an unbiased exponent of 5, it cannot be done.  The maximum unbiased exponent that can be represented is 4.  So the number is larger than the one that can be represented.  If you put 32 ounces of water in a 24-ounce cup, we say that the water overflowed.  In this case, the number will overflow as it is more than it can handle.

You can see this in a different way as follows (looking at a solution a different way; that always helps the brain and your long-term memory).

The maximum number you can represent in binary for the given 7-bit word is 0111111 and that translates to (1.111)2*2^(111)2 which in base 10 is equivalent to (1.875)*2^(7-3)=30 (the 3 is used for unbiasing the exponent).  Hence, 33.35 would overflow, just like when you put  32 ounces of water in a 24-ounce cup.

_____________________________________________________

This post is brought to you by

Machine epsilon – Question 3 of 5

In the previous blog posts, we answered

Here we answer the next question.
Future posts will answer these questions
Question 4 of 5: What is the significance of machine epsilon for a student in an introductory course in numerical methods?

Question 5 of 5: What is the proof that the absolute relative true error in representing a number on a machine is always less than the machine epsilon?
________________________

_________________

This post is brought to you by

A Wolfram demo on converting a decimal number to floating point binary representation

Here is another Wolfram demo. This one converts a decimal number to a floating point binary representation.  To play with the demo, download the free CDF player first.
 
The total number of bits used for the representation =
       one bit for the sign of the number +
       one bit for the sign of the exponent +
       number of bits for the exponent +
       number of bits for the mantissa +
       As an example, how would 54.75 be represented in a 9-bit register where the first bit is used for the sign of the number, second bit is used for sign of exponent, next three bits are used for the exponent, and the last four bits are used for the mantissa?
Both the number and the exponent are positive. 
As the number is normalized to lie between 1 and 2 (the interval being half-closed at the bottom and half-open at the top), the leading binary digit is always 1. So we do not actually use it in the representation of the mantissa. Hence the mantissa bits are 1011. Moreover the exponent bits are 101, the sign of the number bit is 0, and the sign of the exponent bit is 0.
Therefore the representation is .
 

Reference: Floating Point Representation

This post is brought to you by

Holistic Numerical Methods: Numerical Methods for the STEM undergraduate at http://nm.mathforcollege.com, the textbook on Numerical Methods with Applications available from the lulu storefront, the textbook on Introduction to Programming Concepts Using MATLAB, and the YouTube video lectures available at http://nm.mathforcollege.com/videos.  Subscribe to the blog via a reader or email to stay updated with this blog. Let the information follow you.

Converting large numbers into floating point format by hand

__________________________________________________

This post is brought to you by Holistic Numerical Methods: Numerical Methods for the STEM undergraduate at http://nm.mathforcollege.com, the textbook on Numerical Methods with Applications available from the lulu storefront, and the YouTube video lectures available at http://nm.mathforcollege.com/videos and http://www.youtube.com/numericalmethodsguy

Subscribe to the blog via a reader or email to stay updated with this blog. Let the information follow you.