Round off errors and the Patriot missile

Twenty-eight Americans were killed on February 25, 1991 when an Iraqi Scud hit the Army barracks in Dhahran, Saudi Arabia. The Patriot defense system had failed to track and intercept the Scud. What was the cause for this failure?

The Patriot defense system consists of an electronic detection device called the range gate. It calculates the area in the air space where it should look for the target such as a Scud. To find out where the Patriot missile should be next, it calculates its location based on the velocity of the Scud and the last time the radar detected the Scud.

In the Patriot missile, time was saved in a fixed point register that had a length of 24 bits. Since the internal clock of the system is measured every one-tenth of a second, 1/10 expressed in a 24 bit fixed point register is 0.0001100110011001100110011 (the exact value of the representatPatriot missileion 0.0001100110011001100110011 of 1/10 in the 24-fixed point register is 209715/2097152) . As we can see that this is not an exact representation of 1/10. It would take infinite numbers of bits to represent 1/10 exactly. So, the error in the representation is (1/10-209715/2097152) which is approximately 9.5E-8 seconds.

On the day of the mishap, the battery on the Patriot missile was left on for 100 consecutive hours, hence causing an inaccuracy of 9.5E-8x10x60x60x100=0.34 seconds (10 clock cycles in a second, 60 seconds in a minute, 60 minutes in an hour).

The shift calculated in the range gate due to the error of 0.342 seconds was calculated as 687m. For the Patriot missile defense system, the target is considered out of range if the shift is more than than 137m. The shift of larger than 137m resulted in the Scud not being targeted and hence killing 28 Americans in the barracks of Saudi Arabia.Scud-B missile

When I started looking at the Google search results of the problem, I found some very useful resources that would be of interest to the reader. These go beyond the above given simplistic explanation of the problem and tell the story behind the story. Here they are

  1. This reference is the full GAO report of the investigation that resulted after the accident. “Patriot Missile Defense – Software Problem Led to System Failure at Dhahran, Saudi Arabia”, GAO Report, General Accounting Office, Washington DC, February 4, 1992.
  2. It should be pointed out that the Patriot missile was originally designed to be a mobile system and not used as a anti-ballistic system. In mobile systems, the clocks are reset more often. As per the article Operations: I Did Not Say You Could Do That! by Bill Barnes and Duke McMillin, here are some important observations: “It turns out that the original use case for this system was to be mobile and to defend against aircraft that move much more slowly than ballistic missiles. Because the system was intended to be mobile, it was expected that the computer would be periodically rebooted. In this way, any clock-drift error would not be propagated over extended periods and would not cause significant errors in range calculation. Because the Patriot system was not intended to run for extended times, it was probably never tested under those conditions—explaining why the problem was not discovered until the war was in progress. The fact that the system was also designed as an antiaircraft system probably also enabled the inclusion of such a design flaw, because slower-moving airplanes would be easier to track and, therefore, less dependent upon a highly accurate clock value.”

A student asked me why we did not use a clock cycle that could be represented exactly in the 24 bit register. Close to 1/10 is a number 0.125 that can be represented exactly as 0.001000000000000000000000 in a 24-bit register, and where 8 clock cycles would be equal to 1 second. I do not have an answer to this question but I intend to find out from my computer science colleagues.

This post brought to you by Holistic Numerical Methods: Numerical Methods for the STEM undergraduate at http://nm.mathforcollege.com

Myth: Error caused by chopping a number is called truncation error

Round off error is the error caused by approximate representation of numbers.

When people talk about round off error, it is the error between the number and its representation. For example 200/3 would be represented as 66.6667 in a six significant digit computer that rounds off the last digit. The last digit has been rounded up from 6 to a 7. The difference between 200/3 and 66.6667, that is, 200/3-66.6667 is the round off error.

If a computer is chopping off as opposed to rounding the last digit, the error caused is still called the round off error (caused by chopping). If a computer is using chopping, then for example, 200/3 would be represented as 66.6666 in a six significant digit computer. The difference between 200/3 and 66.6666, that is, 200/3-66.6666 is the round off error.

Where does the myth come from? Because if one is chopping off the number, students think that we are truncating a number, and hence the resulting error should be truncation error. No! No! That is still round off error. As a side note, there is something called truncating a number – a number if truncated is just the integer part of the number (example: truncating 20.568 gives 20; truncating 20.03 gives 20).

So what then is truncation error?

Truncation error is error caused by truncating a mathematical procedure.

Examples of truncation error abound and include

  1. In exact differentiation, you need dx approaching zero; in numerical differentiation we can only choose dx=finite.
  2. In exact integration, one would need infinite number of trapezoids to find the integral; in numerical integration, we can only choose a finite number of trapezoids.
  3. In the Maclaurin series for transcendental and trigonometric functions, we need infinite number of terms for exact solution; in a numerical solution, we can only choose finite number of terms.

So let us get this straight – round off error is caused by representing numbers approximately; truncation error is caused by approximating mathematical procedures.

For more details, read the textbook chapter on Sources of Error in Numerical Methods at http://nm.mathforcollege.com

This post is brought to you by Holistic Numerical Methods: Numerical Methods for the STEM undergraduate at http://nm.mathforcollege.com and the textbook on Numerical Methods with Applications available from the lulu storefront.

Subscribe to the blog via a reader or email to stay updated with this blog. Let the information follow you.