Analysis, Event, Software Programs, Supercomputing Frontiers 2015

The Evils of Floating Point, and the Joys of Unum

It may come as a surprise to many that the way computers handle numbers is not very accurate. Indeed, it can be said that error is built into the very foundation of digital computers, and while the end user often does not see the result of these errors, they can be very problematic for programmers, scientists, engineers, and calculation intense industries such as money management and military operations.

At the recent Supercomputing Frontiers 2015 conference in Singapore, computer scientist John Gustafson outlined the problems with floating points in his keynote and later in an interview. Given the complexity — and severity — of the problem, it’s worth taking a second in-depth look at the issue.

The Problem

Developer Richard Harris, who wrote a series of articles on the dangers of floating point, said in one post, “The dragon of numerical error is not often roused from his slumber, but if incautiously approached he will occasionally inflict catastrophic damage upon the unwary programmer’s calculations. So much so that some programmers, having chanced upon him in the forests of IEEE 754 floating point arithmetic, advise their fellows against travelling in that fair land.”

Because computers – which are machines of precision and exactness – are often made to deal with unprecise and inexact numbers (such as pi, and irrationals), methods must be devised to compensate for computational error, and to make the end result as close to the correct answer as possible. One solution has been devised that is still in use today: floating point. Floating point is a method similar to scientific notation, which uses a decimal point, sign bit, and a number of exact digits to represent a number.

Since The IEEE Standard for Floating-Point Arithmetic was published in 1985, this standard has come to dominate the mathematical methods used by hardware and software engineers for the basic operations computers perform whenever running an application. Ideally, a one-size-fits-all standard such as this one would minimize error and promote uniformity of results across a broad spectrum of hardware.

Unfortunately, this has not been the practical result. Different processors and software packages, designed to handle floating point operations, often result in slightly different answers, due to rounding errors, and differing orders of operation.

One way that programmers often compensate is to use as many digits as possible to represent a number. In modern computers, this means that 32 – 64 bits of data are almost always used to represent a single floating point number. While modern computers are also very fast at calculations, this many bits must be stored and retrieved from memory, causing significant latency in calculations.

Furthermore, due to compounding error, traditional properties of algebra – such as the commutative and associative property – do not necessarily apply to floating point operations. In other words, (a + b) + c =/= a + (b + c), nor does c * (a + b) = c*a + c*b.

In the case of floating point, using these differing approaches often yields dissimilar results.