Equality Testing and the Halting Problem

Oct 18, 2022

It’s well-known that in programming, testing equality of floating point numbers is problematic. So, for example:

// ok bool areEqual(int a, int b) { return a == b; } // bad bool areEqual(float a, float b) { return a == b; }

In the latter case we won’t really get the behavior we “expect”, since floating point representations are not exact for most numbers. For example in Python:

>>> print('{0:.24f}'.format(1./2.)) 0.500000000000000000000000 >>> print('{0:.24f}'.format(1./3.)) 0.333333333333333314829616

We can see that while ½ can be exactly represented, ⅓ cannot be. This error can accumulate as we do various arithmetic operations, and so we might get two numbers that mathematically “should” be the same, but in the end are not. This can happen in very simple cases:

>>> .1+.2==.3 False

We can see the problem if we print out more of the digits, as before:

>>> print('{0:.24f}'.format(.1+.2)) 0.300000000000000044408921 >>> print('{0:.24f}'.format(.3)) 0.299999999999999988897770

This is quite well-known, and although the proper way to deal with it is actually quite tricky, in most cases it’s sufficient to compare with a tolerance:

// good enough most of the time bool areAlmostEqual(float a, float b) { return fbas(a - b) < FLT_EPSILON; }

But one question you might have is why is this the case? Does it really have to be that we can’t do exact comparison of floats? Is this some flaw in the IEEE 754 representation?

Solving the Halting Problem

Fundamentally, the problem is that there are only finitely many n-bit floating point numbers, but infinitely many real numbers. That means that for almost all real numbers we can only pick the closest possible floating point number, and this unfortunately happens to be true even for simple rational numbers like ⅓.

We could imagine other representations that let us exactly capture more real numbers. For example, we have an algorithm to compute 𝜋 to arbitrary precision. That means we could have a function that gives us the n-th digit of 𝜋. And we could consider this function the representation of 𝜋. For rational numbers this is even easier, since their decimal representations all repeat and so a function to give us the n-th digit is quite simple.

Once we have this new representation (or any other), we probably want to be able to perform some operations on it, and in particular we’d really like to be able to check equality, and stop that thorny issue we saw above. So let’s say that we have a new function areEqual() which can exactly determine (in finite time!) whether two real numbers in this representation are equal. What problems could this cause?

Well, it turns out, quite a lot. Indeed it turns out that with such a function we can solve the famous Halting problem, which is known to be unsolvable. How can we do this? Let’s say we have a program H that we want to determine whether or not it halts. We now construct a new representation of a real number in our system, as follows:

int nthDigit(int n) { bool doesHalt = doesHaltInNSteps(H, n); if (doesHalt) { return 1; } return 0; }

So we basically run H for n steps, and use that to determine the n-th digit of our number. This means we can now check areEqual(nthDigit, 0). By our assumption, this runs in finite time. But note that if it’s true then we know H doesn’t halt, and if it’s false then H does halt. So we’ve solved the halting problem!

The method used here was specific to the representation I described, but a similar method works for any representation (e.g., another common one is as Cauchy sequences of rationals). Essentially the problem is that any program can only examine a finite number of digits, but two real numbers might need to have infinitely many digits examined before we can determine if they are equal (if they are not equal then of course we will know in finite time, but otherwise can never be sure that the next digit won’t be the one that differs).

This means that fundamentally we can’t check equality of real numbers in finite time, and this is true of every representation we may use, including IEEE 754 floating point. So if you thought that the annoyance of having to check ‘approximate equality’ was some kind of technical limitation, this may make you happy that we haven’t been suffering for nothing. Or it might make you sad that we’re basically doomed forever.

Thank you for reading Catching the Biggest Fish. This post is public so feel free to share it.

Catching the Biggest Fish

Equality Testing and the Halting Problem

Solving the Halting Problem

Discussion about this post