Programming

The trouble with `floor` and `ceil`

floor and ceil have the bad habit of producing unexpected results. They aren’t broken, but in light of floating point nasties can often result in a number that’s ±1 of the desired result.

Why it happens

By example, floor can take a value that is effectively 18 and convert it to 17. Consider this python example:

1
2
3
4
5
import math

a = 17.99999999999
print a
print math.floor(a)

On my computer this outputs:

1
2
18.0
17.0

In other languages, with different formatters, we can take off several 9’s from a and still get the same output. It depends on how the floating point formatter rounds the result. Unless we’ve given a format specifier that preserves the exact value (which is almost never the default), the above can happen.

Despite a value of 18.0 being printed floorstill lowers the value to 17. This is correct since technically a is less than 18, but rarely the behaviour we actually want.

The point here is not to say the formatting is wrong, but just to point out a potential trouble point. Consider in this case that while a ~= parse(format(a)), with a small tolerance, it is not true floor(a) ~= floor(parse(format(a)) — it’s off by a large amount. The result is very different depending how we’ve handled the input value.

Floating point precision always produces results that can vary slightly. Normally these variations aren’t relevant, but floor can magnify the slightest inaccuracy into a very large difference.

Always add a range

In Fuse this problem comes up in a few locations, one of which is the layout code. To provide visual crispness, locations and sizes are snapped to exact pixels. Positions are rounded to the nearest value and the size is always rounded up.

A lot of calculations are involved in calculating a size; the precise floating point value that comes out can vary up/down slightly. For most values this isn’t an issue, but for a size hovering around an integer it can be. Two items that are logically the same size might end up with differing sizes of 17.999 and 18.001 pixels. It would be somewhat distracting if one ended up a full pixel larger on the display.

To get a reasonable ceil behaviour we subtract an epsilon:

1
2
3
4
5
const float pixelEpsilon = 0.005f;

float SnapSize(float p) {
    return Math.Ceil(p - pixelEpilson);
}

This results in any value up to 18.005 being snapped to 18, preventing a value like 18.001 from snapping to 19.

The epsilon of 0.005 was determined to be a good value for this domain. It is intended only as a safeguard against floating point inaccuracy. Anything higher and we’d start breaking the layout rule that sizes are rounded up. In our clipping code, which decides what to draw to the screen, the epsilon is a bit larger. That’s based on the assumption that something covering only a tiny fraction of a pixel will have no actual visual effects.

floor can be modified in the same way by adding the epsilon value:

1
2
3
float SnapDown(float p) {
    return Math.Floor(p + pixelEpsilon);
}

Update: Round-trip idempotence

The above examples are just to show the core concept and I admit don’t completely illustrate the problem. In Fuse we store positions and sizes in “device independent pixels”, called “points”. To do pixel snapping I might first convert to pixel space, round the value, and then convert back. That gives me this function (with the epsilon):

1
2
3
4
5
float SnapSize( float pts ) {
    var px = ConvertPointsToPixels(pts);
    px = Math.Ceil( px - pixelEpsilon );
    return ConvertPixelsToPoints(px);
}

Without a pixelEpsilon this function is not idempotent: SnapSize(x) != SnapSize(SnapSize(x)). floor itself is idempotent, but the precision of the conversion functions introduces a slight inaccuracy. The pixel cannot be represented exactly in point space, and thus when I convert back to pixels I may get a value slightly more than a full integer. Calling Ceil on this would then bump the number up again.

For our code it was important that this function is idempotent as it is called at various times during layout, and during layout animation between frames.

Think before you floating point

As with all things floating point, studious attention to the algorithm is required. We can’t just blindly use the floor and ceil functions and get the desired results. Including an epsilon in the calculation ensures that close-to-integer values remain at that integer.

11 replies »

  1. This is particularly amplified when data-serialization is involved. For example, using the example above, consider a computer sending a float value to another in JSON format (which likely involves sprintf-like functions when dealing with floats). The receiver gets 18.0, floors it to 18. The sender floors the native-type variable so gets 17. The two computers have now diverged significantly.

  2. Your SnapSize function has the same problem that you describe with Ceil, just at a slightly different place. Two items that are logically the same size might end up with differing sizes of 18.004 and 18.006 pixels. I assume it would be equally distracting if one ended up a full pixel larger on the display.

    That assumes that there is no particular reason to be close to an integral pixel size. If there is, then you should be rounding to that size.

    I agree that working with floating point is hard, but I think the problem you are describing in this particular case is that your pixels are too large. The solution, then, is to use smaller pixels when you can.

    • One of the key purposes of the epsilon is to ensure stability of the `ceil` operation. To continue the example, in layout the values are often calculate in various different manners, and sometimes the calculation is reversed. This may happen to check limits on the size, or to calculate a size based on the limits.

      This slight instability essentially made `ceil` a non-idempotent operation. That is, `ceil(x) != ceil( inv_f(f(x)))` because it’s possible that `inv_f(f(x)) != x` exactly. This is the same reason why unit tests don’t check floating point equality, but instead use a small range.

      This is perhaps one of the most significant cases I was solving in this example. I had a value that was already rounded, and I needed it to stay rounded to the same integer if put through the same calculation.

      Slawomir indicates the other issue in his comment, about serialization. Note in my first example the printed value is `18.0`. This is basically the same inversion problem where `parse(format(x)) != x` thus a `floor` done before or after parsing will have different results.

  3. You may know this already, but Perl 6 and Clojure try to sidestep part of this problem by having an actual Rational builtin numeric type. I’m sure other languages have that too, but I don’t know which ones.

    • In Leaf all source code constants are rationals. I convert them only as required to the destination type. This prevents the classice `1/2` producing 0 as an integer — Leaf either converts to a float 0.5, or produces an error if you try to assign to an integer.

      I will likely also have rationals at runtime, but this is a ways off yet. I’d also like to have high precision floating point (like 4Kbit/configurable) for situations where precision is preferable to speed.

  4. Somehow I don’t see the problem where you see it — “Despite a value of 18.0 being printed floor still lowers the value to 17.”. My conclusion is the formatter for print is too forgiving, not that floor is inaccurate — if you shave 17.999 you should get 17. After all it is not 17.(9) which is equal to 18, and thus in such case floor(18) == 18. But it is not this case.

    Btw. “In Leaf all source code constants are rationals”, how do you keep such numbers as “pi” then?

    • I’ve added an update to help clarify more where the problem is. It’s like Slawomir’s comment about round-trip: because my value is undergoind conversion between domains I do no retain the exact value. Thus I have to be careful calling floor or ceil multiple times.

      In Leaf pi will simply be a very high precision approximation. The goal of having rationals is to ensure the initial run-time value is as precise as is possible. So if you do `var v = pi / 3` you know that `v` is as is precise as possible. If `pi` was simply stored as a 64-bit float you could already have some rounding issues in the initial value. This is reason why other languages have constants like PI_2.

      I might also downgrade to high precision float in some cases (should the rational become too large). Perhaps a 2K floating point. This should be precise enough so that even constants like `4 * pi / 180` will be represented correctly even in 128-bit runtime floats.

  5. You’re barking up the wrong tree. 17.99999999999 is not effectively 18. It is exactly 17.99999999999. This article should have been about being careful with print formatting. (lol, cheers.)

    • The issue Slawomir points out with serialization is very important here. Two different computers can compute a different value of `floor` because the common JSON serializaiton has rounded the value.

      Saying the value is “exactly” something is also misleading, there simply are no exacts in flaoting point. Sure, it has a definitive byte respresenation, but that is merely an approximation of the value I actually want. This is just how floating point works, you never get the exact value you want, only an approximation of a real number.

      I’ve added a section about idempotence how this is relevant.

  6. “floor and ceil have the bad habit of producing unexpected results”, I rather say floor and ceil are fine it is your assumptions that make unexpected results.

    the example of Slawomir is clear in that there is nothing wrong with floor and ceil. ‘format JSON’ the formatted is the key here. You get an apple, you chew on it and expect an apple again?

    You clearly point to the curplit and that is != and == which are weak in the face of discrete floats !

    Also your case needs round not hacks around ceil and floor. Now round is subject to convention, is round(0.5) 0 or 1 !??

    • Functions are never used in isolation. My article talks about the unexpecting results of programming with floor and ceil, not about how floor/ceil themselves are broken. It’s clearly the way I’m using floor/ceil that is causing the problem, and that is exactly what I’m trying to demonstrate.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s