Hacker News new | past | comments | ask | show | jobs | submit login

The probably most easiest way to increase floating point accuracy is to use decimal floating point. The input data usually is in decimal and needs only a small part of the accuracy that a decimal float format offers. Using them later in calculations will make them use more accuracy, but the calculations will be exact for much longer.

When converting decimal to binary floating point numbers you will often use the full accuracy of the float format because the decimal floating point numbers can't be represented exactly in binary.




Decimal floating point just changes how cancellation/rounding appears, it doesn't do anything to increase accuracy. E.g. suppose you're using a decimal float with 1 digit of precision (i.e. it can have 1.0, 1.1, ... 9.9 as the mantissa), and compute 100 + 1 - 100 (i.e. 1.0e2 + 1.0e0 + -1.0e2): 100 + 1 is 101, which rounds to 100 (1.0e2), and 100 + -100 is of course 0. Complete catastrophic cancellation.

In fact, binary floats use their storage better than decimal floats, so a binary float of a given size will have slightly higher accuracy than a decimal one of the same size. For example, there's 90 mantissae to encode in the example above, requiring a minimum of 7 bits, with a machine epsilon of 0.01. Using those 7 bits for a binary float gives an smaller epsilon: 1/128 ≈ 0.0078.

Of course, decimal floats do have the advantage of matching a common input/output format, and their rounding artifacts better match the intuition of us base-10 accustomed humans.


> In fact, binary floats use their storage better than decimal floats, so a binary float of a given size will have slightly higher accuracy than a decimal one of the same size.

Not necessarily. The inefficiency is coped with by using less data for the exponent (which thus offers a smaller range). 64 bit decimal floats actually even have more data for the mantissa than their 64 bit binary counterpart.

> Decimal floating point just changes how cancellation/rounding appears, it doesn't do anything to increase accuracy. E.g. suppose you're using a decimal float with 1 digit of precision (i.e. it can have 1.0, 1.1, ... 9.9 as the mantissa), and compute 100 + 1 - 100 (i.e. 1.0e2 + 1.0e0 + -1.0e2): 100 + 1 is 101, which rounds to 100 (1.0e2), and 100 + -100 is of course 0. Complete catastrophic cancellation.

It also changes how often it occurs. With binary floats it already happens when you do 0.1 + 1.0 (I don't mean the rounding error of converting 0.1 to binary).

When you enter 1.0 + 0.1 - 1.0 e.g. in python you will get 0.10000000000000009 as a result which is not equal to what you get when you enter 0.1. With decimal floats this doesn't happen.

With every calculation you do the amount of precision needed can rise. With decimals converted to binary floats you use all of the precision right from the beginning. With decimal floats you can often stay easily within the precision offered by the data type.


> The input data usually is in decimal and needs only a small part of the accuracy that a decimal float format offers.

I would actually argue that majority of the time data is not in decimal. For example in computational science, for which the Herbie seems to be aimed at, you very rarely have decimal numbers. While input parameters to computations might be a decimal numbers, everything else apart from initial conditions would be irrational numbers. A good example would be numerical solving of harmonic oscillator equation -- the initial conditions might very well be decimal, but the numerical solution is not (neither would be analytic solution).

> When converting decimal to binary floating point numbers you will often use the full accuracy of the float format because the decimal floating point numbers can't be represented exactly in binary.

Inability to exactly represent decimal numbers isn't really the problem in these cases. Summation of numbers with wildly varying magnitudes would be problematic for decimal floats as well.


> I would actually argue that majority of the time data is not in decimal.

The majority of time data is not even floating point and for most use cases floats don't make sense there (you usually have a set precision you want to have and don't have varying orders of magnitude)

Almost all operating systems and programming languages use metric fractions of the second for time (https://en.wikipedia.org/wiki/System_time#Retrieving_system_...) the only common technology that I know that doesn't is NTP.

> in these cases

I was leaving a more general remark




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: