double precision on linux using fpu_control.h

789 Views Asked by At

i am trying to port a particular piece of code from solaris to Linux. During the process i found that the precision on linux is different and it is in extended precision and we need to set it to double precision explicitly. To achieve this found fpu_control.h library, functions FPU_GETCW and FPU_SETCW functions. But even after that the precision is not being set properly. the code snippet

long double power = 1.0;
#ifdef __linux
    fpu_control_t mask;
        _FPU_GETCW(mask);
mask &= ~(_FPU_EXTENDED & _FPU_SINGLE);
mask |= _FPU_DOUBLE;
        _FPU_SETCW(mask);    

   power *= 0.1;
#endif

when i print power the value is power = 0.1000000000000000055511151231257827

however I was expecting power to have an value 0.1 Also i have use -DDouble while compiling. Can someone point me whats going wrong.

2

There are 2 best solutions below

1
tzot On

You specifically request a long double, while you supposedly want plain double. If your hardware is an Intel x86/x86-64 CPU, calculations going through the FPU are performed on 80-bits precision.

Otherwise: try using something like the gcc flag: -mfpmath=sse, which will stop using the FPU and your operations will be performed with 64-bit (aka double) precision.

Note:

It is very possible that even in Solaris you were getting an inexact representation for 0.1 (there isn't an exact one), but the way the value was output hid this inexactness by printing up to a specified number of decimal digits.

2
chux - Reinstate Monica On

I was expecting power to have an value 0.1

Not generally possible to fulfill OP's expectation.


double and long double cannot store every possible number.
double can encode exactly about 264 different numbers as it is usually uses 64 bits.
long double can encode exactly maybe 264, 280 or 2128 different numbers.

With typical double, 0.1 cannot be encoded exactly as a double. It is not one of those 264 exact numbers. Instead double x = 0.1 will initialize x with the closest alternative:

Exact value        0.1000000000000000055511151231257827021181583404541015625
OP's printed value 0.1000000000000000055511151231257827

The next close alternative is

0.09999999999999999167332731531132594682276248931884765625

This is not a double vs long double issue.