Why explicit Integer conversion doesn't apply as expected

135 Views Asked by At

Please be aware that my question is not "Why do floating numbers lose precision" I am aware that not all fractional parts of a number can sit in binary form. What's really stored is actually the ''closest number '' occupied all Mantissa bit space.

Given:

//This is just an example.
float f1 = 17.96f, f3 = 17.98f;
double d2 = 17.96, d4 = 17.98;
printf("1. 17.96 = %f (as cents:%d)\n", price1, (int)(f1 * 100));
printf("2. 17.96 = %lf (as cents:%d)\n", price2, (int)(d2 * 100));
printf("3. 17.98 = %f (as cents:%d)\n", price3, (int)(f3 * 100));
printf("4. 17.98 = %lf (as cents:%d)\n", price4, (int)(d4 * 100));

the output is

1. 17.96 = 17.959999 (as int:1795)
2. 17.96 = 17.960000 (as int:1796)
3. 17.98 = 17.980000 (as int:1798)
4. 17.98 = 17.980000 (as int:1798)

I changed the printing format to the 25th decimal place to see what is the "real number" stored in memory. The output turns out

1. 17.96 = 17.9599990844726562500000000 (as int:1795)
2. 17.96 = 17.9600000000000008526512829 (as int:1796)
3. 17.98 = 17.9799995422363281250000000 (as int:1798)
4. 17.98 = 17.9800000000000004263256415 (as int:1798)

The question is why (int)(f1* 100) results in 1795 instead of 1796 according to the ''real stored number'' is 17.9599990844726562500000000.

But (int)(f3 * 100) results in 1798 instead of 1797.while''real stored number'' is 17.9799995422363281250000000. "the" real number * 100 equals 1797.99995422363281250000000. So after Integer Truncation, it's supposed to be 1797 but I got 1798.

1

There are 1 best solutions below

5
chux - Reinstate Monica On BEST ANSWER

Common float and double are some integer times a power of 2. See Dyadic rational.

17.96 and 17.98 are not representable exactly that way.

Instead a nearby representable number is used which is just above or just below the desired code floating point constant.

printf("%.52e %.52e\n", 17.96f, 17.98f);
printf("%.52e %.52e\n", 17.96, 17.98);
1.7959999084472656250000000000000000000000000000000000e+01 1.7979999542236328125000000000000000000000000000000000e+01
1.7960000000000000852651282912120223045349121093750000e+01 1.7980000000000000426325641456060111522674560546875000e+01

Scaling by 100 incurs additional rounding effects - perhaps up a tad or down a little. The products may be 1796, 1798 or just above or just below.

Recall 17.96f * 100 is not the same as math 17.96 * 100, but 17.95999908447265625f * 100.0f (which is 1795.999908447265625) and then that product is rounded to the nearest float: 1795.9998779296875.

printf("%.52e %.52e\n", 17.96f * 100, 17.98f * 100);
printf("%.52e %.52e\n", 17.96 * 100, 17.98 * 100);
1.7959998779296875000000000000000000000000000000000000e+03 1.7980000000000000000000000000000000000000000000000000e+03
1.7960000000000000000000000000000000000000000000000000e+03 1.7980000000000000000000000000000000000000000000000000e+03

[Edit]
OP added <"the" real number * 100 equals 1797.99995422363281250000000>
That is true math wise, yet it is a float multiplication with a float product. As 1797.9999542236328125 is not representable as a float, the result was rounded to 1 or 2 nearby floats: 1797.9998779296875 or 1798.0. With OP's FP rounding mode, it likely rounded to the nearest: 1798.0. Converting that to an int is 1798.


C23 optionally offers decimal floating point and that is best used for like decimal problems.


Do not use crude casts like (int) on the results of floating point calculation without a clear understanding of the edge effects.

Applying an int cast is questionable coding as the product calculations may result in 1796, or 1798 as the prior product was 1795.999... 1797.999... Using lround() would make more sense.


To well scale money using float (a poor idea), multiply by 100.0 (a double) and use llround() or lround().

To well scale money using double into an integer, multiply by 100.0L (a long double) and use llroundl() or lroundl().

The extra precision will help with edge cases.

printf("%ld %ld\n", lround(17.96f * 100.0), lround(17.98f * 100.0));
printf("%lld %lld\n", llroundl(17.96 * 100.0L), llroundl(17.98 * 100.0L));
1796 1798
1796 1798

Note 17.96f * 100 is like 17.96f * 100.0f - a float multiplication. That differs from 17.96f * 100.0 - a double multiplication.