Floating point precision affected when converting DataFrame to list

2k Views Asked by At

So I'm trying to convert a float DataFrame to a list (of list) by using row.values.tolist() (row was read from a CSV file). It does the job pretty okay though for a few values the precision is being affected, so, for instance, instead of 32.337 it's outputting 32.336999999999996.

Since tolist() yields a list of list and I need to work with lists, I decided to switch to list(row.values.flatten()), but it introduces precision issues for almost any value in the list, which just makes it worst.

I found a discussion on Github about this issue, from almost 3 years ago, but I can't find anything else up to date and I have no idea how to overcome this.

I tried using pd.set_option('display.precision',4) since for most values I need a maximum of four significant digits, but this isn't working either (assuming I'm using it right).

Is there any workaround for this?

3

There are 3 best solutions below

0
dagrha On

As Padraic alluded, you are not modifying the correct option with display.precision. Instead, try:

pd.options.display.float_format = '{:,.3f}'.format
0
SamB On

The "loss of precision" you're seeing is due to the fact that binary floating-point can't precisely represent decimal fractions, so there's some rounding error. If you really want to pass decimal values through unchanged, you'd get better results using an actual decimal representation...

Unfortunately, NumPy doesn't seem to provide any decimal datatypes for you to use.

0
Sylhare On

Eventhough, you might still have a problem with precision depending on the base values. You can still use round to specify the amount of decimal you wish to have.

df = pd.DataFrame([(.2132481, .399452), (.012311, .13267), (.613216, .01233), (.213211, .181235)])
df.values.round(3).tolist()

>> [[0.213, 0.399], [0.012, 0.133], [0.613, 0.012], [0.213, 0.181]]

The 3 in .round(3) goes for three decimals.