I stumbled across an interesting optimization question while writing some mathematical derivative functions for a neural network library. Turns out that the expression a / (b*c) takes longer to compute than a / b / c for large values (see timeit below). But since the two expressions are equal:
- shouldn't Python be optimized both in the same way on the down-low?
- is there a case for
a / (b*c), given that it seems to be slower? - or am I missing something, and the two are not always equal?
Thanks in advance :)
In [2]: timeit.timeit('1293579283509136012369019234623462346423623462346342635610 / (52346234623632464236234624362436234612830128521357*32189512234623462637501237)')
Out[2]: 0.2646541080002862
In [3]: timeit.timeit('1293579283509136012369019234623462346423623462346342635610 / 52346234623632464236234624362436234612830128521357 / 32189512234623462637501237')
Out[3]: 0.008390166000026511
Why is
a/(b*c)slower?(b*c)is multiplying two very big ints with unlimited precision. That is a more expensive operation than performing floating point division (which has limited precision).Are the two calculations equivalent?
In practice,
a/(b*c)anda/b/ccan give different results, because floating point calculations have inaccuracies, and doing the operations in a different order can produce a different result.For example:
It boils down to how a computer deals with the numbers it uses.
Why doesn't Python calculate
a/(b*c)asa/b/c?That would give surprising results. The user ought to be able to expect that
should have the same result as
so it would be a source of very mysterious behaviour if
a / (b*c)gave a different result because it was magically replaced bya / b / c.