For disclosure, I'm rather new to Haskell and am figuring out the syntax. Let me illustrate my question with an example. Let's say you have a function which returns a tuple of several values or a data type with multiple fields.
let
tuple = functionWhichReturnsTuple
value1 = fst tuple
value2 = snd tuple
in
otherFunction value1 value2
Once the line in the in block is executed, would functionWhichReturnsTuple be called twice, once for value1 and once for value2? If so, what would be a better method of extracting multiple values at once without initiating multiple function calls?
This is indeed a subtle point. The key facts to remember here are:
From the point of view of the result, i.e. the semantics, it is irrelevant how many times
functionWhichReturnsTupleis called: even if thefunctionWhichReturnsTupleis called twice, it will always return the same result since Haskell is referentially transparent. This is the main feature of pure functional programming: functions always return the same result when called with the same arguments.From the point of view of performance, instead, it does matter. Calling a function more times means recomputing the same result, so it can slow down the program dramatically. If that happens at every step of a recursive function, the complexity might blow up from linear to exponential.
Now, the compiler tries to optimize performance, hence it will prefer avoiding recomputing results when possible. Except in a few obscure cases, it will compute definitions
x = ....only once. Let's consider your code:In Haskell, computation is triggered from outside, when a value is demanded: it might be because of pattern matching, because of some strict function (like
+onInts), because we are going to print a value on screen, etc. Let's sayvalue1is demanded. GHC then computesfst tuple, which in turn will demandtupleto be computed, forcing the evaluation offunctionWhichReturnsTuple. When that, say, returns(2,73)GHC effectively rewrites the definition to store the result:In this way, if later on
value2is demanded as well, we can access the already-computedtupleand get73. No recomputation of the function.(Nitpick: above I neglected the fact that, because of laziness, tuples might have only one component evaluated. But that's not really relevant for this question.)
Finally, above I mentioned "a few obscure cases". There are cases where the optimizer can not call the function only once. That might happen when the function is polymorphic ("generic"). Consider this code:
Here, types are polymorphic (
forall a . ...), and it's perfectly possible that later on we choose two distinct types forvalue1andvalue2!For instance in
print (value1 :: Int, value2 :: Double).In this case, we must compute
(1+2)both onInts and onDoubles. Optimization can not prevent that.dupalso has to be called twice.The problem here is that polymorphic values may look like plain values but are functions in disguise.
tupleabove acts like a function, expecting to be passed the typea, and returning a tuple inside that type. Above we saw that the definition oftuplewas rewritten, but that only happened because it was a non-function value: GHC does not attempt to rewrite the definition of functions, since next time they might be called with different arguments. Since polymorphic values are disguised functions, they do not get rewritten and they are recomputed at each use.Lastly, note that the Haskell definition does not mandate that recomputing happens or that it does not happen. The GHC optimizer is completely free to do anything, so what we discussed above is not completely set in stone. Usually, the optimizer does a great job. It always avoids to recompute non-polymorphic values. In some cases, it might even avoid recomputing polymorphic ones, but we should not rely on that too much.