I would like to understand how the memory sharing mechanism works in Haskell. Indeed, a way to program a function calculating the terms of the fibonnaci sequence is:
fibo' n = f n
where
f 0 = 1
f 1 = 1
f n = fibs !! (n-1) + fibs !! (n-2)
fibs = [f x | x <- [0..]]
This assumes for there to be an improvement that the fibs list is not evaluated multiple times but is shared between calls but I'm not sure about this assumption or how it works.
As a first approximation, you can just use variable scopes to predict what will be shared.
All the places that use
fibsare within the same scope, so each occurrence of that name resolves to the same object in memory. That means each timefis invoked within one one top-levelfibo'call, there is one singlefibslist. It's shared acrossfcalls.However,
fibsis defined in awherescope, attached to a function equationfibo' n =. That means the scope is "inside" each call of thefibo'(on some particularn); all the names defined in thewhereclause refer to different objects in memory each timefibo'is called. (This is just like how local variables - defined inside a function - work in any mainstream programming language)What you have here is a function that will work with a list of pre-computed Fibonacci numbers to save recomputation within one top-level call, but then will throw that away and start from scratch on the next top-level call.
That may be exactly what you want, so this isn't wrong per se. If
fibswas defined in a scope outside the top-level application, then it would be shared across all calls, but that also means it would permanently occupy memory to remain available for the next call. As objects in Haskell can expand in memory as they are evaluated more-deeply, this could be considered a memory leak. Throwing it away after each top-level call instead means multiple calls repeat some work, but you're only using the memory needed to speed up each call (rather than the memory needed for the largest call you have ever made), and only while it is running. So which way is "better" depends on what the rest of your program is doing with this function, not on this function itself.Note that none of the definitions in your
whereclause are actually using the arguments offibo', andfibo'itself isn't really "doing anything"; it just forwards the call immediately tof. So the only "interesting" effect of putting your code in awherelike this is creating a scope where you can definefibsinside the top-level call but outside the innerfcalls. (I'm considering the access control effects to be non-interesting). If you didn't want that, and did wantfibsto be shared across calls, it would be simpler to just define your code like this:Now
fibsis just a single object in the global scope, and will be shared across all calls (but will occupy memory across all calls too).If you do care about the access-control (nothing else can access
fibsif it's defined in a local scope, but can if it's defined in a global scope), you can instead do this:Here
fibs(andf) are defined in a local scope. But the equation which thewhereis attached to is justfibo''' = f. This local scope is still "outside" the top-level call, because the scope is "entered" beforefibo'''receives its argument. Whereas the originalfibo'has thewhereattached to an equation that has already brought the argumentninto scope, so thewherescope is "entered" after each top-level call is made.Now, I did start this post with "as a first approximation". The compiler is allowed to reorganise your code to try and optimise it. For example it could theoretically notice that
fibsdoesn't depend on the argumentn, and so move it outside the top-level call, making yourfibo'behave likefibo'''. Or (again theoretically) it could decide to inlinefibswhich would remove all sharing and make yourfibo'behave like a naiive recursive Fibonacci calculator.In practice, however, it is extremely unlikely to do either of these things. It's "allowed" to do any transformation that doesn't change the end result of your code, but the transformations the GHC developers actually put into the compiler are designed to make your code better. So they try very hard not to increase or reduce sharing in a way that causes large memory leaks or slowdowns.
So while the compiled and optimised code frequently is quite different from what you actually wrote, it's pretty safe to reason about your code under the assumption that "if you give something a name, all uses of that name within the same scope will be shared". You just have to be careful about when exactly local scopes are "entered" so that you know what counts as "within the same scope".