First up, despite this being an implementation of the Sieve of Eratosthenes this is not a homework question. I can find implementations of the Sieve in many intro books :).
The question I have is that around my confusion around referential transparency, which was one of the virtues of functional programming.
My understanding was that I could replace f(x) anywhere in my program with y, provided y = f(x) and I was in a pure function.
A naive implementation of the n-th prime using the Sieve "by hand" is
nums0 = [2..] -- allowed by lazy evaluation
nums1 = [n | n <- nums0, mod n (head nums0) /= 0] -- works, gives [3,5,7,9,11,13,15...]
nums2 = [n | n <- nums1, mod n (head nums1) /= 0] -- works, gives [5,7,11,13,...]
nums3 = [n | n <- nums2, mod n (head nums2) /= 0] -- works, gives [7,11,13,...]
It is clear that there is a recursive pattern here. Let's call ndnp m the function that returns the list of natural numbers greater than 1 that are not divisible by the first m primes i.e.
nums0 = ndnp 0 -- (i.e. all numbers starting at 2)
nums1 = ndnp 1 -- (not divisible by 2)
nums2 = ndnp 2 -- (not divisible by 2 or 3)
To find the Nth prime, we can simply do head $ ndnp (n-1).
The example of constructing the numsX array has a recursive structure:
-- ndnp: Not Divisible by first N Primes
ndnp :: Integer -> [Integer]
ndnp 0 = [2..]
ndnp n = [n | n <- (ndnp (n-1)), mod n (head $ ndnp (n-1)) /= 0]
In particular, ndnp 1 is built from ndnp 0 or nums0
When run take 10 $ ndnp 0 I get the answer I expect. When I run take 10 $ ndnp 1 I get a stackoverflow.
My question:
- I can assign just fine:
a = ndnp 1is legal. Evaluation is lazy, which I understand. - When I try
take, I get a stack overflow (which suggests it is trying to evaluate the whole structure). I could understand ifmodor something forced eager evaluation, but I know this isn't the case because ... take 10 nums1works just fine, and it is build off the same steps of looping through an infinite, lazily eval'ed list!- Since
nums0 = ndnp 0, why is the execution oftake 10 $ nums1fine (which usesnums0and iterates over it), buttake 10 $ ndnp 1broken (which uses the result ofndnp 0and iterates over it)? Referrential transparency suggests I should be able to swap one out for the other, but in one case I retain lazy evaluation and the other generates a stack overflow!
Note: There is a fix for this I found
ndnp :: Integer -> [Integer]
ndnp 0 = [2..]
ndnp n = [n | n <- arr, mod n (head $ arr) /= 0]
where arr = ndnp (n-1)
which makes my code work, but doesn't give me a good conceptual understanding on when I have referential transparency and when I don't.
If you have warnings turned on (which is good practice -- it's kind of a mystery to me that so many compilers, GHC included, default to not printing warnings), then you'll see something like:
which will clue you in on what's going wrong. The Haskell scoping rules mean that your original definition of
ndnpis equivalent to:That is, your new definition of
ndoesn't capture thenin the first occurrence ofndnp (n-1), but it does capture all the otherns in the comprehension.When you try to evaluate
ndnp 1you get:and the list comprehension starts with
m = 2which callsndnp (m-1) = ndnp 1within the guard, resulting in an infinite loop.You can fix the problem by renaming the newly introduced iteration variable in the list comprehension:
Observe that substituting
n = 1gives:and then substituting
ndnp 0 = nums0gives:which precisely matches your definition of
nums1up to variable substitution.So, there's no violation of referential transparency in this example.
The fix you found:
is scoped differently. The Haskell scoping rules mean that it's equivalent to:
See how the new variable
mcaptures every occurrence ofmin the list comprehension, but doesn't capture thenin the definition ofarr? This means that this is equivalent to:where neither
ndnp (n-1)is captured, making it equivalent to my "fixed" version above, up to variable renaming: