I'm trying to understand what is the effect that filter has in the shrink tree of a generator when using hedgehog integrated shrinking.
Consider the following function:
{-# LANGUAGE OverloadedStrings #-}
import Hedgehog
import qualified Hedgehog.Gen as Gen
aFilteredchar:: Gen Char
aFilteredchar =
Gen.filter (`elem` ("x" :: String)) (Gen.element "yx")
When a print the shrink tree:
>>> Gen.printTree aFilteredchar
I'd get shrink trees that look as follow:
'x'
└╼'x'
└╼'x'
└╼'x'
...
└╼<discard>
this is, a very deep tree containing only x's, and a discard at the end.
Why does the shrink function keeps on returning x's, instead of an empty list, which signals that there are no further shrinks possible?
Genis essentially a composition of a probability monad and a tree monad, and the behavior you observe mostly arises from the tree monad and the definition ofGen.filter.Basically,
Gen.filter p gis a simple monadic loop,try 0where:So to understand the tree you got, you must understand the tree monad under the
donotation here.The tree monad
The
Treetype in hedgehog that is internally used byGenlooks roughly like this (if you are looking at the linked implementation in hedgehog, setm ~ Maybe):There are many other
Tree-like types that are monads, and the monadic bind(>>=)generally takes the form of a tree substitution.Say you have a tree
t = Node x [t1, t2, ...] :: Tree a, and a continuation/substitutionk :: a -> Tree b, which replaces every node/variablex :: awith the treek x :: Tree b. We can describet >>= kin two steps,fmapthenjoin, as follows. First thefmapapplies the substitution on every node label. So we obtain a tree where every node is labeled by another tree. For concreteness, sayk x = Node y [u1, u2, ...]:Then the
joinstep flattens the nested tree structure, concatenating the children from inside the label with those outside:To complete the
Monadinstance, note that we havepure x = Node x [].The try loop
Now that we have some intuition for the tree monad we can turn to your particular generator. We want to evaluate
try kabove, wherep = (== 'x')andg = elements "yx". I'm waving my hands here, but you should imagine thatgevaluates randomly to either the treeNode 'y' [](generate'y'with no shrinkings), aka.pure 'y', orNode 'x' [Node 'y' []](generate'x'and shrink to'y'; indeed, "elementsshrinks to the left"), and that every occurence ofgis independent from others, so we get a different result when we retry.Let's examine each case separately. What happens if
g = pure 'y'? Assumek <= 100so we're in theelsebranch of the toplevelif, already simplified away below:So all the times where
gevaluates topure 'y'end up simplified away as the recursive termtry (k + 1), and we are left with the cases wheregevaluates to the other treeNode 'x' [Node 'y' []]:As illustrated in the previous section, the monadic bind is equivalent to the following, and we finish with some equational reasoning.
In summary, starting from
try 0, with half probabilitytry k = try (k + 1), and with the other halftry k = Node 'x' [try (k + 1)], finally we stop attry 100. This explains the tree you observe.(I believe this also provides at least a partial answer to your other question, since this shows how shrinking a
Gen.filteroften amounts to rerunning the generator from scratch.)