Me trying to grok The Little MLer again. TLMLer has this SML code
datatype a pizza = Bottom | Topping of (a * (a pizza))
datatype fish = Anchovy | Lox | Tuna
which I've translated as
data PizzaSh a = CrustSh | ToppingSh a (PizzaSh a)
data FishPSh = AnchovyPSh | LoxPSh | TunaPSh
and then an alternative closer to TLMLer perhaps
data PizzaSh2 a = CrustSh2 | ToppingSh2 (a, PizzaSh2 a)
And from each I create a pizza
fpizza1 = ToppingSh AnchovyPSh (ToppingSh TunaPSh (ToppingSh LoxPSh CrustSh))
fpizza2 = ToppingSh2 (AnchovyPSh, ToppingSh2 (LoxPSh, ToppingSh2 (TunaPSh, CrustSh2)))
respectively, which are of type PizzaSh FishPSh and PizzaSh2 FishPSh respectively.
But the second version (which is arguably the closer to the original ML version) seems "offbeat." It's as if I'm creating a 2-tuple when I "cons" toppings together where the second member recursively expands . May I assume the parametric data constructor "function" of PizzaSh2 doesn't literally build a tuple, it's just borrowing the tuple as a cons strategy, correct? Which is preferable in Haskell, PizzaSh or PizzaSh2? As I understand, a tuple (cartesian product) data type will have a single constructor, e.g., data Point a b = Pt a b, not the disjoint union of ored-together (|) constructors. In SML the "*" indicates product, i.e., tuple, but, again, is this just a "tuple-like thing," i.e., it's just a tuple-looking way to cons a pizza together?
In Haskell, we prefer this style:
There is no need in Haskell to use a tuple there, since a data constructor like
ToppingShcan take multiple arguments.Using an additional pair as in
creates a type which is almost isomorphic to the previous one, but is more cumbersome to handle since it requires to use more parentheses. E.g.
Further, the type is indeed only almost isomorphic. When using an additional pair, because of laziness, we have one more value which can be represented in the type: we have a correpondence
which breaks down in the case
That is,
ToppinggSh2can be applied to a non-terminating (or otherwise exceptional), pair-valued expression, and that constructs a value which can not be represented usingToppingSh.Operationally, to achieve that GHC uses a double indirection (roughly pointer-to-pointer, or thunk-returning-pair-of-thunks), which further slows down the code. Hence, it's also a bad choice from a performance point of view, if one cares about such micro-optimizations.