problem formulation
Informally speaking, I want to write a function which, taking as input a function that generates binary factorizations and an element (usually neutral), creates an arbitrary length factorization generator. To be more specific, let us first define the function nfoldr in Clojure.
(defn nfoldr [f e]
(fn rec [n]
(fn [s]
(if (zero? n)
(if (empty? s) e)
(if (seq s)
(if-some [x ((rec (dec n)) (rest s))]
(f (list (first s) x))))))))
Here nil is used with the meaning "undefined output, input not in function's domain". Additionally, let us view the inverse relation of a function f as a set-valued function defining inv(f)(y) = {x | f(x) = y}.
I want to define a function nunfoldr such that inv(nfoldr(f , e)(n)) = nunfoldr(inv(f) , e)(n) when for every element y inv(f)(y) is finite, for each binary function f, element e and natural number n.
Moreover, I want the factorizations to be generated as lazily as possible, in a 2-dimensional sense of laziness. My goal is that, when getting some part of a factorization for the first time, there does not happen (much) computation needed for next parts or next factorizations. Similarly, when getting one factorization for the first time, there does not happen computation needed for next ones, whereas all the previous ones get in effect fully realized.
In an alternative formulation we can use the following longer version of nfoldr, which is equivalent to the shorter one when e is a neutral element.
(defn nfoldr [f e]
(fn [n]
(fn [s]
(if (zero? n)
(if (empty? s) e)
((fn rec [n]
(fn [s]
(if (= 1 n)
(if (and (seq s) (empty? (rest s))) (first s))
(if (seq s)
(if-some [x ((rec (dec n)) (rest s))]
(f (list (first s) x)))))))
n)))))
a special case
This problem is a generalization of the problem of generating partitions described in that question. Let us see how the old problem can be reduced to the current one. We have for every natural number n:
npt(n) = inv(nconcat(n)) = inv(nfoldr(concat2 , ())(n)) = nunfoldr(inv(concat2) , ())(n) = nunfoldr(pt2 , ())(n)
where:
npt(n)generatesn-ary partitionsnconcat(n)computesn-ary concatenationconcat2computes binary concatenationpt2generates binary partitions
So the following definitions give a solution to that problem.
(defn generate [step start]
(fn [x] (take-while some? (iterate step (start x)))))
(defn pt2-step [[x y]]
(if (seq y) (list (concat x (list (first y))) (rest y))))
(def pt2-start (partial list ()))
(def pt2 (generate pt2-step pt2-start))
(def npt (nunfoldr pt2 ()))
I will summarize my story of solving this problem, using the old one to create example runs, and conclude with some observations and proposals for extension.
solution 0
At first, I refined/generalized the approach I took for solving the old problem. Here I write my own versions of
concatandmapmainly for a better presentation and, in the case ofconcat, for some added laziness. Of course we can use Clojure's versions ormapcatinstead.In an attempt to get inner laziness we could replace
(partial partial cons)with something like(comp (partial partial concat) list). Although this way we get innerLazySeqs, we do not gain any effective laziness because, beforeconsing, most of the computation required for fully realizing therestpart takes place, something that seems unavoidable within this general approach. Based on the longer version ofnfoldr, we can also define the following faster version.Here I added a
printlncall inside the main recursive function to get some visualization of eagerness. So let us demonstrate the outer laziness and inner eagerness.solution 1
Then I thought of a more promising approach, using the function:
To get the new solution we replace the previous argument in the
map'call with:Trying to get inner laziness we could replace
(partial apply cons)with#(cons (first %) (lazy-seq (second %)))but this is not enough. The problem lies in the(every? seq s)test insidetranspose, where checking a lazy sequence of factorizations for emptiness (as a stopping condition) results in realizing it.solution 2
A first way to tackle the previous problem that came to my mind was to use some additional knowledge about the number of
n-ary factorizations of an element. This way we canrepeata certain number of times and use only this sequence for the stopping condition oftranspose. So we will replace the test insidetransposewith(seq (first s)), add an inputcounttonunfoldrand replace the argument in themap'call with:Let us turn to the problem of partitions and define:
Now we can demonstrate outer and inner laziness.
However, the dependence on additional knowledge and the extra computational cost make this solution unacceptable.
solution 3
Finally, I thought that in some crucial places I should use a kind of lazy sequences "with a non-lazy end", in order to be able to check for emptiness without realizing. An empty such sequence is just a non-lazy empty list and overall they behave somewhat like the
lazy-conss of the early days of Clojure. Using the definitions given below we can reach an acceptable solution, which works under the assumption that always at least one of theconcat'ed sequences (when there is one) is non-empty, something that holds in particular when every element has at least one binary factorization and we are using the longer version ofnunfoldr.Note that in this approach the input function
fshould produce lazy sequences of the new kind and the resultingn-ary factorizer will also produce such sequences. To take care of the new input requirement, for the problem of partitions we define:Once again, let us demonstrate outer and inner laziness.
To make the input and output use standard lazy sequences (sacrificing a bit of laziness), we can add:
observations and extensions
While creating solution 3, I observed that the old mechanism for lazy sequences in Clojure might not be necessarily inferior to the current one. With the transition, we gained the ability to create lazy sequences without any substantial computation taking place but lost the ability to check for emptiness without doing the computation needed to get one more element. Because both of these abilities can be important in some cases, it would be nice if a new mechanism was introduced, which would combine the advantages of the previous ones. Such a mechanism could use again an outer
LazySeqthunk, which when forced would return an empty list or aConsor anotherLazySeqor a newLazyConsthunk. This new thunk when forced would return aConsor perhaps anotherLazyCons. Nowempty?would force onlyLazySeqthunks whilefirstandrestwould also forceLazyCons. In this settingmapcould look like this:I have also noticed that the approach taken from solution 1 onwards lends itself to further generalization. If inside the argument in the
map'call in the longernunfoldrwe replaceconswithconcat,transposewith some implementation of Cartesian product andrepeatwith another recursive call, we can then create versions that "split at different places". For example, using the following as the argument we can define anunfoldmfunction that "splits in the middle" and corresponds to an easy-to-imaginenfoldm. Note that all "splitting strategies" are equivalent whenfis associative.Another natural modification would allow for infinite factorizations. To achieve this, if
fgenerated infinite factorizations,nunfoldr(f , e)(n)should generate the factorizations in an order of type ω, so that each one of them could be produced in finite time.Other possible extensions include dropping the
nparameter, creating relational folds (in correspondence with the relational unfolds we consider here) and generically handling algebraic data structures other than sequences as input/output. This book, which I have just discovered, seems to contain valuable relevant information, given in a categorical/relational language.Finally, to be able to do this kind of programming more conveniently, we could transfer it into a point-free, algebraic setting. This would require constructing considerable "extra machinery", in fact almost making a new language. This paper demonstrates such a language.