I'm trying to follow Cormen - Algorithms, 3rd edition. Specifically, Chapter VII, 32 "String Matching". In general, I find this book extremely hard to follow, due to the abundance of math-speak and too theoretical, rather than practical, nature. That's why I only visit it selectively, at the algorithms I'm currently interested in.
I've been following the string chapter pretty well. It was not easy, but I managed to get most of the stuff in the Brute-force and Quick-hash algorithms... However, the second part, left me grasping for air, a bit. I have almost 0 background in automata theory, but I still managed to grasp the concept of the algorithm. However, one thing is unfathomable for me:
OK, so here everything seems to be pretty obvious. Automaton changes states depending on the previous state q and a character from text, by passing it into transition function small delta. But wait... What in the World is this transition function? Well, here it is:
OK, now, here, it is much less obvious of what is even going on. But my biggest problem is not in understanding how this particular piece of pseudocode works. I can't grasp the practicality of this process. How is it even possible to generate function at runtime in another function, and then return it like some kind of variable, so that calling function can execute it?
Can anybody give me directions of what basic things I need to understand in order to get the practical side of it? Maybe you can give some short and simple enough description of how this can be practically implemented?
My only guess, for now, is virtual machine... Theoretically, you can generate sequence of VM instructions inside a function, and then execute them, yes... But VMs have high overhead and this kind of defeats the whole purpose of fast string matching...
Another possible way, albeit, a bit crazy, is to actually generate native machine code instructions directly inside a function, during runtime. Then, write them to memory area allocated with PAGE_EXECUTE rights, and then just return the pointer to this function. Calling code can then just call the function, as any other hardcoded function. I have no idea if this is even possible or practical, though.


This is pseudocode. It does not need to be implementable in a "standard" programming language verbatim. If you look at the usage of the transition function, you will find that only three things are done with it:
That is, small delta is effectively just a map from Q x Sigma → Q. This is exactly what functions do in math. Pseudocode tends to use notation which is closer to math than programming languages.
In practice you would usually implement this as a 2-dimensional array (if you don't care about memory usage) or as a hash map (if you want to reduce memory usage at the expense of some CPU work).
A hardcoded map can be implemented using a bunch of nested switches just as well:
This tree of switches encodes delta(Q1, a) = Q1, delta(Q1, b) = Q2, delta(Q2, a) = Q2, delta(Q2, b) = Q2. It omits
defaultcases forcnot in {a, b} or q not in {Q1, Q2}.Parser generators will in practice often generate code like this - actual functions - for performance reasons. Note that this happens at compile time rather than run time.
If your function's code exists at "compile time" already, this is called a closure: The function's code doesn't change, only the context changes. These variables in the context - effectively local variables of parent scopes - are called upvalues. Creating a closure is effectively just allocating an implicit
structwhich holds these upvalues and a function pointer; roughly speaking, calling the closure calls the function pointer, passing thestructsuch that the closure can access its upvalues.Of course this closure, which is just a pointer to a struct or similar, can be passed around freely. Being able to treat functions as values - which enables fully dynamic dispatch - makes functions first class.
First-class closures are fundamental to lambda calculus and functional programming in general. Many scripting languages have also adopted first-class closures. Here is a standard "curried addition" example in Lua: