How to conceptualize a lexical scope tree when you can have hoisted nested functions?

57 Views Asked by At

I am working on a compiler in TypeScript and thinking a lot about lexical scope. In particular I'm wondering about how you handle the situation where you have hoisted functions, where variables can be undefined at one point, and then defined at another point. For example:

function a() {
  let d = 10
  b()
  return b
  let g = 40
  function b() {
    c()
    let f = 30
    return c
    function c() {
      console.log(d, e, f, g)
    }
  }
  let e = 20
}
const x = a()() // returns b, returns c
x() // log the variables

Here, what is the "lexical scope" inside the function c?

  • Is it all the variables it could possibly/potentially have access to (at some point) during possible code evaluation? (d, e, f, g)
  • Is it what it only will eventually be defined within its runtime context? (d, f)
  • Is it what won't be undefined and throw an error? (d)
  • Is it every combination of possible parents all at once (like a cartesian product sort of thing, times however many parents in the tree)?
  • etc.

At first, for a while (for many years), I thought of lexical scope as a simple tree. You had higher scopes and nested scopes, and each nested scope introduced new variables. Nice and clean, easy-peasy. But this, this throws it for a loop. Now it's like, focusing on c, there are steps of evaluation inside b, and at each step, a possible different parent scope may exist. Then a might have a suite of parent scopes too. So it is like a natural flower/tree that blossoms upward (which is hard to imagine in programming). There are more and more possible parents. What I'm thinking is, the lexical scope is the combination of every combination of parents on upward. But then how do you use this scope in a compiler?

Now I am confused. What am I supposed to use the lexical scope for in a compiler if I can't tell for a nested function what 1 scope definition it is? I assumed each nested function has 1 scope, but no, it's parents are vast and combinatory. You can only really tell what the scope is when it is bound ("binding", as opposed to "scope"). That is, at the specific step in code evaluation, what the value of the scope is.

So how do you use this information in a compiler? I can tell if a variable is in scope only seemingly if I evaluate/simulate the code running itself. Basically, how do I use this lexical scope in a compiler now with this confusion?

I was going to use it as the source of truth for what variables were defined in a nested lexical scope block. But now I can't, because I need to know at each step what the values of the variables are before I can know what the scope is. Or am I missing something?

Thinking of trying something like this:

type SiteContainerScopeType = {
  like: Site.ContainerScopeType
  parent?: SiteContainerScopeType
  children: Array<SiteContainerStepScopeType>
  declarations: Record<string, SitePropertyDeclarationType>
}

type SiteContainerStepScopeType = {
  like: Site.ContainerStepScope
  previous: SiteContainerStepScopeType
  context: SiteContainerScopeType
  declarations: Record<string, SitePropertyDeclarationType>
}
1

There are 1 best solutions below

0
coredump On

If I understand correctly the semantics described in Understanding Hoisting in JavaScript, the code can be reorganized as follows, ordering first declarations, then variable initialization and finally the rest of the expressions.

function a() {
  // DECLARATIONS
  let d
  let g
  let e

  function b() {
    let f

    function c() {
      console.log(d, e, f, g)
    }

    f = 30
    c()
    return c
  }

  // INITIALIZATION

  e = 20
  d = 10
  g = 40

  // EXPRESSIONS

  b()
  return b
}

When doing this, we find a more usual structure of having variables declared before being initialized and used.