Stacks and scope¶
We considered a minimally useful stack machine in the preceding quiz, which is reproduced below for reference -
#lang racket
(define (top stack) (first stack))
(define (push val stack) (cons val stack))
(define (pop stack) (rest stack))
(define (stack-machine program stack)
(if (empty? program)
stack
(let ([instr (first program)])
(stack-machine
(rest program)
(process-instruction instr stack)))))
(define (process-instruction instr stack)
(cond
[(equal? instr '+)
(push (+ (top stack) (top (pop stack))) (pop (pop stack)))]
[(equal? instr '-)
(push (- (top stack) (top (pop stack))) (pop (pop stack)))]
[(equal? instr '*)
(push (* (top stack) (top (pop stack))) (pop (pop stack)))]
[(equal? instr '/)
(push (/ (top stack) (top (pop stack))) (pop (pop stack)))]
[(equal? instr 'dup)
(push (top stack) stack)]
[(equal? instr 'rot2)
(push (top (pop stack))
(push (top stack)
(pop (pop stack))))]
[(equal? instr 'rot3)
(push (top (pop stack))
(push (top (pop (pop stack)))
(push (top stack) (pop (pop (pop stack))))))]
[(equal? instr 'rot4)
(push (top (pop stack))
(push (top (pop (pop stack)))
(push (top (pop (pop (pop stack))))
(push (top stack)
(pop (pop (pop (pop stack))))))))]
[(equal? instr 'drop)
(pop stack)]
[(equal? instr 'sqrt)
(push (sqrt (top stack)) (pop stack))]
[(number? instr)
(push instr stack)]
[#t
(raise-argument-error 'do-instruction "Valid instruction" instr)]))
This machine lets us perform simple arithmetic calculations. For example,
'(dup * rot2 dup * + sqrt) is a program that can be used to compute the
distance from the origin to a point (x y) where the coordinates are
given on the stack. For example,
(stack-machine '(dup * rot2 dup * + sqrt) '(3 4))
; Produces '(5) as the result stack
While this program is understandable without much effort, it is not obvious
that the program '(dup dup * rot3 rot2 dup dup * rot3 * dup + + +)
computes the algebraic expression \((a^2 + 2ab + b^2)\), so that we can
transform it into the equivalent program '(+ dup *) – i.e.
\((a+b)^2\). Maybe if we work with such expressions enough, we’ll build
sufficient algebraic prowess to see how the longer expression can be reduced to
the shorter one.
Given that the mechanisms we don’t have prior familiarity with are the
dup and rotN family which juggle elements on the stack in
preparation for future operations, it is easy to see that if we can simply
name the elements on top of the stack, the program can become more comprehensible.
For example, if a and b stood for the top two elements of the stack,
the longer program above could be written as '(a a * 2 a b * * b b * + +),
which is much better for the human reader. Similarly, the distance formula
also can be written as '(a a * b b * + sqrt).
This version of the distance formula should look familiar!
Consider the expression we would’ve written in Racket –
(define (distance a b)
(sqrt (+ (* a a) (* b b))))
; Focus on the "sqrt" expression
(sqrt (+ (* a a) (* b b)))
; Remove all the parentheses
sqrt + * a a * b b
This is just the stack machine program written from right to left
order! For this reason, programs like the ones we wrote for the
stack-machine are said to be “postfix notation” while LiSP’s notation is
also called “prefix notation”. LiSP’s notation admits variadic functions
(functions which can take any number of arguments such as +) whereas
with the postfix notation the “arity” of an operator, or “words” as operators
are called in such languages, is in general fixed. “Arity” refers to the number
of arguments to a function or procedure.
Aside
Apart from Forth being the canonical “postfix notation language”, you perhaps pretty much use a postfix language on a daily basis without knowing it – PostScript and PDF files! Adobe’s PostScript (“Post” is there in the name for a reason) is actually a programming language for drawing. PDF , while advertised as a “portable document format”, is a compressed version of the drawing commands produced by a PostScript program. Apple (well, NeXT) also adapted Postscript for use in the NeXT OS for controlling the display, called Display PostScript , which enabled the OS to capture scalable on-screen vector graphics as PostScript or PDF files easily for print. As we saw, Postfix languages are very easy to write interpeters for and these turn out to be low resource processors that can be used in devices like printers.
Adding names to the stack-machine¶
So we’d like to be able to “bind” symbols to values picked from the
stack so we can recall them whenever we need their values. For this, we need a
kind of “dictionary” in which we can lookup values associated with symbols.
There is a Scheme function assoc that’ll do this for us -
(define alist (list (list 'one "ek")
(list 'two "do")
(list 'three "teen")
(list 'three "theen")))
(display (assoc 'three alist))
; Prints out (three "teen")
; Notice that only the first occurrence is returned.
(display (assoc 'four alist)
; Prints out #f to indicate "not found".
So let’s augment our stack machine with such a “dictionary” and
interpret “symbols” we find in the instruction stream to mean “lookup this
symbol in the dictionary and push the value you find on the top of the stack”.
We’ll call this dictionary “bindings” because it is a list of symbols bound to
values. We’ll also add a new “compound instruction” for popping off a value
from the stack and binding it to a symbol - as (def <symbol>).
; Since our stack-machine now has to consume a stack
; and a bindings list and produce new versions of those
; as a result, we'll group them into a simple struct
; we'll call "State".
(struct State (stack bindings))
(define (stack-machine program state)
(if (empty? program)
state
(let ([instr (first program)])
(stack-machine
(rest program)
(process-instruction instr state)))))
(define (process-instruction instr state)
(match state
[(State stack bindings)
(cond
[(equal? instr '+)
(State (push (+ (top stack) (top (pop stack))) (pop (pop stack))) bindings)]
[(equal? instr '-)
(State (push (- (top stack) (top (pop stack))) (pop (pop stack))) bindings)]
[(equal? instr '*)
(State (push (* (top stack) (top (pop stack))) (pop (pop stack))) bindings)]
[(equal? instr '/)
(State (push (/ (top stack) (top (pop stack))) (pop (pop stack))) bindings)]
[(equal? instr 'dup)
(State (push (top stack) stack) bindings)]
[(equal? instr 'rot2)
(State (push (top (pop stack))
(push (top stack)
(pop (pop stack))))
bindings)]
[(equal? instr 'rot3)
(State (push (top (pop stack))
(push (top (pop (pop stack)))
(push (top stack) (pop (pop (pop stack))))))
bindings)]
[(equal? instr 'rot4)
(State (push (top (pop stack))
(push (top (pop (pop stack)))
(push (top (pop (pop (pop stack))))
(push (top stack)
(pop (pop (pop (pop stack))))))))
bindings)]
[(equal? instr 'drop)
(State (pop stack) bindings)]
[(equal? instr 'sqrt)
(State (push (sqrt (top stack)) (pop stack)) bindings)]
[(number? instr)
(State (push instr stack) bindings)]
; Handle (def <symbol>) instruction
[(and (list? instr)
(not (empty? instr))
(equal? (first instr) 'def)
(symbol? (second instr)))
(State (pop stack) (cons (list (second instr) (top stack)) bindings))]
; Handle symbols that occur that we don't already know about
; as a "lookup operation"
[(symbol? instr)
(match (assoc instr bindings)
[(list sym value)
(State (push value stack) bindings)]
[#f (raise-argument-error 'process-instruction
"Defined symbol expected"
instr)])]
[#t
(raise-argument-error 'process-instruction "Valid instruction" instr)])]))
With these two additions, we can now express our “Euclidean distance” function
as ((def x) (def y) x x * y y * + sqrt). Note how this closely
resembles (lambda (x) (lambda (y) (sqrt (+ (* x x) (* y y))))).
Exercise
To make the resemblance to lambda even closer, modify the
implementation of the (def <symbol>) instruction to support multiple
symbols. The idea is to pull one value off the stack for each symbol and
bind it to the corresponding symbol. So encountering a (def x y)
instruction will cause our machine to pull the top two values from the
stack and bind them to x and y.
Blocks¶
Though we’ve been able to bind symbols to values and use them, our stack-machine programming language does not have the ability to reuse such calculations. For example, we’ll have to repeat the whole distance calculation code whenever we need to do it.
We can invent another type of value – the “block” – which contains a
list of instructions (a “program”) that we can store bound to a symbol and
“invoke” whenever we need. Surprisingly, this requires only a small change to
our stack-machine. We’ll also have to add a do instruction that will pop
a block off the top of the stack and run its program on the stack.
(struct Block (program))
(define (process-instruction instr state)
(match state
[(State stack bindings)
(cond
; <common-operators>
; ...
; </common-operators>
; A "block" compound instruction is given like (block dup + sqrt)
[(and (list? instr)
(not (empty? instr))
(equal? (first instr) 'block))
(State (push (Block (rest instr)) stack) bindings)]
; A "do" instruction will pop a Block value off the top of the
; stack and "run" it.
[(equal? instr 'do)
(match (top stack)
[(Block program)
(stack-machine program (State (pop stack) bindings))]
[_ (raise-argument-error 'process-instruction
"Block value on stack"
stack)])]
; Handle (def <symbol>) instruction
[(and (list? instr)
(not (empty? instr))
(equal? (first instr) 'def)
(symbol? (second instr)))
(State (pop stack) (cons (list (second instr) (top stack)) bindings))]
; Handle symbols that occur that we don't already know about
; as a "lookup operation"
[(symbol? instr)
(match (assoc instr bindings)
[(list sym value)
(State (push value stack) bindings)]
[#f (raise-argument-error 'process-instruction
"Defined symbol expected"
instr)])]
[#t
(raise-argument-error 'process-instruction "Valid instruction" instr)])]))
Now, we’re actually equipped to define a “euclidean distance” function in our stack-machine language!
(define program '( (block (def x) (def y) x x * y y * + sqrt)
(def distance)
3 4 distance ) )
(display-state (stack-machine program (State '() '())))
; Prints out (5) as the result stack.
Which programs are valid blocks?¶
The way we’ve implemented block execution, the final value of a block’s bindings will be available after block execution. So the following program will actually produce a value with our stack-machine.
(stack-machine '((block (def x) (def y) x x * y y * + sqrt)
(def distance)
3 4 distance do
x y +)
(State '() '()))
; will produce (7 5) as the result stack.
What does this program really mean? Why should the x y + part
of the program care what variable names the block implementing the
distance calculation uses internally?
Our “block” defines a “region of code” that we wish to be self contained. In other words, we want “what happens within the block, stays within the block” to hold, except for the effect it has on the stack. In yet more words, we want to throw away all symbol bindings done within the block once the block is done. We don’t want all our variables to be “global” and interfere with each other.
Note
Why is this required? Think about it before reading on.
Let’s first fix the problem we noted above, assuming global variables are “a bad idea”.
(define (process-instruction instr state)
(match state
[(State stack bindings)
(cond
; <common-operators>
; ...
; </common-operators>
; A "block" compound instruction is given like (block dup + sqrt)
[(and (list? instr)
(not (empty? instr))
(equal? (first instr) 'block))
(State (push (Block (rest instr)) stack) bindings)]
; A "do" instruction will pop a Block value off the top of the
; stack and "run" it.
[(equal? instr 'do)
(match (top stack)
[(Block program)
; <<<---->>>>
; We've modified this expression to consider only the stack
; as part of the result state of executing a block. We discard
; the bindings it produces and retain the original bindings list.
(State (State-stack (stack-machine program (State (pop stack) bindings)))
bindings)]
[_ (raise-argument-error 'process-instruction
"Block value on stack"
stack)])]
; Handle (def <symbol>) instruction
[(and (list? instr)
(not (empty? instr))
(equal? (first instr) 'def)
(symbol? (second instr)))
(State (pop stack) (cons (list (second instr) (top stack)) bindings))]
; Handle symbols that occur that we don't already know about
; as a "lookup operation"
[(symbol? instr)
(match (assoc instr bindings)
[(list sym value)
(State (push value stack) bindings)]
[#f (raise-argument-error 'process-instruction
"Defined symbol expected"
instr)])]
[#t
(raise-argument-error 'process-instruction "Valid instruction" instr)])]))
This version of the stack-machine rightly rejects the program we considered to
be erroneous. However we have not solved the problem completely. While we’ve
eliminated the “global variables only” meaning in our program, our blocks can still
reference variables that are meaningless in certain ways.
Dynamic scoping¶
Consider the program below –
(stack-machine '((block (def x) x x * y y * + sqrt)
(def distance)
(block 4 (def y) 3 distance do)
do)
(State '() '()))
Note that in this program, we’ve removed the (def y) within the block,
so the distance definition will only pop one value off the stack and
name it as x. Our use of y within the block is meaningless at the
point at which the block is being defined, because there is no guarantee that
it will become defined later on, and that could happen within another block
which accidentally uses a “local variable” y, which would interfere with
the reference to y within our distance block.
So, we want this program to also be treated as erroneous and fail rather than be given a spurious meaning.
Programming languages which give meaning to such programs are said to have
“dynamic scoping”. The word “dynamic” here refers to the fact that as
the program is running, the symbol y takes on different values and the
meaning being attributed by the interpreter to the y within the first
block is “whatever value y happens to have right now”.
That global variables are a bad idea is quite easily argued – two different parts of a large program accidentally using the same symbol to refer to different kinds of values should not cause the whole program to become invalid. The reason dynamic scoping is also “a bad idea” is less obvious.
Note
Think about why, before you read on.
In general, when we encapsulate some computation as a function for purposes of reuse, we want to be able to reason about the behaviour of the function without having to consider anything apart from the arguments supplied to it. If we’re able to do that, the task of ensuring the correctness of a large program is tractable – since we only have to validate each function based on the constraints of the functions that it relies on. We do not want to have to check how a function behaves in every context it is being invoked.
Fixing dynamic scoping¶
To fix the “dynamic scoping bug”, we need to clarify what exactly is the problem in the first place.
Note
Think about it before reading on. Why is the stray variable y
in our last example taking on an actual value when we’re invoking the block?
The set of bindings in effect when evaluating a particular instruction is called its “environment”. For a block, we therefore need to distinguish between two such “environments”.
The bindings in effect at the point we’re creating the “block value” (or “block
object” if you want) is its “definition environment”. This “block value”
refers to the Block type value we’re placing on the stack and therefore
the “definition environment” is the environment in effect when we’re creating
this Block type value.
The bindings in effect at the point we’re invoking the block is called its
“evaluation environment”. This is the environment in effect when we
evaluate distance do. The problem we currently have is that we’re not
distinguishing between these two environments. More specifically, we’re letting
the evaluation environment affect the inside of the block where the definition
environment is the one that’s supposed to be in effect. This is because the
definition environment is what lends meaning to the value of the symbols used
within the block and we don’t want the evaluation environment to be responsible
for that.
Note
Think about why it is the definition environment that lends meaning to the inside of a block. What consequences does it have when building large programs as a collection of small pieces of functionality?
As with many problems, identifying the problem is the major part of fixing it. In this case, because we only have one notion of environment, we need to store away the definition environment along with the block when we’re creating it, so that we can refer to it later at evaluation time.
(struct Block (program definition-time-bindings))
(define (process-instruction instr state)
(match state
[(State stack bindings)
(cond
; <common-operators>
; ...
; </common-operators>
; A "block" compound instruction is given like (block dup + sqrt)
[(and (list? instr)
(not (empty? instr))
(equal? (first instr) 'block))
(State (push (Block (rest instr) bindings) stack) bindings)]
; ^---- Note that we added this
; to store the definition time
; bindings when we're making the block.
; A "do" instruction will pop a Block value off the top of the
; stack and "run" it.
[(equal? instr 'do)
(match (top stack)
[(Block program definition-time-bindings)
; ^---- Now we pick up what we stored away
; at definition time.
; <<<---->>>>
; We've modified this expression to consider only the stack
; as part of the result state of executing a block. We discard
; the bindings it produces and retain the original
; definition-time-bindings that were captured when the block
; was created.
(State (State-stack (stack-machine program
(State (pop stack) definition-time-bindings)))
; ^---- Note this.
bindings)]
; ^------------ These bindings are unaffected by what happens
; when running the block.
[_ (raise-argument-error 'process-instruction
"Block value on stack"
stack)])]
; ...
; THE REST OF (def <symbol>) and (symbol? x) etc.
; ...
[#t
(raise-argument-error 'process-instruction "Valid instruction" instr)])]))
With this, we’ve dealt the final blow to dynamic scoping in our interpreter.
Our interpreter now properly implements “lexical scoping” – i.e. the
meaning of a particular symbol used is taken to be available in the region of
program text where it’s used. That’s an informal way of saying it though. We
usually imply that there is some region of code (usually delimited by some form
of brackets or pair of keywords like begin and end) which is the
intended “region of program text”.
Exercise
While we’ve “fixed” dynamic scoping, think about whether we’ve lost any useful ability along the way.
Exercise
Btw we’ve also gained another ability when we implemented proper lexical
scoping in stack-machine. Can you spot it? What possible ways to use
blocks would you try to exhaust some of these possibilities?