Functions and scope (arithmetic track)¶
Thus far, we can construct basic arithmetic expressions in our language
and run them through our interpreter interp-v2
after desugaring
it using desugar-v2
as defined in the previous section.
Notice that instead of defining SubS
as a sugar term and expanding
it before we run it through our interpreter, we could’ve handled it within
our interpreter in terms, but in terms of our core terms like this -
(define (interp-v3 aexpr)
(match aexpr
[(SubC e1 e2)
(interp-v3 (AddC (interp-v3 e1) (MulC (NumC -1) (interp-v3 e2))))]
;...and so on for the others...
))
Here, we’re defining a new construct that has a pattern into which we’re substituting
the reductions of the corresponding “arguments” and reducing the result using our
other core expressions. In other words, we’re defining a function “(sub a b)” as a
“pattern” like this - (define (sub a b) (+ a (* -1 b)))
.
We’d obviously like to have this facility within our language as well! … so that the programmer can define their own functions in terms of core language features and reuse them.
Defining functions¶
Towards this, we’ll consider a function definition structure that captures the essence of a general enough function within our language.
(struct FunC (arg expr) #:transparent)
This structure captures what we need to specify a function. We’ll identify a function by its name, we’ll identify its argument (a.k.a. “formal parameter”) using a symbol and we’ll give an expression as the body of the function. In other words, we’re interested in functions that compute numeric results (via the interpreter).
In our language so far, we can only call our functions with numbers as values. However, we committed early on that all our “core expressions” will reduce to values. Which means we’ll need to define a value type for the values that “function terms” reduce to.
(struct FunV (argname expr) #:transparent)
This is no different from FunC
, but we’ll keep the struct different
so we can keep track of all the possible return values of our interpreter
functions. Here, argname
is expected to be a symbol and expr
is expected to be an expression.
Ok we have function expressions. Now we need to be able to apply them. So we need to capture that as an expression as well.
(struct ApplyC (fexpr vexpr) #:transparent)
“Application” is the process of taking a function value, associating its “formal parameters” with a value computed from a given value expression and evaluating the body of the function given this association.
Oh boy! We have a slew of notions at this point to capture. So let’s break that down.
First off, we need a way to reference the slots into which the actual argument value should be used within the function body. We’ll use the following for that –
(struct IdC (id) #:transparent)
… where the id
field is a symbol.
Let’s try to write our interpreter based on these. We’ll leave the desugaring as an exercise since it is all recursive processing of the abstract syntax tree.
(define (interp-v4 aexpr)
(match aexpr
[(FunC argname body)
(FunV argname body)]
[(ApplyC fexpr vexpr)
(let ([fval (interp-v4 fexpr)]
[vval (interp-v4 vexpr)])
; some how associate vval with
; the argname in the fval and
; call the interpreter on the body
)]
[(IdC id)
; Somehow lookup the current value of id
; and return the value that it is associated
; with
]
;...other terms...
))
So we see that we need a mechanism to associate ids with values that can be extended and passed through our interpreter as it is processing each term.
We call such an association an “environment” – i.e. an “environment” is (effectively) a set of associations between identifiers and values. Since an environment maps ids to values, we can model it using functions like this –
; The "empty environment" does not know about any identifier.
(define empty-env (λ (id)
(error 'env "Unknown identifier ~s" id)))
; Given an environment, we can lookup the value corresponding
; to an identifier by just calling it like a function.
(define (lookup env id) (env id))
; We an extend an environment to include an id by wrapping
; a given environment in an additional check for the new
; association.
(define (extend env id val)
(λ (id2)
(if (equal? id2 id)
val
(lookup env id2))))
With such an environment at hand, we can now define our unfinished interpreter like this -
(define (interp-v4 env aexpr)
(match aexpr
[(FunC argname body)
(FunV argname body)]
[(ApplyC fexpr vexpr)
(let ([fval (interp-v4 fexpr)]
[vval (interp-v4 vexpr)])
(interp-v4 (extend env (FunV-argname fval) vval)
(FunV-expr fval)))]
[(IdC id)
(lookup env id)]
[(AddC e1 e2)
(NumV (+ (NumV-n (interp-v4 env e1))
(NumV-n (interp-v4 env e2))))]
;...handle other terms...
))
Ok this is some language we’ve implemented certainly, but is it the one we want? – i.e. something that behaves like SMoL in this regard.
To understand what is lacking in this language, we need to understand what defines a valid versus invalid expression that may include functions. In order to produce a value as a result, the expression that we pass to our interpreter must not have any “free variables”. If it did, then when we get to those variables in the evaluation process, we’ll encounter (or at least we expect to encounter) an error.
Consider the following Racket expression -
((λ (f) (f (f 10)))
((λ (x) (λ (y) (+ y x)) 3)))
This expression has no free variables and is well formed according to lambda calculus. However, our interpreter will fail on it. Can you see why?
Lexical & dynamic environments¶
The env
argument in our interp-v4
function captures the
state of the environment at the point a term is being evaluated. To
evaluate some nested expressions, this environment may be extended,
such as when we’re “applying” a function to a value.
The value of the env
argument at the point of entry into the
interp
function is therefore called the “dynamic environment”,
because it is in the process of computing the final result, while the
computation is still not finished yet.
In particular when our interpreter is evaluating a FunC
term to produce
a FunV
value, the meaning of the identifiers used in the body of the
FunC
term are to be considered in relation to the dynamic environment in
which this term is being evaluated – i.e. at the point at which the
FunV
is being constructed. In this particular case, since the
meaning of the function is determined by the context in which it is being
created and not applied, this dynamic environment is also considered to
be the “lexical environment” of the function. Since the body of the function is
to be interpreted with only the additional fact of the binding for its
argument, we need to extend the lexical environment of a function when
computing an application, and not the dynamic environment at application time.
Without this “lexical environment”, the function cannot be interpreted correctly when applied. Therefore we need to keep this environment around.
The word “lexical” for this environment is used because this environment can be gleaned off the local (i.e. “lexical”) source code within which the function expression exists by considering all the identifiers defined in the enclosing expressions up to the top level … which is often a short way away from the point at which the function expression is given.
Since we need to capture this in our FunV
, we need to alter its
definition to be –
(struct FunV (lexenv argname expr) #:transparent)
… and based on that our interpreter needs to be modified to –
(define (interp-v5 env aexpr)
(match aexpr
[(FunC argname body)
; Note that we're storing away the dynamic environment
; at the point the function value is created, as its
; lexical environment.
(FunV env argname body)]
[(ApplyC fexpr vexpr)
(let ([fval (interp-v5 fexpr)]
[vval (interp-v5 vexpr)])
; Here, we have to extend the "lexical environment"
; of the function with the new binding and evaluate it,
; instead of extending the dynamic environment at the
; call point.
(match fval
[(FunV lexenv argname body)
(interp-v4 (extend lexenv argname vval) body)]
[_ (error "Not a function, so can't apply")]))]
[(IdC id)
(lookup env id)]
[(AddC e1 e2)
(NumV (+ (NumV-n (interp-v5 env e1))
(NumV-n (interp-v5 env e2))))]
;...handle other terms...
))
Exercise
Complete the interpreter and the corresponding desugar
function
and test the various cases you think might be problematic, to see
whether it performs correctly – i.e. works where it should and errors
out where it is faced with an invalid expression.
Exercise
Define a sugar expression (LetS id vexpr bodyexpr)
which binds the
given identifier to the value of the given expression within the bodyexpr.
You can define this in terms of FunC
and ApplyC
.
Exercise
Make the language more complete by adding support for boolean values,
logical operations and branching. Define the following new terms and their
behaviours in the interp
function and the desugar
function.
(struct BoolV (b) #:transparent) ; b = #t or #f
(struct TrueC () #:transparent)
(struct FalseC () #:transparent)
(struct AndC (e1 e2) #:transparent)
(struct OrC (e1 e2) #:transparent)
(struct NotC (e1) #:transparent)
(struct IfC (boolexpr thenexpr elseexpr) #:transparent)
The slots named e1
, e2
etc are expected to be
expressions (potential sugar terms as well), but which in this
context are valid only if they evaluate to boolean values.