« Back to James's CS61A page. CS61A: Spring 2000

Week 12 Discussion/Lab Notes


Administrivia


See also Brian Denny's notes on programming languages and his overview of the MCE.


The Big Picture

Up till now, we've been discussing the structure of computer programs. From this week onward we'll be dealing primarily with the interpretation of computer programs.

Keep in mind that just like everything else in this course, this is all about abstraction.


The Evaluator as a Black Box

What is a program? A program is essentially a description of some desired computational process.

We can think of an evaluator as a black box that takes this description and some data and produces some result.

This black box doesn't need to be physical; we can implement it as another program!

universal function machine The Evaluator:
a program that evaluates the input program when applied to the input data

This is the pinnacle of data-directed programming. To the evaluator, the input program is just more data!

Some things to note:

The language of the input program and the language used to implement the black box are independent. In our case, Scheme will be the language for both, and hence our evaluator is metacircular.

Unfortunately, because both languages are Scheme, this makes it very easy to confuse input code with the implementation code. This is probably the biggest cause for confusion. You must make an effort to keep them straight! If you think about the input program as data to the evaluator, distinguishing between them will be easier. (See Level Confusion, below.)


Why?

Why should we spend time studying and building evaluators? There are a number of reasons:


Terminology

Special Forms
  • special evaluation rules
  • not procedures
  • not bound in the environment
  • not first-class; they can't be used as arguments
  • hard-coded into the evaluator*
Primitive Procedures
  • ordinary procedures (all arguments must be evaluated)
  • created using the underlying language
  • applied "by magic"
  • hard-coded into the evaluator
Compound Procedures
  • ordinary procedures
  • created using the meta language
  • compositions of primitives and special forms
  • user-defined

(* Technically, not all special forms need to be hard-coded into the evaluator. Special forms that are derived expressions of other special forms can be user-defined with macros. You don't need to worry about this though.)


Level Confusion

Suppose we are running the metacircular evaluator and we enter the expression (* (+ 1 2) 3). What is this expression from the below-the-line perspective? What does the underlying Scheme interpreter treat this as?

From the below-the-line perspective, things typed into the metacircular evaluator are either just atomic symbols or are lists of symbols. (Technically, symbols, numbers, booleans, strings, etc.) It's data. For example:

(* (+ 1 2) 3) as a list of symbols (and numbers)

Remember this! This is the key to distinguishing meta-Scheme from the underlying Scheme. As another example, if we type (lambda (x) (* x x)) into the metacircular evaluator, we are typing in just a list whose car is the symbol lambda, whose caddr is the list (x), etc..

To emphasize this point further, consider how to invoke mc-eval directly from the underlying Scheme system:

   (mc-eval '(* (+ 1 2) 3) the-global-environment)

Furthermore, procedures in meta-Scheme are tagged lists to the underlying Scheme. The underlying Scheme system cannot invoke these "meta-procedures" directly!


Quick Review of Environment Diagrams

The first two evaluation rules are the most important:

(By modifying these rules, we can change the behavior of the evaluator. For example, the rule about evaluating all the arguments before the procedure is called is necessary for applicative-order evaluation. If we get rid of it, we'll end up with a normal-order evaluator! Similarly, the rule about extending a procedure's environment is necessary for lexical scoping. Changing these rules will give us dynamic scoping.)


Below-the-line Representation

So given the environment diagramming rules, when we construct a meta-procedure, what do we need to store? They must have:

We can package these things together in a list. Meta-procedures are lists to the underlying Scheme system.

How can we implement environments? What is an environment? Recall that an environment is a chain--a sequence--of frames. The list data structure is perfect for this! Environments are lists of frames.

How about frames? Frames must store bindings. We can either maintain a list of names and a list of values, or we can have one big list of name-value pairs. The textbook uses the first method.


The core: mc-eval, mc-apply, driver-loop

mc-eval evaluates expressions by looking up variables, figuring out how to handle special forms, and by calling mc-apply on procedures.

Here's a simplified version of mc-eval without the complications from data abstraction. A lot of other details have been stripped away to give a better sense of the general structure:

   (define (mc-eval exp env)
     (cond ((number? exp) exp)
           <other self-evaluating cases>
           ((symbol? exp) <look up variable in env>)
           ((eq? (car exp) 'if)
            <use special evaluation rule for if>)
           ((eq? (car exp) 'lambda) <create a procedure>)
           <cases for other special forms>
           (else <invoke meta-procedure with mc-apply>) ))

mc-apply handles invocation of all meta-procedures. How do we invoke procedures? For primitive procedures we apply them "by magic." (Since primitives are created in the underlying Scheme system, the underlying system knows how to deal with them.) For compound procedures, again recall the environment diagramming rules:

  1. Evaluate all the arguments before calling the procedure
  2. Extend the environment of the procedure with a new frame
  3. Bind values to the parameters of the procedure
  4. Evaluate the body of the procedure

Here's a version of mc-apply with the details stripped away:

   (define (mc-apply procedure arguments)
     (cond ((primitive-procedure? procedure)
            <apply by magic>)
           ((compound-procedure? procedure)
            <extend the environment and bind values to parameters>
            <evaluate body with mc-eval>)
           (else <error, not a procedure>) ))

(Note that mc-eval handles the step about evaluating all the arguments before invoking a procedure.)


driver-loop merely provides a convenient interface to the evaluator by going through a read-evaluate-print loop. For example, we could invoke the evaluator explicitly:

   (user-print (mc-eval '(lambda (x) (* x x)) the-global-environment))

(Why do we need user-print?)


Things to ponder:


Scoping: Lexical vs. Dynamic

In a lexically-scoped language, each procedure has some notion of its defining environment--the environment in which it was created. Furthermore, whenever a procedure is called, that procedure's defining environment is extended.

Lexical scoping often is referred to as static scoping. This means that code can be analyzed by the evaluator at compile-time--before it is actually evaluated. From this kind of analysis, we can determine what each variable refers to, and we can determine what the defining environment to each procedure is. Thus, when we invoke a procedure, we can know ahead of time which environment to extend. (Remember, in a lexically-scoped language, a given procedure, when invoked, always extends the same environment.)

In the long run, this means that lexical languages lead to faster code. The more things we can determine ahead of time, the fewer things we need to compute when the program is actually run, and the more optimizations we can make.

In contrast, in a dynamically-scoped language, whenever a procedure is called, the current environemnt is extended. Therefore we don't care what a procedure's defining environment is. (We can thus represent a procedure as a single "bubble" instead of as a bubble-pair.)

With dynamic scope, variable bindings are determined at run-time. This means that we have to explicitly lookup variables while we're running the program. Furthermore, since dynamic scope always extends the current frame for procedure calls, we can't garbage collect frames until the program terminates. For example, consider the following sequence of expressions:

   (define x 3)

   (define (foo x)
     (bar))

   (define (bar) x)

   (foo 4)

Using lexical scope, when we call (foo 4), we know that the resulting frame is no longer necessary and can be thrown away, because no procedure can ever access foo's local variable x.

In contrast, using dynamic scope, we must keep foo's frame, because bar can access it. Thus, dynamically-scoped languages use more memory than lexically-scoped ones.

The lecture notes discuss some more of the advantages and disadvantages of each scoping behavior.


What would we need to change in the metacircular evaluator for it to use dynamic scoping instead of lexical?

The fundamental change to the MCE is to mc-apply. Rather than calling extend-environment on the procedure's defining environment, we call it on the current environment instead. Of course, since mc-apply doesn't have a pointer to the current environment, we'll have to change mc-eval so that when it calls mc-apply, it supplies the current environment as an additional argument.

Here is the modified code for dynamic scoping: dyn-mce.scm.

(Those are the minimum changes required for the MCE to use dynamic scoping. We can note, however, that with dynamic scope a lot of the existing code is no longer necessary. For example, rather than passing the environment as an argument to mc-eval (and the other eval- procedures), we can maintain a global pointer to the current environment instead. (Why can't we do this with lexical scoping?))


« Week 11 Weeks 13/14 »
« Back to James's CS61A page. CS61A: Spring 2000

Last Modified: Tuesday, 30-Dec-2014 11:58:34 PST