« Back to James's CS61A page. CS61A: Spring 2000

Week 4 Discussion/Lab Notes


Administrivia


Data Abstraction

CS61A is all about abstraction. In the past few weeks we've learned to abstract procedures, treating them as things that can be used as arguments to or as return values of other procedures. This week we're going to take this idea even further by learning about data abstraction, one of the most important and powerful topics of the course.

Data abstraction is application of meaning to data. We take some chunk of data and transform it into a concept. For example, think about text on a piece of paper. From the low-level perspective, the text is merely ink on a page. As we step back, we can see the text at a higher level of abstraction. We recognize that the ink represents letters. As we step back even further, we see that the letters comprise words, that the words comprise sentences, and that sentences convey ideas. Abstraction is a very powerful concept!

Applying abstraction to programming, let's consider the list (1 2) as an example. This is some arbitrary piece of data. At a conceptual level, this might represent the rational number 1/2. Or it might represent the complex number 1+2i. There are an infinite number of possibilities!

And it is because there are an infinite number of possibilities that data abstraction is necessary and useful. How do we know if we're trying to deal with a rational number, a complex number, or something else entirely?

To answer these questions, we must deal with the concepts. By working at the conceptual level--the level of abstraction--we can concentrate on the main ideas of our programs without getting bogged down in details of implementation. We can use abstraction to simplify complicated programs.

To help us deal with abstract data, we use constructors and selectors to specify what we're doing. As their names imply, constructors create abstract data and selectors retrieve their components.


Implementing Compound Data

Our basic tool for creating compound data is cons. cons takes exactly two arguments and joins them together to create a pair structure.

There are other tools to create compound data such as list, but the list procedure really just chains together a bunch of cons's in a particular way.

(Question: There are predicates to test for pairs and lists: pair? and list?. Is every pair a list? is every list a pair? The answer to both is no. Why?)


Here's a simple example of creating some polygons using data abstraction:

   (define (make-point x y)        ;; a point constructor
     (list x y))

   (define (x-coord point)         ;; point selectors
     (car point))

   (define (y-coord point)
     (cadr point))

   (define make-polygon list)      ;; polygon constructor

   (define (get-point polygon n)   ;; polygon selector
     (list-ref polygon n))

   (define a-square (make-polygon (make-point 0 0)
                                  (make-point 1 0)
                                  (make-point 1 1)
                                  (make-point 0 1) ))

Compare that to:

   (define a-square (list (list 0 0)
                          (list 1 0)
                          (list 1 1)
                          (list 0 1) ))

Which makes more intuitive sense and is easier to read?

Suppose we wanted to change our representation of a point from a list to a pair. What would we need to change? As long as we respect the data abstraction, we should only need to modify the constructors and selectors!

Two principal reasons for using data abstraction:

Respect the data abstraction. Do not mix constructors and selectors of different data types! Doing so is a data abstraction violation (DAV). Even if it works, it's very bad code. If the implementations of the abstract data ever are changed, DAVs will break your code. If you respect the data abstraction, you'll write code with fewer errors.


Above-the-Line versus Below-the-Line

The above-the-line perspective is viewing something from an abstract point-of-view.

The below-the-line perspective is viewing something from an implementational point-of-view.

(See Figure 2.1 in SICP.)


Sentences versus Lists

So far we have seen a number of different data types. One is a sentence, another is a list. They look the same. What's the difference?

Sentences are implemented as lists, but they are nonetheless distinct data types. They don't have to be the same; we could change the representation of a sentence to be something else, if we so wanted.

Because they are distinct, we cannot use them interchangeably. We cannot use car and cdr on sentences, and we cannot use first, butfirst, etc. on lists. To do so would be a data abstraction violation!

list sentence
list
append
sentence
car first
cdr butfirst
cadr  
  last
  butlast
null? empty?

Equivalent procedures for lists and sentences.


The Contract

When writing and using procedures, you should think about what the contract of the procedure is. The contract consists of:

If you keep these in mind, you should be able to write code with fewer errors. For the parameters and return values, you should think about what types of values they are from the abstract point-of-view.


Box-and-Pointer Diagrams

Be sure to read the Lecture Notes on box-and-pointer diagrams.

Exercise: What are the box-and-pointer diagrams for the following?

   (1 2 3 4)
   (1 2 3 . 4)
   ((1 . 2) 3 4)
   ((1 . 2) 3 . 4)
   (())

Are the following possible? Why or why not?

   (1 2 . 3 4)
   ((1 . 2) . (3 . 4))

I'll try go through these examples during discussion.


Dotted-Tail Notation

Dotted-tail notation allows us to create procedures that take a variable number of arguments. (See exercise 2.20 for instructions.) Remember that when using dotted-tail notation with recursion, a helper procedure is almost always necessary! Consider the following procedure that adds all its arguments (yes, + already does this, but let's pretend it doesn't):

   (define (sum . args)
     (if (null? args)
         0
         (+ (car args)
            (sum (cdr args)) ) )) ;; WRONG

Remember that all the optional arguments first are encapsulated in a list, in this case args. The problem is that (cdr args) is already a list, and thus each recursive call produces nested lists. The correct way to do this is to use a helper procedure that explicitly takes a list as its argument:

   (define (sum . args)
     (define (sum-helper arg-list)
       (if (null? arg-list)
           0
           (+ (car arg-list)
              (sum-helper (cdr arg-list)) ) ))
     (sum-helper args) )

« Week 3 Week 5 »
« Back to James's CS61A page. CS61A: Spring 2000

Last Modified: Tuesday, 30-Dec-2014 11:58:34 PST