« Back to James's CS61A page. | CS61A: Spring 2000 |
CS61A is all about abstraction. In the past few weeks we've learned to abstract procedures, treating them as things that can be used as arguments to or as return values of other procedures. This week we're going to take this idea even further by learning about data abstraction, one of the most important and powerful topics of the course.
Data abstraction is application of meaning to data. We take some chunk of data and transform it into a concept. For example, think about text on a piece of paper. From the low-level perspective, the text is merely ink on a page. As we step back, we can see the text at a higher level of abstraction. We recognize that the ink represents letters. As we step back even further, we see that the letters comprise words, that the words comprise sentences, and that sentences convey ideas. Abstraction is a very powerful concept!
Applying abstraction to programming, let's consider the list
And it is because there are an infinite number of possibilities that data abstraction is necessary and useful. How do we know if we're trying to deal with a rational number, a complex number, or something else entirely?
To answer these questions, we must deal with the concepts. By working
at the conceptual level
To help us deal with abstract data, we use constructors and selectors to specify what we're doing. As their names imply, constructors create abstract data and selectors retrieve their components.
Our basic tool for creating compound data is cons. cons takes exactly two arguments and joins them together to create a pair structure.
There are other tools to create compound data such as list, but the list procedure really just chains together a bunch of cons's in a particular way.
(Question: There are predicates to test for pairs and lists: pair? and list?. Is every pair a list? is every list a pair? The answer to both is no. Why?)
Here's a simple example of creating some polygons using data abstraction:
(define (make-point x y) ;; a point constructor (list x y)) (define (x-coord point) ;; point selectors (car point)) (define (y-coord point) (cadr point)) (define make-polygon list) ;; polygon constructor (define (get-point polygon n) ;; polygon selector (list-ref polygon n)) (define a-square (make-polygon (make-point 0 0) (make-point 1 0) (make-point 1 1) (make-point 0 1) ))
Compare that to:
(define a-square (list (list 0 0) (list 1 0) (list 1 1) (list 0 1) ))
Which makes more intuitive sense and is easier to read?
Suppose we wanted to change our representation of a point from a list to a pair. What would we need to change? As long as we respect the data abstraction, we should only need to modify the constructors and selectors!
Two principal reasons for using data abstraction:
Respect the data abstraction. Do not mix constructors and selectors of different data types! Doing so is a data abstraction violation (DAV). Even if it works, it's very bad code. If the implementations of the abstract data ever are changed, DAVs will break your code. If you respect the data abstraction, you'll write code with fewer errors.
The above-the-line perspective is viewing something from an abstract point-of-view.
The below-the-line perspective is viewing something from an implementational point-of-view.
(See Figure 2.1 in SICP.)
So far we have seen a number of different data types. One is a sentence, another is a list. They look the same. What's the difference?
Sentences are implemented as lists, but they are nonetheless distinct data types. They don't have to be the same; we could change the representation of a sentence to be something else, if we so wanted.
Because they are distinct, we cannot use them interchangeably. We cannot use car and cdr on sentences, and we cannot use first, butfirst, etc. on lists. To do so would be a data abstraction violation!
|
Equivalent procedures for lists and sentences.
When writing and using procedures, you should think about what the contract of the procedure is. The contract consists of:
If you keep these in mind, you should be able to write code with fewer errors. For the parameters and return values, you should think about what types of values they are from the abstract point-of-view.
Be sure to read the Lecture Notes on box-and-pointer diagrams.
Exercise: What are the box-and-pointer diagrams for the following?
(1 2 3 4) (1 2 3 . 4) ((1 . 2) 3 4) ((1 . 2) 3 . 4) (())
Are the following possible? Why or why not?
(1 2 . 3 4) ((1 . 2) . (3 . 4))
I'll try go through these examples during discussion.
Dotted-tail notation allows us to create procedures that take a variable number of arguments. (See exercise 2.20 for instructions.) Remember that when using dotted-tail notation with recursion, a helper procedure is almost always necessary! Consider the following procedure that adds all its arguments (yes, + already does this, but let's pretend it doesn't):
(define (sum . args) (if (null? args) 0 (+ (car args) (sum (cdr args)) ) )) ;; WRONG
Remember that all the optional arguments first are encapsulated in a
list, in this case args. The problem is that
(define (sum . args) (define (sum-helper arg-list) (if (null? arg-list) 0 (+ (car arg-list) (sum-helper (cdr arg-list)) ) )) (sum-helper args) )
« Week 3 | Week 5 » |
« Back to James's CS61A page. | CS61A: Spring 2000 |
Last Modified: Tuesday, 30-Dec-2014 11:58:34 PST