Rethinking CS101:
Or, How Robots Revolutionize Introductory Computer Programming

Lynn Andrea Stein

Artificial Intelligence Laboratory and Department of EECS
Massachusetts Institute of Technology
Cambridge, MA 02139
las@ai.mit.edu (now las@olin.edu)

Abstract

Introductory computer science education is entrenched in an outdated computational model. Although it corresponds neither to our computing environments nor to our work, we insist on teaching our introductory students computation-as-calculation, a mathematical problem-solving view of the role of the computer program. We can dramatically improve this situation -- and, as a corollary, all of undergraduate computer science -- by focusing on the kind of dynamic, interactive, inherently parallel computation that occurs in spreadsheets and video games, web applications and robots.

1. How we teach Intro CS now (and what's wrong with it).

Traditionally, computer science education has begun from the perspective of von Neumann serial computing.[1] We teach people the following model of computation: Begin with a question. Describe the answer in terms of the question. Programming is the process of writing down the sequence of calculations required to get from a particular instance of the question to the corresponding instance of the answer. Computation is the process of executing those steps -- the algorithm -- to deduce the answer to a particular question.

But this model of computation doesn't really correspond to the way that computation exists in the world at large. Most computation these days is not algorithmic question-answering in desktop boxes. Instead, most computation takes place in automobiles and in toaster ovens. It is a parallel, distributed, embedded, continuous, condition-monitoring, event- driven, ongoing, interactive process. It is computation as a living, breathing thing that exists and coexists in a dynamic continuous parallel world. Even the computation that does occur in traditional computers is largely of this sort -- it is spreadsheets and word processors and network access protocols, distributed databases and graphical visualization tools and computer games, rather than mathematical problem-solving per se.

The difference here can be summed up with a simple example. Traditional computer science education begins with

                    print ``hello world'';

-- a simple program which performs its job and exits. This new approach to computation advocates a different starting point:

		    while (TRUE) {
			echo();
		    }

-- an ongoing process which samples its environment and responds to it. Of course, there's much more to the story of computation-as-interaction than a simple keyboard echo, just as there's much more to traditional computation-as-calculation than printing ``Hello, world!''. But this simple example captures the spirit of the difference, if not all of its implications.

2. How we'd like to teach Intro CS.

Actually, this second model of computation -- the model which starts with interactive computational processes -- makes a compelling first introduction to computer science and to computer programming. The fundamental difference between this approach to computation and the one traditionally taught to our introductory students is the following: in intro CS now, students are allowed to assume that there is a single thread of execution and that it is under their control.[2]

This course changes the terms of the debate simply by beginning from the premise that there are multiple parallel threads of execution.

By starting with interactive computational processes, we make it possible to talk -- and think -- about the computational world in ways that students now don't experience until late in their undergraduate careers (or beyond). If we teach this model in the first course, we begin our students with the mindset that their work is part of a dynamically interacting system; their goal is to build a piece (or pieces) which coexist and cooperate and collaborate. We can also expose them to a wide variety of topics which are now relegated to upper-level and graduate courses.

In fact, most researchers in computer science would probably agree that this model of computation is at the heart of most of the activity at the frontiers of computer science today. Of course, once students have learned this broader model, they can easily be introduced to von Neumann serial computation as a (significant) special case. But it turns out that most of traditional computer science education follows at least as naturally from this altered introduction.

3. Why it has been too hard with conventional technology.

If this is such a reasonable idea, it might seem surprising that introductory computer science is not already taught this way.[3] One explanation is simply the lag between the time that an idea hits the research frontier and the time at which it is ready to be integrated into the lowest levels of our curriculum. An equally significant issue, however, has been the non-availability of simple models of embedded embodied computation.

Actually, conventional computer science curricula do currently teach this model of computation: in an upper level course, typically one on operating systems. Unfortunately, an operating system, while a wonderful example of this type of computation, is also an extremely complicated program and notoriously difficult for students to understand. One reason for this may be their prior training in serial computation; but the inherent complexity of an operating system, which is a software system whose behavior is largely unobservable to the eye, is in itself sufficient explanation of its difficulty. In any case, an operating system is obviously inappropriate for a first-term programming course; it is simply beyond the reach of most introductory computer science students.

An operating system is only the most commonly taught example of embedded computation. Many other consequences and implications of this approach are also treated in existing curricula, though disjointedly: distributed algorithms in a theoretical computer science class, transaction processing in databases, parallel processing in operating systems, interprocess communication models in programming languages, event-driven computation in a class on interfaces, multi-agent interaction in distributed AI. One implication of this course is to tie together all of these ideas and to present their simpler manifestations early in the curriculum. Students who begin here are likely to see all of these issues as pieces of a larger whole, and to have better appreciation for their complexities when revisiting them in these upper level courses. The difficulty is in finding a way to package up these issues and make them tangible for students who have never programmed a computer before.

4. How robots make everything easier.

All of this is simplified with the availability of simple, cheap, and easy-to-use robotics.[4] Robots have all of the properties of embedded embodied continuous computation. They are responsive to the world around them, usually monitoring several sensors and internal processes at once. They operate continuously; their controllers are often distributed and/or parallel, but invariably multi-threaded. However, unlike traditional examples of multi- threaded programs, the behavior of (simple) robot control programs is exposed: it is readily observable. This makes them ideal devices for conveying the complex concepts underlying this approach to computational processes.

Even better, robot control programs behave differently from run to run, responding to subtle differences in the environment or initial conditions. This means that multiple runs of a robot control program are likely to produce different behaviors and to test different aspects or limitations of the program. Unlike traditional introductory CS programming assignments, which are too often run once on a fixed set of test data, robot control programs are by their very nature generally tested on multiple heterogeneous sets of inputs. This gives the student programmer exposure to the software lifecycle, in which most coding is modification and maintenance of existing code, without requiring that their work be solely or largely modification of code that they have not written.

5. What the new course would look like.

Like every introductory programming class, this class must begin with the mechanics of program-writing, introducing basic data types and programming constructs. Rather than detailing the common points between this course and a traditional introductory programming class, this section highlights the ways in which the idea of interactive computation can form the backbone of an introductory programming class.

5.1 The Interactive Control Loop

The class begins with the idea of a responsive loop. An example is a light-following robot which reads a differential photo-receptor (Is there more light to the right or to the left?) and uses that value to drive the motors towards the brighter side. This robot can be implemented using a simple control loop:

		    while (TRUE) {
			if (left_brighter) {
			    drive_left();
			} else {
			    drive_right();
			}
		    }

(while(TRUE){echo();} is a non-robotic analog.) This program introduces the idea of continuous computation and responsivity to an environment. Basics of state-machine and dynamical system vocabularies can be concretely grounded in this example.

Martin and Resnick (1990; also Hogg, Martin, and Resnick, 1991), following Braitenberg (1984), describes an interesting alternative to the simple control loop implementation of the light-follower of the previous paragraph. Martin et al. built a set of ``Braitenberg bricks'' which embed digital logic into LEGO(TM) bricks. Using these, the light-follower can be implemented directly in continuous logic. This gives the student an alternative model by which to measure the performance of the software implementation. Indeed, the interactive control loop model is a discrete approximation to the continuous and inherently parallel performance of a logic circuit. Students given the chance to explore both models can only strengthen their understanding of this kind of interactive computation.

5.2 Modes of Teaching

A brief aside: like many of the projects throughout this course, the example program in the previous section can be approached in several ways. The first is as a traditional programming assignment: Build a robot control program which yields the appropriate behavior. By using a physical robot (or a very rich simulator), students are also given the opportunity to test their programs under different circumstances (e.g., varying ambient light conditions) and to see the importance of robust coding first-hand. Odds are, it will also give them the opportunity to revisit the programming of their assignments, revising and bulletproofing their code. This alleviates some of the program-once, run on test data culture of standard introductory programming classes.

But beyond the initial programming and revision, there are also opportunities to explore computation from other standpoints. For example, the programmed robot can be used for an observation-based laboratory -- recording measurements of the robot's behavior in a laboratory notebook, as is traditional in many scientific disciplines. (E.g., what does your robot do if the light source is intermittent?) Alternately, an assignment can involve building -- or simply comparing -- multiple implementations of the same behavior. Martin et al.'s Braitenberg bricks are one example of an alternative implementation; but even with a single set of computational hardware, multiple implementations (ranging from parameter variations to multiple control algorithms) provide an opportunity for students to compare and contrast behavior under varying circumstances.[5]

The traditional approach to introductory programming gives students few opportunities to observe the behavior of their code in a context other than debugging. Understanding the interactions between program and behavior is critical in many modern applications. A secondary goal of this course is to give students an appreciation for this aspect of coding.

5.3 Expanding the Central Control Loop: Differential Responses, Parallel Processes, State

The basic interactive control loop can be used to teach a variety of conditional constructs by investigating differential responses to varying inputs. For example, a robot might treat different colored light sources as different commands, e.g., using a pattern of lights as directions to navigate through a maze; or a robot equipped with left and right front bumpers might follow corridors; or the simplest example of all, a termination condition. This introduces a conditional dispatch statement embedded in the central control loop, providing the opportunity to explore -- and discuss the relative merits of -- a variety of control constructs (e.g., switch vs. cascaded ifs vs. explicit table dispatch vs. higher order procedures).

The different responses invoked by the central control loop can, of course, become arbitrarily complex. Each response can be encapsulated as a separate procedure, introducing the language of subroutines. These subroutines can grow to be semi- independent, with their own threads of control: parallel processes. These processes will sometimes need to interact, introducing issues of synchronization, shared variables, message passing, and other topics in interprocess communication. The topics here grow arbitrarily rich depending on time allotted.

An orthogonal direction for expansion of the basic interactive control loop involves maintaining internal state. For example, the corridor-following robot might navigate the edges of open space by a) bumping the wall to determine its presence b) steering away from the wall to make forward progress c) angling back towards the wall after a fixed period to re-establish its presence, looping back to (a). Different sensory inputs in different contexts take on different meanings. Examples here, too, can get arbitrarily complex. This topic naturally motivates exploration of the underlying finite state machine model.

5.4 Beyond the Central Control Loop

As the programs surrounding the central control loop become more and more complex -- and as they occupy their own threads of control -- the need for centralized control becomes less apparent. A natural evolution is to a fully decentralized model of control: a robot with several complex behaviors which interact locally to provide globally coherent behaviors, such as a randomized-exploration obstacle-avoiding forager. This robot has some global ``goals'' such as locating ``food'' and bringing it ``home'', but is largely driven by local responses: picking up food when in its presence and not currently carrying, turning away from obstacles, etc. Each of these behaviors ``wakes up'' when the appropriate conditions are met; no central module drives the robot. An architecturally similar example from a non- robotic domain might deal with a multi-threaded interactive graphical simulation or video game.

This kind of architecture provides an opportunity to explore formal or informal tradeoffs between centralized and distributed control models. At the distributed end, it introduces the need for an arbitration scheme (global vs. local, peer-to-peer vs. hierarchical, ...). The centralized model leads naturally to traditional scheduling issues as well as the ideas behind event-driven and interrupt-driven programming. A contrasting problem, such as a distributed sorting algorithm, exemplifies differences between homogeneous and heterogeneous parallelism.

5.5 Implications: Advanced Topics

As these last few topics indicate, there are a whole host of advanced issues raised by this model of computation. Perhaps the most basic surround the questions of design and decomposition for programs built along this model. Traditional functional decomposition is no longer an adequate tool; now the interrelationships between object and thread come to the fore. Although there are no completely adequate solutions to this problem, an exploration of heuristics for encapsulation and even conceptualization is an running theme throughout the course; examples range from dynamical systems analysis (which ties in particularly well with a control-theoretic curriculum) to traditional object-oriented software engineering techniques to the finite-state-machine decomposition found in the formal analysis of distributed systems. A related issue, critical to any version of this course which relies heavily on student programming, is that of parallel debugging and monitoring techniques.

Alternately, a course might choose to devote further attention to issues covered only briefly here, such as the details of implementation of synchronization mechanisms and reentrant code, or the concept of transaction safety and techniques for anticipatory concurrency control or recovery. User interface design or multithreaded graphics or details of web- walking (including everything from html to an understanding of network protocols) are logical extensions of the curriculum described here.

In short, this approach to introductory computer science and computer programming provides a logical introduction to the diversity of topics that make up our computer science curriculum today. Like current introductory programming courses, in can (and should) be customized to fit into the curriculum of each institution. Unlike traditional introductory courses, it provides the student with a more sophisticated (and more accurate) underlying model of computation. In the next section, the relationship of this introductory course to the larger CS curriculum will be explored in greater depth.

6. How it fits into existing CS curricula.

Although this approach to introductory CS seems intriguing, it's hardly likely to see widespread acceptance if it require a reworking of the entire undergraduate curriculum. Fortunately, it turns out that the new introductory course slides nicely into place without requiring much change at all; the upper level courses can continue as they are, but are likely to find their task simplified somewhat by the new perspective that students bring to them.

After all, this course is really still an introductory programming course. Its thematic lesson concerns a model of computation as interaction, rather than calculation. But its pragmatic goals include most of the skills that are learned in existing versions of introductory CS. As a concrete example, sorting algorithms absolutely have a place in the revised course; it is now, however, equally appropriate to discuss parallel quicksort as bubble sort. The fundamental lesson of this course remains how to take a description and construct a program whose behavior implements that description; the differences are in the underlying assumptions, the kinds of descriptions that can therefore be considered, and the corresponding conceptualizations used to build the program. The computational constructs and modeling tools have changed; the problem remains the programming.

The remainder of the curriculum which begins with an introduction to computation on these terms may look much like the existing computer science undergraduate curriculum, but with subtle differences. Several important topics that are currently covered only in advanced undergraduate or graduate level classes can be introduced earlier in the curriculum. For example, topics in distributed algorithms and parallel complexity -- such as the time/processor tradeoff -- can be taught in the first course in computer science theory if the model of parallel computation is already familiar. Since modern algorithms increasingly makes use of such approaches, it seems only natural to expose our undergraduates to the fundamental ideas in these areas.

Other topics, already present at the undergraduate level, become much easier to explain when students come equipped with this world view. Much of operating systems becomes an exploration of different methods for implementing and ensuring appropriate behavior of multi-processing, rather than focusing on the concept of parallel execution itself. Students seeing these ideas for the second time, now in depth, are more likely to appreciate some of the subtleties of the problem rather than being confused by the many levels at which operating system code must operate. Synchronization and interprocess communication can be introduced along with scheduling. Transaction-safety, remote procedure call, and shared memory models similarly follow smoothly from this approach.

7. Past, present, future.

Perhaps the most direct precursors to this work are the many outgrowths of the Logo project. (See, e.g., (Papert, 1980, Abelson & diSessa, 1981).) From the first physically instantiated turtles to later work on LEGO/Logo (Resnick & Ocko, 1990), the connections between introductory programming and a visibly manipulable environment have been at the heart of that project. However, nearly all of the logo work took place within an essentially sequential universe.

A notable exception is Resnick's (1988, 1992) work. In the earlier of these, he describes a project to introduce a genuinely multi-threaded logo to a small group of elementary school students already familiar with the traditional language. Although these students had already been trained in the sequential approach -- and, indeed, he documents many of their resulting misconceptions -- Resnick's overall approach in that project is very much in the spirit of the introductory CS course I describe here. Rather than continuing this line of exploration using medium-grained parallelism and individually customized processes, Resnick's later work moved in the direction of massive parallelism; his *Logo work remains an exciting contrast to the coarser-grained parallelism described here.

A different kind of precursor is Karel the robot (Pattis, 1981; Pattis, Roberts, and Stehlik, 1995). Pattis designed Karel as a motivational application environment for introductory programming students. As the popularity of the text (now in its second edition) attests, the metaphor of the robot is a compelling one, and the directly observable nature of its behavior is a significant aid to pedagogy. Nonetheless, Karel is a very different beast from any proposed here. It is implemented strictly in a very high-level (and static) simulation -- a grid-world with a handful of move and sense actions -- and serves largely to illustrate the behavior of very simple Pascal programs.

In contrast, this course is inspired by the dynamics of physical robots interacting with dynamic worlds. There is a large research literature on this topic, and the state-of-the-art is improving rapidly, to the point that inexpensive robots (in the low hundreds of dollars) are becoming increasingly available. The many articles in this volume will attest to the influence and efficacy of robotics as an inspirational approach.

My own experiences with the potential for this approach have largely involved outgrowths of the MIT 6.270 student-run robot-building contest conducted during independent activities period each January.[6] Building on technology developed for that activity, several versions of 6.270 have been taught to a wide variety of audiences nationally and internationally (e.g., AAAI, 1993).

This in turn has lead to the successful adoption of 6.270-based technologies in the robotics or artificial intelligence curricula of at least a dozen universities. Nor has the use of robotics technologies in the classroom been limited to AI; as other articles in this volume will attest, many aspects of engineering design have been enhanced through the use of robotics technology. Virtually all of these uses, however, presume a familiarity on the part of the students with the basic principles of programming. Today, that familiarity is likely to begin with the sequentialist mindset.

Unfortunately, the recent advances in the state of robotic technology have not brought us quite to the $300 PC-peripheral commodity robot (available at computer superstores near you...). In the absence of such technology, some institutions will have the luxury of using either slightly more expensive or somewhat less reliable hardware. Others will rely more heavily on simulations, which are also increasingly available and suitable.

But in the absence of satisfactory robotic solutions, this class is both viable and even desirable in the world of dynamic software environments. The world-wide web is an unprecedented resource for this kind of exploration. It makes accessible in a purely software environment much of the dynamicism and real-time nature of robotic systems. Beyond the network itself, on-line interactive multi-user environments provide potentially inspirational playgrounds for this course. Further, the web enables software (and even physical) resources from one site to be shared by others, enabling a new kind of shared pedagogy and communal curriculum development.

8. Acknowledgments

Early versions of many of these ideas were developed jointly with Jim Hendler of the University of Maryland at College Park. Beyond credit for the ideas, he deserves tremendous thanks for support, encouragement, and a sustained commitment which has been invaluable in their (and my) development. Although not a part of the formal development of these ideas, Hal Abelson, Rod Brooks, Eric Grimson, Gill Pratt, and Gerry Sussman have all been critical to their growth. Each in his own way has also been both a facilitator and an inspiration. All of my students, but especially Mark Torrance, Mike Wessler, and Holly Yanco, have not only sat through innumerable recountings of this story, but have actually stayed awake long enough to contribute to it. Finally, I am grateful to the staff and students of the AAAI '93 Robot Building Lab, the Robot Building Cooperative at the MIT Edgerton Center, and those who will brave the summer and fall '96 versions of this course.

This material is based upon work supported by the National Science Foundation under Young Investigator Award No. 93-57761. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author and do not necessarily reflect the views of the National Science Foundation.

9. References

Abelson, Harold, and Andrea diSessa. Turtle Geometry: The Computer as a Medium for Exploring Mathematics. MIT Press. Cambridge. 1981.

Abelson, Harold, and Gerald J. Sussman, with Julie Sussman. Structure and Interpretation of Computer Programs (MIT Press: Cambridge, 1985). Computing Curricula 1991: Report of the ACM/IEEE�CS Joint Curriculum Task Force. ACM Press. New York. 1991.

AAAI Robot Building Laboratory Notes. Based on the 6.270 Robot Builder's Guide. 6.270 Organizers. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Cambridge, MA. 1993.

Connor, Brook, David Niguidula, and Andries van Dam. Object Oriented Programming in Pascal: A Graphical Approach. Addison Wesley, Inc. New York. 1995. Braitenberg, Valentino. Vehicles: Experiments in Synthetic Psychology . The MIT Press. Cambridge, MA. 1984.

Joseph L., and Anita M. Flynn. Mobile Robots: Inspiration to Implementation. A.K. Peters. Wellesley, MA. 1993.

Hogg, David W, Fred Martin, and Mitchel Resnick. ``Braitenberg Creatures.'' Epistemology and Learning Memo #13. MIT Media Laboratory. Cambridge, MA. 1991.

Callaghan, Victor, and Robert McCartney, eds. Computer Science Education Special Issue on Robotics in Computer Science and Engineering Education. Ablex. Norwood, NJ. To appear.

Donnett, Jim, and Tim Smithers. ``Lego Vehicles: A Technology for Studying Intelligent Systems.'' In From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior. Jean-Arcady Meyer and Stewart W. Wilson, eds. The MIT Press. Cambridge. 1991.

Martin, Fred. Circuits to Control: Learning Engineering by Designing LEGO Robots. Ph.D. Dissertation. Media Laboratory. Massachusetts Institute of Technology. Cambridge, MA. 1994.

Martin, Fred, and Mitchel Resnick. ``LEGO/Logo and Electronic Bricks: Creating a Scienceland for Children.'' In Advanced Educational Technologies for Mathematics and Science. David L. Ferguson, ed. Springer-Verlag. Berlin. 1993.

Papert, Seymour. Mindstorms: Children, Computers, and Powerful Ideas. Basic Books. 1980.

Pattis, Richard E. Karel the Robot: A Gentle Introduction to the Art of Programming. John Wiley & Sons, Inc. New York. 1981.

Pattis, Richard E., Jim Roberts, and Mark Stehlik. Karel the Robot: A Gentle Introduction to the Art of Programming. Second Edition. John Wiley & Sons, Inc. New York. 1995.

Resnick, Mitchel. MultiLogo: A Study of Children and Concurrent Programming. Sc.M. Thesis. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Cambridge, MA. 1988.

Resnick, Mitchel. Beyond the Centralized Mindset: Explorations in Massively Parallel Microworlds Ph.D. Thesis. Department of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Cambridge, MA. 1992.

Resnick, Mitchel, and Steven Ocko. ``LEGO/Logo: Learning through and about design.'' Epistemology and Learning Memo #8. MIT Media Laboratory. Cambridge, MA. 1990.

Roberts, Eric. The Art and Science of C . Addison Wesley. New York. 1995.

Notes

[1] I am in fact lumping together a wide range of approaches to introductory computation. (See, for example, (Abelson, Sussman, and Sussman, 1985; ACM, 1991; Connor, Niguidula, and van Dam, 1995; Roberts, 1995) for a variety of different course curricula.) Although there are many variants, with significant pedagogic and curricular distinctions, the single-thread-of-control sequentialist paradigm is virtually universal.

[2] This has not in fact been true for decades, since the advent of the timesharing system, but until sometime in the 1980s most applications programs subscribed to the pretense of a single thread. Increasingly, however, students laboring under this illusion must first be retrained to deal with the realities of embedded multi-threaded computation before they can appreciate the worlds of applications programming or advanced computer science.

[3] To be perfectly fair, many existing introductory courses do make an effort to include some indication of the multi-threaded nature of the universe. When done, this is generally accomplished through the inclusion of a unit -- on user interface, simulation, or explicit parallelism -- which extends the course's central sequentialist metaphor.

[4] See, for example, (Donnett and Smithers, 1991; Jones and Flynn, 1993; Martin, 1994). Many of the articles in this volume (Callaghan and McCartney, to appear) exploit this recent trend.

[5] This approach is common, e.g., when exploring sorting in an algorithms class, but is equally for this introductory laboratory. The need to vary part of the implementation without requiring a complete rewriting of the code motivates the crucial idea of modularity. Together with the dynamics of the robot environment and the resulting opportunity to continually modify code, this is of necessity an introduction to the software lifecycle.

[6] (See Martin, 1994.) This class packs a hands-on introduction to all aspects of engineering -- ranging from mechanical design to electronics to computation -- into a three-week period. It is a superb example of the best kind of hands-on educational practice, and the students who have designed and run it over the past decade deserve tremendous credit.

Back to CS101 Home Page

Back to Lynn Andrea Stein's Home Page

las@olin.edu

Rethinking CS101: Or, How Robots Revolutionize Introductory Computer Programming