AI Publications

Last update Sun Mar 19 05:05:02 2006

AIM-2005-037

Author[s]: Charles C. Kemp and Aaron Edsinger

Visual Tool Tip Detection and Position Estimation for Robotic Manipulation of Unknown Human Tools

November 16, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-037.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-037.pdf

Robots that use human tools could more easily work with people, perform tasks that are important to people, and benefit from human strategies for accomplishing these tasks. For a wide variety of tools and tasks, control of the tool's endpoint is sufficient for its use. In this paper we present a straight-forward method for rapidly detecting the endpoint of an unmodeled tool and estimating its position with respect to the robot's hand. The robot rotates the tool while using optical flow to detect the most rapidly moving image points, and then finds the 3D position with respect to its hand that best explains these noisy 2D detections. The resulting 3D position estimate allows the robot to control the position of the tool endpoint and predict its visual location. We show successful results for this method using a humanoid robot with a variety of traditional tools, including a pen, a hammer, and pliers, as well as more general tools such as a bottle and the robot's own finger.

AIM-2005-036

CBCL-259

Author[s]: T. Serre, M. Kouh, C. Cadieu, U. Knoblich, G. Kreiman, T. Poggio

A theory of object recognition: computations and circuits in the feedforward path of the ventral stream in primate visual cortex

December 19, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-036.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-036.pdf

We describe a quantitative theory to account for the computations performed by the feedforward path of the ventral stream of visual cortex and the local circuits implementing them. We show that a model instantiating the theory is capable of performing recognition on datasets of complex images at the level of human observers in rapid categorization tasks. We also show that the theory is consistent with (and in some case has predicted) several properties of neurons in V1, V4, IT and PFC. The theory seems sufficiently comprehensive, detailed and satisfactory to represent an interesting challenge for physiologists and modelers: either disprove its basic features or propose alternative theories of equivalent scope. The theory suggests a number of open questions for visual physiology and psychophysics.

AIM-2005-035

CBCL-258

Author[s]: Yuri Ivanov, Thomas Serre and Jacob Bouvrie

Confidence weighted classifier combination for multi-modal human identification

December 14, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-035.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-035.pdf

In this paper we describe a technique of classifier combination used in a human identification system. The system integrates all available features from multi-modal sources within a Bayesian framework. The framework allows representing a class of popular classifier combination rules and methods within a single formalism. It relies on a “per-class” measure of confidence derived from performance of each classifier on training data that is shown to improve performance on a synthetic data set. The method is especially relevant in autonomous surveillance setting where varying time scales and missing features are a common occurrence. We show an application of this technique to the real-world surveillance database of video and audio recordings of people collected over several weeks in the office setting.

AIM-2005-034

Author[s]: Leonid Taycher, Gregory Shakhnarovich, David Demirdjian, and Trevor Darrell

Conditional Random People: Tracking Humans with CRFs and Grid Filters

December 1, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-034.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-034.pdf

We describe a state-space tracking approach based on a Conditional Random Field (CRF) model, where the observation potentials are \emph{learned} from data. We find functions that embed both state and observation into a space where similarity corresponds to $L_1$ distance, and define an observation potential based on distance in this space. This potential is extremely fast to compute and in conjunction with a grid-filtering framework can be used to reduce a continuous state estimation problem to a discrete one. We show how a state temporal prior in the grid-filter can be computed in a manner similar to a sparse HMM, resulting in real-time system performance. The resulting system is used for human pose tracking in video sequences.

AIM-2005-033

Author[s]: Sanjoy Dasgupta, Adam Tauman Kalai, Claire Monteleoni

Analysis of Perceptron-Based Active Learning

November 17, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-033.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-033.pdf

We start by showing that in an active learning setting, the Perceptron algorithm needs $\Omega(\frac{1}{\epsilon^2})$ labels to learn linear separators within generalization error $\epsilon$. We then present a simple selective sampling algorithm for this problem, which combines a modification of the perceptron update with an adaptive filtering rule for deciding which points to query. For data distributed uniformly over the unit sphere, we show that our algorithm reaches generalization error $\epsilon$ after asking for just $\tilde{O}(d \log \frac{1}{\epsilon})$ labels. This exponential improvement over the usual sample complexity of supervised learning has previously been demonstrated only for the computationally more complex query-by-committee algorithm.

AIM-2005-032

Author[s]: Claire Monteleoni, Tommi Jaakkola

Online Learning of Non-stationary Sequences

November 17, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-032.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-032.pdf

We consider an online learning scenario in which the learner can make predictions on the basis of a fixed set of experts. We derive upper and lower relative loss bounds for a class of universal learning algorithms involving a switching dynamics over the choice of the experts. On the basis of the performance bounds we provide the optimal a priori discretization of the switching-rate parameter that governs the switching dynamics. We demonstrate the algorithm in the context of wireless networks.

AIM-2005-031

Author[s]: Alexandr Andoni and Piotr Indyk

New LSH-based Algorithm for Approximate Nearest Neighbor

November 3, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-031.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-031.pdf

We present an algorithm for c-approximate nearest neighbor problem in a d-dimensional Euclidean space, achieving query time of O(dn^{1/c^2+o(1)}) and space O(dn + n^{1+1/c^2+o(1)}).

AIM-2005-030

CBCL-257

Author[s]: Ross Lippert and Ryan Rifkin

Asymptotics of Gaussian Regularized Least-Squares

October 20, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-030.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-030.pdf

We consider regularized least-squares (RLS) with a Gaussian kernel. We prove that if we let the Gaussian bandwidth $\sigma \rightarrow \infty$ while letting the regularization parameter $\lambda \rightarrow 0$, the RLS solution tends to a polynomial whose order is controlled by the relative rates of decay of $\frac{1}{\sigma^2}$ and $\lambda$: if $\lambda = \sigma^{-(2k+1)}$, then, as $\sigma \rightarrow \infty$, the RLS solution tends to the $k$th order polynomial with minimal empirical error. We illustrate the result with an example.

AIM-2005-029

CBCL-256

Author[s]: Gadi Geiger & Domenic G Amara

Towards the Prevention of Dyslexia

October 18, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-029.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-029.pdf

Previous studies have shown that dyslexic individuals who supplement windowed reading practice with intensive small-scale hand-eye coordination tasks exhibit marked improvement in their reading skills. Here we examine whether similar hand-eye coordination activities, in the form of artwork performed by children in kindergarten, first and second grades, could reduce the number of students at-risk for reading problems. Our results suggest that daily hand-eye coordination activities significantly reduce the number of students at-risk. We believe that the effectiveness of these activities derives from their ability to prepare the students perceptually for reading.

AIM-2005-028

CBCL-255

Author[s]: Sanmay Das

Learning to Trade with Insider Information

October 7, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-028.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-028.pdf

This paper introduces algorithms for learning how to trade using insider (superior) information in Kyle's model of financial markets. Prior results in finance theory relied on the insider having perfect knowledge of the structure and parameters of the market. I show here that it is possible to learn the equilibrium trading strategy when its form is known even without knowledge of the parameters governing trading in the model. However, the rate of convergence to equilibrium is slow, and an approximate algorithm that does not converge to the equilibrium strategy achieves better utility when the horizon is limited. I analyze this approximate algorithm from the perspective of reinforcement learning and discuss the importance of domain knowledge in designing a successful learning algorithm.

AIM-2005-027

Author[s]: Georgios Theocharous, Sridhar Mahadevan, Leslie Pack Kaelbling

Spatial and Temporal Abstractions in POMDPs Applied to Robot Navigation

September 27, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-027.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-027.pdf

Partially observable Markov decision processes (POMDPs) are a well studied paradigm for programming autonomous robots, where the robot sequentially chooses actions to achieve long term goals efficiently. Unfortunately, for real world robots and other similar domains, the uncertain outcomes of the actions and the fact that the true world state may not be completely observable make learning of models of the world extremely difficult, and using them algorithmically infeasible. In this paper we show that learning POMDP models and planning with them can become significantly easier when we incorporate into our algorithms the notions of spatial and tempral abstraction. We demonstrate the superiority of our algorithms by comparing them with previous flat approaches for large scale robot navigation.

AIM-2005-026

Author[s]: Chris Stauffer

Automated Audio-visual Activity Analysis

September 20, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-026.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-026.pdf

Current computer vision techniques can effectively monitor gross activities in sparse environments. Unfortunately, visual stimulus is often not sufficient for reliably discriminating between many types of activity. In many cases where the visual information required for a particular task is extremely subtle or non-existent, there is often audio stimulus that is extremely salient for a particular classification or anomaly detection task. Unfortunately unlike visual events, independent sounds are often very ambiguous and not sufficient to define useful events themselves. Without an effective method of learning causally-linked temporal sequences of sound events that are coupled to the visual events, these sound events are generally only useful for independent anomalous sounds detection, e.g., detecting a gunshot or breaking glass. This paper outlines a method for automatically detecting a set of audio events and visual events in a particular environment, for determining statistical anomalies, for automatically clustering these detected events into meaningful clusters, and for learning salient temporal relationships between the audio and visual events. This results in a compact description of the different types of compound audio-visual events in an environment.

AIM-2005-025

Author[s]: Bryan C. Russell, Antonio Torralba, Kevin P. Murphy, and William T. Freeman

LabelMe: a database and web-based tool for image annotation

September 8, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-025.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-025.pdf

Research in object detection and recognition in cluttered scenes requires large image collections with ground truth labels. The labels should provide information about the object classes present in each image, as well as their shape and locations, and possibly other attributes such as pose. Such data is useful for testing, as well as for supervised learning. This project provides a web-based annotation tool that makes it easy to annotate images, and to instantly share such annotations with the community. This tool, plus an initial set of 10,000 images (3000 of which have been labeled), can be found at http://www.csail.mit.edu/$\sim$brussell/research/LabelMe/intro.html

AIM-2005-024

Author[s]: Whitman Richards

Collective Choice with Uncertain Domain Moldels

August 16, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-024.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-024.pdf

When groups of individuals make choices among several alternatives, the most compelling social outcome is the Condorcet winner, namely the alternative beating all others in a pair-wise contest. Obviously the Condorcet winner cannot be overturned if one sub-group proposes another alternative it happens to favor. However, in some cases, and especially with haphazard voting, there will be no clear unique winner, with the outcome consisting of a triple of pair-wise winners that each beat different subsets of the alternatives (i.e. a “top-cycle”.) We explore the sensitivity of Condorcet winners to various perturbations in the voting process that lead to top-cycles. Surprisingly, variations in the number of votes for each alternative is much less important than consistency in a voter’s view of how alternatives are related. As more and more voter’s preference orderings on alternatives depart from a shared model of the domain, then unique Condorcet outcomes become increasingly unlikely.

AIM-2005-023

CBCL-254

Author[s]: Jerry Jun Yokono and Tomaso Poggio

Boosting a Biologically Inspired Local Descriptor for Geometry-free Face and Full Multi-view 3D Object Recognition

July 7, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-023.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-023.pdf

Object recognition systems relying on local descriptors are increasingly used because of their perceived robustness with respect to occlusions and to global geometrical deformations. Descriptors of this type -- based on a set of oriented Gaussian derivative filters -- are used in our recognition system. In this paper, we explore a multi-view 3D object recognition system that does not use explicit geometrical information. The basic idea is to find discriminant features to describe an object across different views. A boosting procedure is used to select features out of a large feature pool of local features collected from the positive training examples. We describe experiments on face images with excellent recognition rate.

AIM-2005-022

CBCL-253

Author[s]: Chou Hung, Gabriel Kreiman, Tomaso Poggio, James J. DiCarlo

Ultra-fast Object Recognition from Few Spikes

July 6, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-022.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-022.pdf

Understanding the complex brain computations leading to object recognition requires quantitatively characterizing the information represented in inferior temporal cortex (IT), the highest stage of the primate visual stream. A read-out technique based on a trainable classifier is used to characterize the neural coding of selectivity and invariance at the population level. The activity of very small populations of independently recorded IT neurons (~100 randomly selected cells) over very short time intervals (as small as 12.5 ms) contains surprisingly accurate and robust information about both object ‘identity’ and ‘category’, which is furthermore highly invariant to object position and scale. Significantly, selectivity and invariance are present even for novel objects, indicating that these properties arise from the intrinsic circuitry and do not require object-specific learning. Within the limits of the technique, there is no detectable difference in the latency or temporal resolution of the IT information supporting so-called ‘categorization’ (a.k. basic level) and ‘identification’ (a.k. subordinate level) tasks. Furthermore, where information, in particular information about stimulus location and scale, can also be read-out from the same small population of IT neurons. These results show how it is possible to decode invariant object information rapidly, accurately and robustly from a small population in IT and provide insights into the nature of the neural code for different kinds of object-related information.

AIM-2005-021

Author[s]: ali rahimi, ben recht, trevor darrell

Nonlinear Latent Variable Models for Video Sequences

June 6, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-021.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-021.pdf

Many high-dimensional time-varying signals can be modeled as a sequence of noisy nonlinear observations of a low-dimensional dynamical process. Given high-dimensional observations and a distribution describing the dynamical process, we present a computationally inexpensive approximate algorithm for estimating the inverse of this mapping. Once this mapping is learned, we can invert it to construct a generative model for the signals. Our algorithm can be thought of as learning a manifold of images by taking into account the dynamics underlying the low-dimensional representation of these images. It also serves as a nonlinear system identification procedure that estimates the inverse of the observation function in nonlinear dynamic system. Our algorithm reduces to a generalized eigenvalue problem, so it does not suffer from the computational or local minimum issues traditionally associated with nonlinear system identification, allowing us to apply it to the problem of learning generative models for video sequences.

AIM-2005-020

Author[s]: Florent Segonne, Jean-Philippe Pons, Bruce Fischl, and Eric Grimson

A Novel Active Contour Framework. Multi-component Level Set Evolution under Topology Control

June 1, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-020.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-020.pdf

We present a novel framework to exert a topology control over a level set evolution. Level set methods offer several advantages over parametric active contours, in particular automated topological changes. In some applications, where some a priori knowledge of the target topology is available, topological changes may not be desirable. A method, based on the concept of simple point borrowed from digital topology, was recently proposed to achieve a strict topology preservation during a level set evolution. However, topologically constrained evolutions often generate topological barriers that lead to large geometric inconsistencies. We introduce a topologically controlled level set framework that greatly alleviates this problem. Unlike existing work, our method allows connected components to merge, split or vanish under some specific conditions that ensure that no topological defects are generated. We demonstrate the strength of our method on a wide range of numerical experiments.

AIM-2005-019

CBCL-252

Author[s]: Andrea Caponnetto, Lorenzo Rosasco, Ernesto De Vito and Alessandro Verri

Empirical Effective Dimension and Optimal Rates for Regularized Least Squares Algorithm

May 27, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-019.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-019.pdf

This paper presents an approach to model selection for regularized least-squares on reproducing kernel Hilbert spaces in the semi-supervised setting. The role of effective dimension was recently shown to be crucial in the definition of a rule for the choice of the regularization parameter, attaining asymptotic optimal performances in a minimax sense. The main goal of the present paper is showing how the effective dimension can be replaced by an empirical counterpart while conserving optimality. The empirical effective dimension can be computed from independent unlabelled samples. This makes the approach particularly appealing in the semi-supervised setting.

AIM-2005-018

CBCL-250

Author[s]: Andrea Caponnetto and Alexander Rakhlin

Some Properties of Empirical Risk Minimization over Donsker Classes

May 17, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-018.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-018.pdf

We study properties of algorithms which minimize (or almost minimize) empirical error over a Donsker class of functions. We show that the L2-diameter of the set of almost-minimizers is converging to zero in probability. Therefore, as the number of samples grows, it is becoming unlikely that adding a point (or a number of points) to the training set will result in a large jump (in L2 distance) to a new hypothesis. We also show that under some conditions the expected errors of the almost-minimizers are becoming close with a rate faster than n^{-1/2}.

AIM-2005-017

Author[s]: Thade Nahnsen, Ozlem Uzuner, Boris Katz

Lexical Chains and Sliding Locality Windows in Content-based Text Similarity Detection

May 19, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-017.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-017.pdf

We present a system to determine content similarity of documents. More specifically, our goal is to identify book chapters that are translations of the same original chapter; this task requires identification of not only the different topics in the documents but also the particular flow of these topics. We experiment with different representations employing n-grams of lexical chains and test these representations on a corpus of approximately 1000 chapters gathered from books with multiple parallel translations. Our representations include the cosine similarity of attribute vectors of n-grams of lexical chains, the cosine similarity of tf*idf-weighted keywords, and the cosine similarity of unweighted lexical chains (unigrams of lexical chains) as well as multiplicative combinations of the similarity measures produced by these approaches. Our results identify fourgrams of unordered lexical chains as a particularly useful representation for text similarity evaluation.

AIM-2005-016

Author[s]: Christopher Taylor, Ali Rahimi, Jonathan Bachrach and Howard Shrobe

Simultaneous Localization, Calibration, and Tracking in an ad Hoc Sensor Network

April 26, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-016.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-016.pdf

We introduce Simultaneous Localization and Tracking (SLAT), the problem of tracking a target in a sensor network while simultaneously localizing and calibrating the nodes of the network. Our proposed solution, LaSLAT, is a Bayesian filter providing on-line probabilistic estimates of sensor locations and target tracks. It does not require globally accessible beacon signals or accurate ranging between the nodes. When applied to a network of 27 sensor nodes, our algorithm can localize the nodes to within one or two centimeters.

AIM-2005-015

CBCL-249

Author[s]: Ernesto De Vito and Andrea Caponnetto

Risk Bounds for Regularized Least-squares Algorithm with Operator-valued kernels

May 16, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-015.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-015.pdf

We show that recent results in [3] on risk bounds for regularized least-squares on reproducing kernel Hilbert spaces can be straightforwardly extended to the vector-valued regression setting. We first briefly introduce central concepts on operator-valued kernels. Then we show how risk bounds can be expressed in terms of a generalization of effective dimension.

AIM-2005-014

Author[s]: Jacob Eisenstein and Randall Davis

Gestural Cues for Sentence Segmentation

April 19, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-014.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-014.pdf

In human-human dialogues, face-to-face meetings are often preferred over phone conversations. One explanation is that non-verbal modalities such as gesture provide additional information, making communication more efficient and accurate. If so, computer processing of natural language could improve by attending to non-verbal modalities as well. We consider the problem of sentence segmentation, using hand-annotated gesture features to improve recognition. We find that gesture features correlate well with sentence boundaries, but that these features improve the overall performance of a language-only system only marginally. This finding is in line with previous research on this topic. We provide a regression analysis, revealing that for sentence boundary detection, the gestural features are largely redundant with the language model and pause features. This suggests that gestural features can still be useful when speech recognition is inaccurate.

AIM-2005-013

CBCL-248

Author[s]: Andrea Caponnetto and Ernesto De Vito

Fast Rates for Regularized Least-squares Algorithm

April 14, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-013.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-013.pdf

We develop a theoretical analysis of generalization performances of regularized least-squares on reproducing kernel Hilbert spaces for supervised learning. We show that the concept of effective dimension of an integral operator plays a central role in the definition of a criterion for the choice of the regularization parameter as a function of the number of samples. In fact, a minimax analysis is performed which shows asymptotic optimality of the above-mentioned criterion.

AIM-2005-012

Author[s]: Jacob Beal

Learning From Snapshot Examples

April 13, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-012.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-012.pdf

Examples are a powerful tool for teaching both humans and computers. In order to learn from examples, however, a student must first extract the examples from its stream of perception. Snapshot learning is a general approach to this problem, in which relevant samples of perception are used as examples. Learning from these examples can in turn improve the judgement of the snapshot mechanism, improving the quality of future examples. One way to implement snapshot learning is the Top-Cliff heuristic, which identifies relevant samples using a generalized notion of peaks. I apply snapshot learning with the Top-Cliff heuristic to solve a distributed learning problem and show that the resulting system learns rapidly and robustly, and can hallucinate useful examples in a perceptual stream from a teacherless system.

AIM-2005-011

Author[s]: Justin Werfel, Yaneer Bar-Yam, Radhika Nagpal

Construction by robot swarms using extended stigmergy

April 8, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-011.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-011.pdf

We describe a system in which simple, identical, autonomous robots assemble two-dimensional structures out of identical building blocks. We show that, in a system divided in this way into mobile units and structural units, giving the blocks limited communication abilities enables robots to have sufficient global structural knowledge to rapidly build elaborate pre-designed structures. In this way we extend the principle of stigmergy (storing information in the environment) used by social insects, by increasing the capabilities of the blocks that represent that environmental information. As a result, arbitrary solid structures can be built using a few fixed, local behaviors, without requiring construction to be planned out in detail.

AIM-2005-010

Author[s]: Kilian M. Pohl, John Fisher, W. Eric L. Grimson, William M. Wells

An Expectation Maximization Approach for Integrated Registration, Segmentation, and Intensity Correction

April 1, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-010.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-010.pdf

This paper presents a statistical framework which combines the registration of an atlas with the segmentation of MR images. We use an Expectation Maximization-based algorithm to find a solution within the model, which simultaneously estimates image inhomogeneities, anatomical labelmap, and a mapping from the atlas to the image space. An example of the approach is given for a brain structure-dependent affine mapping approach. The algorithm produces high quality segmentations for brain tissues as well as their substructures. We demonstrate the approach on a set of 30 brain MR images. In addition, we show that the approach performs better than similar methods which separate the registration from the segmentation problem.

AIM-2005-009

CBCL-247

Author[s]: Lior Wolf & Stanley Bileschi

Combining Variable Selection with Dimensionality Reduction

March 30, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-009.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-009.pdf

This paper bridges the gap between variable selection methods (e.g., Pearson coefficients, KS test) and dimensionality reduction algorithms (e.g., PCA, LDA). Variable selection algorithms encounter difficulties dealing with highly correlated data, since many features are similar in quality. Dimensionality reduction algorithms tend to combine all variables and cannot select a subset of significant variables. Our approach combines both methodologies by applying variable selection followed by dimensionality reduction. This combination makes sense only when using the same utility function in both stages, which we do. The resulting algorithm benefits from complex features as variable selection algorithms do, and at the same time enjoys the benefits of dimensionality reduction.1

AIM-2005-008

Author[s]: Leonid Taycher, John W. Fisher III, and Trevor Darrell

Combining Object and Feature Dynamics in Probabilistic Tracking

March 2, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-008.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-008.pdf

Objects can exhibit different dynamics at different scales, a property that is often exploited by visual tracking algorithms. A local dynamic model is typically used to extract image features that are then used as inputs to a system for tracking the entire object using a global dynamic model. Approximate local dynamics may be brittle---point trackers drift due to image noise and adaptive background models adapt to foreground objects that become stationary---but constraints from the global model can make them more robust. We propose a probabilistic framework for incorporating global dynamics knowledge into the local feature extraction processes. A global tracking algorithm can be formulated as a generative model and used to predict feature values that influence the observation process of the feature extractor. We combine such models in a multichain graphical model framework. We show the utility of our framework for improving feature tracking and thus shape and motion estimates in a batch factorization algorithm. We also propose an approximate filtering algorithm appropriate for online applications, and demonstrate its application to problems such as background subtraction, structure from motion and articulated body tracking.

AIM-2005-007

Author[s]: Kristen Grauman and Trevor Darrell

Pyramid Match Kernels: Discriminative Classification with Sets of Image Features

March 17, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-007.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-007.pdf

Discriminative learning is challenging when examples are sets of local image features, and the sets vary in cardinality and lack any sort of meaningful ordering. Kernel-based classification methods can learn complex decision boundaries, but a kernel similarity measure for unordered set inputs must somehow solve for correspondences -- generally a computationally expensive task that becomes impractical for large set sizes. We present a new fast kernel function which maps unordered feature sets to multi-resolution histograms and computes a weighted histogram intersection in this space. This ``pyramid match" computation is linear in the number of features, and it implicitly finds correspondences based on the finest resolution histogram cell where a matched pair first appears. Since the kernel does not penalize the presence of extra features, it is robust to clutter. We show the kernel function is positive-definite, making it valid for use in learning algorithms whose optimal solutions are guaranteed only for Mercer kernels. We demonstrate our algorithm on object recognition tasks and show it to be dramatically faster than current approaches.

AIM-2005-006

CBCL-246

Author[s]: Benjamin Balas, Pawan Sinha

Receptive field structures for recognition

March 1, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-006.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-006.pdf

Localized operators, like Gabor wavelets and difference-of-Gaussian filters, are considered to be useful tools for image representation. This is due to their ability to form a ‘sparse code’ that can serve as a basis set for high-fidelity reconstruction of natural images. However, for many visual tasks, the more appropriate criterion of representational efficacy is ‘recognition’, rather than ‘reconstruction’. It is unclear whether simple local features provide the stability necessary to subserve robust recognition of complex objects. In this paper, we search the space of two-lobed differential operators for those that constitute a good representational code under recognition/discrimination criteria. We find that a novel operator, which we call the ‘dissociated dipole’ displays useful properties in this regard. We describe simple computational experiments to assess the merits of such dipoles relative to the more traditional local operators. The results suggest that non-local operators constitute a vocabulary that is stable across a range of image transformations.

AIM-2005-005

Author[s]: Josef Sivic, Bryan C. Russell, Alexei A. Efros, Andrew Zisserman, William T. Freeman

Discovering object categories in image collections

February 25, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-005.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-005.pdf

Given a set of images containing multiple object categories, we seek to discover those categories and their image locations without supervision. We achieve this using generative models from the statistical text literature: probabilistic Latent Semantic Analysis (pLSA), and Latent Dirichlet Allocation (LDA). In text analysis these are used to discover topics in a corpus using the bag-of-words document representation. Here we discover topics as object categories, so that an image containing instances of several categories is modelled as a mixture of topics. The models are applied to images by using a visual analogue of a word, formed by vector quantizing SIFT like region descriptors. We investigate a set of increasingly demanding scenarios, starting with image sets containing only two object categories through to sets containing multiple categories (including airplanes, cars, faces, motorbikes, spotted cats) and background clutter. The object categories sample both intra-class and scale variation, and both the categories and their approximate spatial layout are found without supervision. We also demonstrate classification of unseen images and images containing multiple objects. Performance of the proposed unsupervised method is compared to the semi-supervised approach of Fergus et al.

AITR-2005-004

Author[s]: Ozlem Uzuner

Identifying Expression Fingerprints using Linguistic Information

November 16, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AITR-2005-004.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AITR-2005-004.pdf

This thesis presents a technology to complement taxation-based policy proposals aimed at addressing the digital copyright problem. The approach presented facilitates identification of intellectual property using expression fingerprints. Copyright law protects expression of content. Recognizing literary works for copyright protection requires identification of the expression of their content. The expression fingerprints described in this thesis use a novel set of linguistic features that capture both the content presented in documents and the manner of expression used in conveying this content. These fingerprints consist of both syntactic and semantic elements of language. Examples of the syntactic elements of expression include structures of embedding and embedded verb phrases. The semantic elements of expression consist of high-level, broad semantic categories. Syntactic and semantic elements of expression enable generation of models that correctly identify books and their paraphrases 82% of the time, providing a significant (approximately 18%) improvement over models that use tfidf-weighted keywords. The performance of models built with these features is also better than models created with standard features used in stylometry (e.g., function words), which yield an accuracy of 62%. In the non-digital world, copyright holders collect revenues by controlling distribution of their works. Current approaches to the digital copyright problem attempt to provide copyright holders with the same kind of control over distribution by employing Digital Rights Management (DRM) systems. However, DRM systems also enable copyright holders to control and limit fair use, to inhibit others' speech, and to collect private information about individual users of digital works. Digital tracking technologies enable alternate solutions to the digital copyright problem; some of these solutions can protect creative incentives of copyright holders in the absence of control over distribution of works. Expression fingerprints facilitate digital tracking even when literary works are DRM- and watermark-free, and even when they are paraphrased. As such, they enable metering popularity of works and make practicable solutions that encourage large-scale dissemination and unrestricted use of digital works and that protect the revenues of copyright holders, for example through taxation-based revenue collection and distribution systems, without imposing limits on distribution.

AIM-2005-004

Author[s]: Reina Riemann, Keith Winstein

Improving 802.11 Range with Forward Error Correction

February 24, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-004.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-004.pdf

The ISO/IEC 8802-11:1999(E) specification uses a 32-bit CRC for error detection and whole-packet retransmissions for recovery. In long-distance or high-interference links where the probability of a bit error is high, this strategy results in excessive losses, because any erroneous bit causes an entire packet to be discarded. By ignoring the CRC and adding redundancy to 802.11 payloads in software, we achieved substantially reduced loss rates on indoor and outdoor long-distance links and extended line-of-sight range outdoors by 70 percent.

AITR-2005-003

Author[s]: Christopher J. Taylor

Simultaneous Localization and Tracking in Wireless Ad-hoc Sensor Networks

May 31, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AITR-2005-003.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AITR-2005-003.pdf

In this thesis we present LaSLAT, a sensor network algorithm that simultaneously localizes sensors, calibrates sensing hardware, and tracks unconstrained moving targets using only range measurements between the sensors and the target. LaSLAT is based on a Bayesian filter, which updates a probability distribution over the quantities of interest as measurements arrive. The algorithm is distributable, and requires only a constant amount of space with respect to the number of measurements incorporated. LaSLAT is easy to adapt to new types of hardware and new physical environments due to its use of intuitive probability distributions: one adaptation demonstrated in this thesis uses a mixture measurement model to detect and compensate for bad acoustic range measurements due to echoes. We also present results from a centralized Java implementation of LaSLAT on both two- and three-dimensional sensor networks in which ranges are obtained using the Cricket ranging system. LaSLAT is able to localize sensors to within several centimeters of their ground truth positions while recovering a range measurement bias for each sensor and the complete trajectory of the mobile.

AIM-2005-003

Author[s]: Gerald Jay Sussman and Jack Wisdom

Functional Differential Geometry

February 2, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-003.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-003.pdf

Differential geometry is deceptively simple. It is surprisingly easy to get the right answer with unclear and informal symbol manipulation. To address this problem we use computer programs to communicate a precise understanding of the computations in differential geometry. Expressing the methods of differential geometry in a computer language forces them to be unambiguous and computationally effective. The task of formulating a method as a computer-executable program and debugging that program is a powerful exercise in the learning process. Also, once formalized procedurally, a mathematical idea becomes a tool that can be used directly to compute results.

AITR-2005-002

CBCL-251

Author[s]: Jia Jane Wu

Comparing Visual Features for Morphing Based Recognition

May 25, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AITR-2005-002.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AITR-2005-002.pdf

This thesis presents a method of object classification using the idea of deformable shape matching. Three types of visual features, geometric blur, C1 and SIFT, are used to generate feature descriptors. These feature descriptors are then used to find point correspondences between pairs of images. Various morphable models are created by small subsets of these correspondences using thin-plate spline. Given these morphs, a simple algorithm, least median of squares (LMEDS), is used to find the best morph. A scoring metric, using both LMEDS and distance transform, is used to classify test images based on a nearest neighbor algorithm. We perform the experiments on the Caltech 101 dataset [5]. To ease computation, for each test image, a shortlist is created containing 10 of the most likely candidates. We were unable to duplicate the performance of [1] in the shortlist stage because we did not use hand-segmentation to extract objects for our training images. However, our gain from the shortlist to correspondence stage is comparable to theirs. In our experiments, we improved from 21% to 28% (gain of 33%), while [1] improved from 41% to 48% (gain of 17%). We find that using a non-shape based approach, C2 [14], the overall classification rate of 33.61% is higher than all of the shaped based methods tested in our experiments.

AIM-2005-002

CBCL-244

Author[s]: Benjamin Balas

Using computational models to study texture representations in the human visual system.

February 7, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-002.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-002.pdf

Traditionally, human texture perception has been studied using artificial textures made of random-dot patterns or abstract structured elements. At the same time, computer algorithms for the synthesis of natural textures have improved dramatically. The current study seeks to unify these two fields of research through a psychophysical assessment of a particular computational model, thus providing a sense of what image statistics are most vital for representing a range of natural textures. We employ Portilla and Simoncelli’s 2000 model of texture synthesis for this task (a parametric model of analysis and synthesis designed to mimic computations carried out by the human visual system). We find an intriguing interaction between texture type (periodic v. structured) and image statistics (autocorrelation function and filter magnitude correlations), suggesting different processing strategies may be employed for these two texture families under pre-attentive viewing.

AITR-2005-001

Author[s]: Attila Kondacs

Determining articulator configuration in voiced stop consonants by matching time-domain patterns in pitch periods

January 28, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AITR-2005-001.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AITR-2005-001.pdf

In this thesis I will be concerned with linking the observed speech signal to the configuration of articulators. Due to the potentially rapid motion of the articulators, the speech signal can be highly non-stationary. The typical linear analysis techniques that assume quasi-stationarity may not have sufficient time-frequency resolution to determine the place of articulation. I argue that the traditional low and high-level primitives of speech processing, frequency and phonemes, are inadequate and should be replaced by a representation with three layers: 1. short pitch period resonances and other spatio-temporal patterns 2. articulator configuration trajectories 3. syllables. The patterns indicate articulator configuration trajectories (how the tongue, jaws, etc. are moving), which are interpreted as syllables and words. My patterns are an alternative to frequency. I use short time-domain features of the sound waveform, which can be extracted from each vowel pitch period pattern, to identify the positions of the articulators with high reliability. These features are important because by capitalizing on detailed measurements within a single pitch period, the rapid articulator movements can be tracked. No linear signal processing approach can achieve the combination of sensitivity to short term changes and measurement accuracy resulting from these nonlinear techniques. The measurements I use are neurophysiologically plausible: the auditory system could be using similar methods. I have demonstrated this approach by constructing a robust technique for categorizing the English voiced stops as the consonants B, D, or G based on the vocalic portions of their releases. The classification recognizes 93.5%, 81.8% and 86.1% of the b, d and g to ae transitions with false positive rates 2.9%, 8.7% and 2.6% respectively.

AIM-2005-001

Author[s]: Jacob Beal, Gerald Sussman

Biologically-Inspired Robust Spatial Programming

January 18, 2005

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-001.ps

ftp://publications.ai.mit.edu/ai-publications/2005/AIM-2005-001.pdf

Inspired by the robustness and flexibility of biological systems, we are developing linguistic and programming tools to allow us to program spatial systems populated by vast numbers of unreliable components interconnected in unknown, irregular, and time-varying ways. We organize our computations around geometry, making the fact that our system is made up of discrete individuals implicit. Geometry allows us to specify requirements in terms of the behavior of the space occupied by the aggregate rather than the behavior of individuals, thereby decreasing complexity. So we describe the behavior of space explicitly, abstracting away the discrete nature of the components. As an example, we present the Amorphous Medium Language, which describes behavior in terms of homeostatic maintenance of constraints on nested regions of space.

AIM-2004-031

CBCL-245

Author[s]: Minjoon Kouh and Tomaso Poggio

A general mechanism for tuning: Gain control circuits and synapses underlie tuning of cortical neurons

December 31, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-031.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-031.pdf

Tuning to an optimal stimulus is a widespread property of neurons in cortex. We propose that such tuning is a consequence of normalization or gain control circuits. We also present a biologically plausible neural circuitry of tuning.

AIM-2004-030

Author[s]: Percy Liang and Nathan Srebro

Methods and Experiments With Bounded Tree-width Markov Networks

December 30, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-030.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-030.pdf

Markov trees generalize naturally to bounded tree-width Markov networks, on which exact computations can still be done efficiently. However, learning the maximum likelihood Markov network with tree-width greater than 1 is NP-hard, so we discuss a few algorithms for approximating the optimal Markov network. We present a set of methods for training a density estimator. Each method is specified by three arguments: tree-width, model scoring metric (maximum likelihood or minimum description length), and model representation (using one joint distribution or several class-conditional distributions). On these methods, we give empirical results on density estimation and classification tasks and explore the implications of these arguments.

AIM-2004-029

Author[s]: Whitman Richards & H. Sebastian Seung

Neural Voting Machines

December 31, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-029.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-029.pdf

“Winner-take-all” networks typically pick as winners that alternative with the largest excitatory input. This choice is far from optimal when there is uncertainty in the strength of the inputs, and when information is available about how alternatives may be related. In the Social Choice community, many other procedures will yield more robust winners. The Borda Count and the pair-wise Condorcet tally are among the most favored. Their implementations are simple modifications of classical recurrent networks.

AIM-2004-028

Author[s]: Luis Perez-Breva

Cascading Regularized Classifiers

April 21, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-028.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-028.pdf

Among the various methods to combine classifiers, Boosting was originally thought as an stratagem to cascade pairs of classifiers through their disagreement. I recover the same idea from the work of Niyogi et al. to show how to loosen the requirement of weak learnability, central to Boosting, and introduce a new cascading stratagem. The paper concludes with an empirical study of an implementation of the cascade that, under assumptions that mirror the conditions imposed by Viola and Jones in [VJ01], has the property to preserve the generalization ability of boosting.

AIM-2004-027

Author[s]: Kristen Grauman and Trevor Darrell

Efficient Image Matching with Distributions of Local Invariant Features

November 22, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-027.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-027.pdf

Sets of local features that are invariant to common image transformations are an effective representation to use when comparing images; current methods typically judge feature sets' similarity via a voting scheme (which ignores co-occurrence statistics) or by comparing histograms over a set of prototypes (which must be found by clustering). We present a method for efficiently comparing images based on their discrete distributions (bags) of distinctive local invariant features, without clustering descriptors. Similarity between images is measured with an approximation of the Earth Mover's Distance (EMD), which quickly computes the minimal-cost correspondence between two bags of features. Each image's feature distribution is mapped into a normed space with a low-distortion embedding of EMD. Examples most similar to a novel query image are retrieved in time sublinear in the number of examples via approximate nearest neighbor search in the embedded space. We also show how the feature representation may be extended to encode the distribution of geometric constraints between the invariant features appearing in each image. We evaluate our technique with scene recognition and texture classification tasks.

AIM-2004-026

CBCL-243

Author[s]: Thomas Serre, Lior Wolf and Tomaso Poggio

A new biologically motivated framework for robust object recognition

November 14, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-026.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-026.pdf

In this paper, we introduce a novel set of features for robust object recognition, which exhibits outstanding performances on a variety of object categories while being capable of learning from only a few training examples. Each element of this set is a complex feature obtained by combining position- and scale-tolerant edge-detectors over neighboring positions and multiple orientations. Our system - motivated by a quantitative model of visual cortex - outperforms state-of-the-art systems on a variety of object image datasets from different groups. We also show that our system is able to learn from very few examples with no prior category knowledge. The success of the approach is also a suggestive plausibility proof for a class of feed-forward models of object recognition in cortex. Finally, we conjecture the existence of a universal overcomplete dictionary of features that could handle the recognition of all object categories.

AIM-2004-025

CBCL-242

Author[s]: Lior Wolf and Ian Martin

Regularization Through Feature Knock Out

November 12, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-025.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-025.pdf

In this paper, we present and analyze a novel regularization technique based on enhancing our dataset with corrupted copies of the original data. The motivation is that since the learning algorithm lacks information about which parts of the data are reliable, it has to produce more robust classification functions. We then demonstrate how this regularization leads to redundancy in the resulting classifiers, which is somewhat in contrast to the common interpretations of the Occam’s razor principle. Using this framework, we propose a simple addition to the gentle boosting algorithm which enables it to work with only a few examples. We test this new algorithm on a variety of datasets and show convincing results.

AIM-2004-024

CBCL-241

Author[s]: Charles Cadieu, Minjoon Kouh, Maximilian Riesenhuber, and Tomaso Poggio

Shape Representation in V4: Investigating Position-Specific Tuning for Boundary Conformation with the Standard Model of Object Recognition

November 12, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-024.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-024.pdf

The computational processes in the intermediate stages of the ventral pathway responsible for visual object recognition are not well understood. A recent physiological study by A. Pasupathy and C. Connor in intermediate area V4 using contour stimuli, proposes that a population of V4 neurons display bjectcentered, position-specific curvature tuning [18]. The “standard model” of object recognition, a recently developed model [23] to account for recognition properties of IT cells (extending classical suggestions by Hubel, Wiesel and others [9, 10, 19]), is used here to model the response of the V4 cells described in [18]. Our results show that a feedforward, network level mechanism can exhibit selectivity and invariance properties that correspond to the responses of the V4 cells described in [18]. These results suggest how object-centered, position-specific curvature tuning of V4 cells may arise from combinations of complex V1 cell responses. Furthermore, the model makes predictions about the responses of the same V4 cells studied by Pasupathy and Connor to novel gray level patterns, such as gratings and natural images. These predictions suggest specific experiments to further explore shape representation in V4.

AIM-2004-023

Author[s]: Kurt Steinkraus, Leslie Pack Kaelbling

Combining dynamic abstractions in large MDPs

October 21, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-023.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-023.pdf

One of the reasons that it is difficult to plan and act in real-world domains is that they are very large. Existing research generally deals with the large domain size using a static representation and exploiting a single type of domain structure. In this paper, we create a framework that encapsulates existing and new abstraction and approximation methods into modules, and combines arbitrary modules into a system that allows for dynamic representation changes. We show that the dynamic changes of representation allow our framework to solve larger and more interesting domains than were previously possible, and while there are no optimality guarantees, suitable module choices gain tractability at little cost to optimality.

AIM-2004-022

Author[s]: Gene Yeo, Eric Van Nostrand, Dirk Holste, Tomaso Poggio, Christopher Burge

Predictive identification of alternative events conserved in human and mouse

September 30, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-022.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-022.pdf

Alternative pre-messenger RNA splicing affects a majority of human genes and plays important roles in development and disease. Alternative splicing (AS) events conserved since the divergence of human and mouse are likely of primary biological importance, but relatively few such events are known. Here we describe sequence features that distinguish exons subject to evolutionarily conserved AS, which we call 'alternative- conserved exons' (ACEs) from other orthologous human/mouse exons, and integrate these features into an exon classification algorithm, ACEScan. Genome-wide analysis of annotated orthologous human-mouse exon pairs identified ~2,000 predicted ACEs. Alternative splicing was verified in both human and mouse tissues using an RT-PCR- sequencing protocol for 21 of 30 (70%) predicted ACEs tested, supporting the validity of a majority of ACEScan predictions. By contrast, AS was observed in mouse tissues for only 2 of 15 (13%) tested exons which had EST or cDNA evidence of AS in human but were not predicted ACEs, and was never observed for eleven negative control exons in human or mouse tissues. Predicted ACEs were much more likely to preserve reading frame, and less likely to disrupt protein domains than other AS events, and were enriched in genes expressed in the brain and in genes involved in transcriptional regulation, RNA processing and development. Our results also imply that the vast majority of AS events represented in the human EST databases are not conserved in mouse, and therefore may represent aberrant, disease- or allele-specific, or highly lineage-restricted splicing events.

AIM-2004-021

Author[s]: Michael R. Benjamin

The Interval Programming Model for Multi-objective Decision Making

September 27, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-021.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-021.pdf

The interval programming model (IvP) is a mathematical programming model for representing and solving multi-objective optimization problems. The central characteristic of the model is the use of piecewise linearly defined objective functions and a solution method that searches through the combination space of pieces rather than through the actual decision space. The piecewise functions typically represent an approximation of some underlying function, but this concession is balanced on the positive side by relative freedom from function form assumptions as well as the assurance of global optimality. In this paper the model and solution algorithms are described, and the applicability of IvP to certain applications are discussed.

AIM-2004-020

CBCL-240

Author[s]: Gabriel Kreiman, Chou Hung, Tomaso Poggio, James DiCarlo

Selectivity of Local Field Potentials in Macaque Inferior Temporal Cortex

September 21, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-020.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-020.pdf

While single neurons in inferior temporal (IT) cortex show differential responses to distinct complex stimuli, little is known about the responses of populations of neurons in IT. We recorded single electrode data, including multi-unit activity (MUA) and local field potentials (LFP), from 618 sites in the inferior temporal cortex of macaque monkeys while the animals passively viewed 78 different pictures of complex stimuli. The LFPs were obtained by low-pass filtering the extracellular electrophysiological signal with a corner frequency of 300 Hz. As reported previously, we observed that spike counts from MUA showed selectivity for some of the pictures. Strikingly, the LFP data, which is thought to constitute an average over large numbers of neurons, also showed significantly selective responses. The LFP responses were less selective than the MUA responses both in terms of the proportion of selective sites as well as in the selectivity of each site. We observed that there was only little overlap between the selectivity of MUA and LFP recordings from the same electrode. To assess the spatial organization of selective responses, we compared the selectivity of nearby sites recorded along the same penetration and sites recorded from different penetrations. We observed that MUA selectivity was correlated on spatial scales up to 800 m while the LFP selectivity was correlated over a larger spatial extent, with significant correlations between sites separated by several mm. Our data support the idea that there is some topographical arrangement to the organization of selectivity in inferior temporal cortex and that this organization may be relevant for the representation of object identity in IT.

AIM-2004-019

Author[s]: Charles Kemp, Thomas L. Griffiths and Joshua B. Tenenbaum

Discovering Latent Classes in Relational Data

July 22, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-019.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-019.pdf

We present a framework for learning abstract relational knowledge with the aim of explaining how people acquire intuitive theories of physical, biological, or social systems. Our approach is based on a generative relational model with latent classes, and simultaneously determines the kinds of entities that exist in a domain, the number of these latent classes, and the relations between classes that are possible or likely. This model goes beyond previous psychological models of category learning, which consider attributes associated with individual categories but not relationships between categories. We apply this domain-general framework to two specific problems: learning the structure of kinship systems and learning causal theories.

AIM-2004-018

Author[s]: Ozlem Uzuner

Distribution Volume Tracking on Privacy-Enhanced Wireless Grid

July 25, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-018.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-018.pdf

In this paper, we discuss a wireless grid in which users are highly mobile, and form ad-hoc and sometimes short-lived connections with other devices. As they roam through networks, the users may choose to employ privacy-enhancing technologies to address their privacy needs and benefit from the computational power of the grid for a variety of tasks, including sharing content. The high rate of mobility of the users on the wireless grid, when combined with privacy enhancing mechanisms and ad-hoc connections, makes it difficult to conclusively link devices and/or individuals with network activities and to hold them liable for particular downloads. Protecting intellectual property in this scenario requires a solution that can work in absence of knowledge about behavior of particular individuals. Building on previous work, we argue for a solution that ensures proper compensation to content owners without inhibiting use and dissemination of works. Our proposal is based on digital tracking for measuring distribution volume of content and compensation of authors based on this accounting information. The emphasis is on obtaining good estimates of rate of popularity of works, without keeping track of activities of individuals or devices. The contribution of this paper is a revenue protection mechanism, Distribution Volume Tracking, that does not invade the privacy of users in the wireless grid and works even in the presence of privacy-enhancing technologies they may employ.

AIM-2004-017

CBCL-239

Author[s]: Thomas Serre and Maximilian Riesenhuber

Realistic Modeling of Simple and Complex Cell Tuning in the HMAX Model, and Implications for Invariant Object Recognition in Cortex

July 27, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-017.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-017.pdf

Riesenhuber \& Poggio recently proposed a model of object recognition in cortex which, beyond integrating general beliefs about the visual system in a quantitative framework, made testable predictions about visual processing. In particular, they showed that invariant object representation could be obtained with a selective pooling mechanism over properly chosen afferents through a {\sc max} operation: For instance, at the complex cells level, pooling over a group of simple cells at the same preferred orientation and position in space but at slightly different spatial frequency would provide scale tolerance, while pooling over a group of simple cells at the same preferred orientation and spatial frequency but at slightly different position in space would provide position tolerance. Indirect support for such mechanisms in the visual system come from the ability of the architecture at the top level to replicate shape tuning as well as shift and size invariance properties of ``view-tuned cells'' (VTUs) found in inferotemporal cortex (IT), the highest area in the ventral visual stream, thought to be crucial in mediating object recognition in cortex. There is also now good physiological evidence that a {\sc max} operation is performed at various levels along the ventral stream. However, in the original paper by Riesenhuber \& Poggio, tuning and pooling parameters of model units in early and intermediate areas were only qualitatively inspired by physiological data. In particular, many studies have investigated the tuning properties of simple and complex cells in primary visual cortex, V1. We show that units in the early levels of HMAX can be tuned to produce realistic simple and complex cell-like tuning, and that the earlier findings on the invariance properties of model VTUs still hold in this more realistic version of the model.

AIM-2004-016

Author[s]: Tevfik Metin Sezgin and Randall Davis

Early Sketch Processing with Application in HMM Based Sketch Recognition

July 28, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-016.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-016.pdf

Freehand sketching is a natural and crucial part of everyday human interaction, yet is almost totally unsupported by current user interfaces. With the increasing availability of tablet notebooks and pen based PDAs, sketch based interaction has gained attention as a natural interaction modality. We are working to combine the flexibility and ease of use of paper and pencil with the processing power of a computer, to produce a user interface for design that feels as natural as paper, yet is considerably smarter. One of the most basic tasks in accomplishing this is converting the original digitized pen strokes in a sketch into the intended geometric objects. In this paper we describe an implemented system that combines multiple sources of knowledge to provide robust early processing for freehand sketching. We also show how this early processing system can be used as part of a fast sketch recognition system with polynomial time segmentation and recognition algorithms.

AIM-2004-015

Author[s]: Mihai Badoiu, Piotr Indyk, Anastasios Sidiropoulos

A Constant-Factor Approximation Algorithm for Embedding Unweighted Graphs into Trees

July 5, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-015.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-015.pdf

We present a constant-factor approximation algorithm for computing an embedding of the shortest path metric of an unweighted graph into a tree, that minimizes the multiplicative distortion.

AIM-2004-014

Author[s]: Piotr Indyk and David Woodruff

Optimal Approximations of the Frequency Moments

July 2, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-014.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-014.pdf

We give a one-pass, O~(m^{1-2/k})-space algorithm for estimating the k-th frequency moment of a data stream for any real k>2. Together with known lower bounds, this resolves the main problem left open by Alon, Matias, Szegedy, STOC'96. Our algorithm enables deletions as well as insertions of stream elements.

AIM-2004-013

Author[s]: Antonio Torralba, Kevin P. Murphy, William T. Freeman

Contextual models for object detection using boosted random fields

June 25, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-013.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-013.pdf

We seek to both detect and segment objects in images. To exploit both local image data as well as contextual information, we introduce Boosted Random Fields (BRFs), which uses Boosting to learn the graph structure and local evidence of a conditional random field (CRF). The graph structure is learned by assembling graph fragments in an additive model. The connections between individual pixels are not very informative, but by using dense graphs, we can pool information from large regions of the image; dense models also support efficient inference. We show how contextual information from other objects can improve detection performance, both in terms of accuracy and speed, by using a computational cascade. We apply our system to detect stuff and things in office and street scenes.

AIM-2004-012

Author[s]: Jaime Teevan

How People Re-find Information When the Web Changes

June 18, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-012.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-012.pdf

This paper investigates how people return to information in a dynamic information environment. For example, a person might want to return to Web content via a link encountered earlier on a Web page, only to learn that the link has since been removed. Changes can benefit users by providing new information, but they hinder returning to previously viewed information. The observational study presented here analyzed instances, collected via a Web search, where people expressed difficulty re-finding information because of changes to the information or its environment. A number of interesting observations arose from this analysis, including that the path originally taken to get to the information target appeared important in its re-retrieval, whereas, surprisingly, the temporal aspects of when the information was seen before were not. While people expressed frustration when problems arose, an explanation of why the change had occurred was often sufficient to allay that frustration, even in the absence of a solution. The implications of these observations for systems that support re-finding in dynamic environments are discussed.

AIM-2004-011

Author[s]: Lilla Zollei, John Fisher, William Wells

A Unified Statistical and Information Theoretic Framework for Multi-modal Image Registration

April 28, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-011.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-011.pdf

We formulate and interpret several multi-modal registration methods in the context of a unified statistical and information theoretic framework. A unified interpretation clarifies the implicit assumptions of each method yielding a better understanding of their relative strengths and weaknesses. Additionally, we discuss a generative statistical model from which we derive a novel analysis tool, the "auto-information function", as a means of assessing and exploiting the common spatial dependencies inherent in multi-modal imagery. We analytically derive useful properties of the "auto-information" as well as verify them empirically on multi-modal imagery. Among the useful aspects of the "auto-information function" is that it can be computed from imaging modalities independently and it allows one to decompose the search space of registration problems.

AIM-2004-010

CBCL-238

Author[s]: Jerry Jun Yokono and Tomaso Poggio

Rotation Invariant Object Recognition from One Training Example

April 27, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-010.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-010.pdf

Local descriptors are increasingly used for the task of object recognition because of their perceived robustness with respect to occlusions and to global geometrical deformations. Such a descriptor--based on a set of oriented Gaussian derivative filters-- is used in our recognition system. We report here an evaluation of several techniques for orientation estimation to achieve rotation invariance of the descriptor. We also describe feature selection based on a single training image. Virtual images are generated by rotating and rescaling the image and robust features are selected. The results confirm robust performance in cluttered scenes, in the presence of partial occlusions, and when the object is embedded in different backgrounds.

AITR-2004-009

Author[s]: Nathan Srebro

Learning with Matrix Factorizations

November 22, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AITR-2004-009.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AITR-2004-009.pdf

Matrices that can be factored into a product of two simpler matrices can serve as a useful and often natural model in the analysis of tabulated or high-dimensional data. Models based on matrix factorization (Factor Analysis, PCA) have been extensively used in statistical analysis and machine learning for over a century, with many new formulations and models suggested in recent years (Latent Semantic Indexing, Aspect Models, Probabilistic PCA, Exponential PCA, Non-Negative Matrix Factorization and others). In this thesis we address several issues related to learning with matrix factorizations: we study the asymptotic behavior and generalization ability of existing methods, suggest new optimization methods, and present a novel maximum-margin high-dimensional matrix factorization formulation.

AIM-2004-009

Author[s]: Antonio Torralba

Contextual Influences on Saliency

April 14, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-009.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-009.pdf

This article describes a model for including scene/context priors in attention guidance. In the proposed scheme, visual context information can be available early in the visual processing chain, in order to modulate the saliency of image regions and to provide an efficient short cut for object detection and recognition. The scene is represented by means of a low-dimensional global description obtained from low-level features. The global scene features are then used to predict the probability of presence of the target object in the scene, and its location and scale, before exploring the image. Scene information can then be used to modulate the saliency of image regions early during the visual processing in order to provide an efficient short cut for object detection and recognition.

AITR-2004-008

Author[s]: Justin Werfel

Neural Network Models for Zebra Finch Song Production and Reinforcement Learning

November 9, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AITR-2004-008.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AITR-2004-008.pdf

The zebra finch is a standard experimental system for studying learning and generation of temporally extended motor patterns. The first part of this project concerned the evaluation of simple models for the operation and structure of the network in the motor nucleus RA. A directed excitatory chain with a global inhibitory network, for which experimental evidence exists, was found to produce waves of activity similar to those observed in RA; this similarity included one particularly important feature of the measured activity, synchrony between the onset of bursting in one neuron and the offset of bursting in another. Other models, which were simpler and more analytically tractable, were also able to exhibit this feature, but not for parameter values quantitatively close to those observed. Another issue of interest concerns how these networks are initially learned by the bird during song acquisition. The second part of the project concerned the analysis of exemplars of REINFORCE algorithms, a general class of algorithms for reinforcement learning in neural networks, which are on several counts more biologically plausible than standard prescriptions such as backpropagation. The former compared favorably with backpropagation on tasks involving single input-output pairs, though a noise analysis suggested it should not perform so well. On tasks involving trajectory learning, REINFORCE algorithms meet with some success, though the analysis that predicts their success on input-output-pair tasks fails to explain it for trajectories.

AIM-2004-008

Author[s]: Antonio Torralba, Kevin P. Murphy, William T. Freeman

Sharing visual features for multiclass and multiview object detection

April 14, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-008.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-008.pdf

We consider the problem of detecting a large number of different classes of objects in cluttered scenes. Traditional approaches require applying a battery of different classifiers to the image, at multiple locations and scales. This can be slow and can require a lot of training data, since each classifier requires the computation of many different image features. In particular, for independently trained detectors, the (run-time) computational complexity, and the (training-time) sample complexity, scales linearly with the number of classes to be detected. It seems unlikely that such an approach will scale up to allow recognition of hundreds or thousands of objects. We present a multi-class boosting procedure (joint boosting) that reduces the computational and sample complexity, by finding common features that can be shared across the classes (and/or views). The detectors for each class are trained jointly, rather than independently. For a given performance level, the total number of features required, and therefore the computational cost, is observed to scale approximately logarithmically with the number of classes. The features selected jointly are closer to edges and generic features typical of many natural structures instead of finding specific object parts. Those generic features generalize better and reduce considerably the computational cost of an algorithm for multi-class object detection.

AITR-2004-007

Author[s]: Lisa Tucker-Kellogg

Systematic Conformational Search with Constraint Satisfaction

October 1, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AITR-2004-007.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AITR-2004-007.pdf

Throughout biological, chemical, and pharmaceutical research, conformational searches are used to explore the possible three-dimensional configurations of molecules. This thesis describes a new systematic method for conformational search, including an application of the method to determining the structure of a peptide via solid-state NMR spectroscopy. A separate portion of the thesis is about protein-DNA binding, with a three-dimensional macromolecular structure determined by x-ray crystallography. The search method in this thesis enumerates all conformations of a molecule (at a given level of torsion angle resolution) that satisfy a set of local geometric constraints, such as constraints derived from NMR experiments. Systematic searches, historically used for small molecules, generally now use some form of divide-and-conquer for application to larger molecules. Our method can achieve a significant improvement in runtime by making some major and counter-intuitive modifications to traditional divide-and-conquer: (1) OmniMerge divides a polymer into many alternative pairs of subchains and searches all the pairs, instead of simply cutting in half and searching two subchains. Although the extra searches may appear wasteful, the bottleneck stage of the overall search, which is to re-connect the conformations of the largest subchains, can be greatly accelerated by the availability of alternative pairs of sidechains. (2) Propagation of disqualified conformations across overlapping subchains can disqualify infeasible conformations very rapidly, which further offsets the cost of searching the extra subchains of OmniMerge. (3) The search may be run in two stages, once at low-resolution using a side-effect of OmniMerge to determine an optimal partitioning of the molecule into efficient subchains; then again at high-resolution while making use of the precomputed subchains. (4) An A* function prioritizes each subchain based on estimated future search costs. Subchains with sufficiently low priority can be omitted from the search, which improves efficiency. A common theme of these four ideas is to make good choices about how to break the large search problem into lower-dimensional subproblems. In addition, the search method uses heuristic local searches within the overall systematic framework, to maintain the systematic guarantee while providing the empirical efficiency of stochastic search. These novel algorithms were implemented and the effectiveness of each innovation is demonstrated on a highly constrained peptide with 40 degrees of freedom.

AIM-2004-007

CBCL-237

Author[s]: Jerry Jun Yokono and Tomaso Poggio

Evaluation of sets of oriented and non-oriented receptive fields as local descriptors

March 24, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-007.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-007.pdf

Local descriptors are increasingly used for the task of object recognition because of their perceived robustness with respect to occlusions and to global geometrical deformations. We propose a performance criterion for a local descriptor based on the tradeoff between selectivity and invariance. In this paper, we evaluate several local descriptors with respect to selectivity and invariance. The descriptors that we evaluated are Gaussian derivatives up to the third order, gray image patches, and Laplacian-based descriptors with either three scales or one scale filters. We compare selectivity and invariance to several affine changes such as rotation, scale, brightness, and viewpoint. Comparisons have been made keeping the dimensionality of the descriptors roughly constant. The overall results indicate a good performance by the descriptor based on a set of oriented Gaussian filters. It is interesting that oriented receptive fields similar to the Gaussian derivatives as well as receptive fields similar to the Laplacian are found in primate visual cortex.

AITR-2004-006

Author[s]: Artur Miguel Arsenio

Cognitive-Developmental Learning for a Humanoid Robot: A Caregiver's Gift

September 26, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AITR-2004-006.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AITR-2004-006.pdf

The goal of this work is to build a cognitive system for the humanoid robot, Cog, that exploits human caregivers as catalysts to perceive and learn about actions, objects, scenes, people, and the robot itself. This thesis addresses a broad spectrum of machine learning problems across several categorization levels. Actions by embodied agents are used to automatically generate training data for the learning mechanisms, so that the robot develops categorization autonomously. Taking inspiration from the human brain, a framework of algorithms and methodologies was implemented to emulate different cognitive capabilities on the humanoid robot Cog. This framework is effectively applied to a collection of AI, computer vision, and signal processing problems. Cognitive capabilities of the humanoid robot are developmentally created, starting from infant-like abilities for detecting, segmenting, and recognizing percepts over multiple sensing modalities. Human caregivers provide a helping hand for communicating such information to the robot. This is done by actions that create meaningful events (by changing the world in which the robot is situated) thus inducing the "compliant perception" of objects from these human-robot interactions. Self-exploration of the world extends the robot's knowledge concerning object properties. This thesis argues for enculturating humanoid robots using infant development as a metaphor for building a humanoid robot's cognitive abilities. A human caregiver redesigns a humanoid's brain by teaching the humanoid robot as she would teach a child, using children's learning aids such as books, drawing boards, or other cognitive artifacts. Multi-modal object properties are learned using these tools and inserted into several recognition schemes, which are then applied to developmentally acquire new object representations. The humanoid robot therefore sees the world through the caregiver's eyes. Building an artificial humanoid robot's brain, even at an infant's cognitive level, has been a long quest which still lies only in the realm of our imagination. Our efforts towards such a dimly imaginable task are developed according to two alternate and complementary views: cognitive and developmental.

AIM-2004-006

CBCL-236

Author[s]: Riesenhuber, Jarudi, Gilad, Sinha

Face processing in humans is compatible with a simple shape-based model of vision

March 5, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-006.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-006.pdf

Understanding how the human visual system recognizes objects is one of the key challenges in neuroscience. Inspired by a large body of physiological evidence (Felleman and Van Essen, 1991; Hubel and Wiesel, 1962; Livingstone and Hubel, 1988; Tso et al., 2001; Zeki, 1993), a general class of recognition models has emerged which is based on a hierarchical organization of visual processing, with succeeding stages being sensitive to image features of increasing complexity (Hummel and Biederman, 1992; Riesenhuber and Poggio, 1999; Selfridge, 1959). However, these models appear to be incompatible with some well-known psychophysical results. Prominent among these are experiments investigating recognition impairments caused by vertical inversion of images, especially those of faces. It has been reported that faces that differ “featurally” are much easier to distinguish when inverted than those that differ “configurally” (Freire et al., 2000; Le Grand et al., 2001; Mondloch et al., 2002) – a finding that is difficult to reconcile with the aforementioned models. Here we show that after controlling for subjects’ expectations, there is no difference between “featurally” and “configurally” transformed faces in terms of inversion effect. This result reinforces the plausibility of simple hierarchical models of object representation and recognition in cortex.

AIM-2004-005

Author[s]: Howard Shrobe and Robert Laddaga

New Architectural Models for Visibly Controllable Computing: The Relevance of Dynamic Object Oriented Architectures and Plan Based Computing Models

February 9, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-005.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-005.pdf

Traditionally, we've focussed on the question of how to make a system easy to code the first time, or perhaps on how to ease the system's continued evolution. But if we look at life cycle costs, then we must conclude that the important question is how to make a system easy to operate. To do this we need to make it easy for the operators to see what's going on and to then manipulate the system so that it does what it is supposed to. This is a radically different criterion for success. What makes a computer system visible and controllable? This is a difficult question, but it's clear that today's modern operating systems with nearly 50 million source lines of code are neither. Strikingly, the MIT Lisp Machine and its commercial successors provided almost the same functionality as today's mainstream sytsems, but with only 1 Million lines of code. This paper is a retrospective examination of the features of the Lisp Machine hardware and software system. Our key claim is that by building the Object Abstraction into the lowest tiers of the system, great synergy and clarity were obtained. It is our hope that this is a lesson that can impact tomorrow's designs. We also speculate on how the spirit of the Lisp Machine could be extended to include a comprehensive access control model and how new layers of abstraction could further enrich this model.

AITR-2004-004

Author[s]: Robert A. Hearn

Building Grounded Abstractions for Artificial Intelligence Programming

June 16, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AITR-2004-004.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AITR-2004-004.pdf

Most Artificial Intelligence (AI) work can be characterized as either ``high-level'' (e.g., logical, symbolic) or ``low-level'' (e.g., connectionist networks, behavior-based robotics). Each approach suffers from particular drawbacks. High-level AI uses abstractions that often have no relation to the way real, biological brains work. Low-level AI, on the other hand, tends to lack the powerful abstractions that are needed to express complex structures and relationships. I have tried to combine the best features of both approaches, by building a set of programming abstractions defined in terms of simple, biologically plausible components. At the ``ground level'', I define a primitive, perceptron-like computational unit. I then show how more abstract computational units may be implemented in terms of the primitive units, and show the utility of the abstract units in sample networks. The new units make it possible to build networks using concepts such as long-term memories, short-term memories, and frames. As a demonstration of these abstractions, I have implemented a simulator for ``creatures'' controlled by a network of abstract units. The creatures exist in a simple 2D world, and exhibit behaviors such as catching mobile prey and sorting colored blocks into matching boxes. This program demonstrates that it is possible to build systems that can interact effectively with a dynamic physical environment, yet use symbolic representations to control aspects of their behavior.

AIM-2004-004

CBCL-235

Author[s]: Robert Schneider and Maximilian Riesenhuber

On the difficulty of feature-based attentional modulations in visual object recognition: A modeling study.

January 14, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-004.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-004.pdf

Numerous psychophysical experiments have shown an important role for attentional modulations in vision. Behaviorally, allocation of attention can improve performance in object detection and recognition tasks. At the neural level, attention increases firing rates of neurons in visual cortex whose preferred stimulus is currently attended to. However, it is not yet known how these two phenomena are linked, i.e., how the visual system could be "tuned" in a task-dependent fashion to improve task performance. To answer this question, we performed simulations with the HMAX model of object recognition in cortex [45]. We modulated firing rates of model neurons in accordance with experimental results about effects of feature-based attention on single neurons and measured changes in the model's performance in a variety of object recognition tasks. It turned out that recognition performance could only be improved under very limited circumstances and that attentional influences on the process of object recognition per se tend to display a lack of specificity or raise false alarm rates. These observations lead us to postulate a new role for the observed attention-related neural response modulations.

AITR-2004-003

Author[s]: Jonathan A. Goler

BioJADE: A Design and Simulation Tool for Synthetic Biological Systems

May 28, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AITR-2004-003.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AITR-2004-003.pdf

The next generations of both biological engineering and computer engineering demand that control be exerted at the molecular level. Creating, characterizing and controlling synthetic biological systems may provide us with the ability to build cells that are capable of a plethora of activities, from computation to synthesizing nanostructures. To develop these systems, we must have a set of tools not only for synthesizing systems, but also designing and simulating them. The BioJADE project provides a comprehensive, extensible design and simulation platform for synthetic biology. BioJADE is a graphical design tool built in Java, utilizing a database back end, and supports a range of simulations using an XML communication protocol. BioJADE currently supports a library of over 100 parts with which it can compile designs into actual DNA, and then generate synthesis instructions to build the physical parts. The BioJADE project contributes several tools to Synthetic Biology. BioJADE in itself is a powerful tool for synthetic biology designers. Additionally, we developed and now make use of a centralized BioBricks repository, which enables the sharing of BioBrick components between researchers, and vastly reduces the barriers to entry for aspiring Synthetic Biologists.

AIM-2004-003

Author[s]: Kristen Grauman, Gregory Shakhnarovich, Trevor Darrell

Virtual Visual Hulls: Example-Based 3D Shape Estimation from a Single Silhouette

January 28, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-003.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-003.pdf

Recovering a volumetric model of a person, car, or other object of interest from a single snapshot would be useful for many computer graphics applications. 3D model estimation in general is hard, and currently requires active sensors, multiple views, or integration over time. For a known object class, however, 3D shape can be successfully inferred from a single snapshot. We present a method for generating a ``virtual visual hull''-- an estimate of the 3D shape of an object from a known class, given a single silhouette observed from an unknown viewpoint. For a given class, a large database of multi-view silhouette examples from calibrated, though possibly varied, camera rigs are collected. To infer a novel single view input silhouette's virtual visual hull, we search for 3D shapes in the database which are most consistent with the observed contour. The input is matched to component single views of the multi-view training examples. A set of viewpoint-aligned virtual views are generated from the visual hulls corresponding to these examples. The 3D shape estimate for the input is then found by interpolating between the contours of these aligned views. When the underlying shape is ambiguous given a single view silhouette, we produce multiple visual hull hypotheses; if a sequence of input images is available, a dynamic programming approach is applied to find the maximum likelihood path through the feasible hypotheses over time. We show results of our algorithm on real and synthetic images of people.

AITR-2004-002

Author[s]: Jonathan Kennell

Generative Temporal Planning with Complex Processes

May 18, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AITR-2004-002.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AITR-2004-002.pdf

Autonomous vehicles are increasingly being used in mission-critical applications, and robust methods are needed for controlling these inherently unreliable and complex systems. This thesis advocates the use of model-based programming, which allows mission designers to program autonomous missions at the level of a coach or wing commander. To support such a system, this thesis presents the Spock generative planner. To generate plans, Spock must be able to piece together vehicle commands and team tactics that have a complex behavior represented by concurrent processes. This is in contrast to traditional planners, whose operators represent simple atomic or durative actions. Spock represents operators using the RMPL language, which describes behaviors using parallel and sequential compositions of state and activity episodes. RMPL is useful for controlling mobile autonomous missions because it allows mission designers to quickly encode expressive activity models using object-oriented design methods and an intuitive set of activity combinators. Spock also is significant in that it uniformly represents operators and plan-space processes in terms of Temporal Plan Networks, which support temporal flexibility for robust plan execution. Finally, Spock is implemented as a forward progression optimal planner that walks monotonically forward through plan processes, closing any open conditions and resolving any conflicts. This thesis describes the Spock algorithm in detail, along with example problems and test results.

AIM-2004-002

CBCL-234

Author[s]: Lior Wolf, Amnon Shashua, and Sayan Mukherjee

Selecting Relevant Genes with a Spectral Approach

January 27, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-002.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-002.pdf

Array technologies have made it possible to record simultaneously the expression pattern of thousands of genes. A fundamental problem in the analysis of gene expression data is the identification of highly relevant genes that either discriminate between phenotypic labels or are important with respect to the cellular process studied in the experiment: for example cell cycle or heat shock in yeast experiments, chemical or genetic perturbations of mammalian cell lines, and genes involved in class discovery for human tumors. In this paper we focus on the task of unsupervised gene selection. The problem of selecting a small subset of genes is particularly challenging as the datasets involved are typically characterized by a very small sample size — in the order of few tens of tissue samples — and by a very large feature space as the number of genes tend to be in the high thousands. We propose a model independent approach which scores candidate gene selections using spectral properties of the candidate affinity matrix. The algorithm is very straightforward to implement yet contains a number of remarkable properties which guarantee consistent sparse selections. To illustrate the value of our approach we applied our algorithm on five different datasets. The first consists of time course data from four well studied Hematopoietic cell lines (HL-60, Jurkat, NB4, and U937). The other four datasets include three well studied treatment outcomes (large cell lymphoma, childhood medulloblastomas, breast tumors) and one unpublished dataset (lymph status). We compared our approach both with other unsupervised methods (SOM,PCA,GS) and with supervised methods (SNR,RMB,RFE). The results clearly show that our approach considerably outperforms all the other unsupervised approaches in our study, is competitive with supervised methods and in some case even outperforms supervised approaches.

AITR-2004-001

Author[s]: Oana L. Stamatoiu

Learning Commonsense Categorical Knowledge in a Thread Memory System

May 18, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AITR-2004-001.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AITR-2004-001.pdf

If we are to understand how we can build machines capable of broad purpose learning and reasoning, we must first aim to build systems that can represent, acquire, and reason about the kinds of commonsense knowledge that we humans have about the world. This endeavor suggests steps such as identifying the kinds of knowledge people commonly have about the world, constructing suitable knowledge representations, and exploring the mechanisms that people use to make judgments about the everyday world. In this work, I contribute to these goals by proposing an architecture for a system that can learn commonsense knowledge about the properties and behavior of objects in the world. The architecture described here augments previous machine learning systems in four ways: (1) it relies on a seven dimensional notion of context, built from information recently given to the system, to learn and reason about objects' properties; (2) it has multiple methods that it can use to reason about objects, so that when one method fails, it can fall back on others; (3) it illustrates the usefulness of reasoning about objects by thinking about their similarity to other, better known objects, and by inferring properties of objects from the categories that they belong to; and (4) it represents an attempt to build an autonomous learner and reasoner, that sets its own goals for learning about the world and deduces new facts by reflecting on its acquired knowledge. This thesis describes this architecture, as well as a first implementation, that can learn from sentences such as ``A blue bird flew to the tree'' and ``The small bird flew to the cage'' that birds can fly. One of the main contributions of this work lies in suggesting a further set of salient ideas about how we can build broader purpose commonsense artificial learners and reasoners.

AIM-2004-001

CBCL-233

Author[s]: Alexander Rakhlin, Dmitry Panchenko, Sayan Mukherjee

Risk Bounds for Mixture Density Estimation

January 27, 2004

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-001.ps

ftp://publications.ai.mit.edu/ai-publications/2004/AIM-2004-001.pdf

In this paper we focus on the problem of estimating a bounded density using a finite combination of densities from a given class. We consider the Maximum Likelihood Procedure (MLE) and the greedy procedure described by Li and Barron. Approximation and estimation bounds are given for the above methods. We extend and improve upon the estimation results of Li and Barron, and in particular prove an $O(\frac{1}{\sqrt{n}})$ bound on the estimation error which does not depend on the number of densities in the estimated combination.

AIM-2003-027

Author[s]: Jacob Beal and Seth Gilbert

RamboNodes for the Metropolitan Ad Hoc Network

December 17, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-027.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-027.pdf

We present an algorithm to store data robustly in a large, geographically distributed network by means of localized regions of data storage that move in response to changing conditions. For example, data might migrate away from failures or toward regions of high demand. The PersistentNode algorithm provides this service robustly, but with limited safety guarantees. We use the RAMBO framework to transform PersistentNode into RamboNode, an algorithm that guarantees atomic consistency in exchange for increased cost and decreased liveness. In addition, a half-life analysis of RamboNode shows that it is robust against continuous low-rate failures. Finally, we provide experimental simulations for the algorithm on 2000 nodes, demonstrating how it services requests and examining how it responds to failures.

AIM-2003-026

Author[s]: Kristen Grauman and Trevor Darrell

Fast Contour Matching Using Approximate Earth Mover's Distance

December 5, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-026.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-026.pdf

Weighted graph matching is a good way to align a pair of shapes represented by a set of descriptive local features; the set of correspondences produced by the minimum cost of matching features from one shape to the features of the other often reveals how similar the two shapes are. However, due to the complexity of computing the exact minimum cost matching, previous algorithms could only run efficiently when using a limited number of features per shape, and could not scale to perform retrievals from large databases. We present a contour matching algorithm that quickly computes the minimum weight matching between sets of descriptive local features using a recently introduced low-distortion embedding of the Earth Mover's Distance (EMD) into a normed space. Given a novel embedded contour, the nearest neighbors in a database of embedded contours are retrieved in sublinear time via approximate nearest neighbors search. We demonstrate our shape matching method on databases of 10,000 images of human figures and 60,000 images of handwritten digits.

AIM-2003-025

Author[s]: Yu-Han Chang, Tracey Ho, Leslie Pack Kaelbling

Mobilized ad-hoc networks: A reinforcement learning approach

December 4, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-025.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-025.pdf

Research in mobile ad-hoc networks has focused on situations in which nodes have no control over their movements. We investigate an important but overlooked domain in which nodes do have control over their movements. Reinforcement learning methods can be used to control both packet routing decisions and node mobility, dramatically improving the connectivity of the network. We first motivate the problem by presenting theoretical bounds for the connectivity improvement of partially mobile networks and then present superior empirical results under a variety of different scenarios in which the mobile nodes in our ad-hoc network are embedded with adaptive routing policies and learned movement policies.

AIM-2003-024

CBCL-232

Author[s]: Christian Morgenstern, Bernd Heisele

Component based recognition of objects in an office environment

November 28, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-024.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-024.pdf

We present a component-based approach for recognizing objects under large pose changes. From a set of training images of a given object we extract a large number of components which are clustered based on the similarity of their image features and their locations within the object image. The cluster centers build an initial set of component templates from which we select a subset for the final recognizer. In experiments we evaluate different sizes and types of components and three standard techniques for component selection. The component classifiers are finally compared to global classifiers on a database of four objects.

AIM-2003-023

Author[s]: Jacob Eisenstein

Evolving Robocode Tank Fighters

October 28, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-023.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-023.pdf

In this paper, I describe the application of genetic programming to evolve a controller for a robotic tank in a simulated environment. The purpose is to explore how genetic techniques can best be applied to produce controllers based on subsumption and behavior oriented languages such as REX. As part of my implementation, I developed TableRex, a modification of REX that can be expressed on a fixed-length genome. Using a fixed subsumption architecture of TableRex modules, I evolved robots that beat some of the most competitive hand-coded adversaries.

AIM-2003-022

Author[s]: Michael G. Ross and Leslie Pack Kaelbling

Learning object segmentation from video data

September 8, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-022.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-022.pdf

This memo describes the initial results of a project to create a self-supervised algorithm for learning object segmentation from video data. Developmental psychology and computational experience have demonstrated that the motion segmentation of objects is a simpler, more primitive process than the detection of object boundaries by static image cues. Therefore, motion information provides a plausible supervision signal for learning the static boundary detection task and for evaluating performance on a test set. A video camera and previously developed background subtraction algorithms can automatically produce a large database of motion-segmented images for minimal cost. The purpose of this work is to use the information in such a database to learn how to detect the object boundaries in novel images using static information, such as color, texture, and shape. This work was funded in part by the Office of Naval Research contract #N00014-00-1-0298, in part by the Singapore-MIT Alliance agreement of 11/6/98, and in part by a National Science Foundation Graduate Student Fellowship.

AIM-2003-021

CBCL-231

Author[s]: Minjoon Kouh and Maximilian Riesenhuber

Investigating shape representation in area V4 with HMAX: Orientation and Grating selectivities

September 8, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-021.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-021.pdf

The question of how shape is represented is of central interest to understanding visual processing in cortex. While tuning properties of the cells in early part of the ventral visual stream, thought to be responsible for object recognition in the primate, are comparatively well understood, several different theories have been proposed regarding tuning in higher visual areas, such as V4. We used the model of object recognition in cortex presented by Riesenhuber and Poggio (1999), where more complex shape tuning in higher layers is the result of combining afferent inputs tuned to simpler features, and compared the tuning properties of model units in intermediate layers to those of V4 neurons from the literature. In particular, we investigated the issue of shape representation in visual area V1 and V4 using oriented bars and various types of gratings (polar, hyperbolic, and Cartesian), as used in several physiology experiments. Our computational model was able to reproduce several physiological findings, such as the broadening distribution of the orientation bandwidths and the emergence of a bias toward non-Cartesian stimuli. Interestingly, the simulation results suggest that some V4 neurons receive input from afferents with spatially separated receptive fields, leading to experimentally testable predictions. However, the simulations also show that the stimulus set of Cartesian and non-Cartesian gratings is not sufficiently complex to probe shape tuning in higher areas, necessitating the use of more complex stimulus sets.

AIM-2003-020

CBCL-230

Author[s]: Hiroaki Shimizu and Tomaso Poggio

Direction Estimation of Pedestrian from Images

August 27, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-020.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-020.pdf

The capability of estimating the walking direction of people would be useful in many applications such as those involving autonomous cars and robots. We introduce an approach for estimating the walking direction of people from images, based on learning the correct classification of a still image by using SVMs. We find that the performance of the system can be improved by classifying each image of a walking sequence and combining the outputs of the classifier. Experiments were performed to evaluate our system and estimate the trade-off between number of images in walking sequences and performance.

AIM-2003-019

Author[s]: Sayan Mukherjee, Polina Golland and Dmitry Panchenko

Permutation Tests for Classification

August 28, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-019.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-019.pdf

We introduce and explore an approach to estimating statistical significance of classification accuracy, which is particularly useful in scientific applications of machine learning where high dimensionality of the data and the small number of training examples render most standard convergence bounds too loose to yield a meaningful guarantee of the generalization ability of the classifier. Instead, we estimate statistical significance of the observed classification accuracy, or the likelihood of observing such accuracy by chance due to spurious correlations of the high-dimensional data patterns with the class labels in the given training set. We adopt permutation testing, a non-parametric technique previously developed in classical statistics for hypothesis testing in the generative setting (i.e., comparing two probability distributions). We demonstrate the method on real examples from neuroimaging studies and DNA microarray analysis and suggest a theoretical analysis of the procedure that relates the asymptotic behavior of the test to the existing convergence bounds.

AIM-2003-018

CBCL-229

Author[s]: Benjamin J. Balas, Pawan Sinha

Dissociated Dipoles: Image representation via non-local comparisons

August 13, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-018.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-018.pdf

A fundamental question in visual neuroscience is how to represent image structure. The most common representational schemes rely on differential operators that compare adjacent image regions. While well-suited to encoding local relationships, such operators have significant drawbacks. Specifically, each filter’s span is confounded with the size of its sub-fields, making it difficult to compare small regions across large distances. We find that such long-distance comparisons are more tolerant to common image transformations than purely local ones, suggesting they may provide a useful vocabulary for image encoding. . We introduce the “Dissociated Dipole,” or “Sticks” operator, for encoding non-local image relationships. This operator de-couples filter span from sub-field size, enabling parametric movement between edge and region-based representation modes. We report on the perceptual plausibility of the operator, and the computational advantages of non-local encoding. Our results suggest that non-local encoding may be an effective scheme for representing image structure.

AITR-2003-017

Author[s]: Austin Che

Fluorescence Assay for Polymerase Arrival Rates

August 31, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-017.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-017.pdf

To engineer complex synthetic biological systems will require modular design, assembly, and characterization strategies. The RNA polymerase arrival rate (PAR) is defined to be the rate that RNA polymerases arrive at a specified location on the DNA. Designing and characterizing biological modules in terms of RNA polymerase arrival rates provides for many advantages in the construction and modeling of biological systems. PARMESAN is an in vitro method for measuring polymerase arrival rates using pyrrolo-dC, a fluorescent DNA base that can substitute for cytosine. Pyrrolo-dC shows a detectable fluorescence difference when in single-stranded versus double-stranded DNA. During transcription, RNA polymerase separates the two strands of DNA, leading to a change in the fluorescence of pyrrolo-dC. By incorporating pyrrolo-dC at specific locations in the DNA, fluorescence changes can be taken as a direct measurement of the polymerase arrival rate.

AIM-2003-017

Author[s]: Jacob Beal

Near-Optimal Distributed Failure Circumscription

August 11, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-017.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-017.pdf

Small failures should only disrupt a small part of a network. One way to do this is by marking the surrounding area as untrustworthy --- circumscribing the failure. This can be done with a distributed algorithm using hierarchical clustering and neighbor relations, and the resulting circumscription is near-optimal for convex failures.

AIM-2003-016

Author[s]: Krzysztof Gajos and Howard Shrobe

Delegation, Arbitration and High-Level Service Discovery as Key Elements of a Software Infrastructure for Pervasive Computing

June 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-016.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-016.pdf

The dream of pervasive computing is slowly becoming a reality. A number of projects around the world are constantly contributing ideas and solutions that are bound to change the way we interact with our environments and with one another. An essential component of the future is a software infrastructure that is capable of supporting interactions on scales ranging from a single physical space to intercontinental collaborations. Such infrastructure must help applications adapt to very diverse environments and must protect people’s privacy and respect their personal preferences. In this paper we indicate a number of limitations present in the software infrastructures proposed so far (including our previous work). We then describe the framework for building an infrastructure that satisfies the abovementioned criteria. This framework hinges on the concepts of delegation, arbitration and high-level service discovery. Components of our own implementation of such an infrastructure are presented.

AITR-2003-016

Author[s]: Pedro F. Felzenszwalb

Representation and Detection of Shapes in Images

August 8, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-016.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-016.pdf

We present a set of techniques that can be used to represent and detect shapes in images. Our methods revolve around a particular shape representation based on the description of objects using triangulated polygons. This representation is similar to the medial axis transform and has important properties from a computational perspective. The first problem we consider is the detection of non-rigid objects in images using deformable models. We present an efficient algorithm to solve this problem in a wide range of situations, and show examples in both natural and medical images. We also consider the problem of learning an accurate non-rigid shape model for a class of objects from examples. We show how to learn good models while constraining them to the form required by the detection algorithm. Finally, we consider the problem of low-level image segmentation and grouping. We describe a stochastic grammar that generates arbitrary triangulated polygons while capturing Gestalt principles of shape regularity. This grammar is used as a prior model over random shapes in a low level algorithm that detects objects in images.

AIM-2003-015

Author[s]: Kimberle Koile, Konrad Tollmar, David Demirdjian, Howard Shrobe and Trevor Darrell

Activity Zones for Context-Aware Computing

June 10, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-015.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-015.pdf

Location is a primary cue in many context-aware computing systems, and is often represented as a global coordinate, room number, or Euclidean distance various landmarks. A user?s concept of location, however, is often defined in terms of regions in which common activities occur. We show how to partition a space into such regions based on patterns of observed user location and motion. These regions, which we call activity zones, represent regions of similar user activity, and can be used to trigger application actions, retrieve information based on previous context, and present information to users. We suggest that context- aware applications can benefit from a location representation learned from observing users. We describe an implementation of our system and present two example applications whose behavior is controlled by users? entry, exit, and presence in the zones.

AITR-2003-015

Author[s]: Samson Timoner

Compact Representations for Fast Nonrigid Registration of Medical Images

July 4, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-015.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-015.pdf

We develop efficient techniques for the non-rigid registration of medical images by using representations that adapt to the anatomy found in such images. Images of anatomical structures typically have uniform intensity interiors and smooth boundaries. We create methods to represent such regions compactly using tetrahedra. Unlike voxel-based representations, tetrahedra can accurately describe the expected smooth surfaces of medical objects. Furthermore, the interior of such objects can be represented using a small number of tetrahedra. Rather than describing a medical object using tens of thousands of voxels, our representations generally contain only a few thousand elements. Tetrahedra facilitate the creation of efficient non-rigid registration algorithms based on finite element methods (FEM). We create a fast, FEM-based method to non-rigidly register segmented anatomical structures from two subjects. Using our compact tetrahedral representations, this method generally requires less than one minute of processing time on a desktop PC. We also create a novel method for the non-rigid registration of gray scale images. To facilitate a fast method, we create a tetrahedral representation of a displacement field that automatically adapts to both the anatomy in an image and to the displacement field. The resulting algorithm has a computational cost that is dominated by the number of nodes in the mesh (about 10,000), rather than the number of voxels in an image (nearly 10,000,000). For many non-rigid registration problems, we can find a transformation from one image to another in five minutes. This speed is important as it allows use of the algorithm during surgery. We apply our algorithms to find correlations between the shape of anatomical structures and the presence of schizophrenia. We show that a study based on our representations outperforms studies based on other representations. We also use the results of our non-rigid registration algorithm as the basis of a segmentation algorithm. That algorithm also outperforms other methods in our tests, producing smoother segmentations and more accurately reproducing manual segmentations.

AIM-2003-014

Author[s]: Martin C. Martin

The Essential Dynamics Algorithm: Essential Results

May 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-014.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-014.pdf

This paper presents a novel algorithm for learning in a class of stochastic Markov decision processes (MDPs) with continuous state and action spaces that trades speed for accuracy. A transform of the stochastic MDP into a deterministic one is presented which captures the essence of the original dynamics, in a sense made precise. In this transformed MDP, the calculation of values is greatly simplified. The online algorithm estimates the model of the transformed MDP and simultaneously does policy search against it. Bounds on the error of this approximation are proven, and experimental results in a bicycle riding domain are presented. The algorithm learns near optimal policies in orders of magnitude fewer interactions with the stochastic MDP, using less domain knowledge. All code used in the experiments is available on the project’s web site.

AITR-2003-014

Author[s]: Lily Lee

Gait Analysis for Classification

June 26, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-014.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-014.pdf

This thesis describes a representation of gait appearance for the purpose of person identification and classification. This gait representation is based on simple localized image features such as moments extracted from orthogonal view video silhouettes of human walking motion. A suite of time-integration methods, spanning a range of coarseness of time aggregation and modeling of feature distributions, are applied to these image features to create a suite of gait sequence representations. Despite their simplicity, the resulting feature vectors contain enough information to perform well on human identification and gender classification tasks. We demonstrate the accuracy of recognition on gait video sequences collected over different days and times and under varying lighting environments. Each of the integration methods are investigated for their advantages and disadvantages. An improved gait representation is built based on our experiences with the initial set of gait representations. In addition, we show gender classification results using our gait appearance features, the effect of our heuristic feature selection method, and the significance of individual features.

AIM-2003-013

Author[s]: Lawrence Shih and David Karger

Learning Classes Correlated to a Hierarchy

May 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-013.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-013.pdf

Trees are a common way of organizing large amounts of information by placing items with similar characteristics near one another in the tree. We introduce a classification problem where a given tree structure gives us information on the best way to label nearby elements. We suggest there are many practical problems that fall under this domain. We propose a way to map the classification problem onto a standard Bayesian inference problem. We also give a fast, specialized inference algorithm that incrementally updates relevant probabilities. We apply this algorithm to web-classification problems and show that our algorithm empirically works well.

AITR-2003-013

Author[s]: Matthew J. Marjanovic

Teaching an Old Robot New Tricks: Learning Novel Tasks via Interaction with People and Things

June 20, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-013.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-013.pdf

As AI has begun to reach out beyond its symbolic, objectivist roots into the embodied, experientialist realm, many projects are exploring different aspects of creating machines which interact with and respond to the world as humans do. Techniques for visual processing, object recognition, emotional response, gesture production and recognition, etc., are necessary components of a complete humanoid robot. However, most projects invariably concentrate on developing a few of these individual components, neglecting the issue of how all of these pieces would eventually fit together. The focus of the work in this dissertation is on creating a framework into which such specific competencies can be embedded, in a way that they can interact with each other and build layers of new functionality. To be of any practical value, such a framework must satisfy the real-world constraints of functioning in real-time with noisy sensors and actuators. The humanoid robot Cog provides an unapologetically adequate platform from which to take on such a challenge. This work makes three contributions to embodied AI. First, it offers a general-purpose architecture for developing behavior-based systems distributed over networks of PC's. Second, it provides a motor-control system that simulates several biological features which impact the development of motor behavior. Third, it develops a framework for a system which enables a robot to learn new behaviors via interacting with itself and the outside world. A few basic functional modules are built into this framework, enough to demonstrate the robot learning some very simple behaviors taught by a human trainer. A primary motivation for this project is the notion that it is practically impossible to build an "intelligent" machine unless it is designed partly to build itself. This work is a proof-of-concept of such an approach to integrating multiple perceptual and motor systems into a complete learning agent.

AITR-2003-012

Author[s]: Andreas F. Wehowsky

Safe Distributed Coordination of Heterogeneous Robots through Dynamic Simple Temporal Networks

May 30, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-012.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-012.pdf

Research on autonomous intelligent systems has focused on how robots can robustly carry out missions in uncertain and harsh environments with very little or no human intervention. Robotic execution languages such as RAPs, ESL, and TDL improve robustness by managing functionally redundant procedures for achieving goals. The model-based programming approach extends this by guaranteeing correctness of execution through pre-planning of non-deterministic timed threads of activities. Executing model-based programs effectively on distributed autonomous platforms requires distributing this pre-planning process. This thesis presents a distributed planner for modelbased programs whose planning and execution is distributed among agents with widely varying levels of processor power and memory resources. We make two key contributions. First, we reformulate a model-based program, which describes cooperative activities, into a hierarchical dynamic simple temporal network. This enables efficient distributed coordination of robots and supports deployment on heterogeneous robots. Second, we introduce a distributed temporal planner, called DTP, which solves hierarchical dynamic simple temporal networks with the assistance of the distributed Bellman- Ford shortest path algorithm. The implementation of DTP has been demonstrated successfully on a wide range of randomly generated examples and on a pursuer-evader challenge problem in simulation.

AIM-2003-012

Author[s]: Jacob Beal

A Robust Amorphous Hierarchy from Persistent Nodes

May 1, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-012.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-012.pdf

For a very large network deployed in space with only nearby nodes able to talk to each other, we want to do tasks like robust routing and data storage. One way to organize the network is via a hierarchy, but hierarchies often have a few critical nodes whose death can disrupt organization over long distances. I address this with a system of distributed aggregates called Persistent Nodes, such that spatially local failures disrupt the hierarchy in an area proportional to the diameter of the failure. I describe and analyze this system, which has been implemented in simulation.

AITR-2003-011

Author[s]: Claire Monteleoni

Online Learning of Non-stationary Sequences

June 12, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-011.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-011.pdf

We consider an online learning scenario in which the learner can make predictions on the basis of a fixed set of experts. The performance of each expert may change over time in a manner unknown to the learner. We formulate a class of universal learning algorithms for this problem by expressing them as simple Bayesian algorithms operating on models analogous to Hidden Markov Models (HMMs). We derive a new performance bound for such algorithms which is considerably simpler than existing bounds. The bound provides the basis for learning the rate at which the identity of the optimal expert switches over time. We find an analytic expression for the a priori resolution at which we need to learn the rate parameter. We extend our scalar switching-rate result to models of the switching-rate that are governed by a matrix of parameters, i.e. arbitrary homogeneous HMMs. We apply and examine our algorithm in the context of the problem of energy management in wireless networks. We analyze the new results in the framework of Information Theory.

AIM-2003-011

Author[s]: Jacob Beal

Persistent Nodes for Reliable Memory in Geographically Local Networks

April 15, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-011.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-011.pdf

A Persistent Node is a redundant distributed mechanism for storing a key/value pair reliably in a geographically local network. In this paper, I develop a method of establishing Persistent Nodes in an amorphous matrix. I address issues of construction, usage, atomicity guarantees and reliability in the face of stopping failures. Applications include routing, congestion control, and data storage in gigascale networks.

AITR-2003-010

CBCL-228

Author[s]: Ezra Rosen

Face Representation in Cortex: Studies Using a Simple and Not So Special Model

June 5, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-010.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-010.pdf

The face inversion effect has been widely documented as an effect of the uniqueness of face processing. Using a computational model, we show that the face inversion effect is a byproduct of expertise with respect to the face object class. In simulations using HMAX, a hierarchical, shape based model, we show that the magnitude of the inversion effect is a function of the specificity of the representation. Using many, sharply tuned units, an ``expert'' has a large inversion effect. On the other hand, if fewer, broadly tuned units are used, the expertise is lost, and this ``novice'' has a small inversion effect. As the size of the inversion effect is a product of the representation, not the object class, given the right training we can create experts and novices in any object class. Using the same representations as with faces, we create experts and novices for cars. We also measure the feasibility of a view-based model for recognition of rotated objects using HMAX. Using faces, we show that transfer of learning to novel views is possible. Given only one training view, the view-based model can recognize a face at a new orientation via interpolation from the views to which it had been tuned. Although the model can generalize well to upright faces, inverted faces yield poor performance because the features change differently under rotation.

AIM-2003-010

Author[s]: Chris Mario Christoudias, Louis-Philippe Morency and Trevor Darrell

Light Field Morphable Models

April 18, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-010.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-010.pdf

Statistical shape and texture appearance models are powerful image representations, but previously had been restricted to 2D or simple 3D shapes. In this paper we present a novel 3D morphable model based on image-based rendering techniques, which can represent complex lighting conditions, structures, and surfaces. We describe how to construct a manifold of the multi-view appearance of an object class using light fields and show how to match a 2D image of an object to a point on this manifold. In turn we use the reconstructed light field to render novel views of the object. Our technique overcomes the limitations of polygon based appearance models and uses light fields that are acquired in real-time.

AITR-2003-009

CBCL-227

Author[s]: Jennifer Louie

A Biological Model of Object Recognition with Feature Learning

June 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-009.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-009.pdf

Previous biological models of object recognition in cortex have been evaluated using idealized scenes and have hard-coded features, such as the HMAX model by Riesenhuber and Poggio [10]. Because HMAX uses the same set of features for all object classes, it does not perform well in the task of detecting a target object in clutter. This thesis presents a new model that integrates learning of object-specific features with the HMAX. The new model performs better than the standard HMAX and comparably to a computer vision system on face detection. Results from experimenting with unsupervised learning of features and the use of a biologically-plausible classifier are presented.

AIM-2003-009

Author[s]: Gregory Shakhnarovich, Paul Viola and Trevor Darrell

Fast Pose Estimation with Parameter Sensitive Hashing

April 18, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-009.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-009.pdf

Example-based methods are effective for parameter estimation problems when the underlying system is simple or the dimensionality of the input is low. For complex and high-dimensional problems such as pose estimation, the number of required examples and the computational complexity rapidly becme prohibitively high. We introduce a new algorithm that learns a set of hashing functions that efficiently index examples relevant to a particular estimation task. Our algorithm extends a recently developed method for locality-sensitive hashing, which finds approximate neighbors in time sublinear in the number of examples. This method depends critically on the choice of hash functions; we show how to find the set of hash functions that are optimally relevant to a particular estimation problem. Experiments demonstrate that the resulting algorithm, which we call Parameter-Sensitive Hashing, can rapidly and accurately estimate the articulated pose of human figures from a large database of example images.

AITR-2003-008

Author[s]: Paul Fitzpatrick

From First Contact to Close Encounters: A Developmentally Deep Perceptual System for a Humanoid Robot

June 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-008.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-008.pdf

This thesis presents a perceptual system for a humanoid robot that integrates abilities such as object localization and recognition with the deeper developmental machinery required to forge those competences out of raw physical experiences. It shows that a robotic platform can build up and maintain a system for object localization, segmentation, and recognition, starting from very little. What the robot starts with is a direct solution to achieving figure/ground separation: it simply 'pokes around' in a region of visual ambiguity and watches what happens. If the arm passes through an area, that area is recognized as free space. If the arm collides with an object, causing it to move, the robot can use that motion to segment the object from the background. Once the robot can acquire reliable segmented views of objects, it learns from them, and from then on recognizes and segments those objects without further contact. Both low-level and high-level visual features can also be learned in this way, and examples are presented for both: orientation detection and affordance recognition, respectively. The motivation for this work is simple. Training on large corpora of annotated real-world data has proven crucial for creating robust solutions to perceptual problems such as speech recognition and face detection. But the powerful tools used during training of such systems are typically stripped away at deployment. Ideally they should remain, particularly for unstable tasks such as object detection, where the set of objects needed in a task tomorrow might be different from the set of objects needed today. The key limiting factor is access to training data, but as this thesis shows, that need not be a problem on a robotic platform that can actively probe its environment, and carry out experiments to resolve ambiguity. This work is an instance of a general approach to learning a new perceptual judgment: find special situations in which the perceptual judgment is easy and study these situations to find correlated features that can be observed more generally.

AIM-2003-008

Author[s]: Kristen Grauman, Gregory Shakhnarovich and Trevor Darrell

Inferring 3D Structure with a Statistical Image-Based Shape Model

April 17, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-008.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-008.pdf

We present an image-based approach to infer 3D structure parameters using a probabilistic "shape+structure'' model. The 3D shape of a class of objects may be represented by sets of contours from silhouette views simultaneously observed from multiple calibrated cameras. Bayesian reconstructions of new shapes can then be estimated using a prior density constructed with a mixture model and probabilistic principal components analysis. We augment the shape model to incorporate structural features of interest; novel examples with missing structure parameters may then be reconstructed to obtain estimates of these parameters. Model matching and parameter inference are done entirely in the image domain and require no explicit 3D construction. Our shape model enables accurate estimation of structure despite segmentation errors or missing views in the input silhouettes, and works even with only a single input view. Using a dataset of thousands of pedestrian images generated from a synthetic model, we can perform accurate inference of the 3D locations of 19 joints on the body based on observed silhouette contours from real images.

AITR-2003-007

Author[s]: Kristen Grauman

A Statistical Image-Based Shape Model for Visual Hull Reconstruction and 3D Structure Inference

May 22, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-007.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-007.pdf

We present a statistical image-based “shape + structure” model for Bayesian visual hull reconstruction and 3D structure inference. The 3D shape of a class of objects is represented by sets of contours from silhouette views simultaneously observed from multiple calibrated cameras. Bayesian reconstructions of new shapes are then estimated using a prior density constructed with a mixture model and probabilistic principal components analysis. We show how the use of a class-specific prior in a visual hull reconstruction can reduce the effect of segmentation errors from the silhouette extraction process. The proposed method is applied to a data set of pedestrian images, and improvements in the approximate 3D models under various noise conditions are shown. We further augment the shape model to incorporate structural features of interest; unknown structural parameters for a novel set of contours are then inferred via the Bayesian reconstruction process. Model matching and parameter inference are done entirely in the image domain and require no explicit 3D construction. Our shape model enables accurate estimation of structure despite segmentation errors or missing views in the input silhouettes, and works even with only a single input view. Using a data set of thousands of pedestrian images generated from a synthetic model, we can accurately infer the 3D locations of 19 joints on the body based on observed silhouette contours from real images.

AIM-2003-007

Author[s]: Jacob Beal

Leveraging Learning and Language Via Communication Bootstrapping

March 17, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-007.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-007.pdf

In a Communication Bootstrapping system, peer components with different perceptual worlds invent symbols and syntax based on correlations between their percepts. I propose that Communication Bootstrapping can also be used to acquire functional definitions of words and causal reasoning knowledge. I illustrate this point with several examples, then sketch the architecture of a system in progress which attempts to execute this task.

AITR-2003-006

Author[s]: Louis-Philippe Morency

Stereo-Based Head Pose Tracking Using Iterative Closest Point and Normal Flow Constraint

May 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-006.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-006.pdf

In this text, we present two stereo-based head tracking techniques along with a fast 3D model acquisition system. The first tracking technique is a robust implementation of stereo-based head tracking designed for interactive environments with uncontrolled lighting. We integrate fast face detection and drift reduction algorithms with a gradient-based stereo rigid motion tracking technique. Our system can automatically segment and track a user's head under large rotation and illumination variations. Precision and usability of this approach are compared with previous tracking methods for cursor control and target selection in both desktop and interactive room environments. The second tracking technique is designed to improve the robustness of head pose tracking for fast movements. Our iterative hybrid tracker combines constraints from the ICP (Iterative Closest Point) algorithm and normal flow constraint. This new technique is more precise for small movements and noisy depth than ICP alone, and more robust for large movements than the normal flow constraint alone. We present experiments which test the accuracy of our approach on sequences of real and synthetic stereo images. The 3D model acquisition system we present quickly aligns intensity and depth images, and reconstructs a textured 3D mesh. 3D views are registered with shape alignment based on our iterative hybrid tracker. We reconstruct the 3D model using a new Cubic Ray Projection merging algorithm which takes advantage of a novel data structure: the linked voxel space. We present experiments to test the accuracy of our approach on 3D face modelling using real-time stereo images.

AIM-2003-006

Author[s]: Christine Alvarado, Jaime Teevan, Mark S. Ackerman and David Karger

Surviving the Information Explosion: How People Find Their Electronic Information

April 15, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-006.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-006.pdf

We report on a study of how people look for information within email, files, and the Web. When locating a document or searching for a specific answer, people relied on their contextual knowledge of their information target to help them find it, often associating the target with a specific document. They appeared to prefer to use this contextual information as a guide in navigating locally in small steps to the desired document rather than directly jumping to their target. We found this behavior was especially true for people with unstructured information organization. We discuss the implications of our findings for the design of personal information management tools.

AIM-2003-005

Author[s]: Antonio Torralba, Kevin P. Murphy, William T. Freeman and Mark A. Rubin

Context-Based Vision System for Place and Object Recognition

March 19, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-005.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-005.pdf

While navigating in an environment, a vision system has to be able to recognize where it is and what the main objects in the scene are. In this paper we present a context-based vision system for place and object recognition. The goal is to identify familiar locations (e.g., office 610, conference room 941, Main Street), to categorize new environments (office, corridor, street) and to use that information to provide contextual priors for object recognition (e.g., table, chair, car, computer). We present a low- dimensional global image representation that provides relevant information for place recognition and categorization, and how such contextual information introduces strong priors that simplify object recognition. We have trained the system to recognize over 60 locations (indoors and outdoors) and to suggest the presence and locations of more than 20 different object types. The algorithm has been integrated into a mobile system that provides real-time feedback to the user.

AITR-2003-005

CBCL-226

Author[s]: Sanmay Das

Intelligent Market-Making in Artificial Financial Markets

June 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-005.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-005.pdf

This thesis describes and evaluates a market-making algorithm for setting prices in financial markets with asymmetric information, and analyzes the properties of artificial markets in which the algorithm is used. The core of our algorithm is a technique for maintaining an online probability density estimate of the underlying value of a stock. Previous theoretical work on market-making has led to price-setting equations for which solutions cannot be achieved in practice, whereas empirical work on algorithms for market-making has focused on sets of heuristics and rules that lack theoretical justification. The algorithm presented in this thesis is theoretically justified by results in finance, and at the same time flexible enough to be easily extended by incorporating modules for dealing with considerations like portfolio risk and competition from other market-makers. We analyze the performance of our algorithm experimentally in artificial markets with different parameter settings and find that many reasonable real-world properties emerge. For example, the spread increases in response to uncertainty about the true value of a stock, average spreads tend to be higher in more volatile markets, and market-makers with lower average spreads perform better in environments with multiple competitive market- makers. In addition, the time series data generated by simple markets populated with market-makers using our algorithm replicate properties of real-world financial time series, such as volatility clustering and the fat-tailed nature of return distributions, without the need to specify explicit models for opinion propagation and herd behavior in the trading crowd.

AIM-2003-004

CBCL-225

Author[s]: Izzat N. Jarudi and Pawan Sinha

Relative Contributions of Internal and External Features to Face Recognition

March 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-004.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-004.pdf

The central challenge in face recognition lies in understanding the role different facial features play in our judgments of identity. Notable in this regard are the relative contributions of the internal (eyes, nose and mouth) and external (hair and jaw-line) features. Past studies that have investigated this issue have typically used high-resolution images or good-quality line drawings as facial stimuli. The results obtained are therefore most relevant for understanding the identification of faces at close range. However, given that real-world viewing conditions are rarely optimal, it is also important to know how image degradations, such as loss of resolution caused by large viewing distances, influence our ability to use internal and external features. Here, we report experiments designed to address this issue. Our data characterize how the relative contributions of internal and external features change as a function of image resolution. While we replicated results of previous studies that have shown internal features of familiar faces to be more useful for recognition than external features at high resolution, we found that the two feature sets reverse in importance as resolution decreases. These results suggest that the visual system uses a highly non-linear cue-fusion strategy in combining internal and external features along the dimension of image resolution and that the configural cues that relate the two feature sets play an important role in judgments of facial identity.

AITR-2003-004

Author[s]: Aaron D. Adler

Segmentation and Alignment of Speech and Sketching in a Design Environment

February 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-004.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-004.pdf

Sketches are commonly used in the early stages of design. Our previous system allows users to sketch mechanical systems that the computer interprets. However, some parts of the mechanical system might be too hard or too complicated to express in the sketch. Adding speech recognition to create a multimodal system would move us toward our goal of creating a more natural user interface. This thesis examines the relationship between the verbal and sketch input, particularly how to segment and align the two inputs. Toward this end, subjects were recorded while they sketched and talked. These recordings were transcribed, and a set of rules to perform segmentation and alignment was created. These rules represent the knowledge that the computer needs to perform segmentation and alignment. The rules successfully interpreted the 24 data sets that they were given.

AIM-2003-003

CBCL-224

Author[s]: Gadi Geiger, Tony Ezzat and Tomaso Poggio

Perceptual Evaluation of Video-Realistic Speech

February 28, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-003.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-003.pdf

abstract With many visual speech animation techniques now available, there is a clear need for systematic perceptual evaluation schemes. We describe here our scheme and its application to a new video-realistic (potentially indistinguishable from real recorded video) visual-speech animation system, called Mary 101. Two types of experiments were performed: a) distinguishing visually between real and synthetic image- sequences of the same utterances, ("Turing tests") and b) gauging visual speech recognition by comparing lip-reading performance of the real and synthetic image-sequences of the same utterances ("Intelligibility tests"). Subjects that were presented randomly with either real or synthetic image-sequences could not tell the synthetic from the real sequences above chance level. The same subjects when asked to lip-read the utterances from the same image-sequences recognized speech from real image-sequences significantly better than from synthetic ones. However, performance for both, real and synthetic, were at levels suggested in the literature on lip-reading. We conclude from the two experiments that the animation of Mary 101 is adequate for providing a percept of a talking head. However, additional effort is required to improve the animation for lip-reading purposes like rehabilitation and language learning. In addition, these two tasks could be considered as explicit and implicit perceptual discrimination tasks. In the explicit task (a), each stimulus is classified directly as a synthetic or real image-sequence by detecting a possible difference between the synthetic and the real image-sequences. The implicit perceptual discrimination task (b) consists of a comparison between visual recognition of speech of real and synthetic image-sequences. Our results suggest that implicit perceptual discrimination is a more sensitive method for discrimination between synthetic and real image-sequences than explicit perceptual discrimination.

AITR-2003-003

Author[s]: Leonid Peshkin

Reinforcement Learning by Policy Search

February 14, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-003.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-003.pdf

One objective of artificial intelligence is to model the behavior of an intelligent agent interacting with its environment. The environment's transformations can be modeled as a Markov chain, whose state is partially observable to the agent and affected by its actions; such processes are known as partially observable Markov decision processes (POMDPs). While the environment's dynamics are assumed to obey certain rules, the agent does not know them and must learn. In this dissertation we focus on the agent's adaptation as captured by the reinforcement learning framework. This means learning a policy---a mapping of observations into actions---based on feedback from the environment. The learning can be viewed as browsing a set of policies while evaluating them by trial through interaction with the environment. The set of policies is constrained by the architecture of the agent's controller. POMDPs require a controller to have a memory. We investigate controllers with memory, including controllers with external memory, finite state controllers and distributed controllers for multi-agent systems. For these various controllers we work out the details of the algorithms which learn by ascending the gradient of expected cumulative reinforcement. Building on statistical learning theory and experiment design theory, a policy evaluation algorithm is developed for the case of experience re-use. We address the question of sufficient experience for uniform convergence of policy evaluation and obtain sample complexity bounds for various estimators. Finally, we demonstrate the performance of the proposed algorithms on several domains, the most complex of which is simulated adaptive packet routing in a telecommunication network.

AIM-2003-002

Author[s]: Harald Steck annd Tommi S. Jaakkola

(Semi-)Predictive Discretization During Model Selection

February 25, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-002.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-002.pdf

In this paper, we present an approach to discretizing multivariate continuous data while learning the structure of a graphical model. We derive the joint scoring function from the principle of predictive accuracy, which inherently ensures the optimal trade- off between goodness of fit and model complexity (including the number of discretization levels). Using the so-called finest grid implied by the data, our scoring function depends only on the number of data points in the various discretization levels. Not only can it be computed efficiently, but it is also independent of the metric used in the continuous space. Our experiments with gene expression data show that discretization plays a crucial role regarding the resulting network structure.

AITR-2003-002

Author[s]: Timothy Chklovski

Using Analogy to Acquire Commonsense Knowledge from Human Contributors

February 12, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-002.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-002.pdf

The goal of the work reported here is to capture the commonsense knowledge of non-expert human contributors. Achieving this goal will enable more intelligent human-computer interfaces and pave the way for computers to reason about our world. In the domain of natural language processing, it will provide the world knowledge much needed for semantic processing of natural language. To acquire knowledge from contributors not trained in knowledge engineering, I take the following four steps: (i) develop a knowledge representation (KR) model for simple assertions in natural language, (ii) introduce cumulative analogy, a class of nearest-neighbor based analogical reasoning algorithms over this representation, (iii) argue that cumulative analogy is well suited for knowledge acquisition (KA) based on a theoretical analysis of effectiveness of KA with this approach, and (iv) test the KR model and the effectiveness of the cumulative analogy algorithms empirically. To investigate effectiveness of cumulative analogy for KA empirically, Learner, an open source system for KA by cumulative analogy has been implemented, deployed, and evaluated. (The site "1001 Questions," is available at http://teach-computers.org/learner.html). Learner acquires assertion-level knowledge by constructing shallow semantic analogies between a KA topic and its nearest neighbors and posing these analogies as natural language questions to human contributors. Suppose, for example, that based on the knowledge about "newspapers" already present in the knowledge base, Learner judges "newspaper" to be similar to "book" and "magazine." Further suppose that assertions "books contain information" and "magazines contain information" are also already in the knowledge base. Then Learner will use cumulative analogy from the similar topics to ask humans whether "newspapers contain information." Because similarity between topics is computed based on what is already known about them, Learner exhibits bootstrapping behavior --- the quality of its questions improves as it gathers more knowledge. By summing evidence for and against posing any given question, Learner also exhibits noise tolerance, limiting the effect of incorrect similarities. The KA power of shallow semantic analogy from nearest neighbors is one of the main findings of this thesis. I perform an analysis of commonsense knowledge collected by another research effort that did not rely on analogical reasoning and demonstrate that indeed there is sufficient amount of correlation in the knowledge base to motivate using cumulative analogy from nearest neighbors as a KA method. Empirically, evaluating the percentages of questions answered affirmatively, negatively and judged to be nonsensical in the cumulative analogy case compares favorably with the baseline, no-similarity case that relies on random objects rather than nearest neighbors. Of the questions generated by cumulative analogy, contributors answered 45% affirmatively, 28% negatively and marked 13% as nonsensical; in the control, no-similarity case 8% of questions were answered affirmatively, 60% negatively and 26% were marked as nonsensical.

AIM-2003-001

Author[s]: Nathan Srebro and Tommi Jaakkola

Generalized Low-Rank Approximations

January 15, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-001.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AIM-2003-001.pdf

We study the frequent problem of approximating a target matrix with a matrix of lower rank. We provide a simple and efficient (EM) algorithm for solving {\em weighted} low rank approximation problems, which, unlike simple matrix factorization problems, do not admit a closed form solution in general. We analyze, in addition, the nature of locally optimal solutions that arise in this context, demonstrate the utility of accommodating the weights in reconstructing the underlying low rank representation, and extend the formulation to non-Gaussian noise models such as classification (collaborative filtering).

AITR-2003-001

Author[s]: Philip Mjong-Hyon Shin Kim

Understanding Subsystems in Biology through Dimensionality Reduction, Graph Partitioning and Analytical Modeling

February 5, 2003

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-001.ps

ftp://publications.ai.mit.edu/ai-publications/2003/AITR-2003-001.pdf

Biological systems exhibit rich and complex behavior through the orchestrated interplay of a large array of components. It is hypothesized that separable subsystems with some degree of functional autonomy exist; deciphering their independent behavior and functionality would greatly facilitate understanding the system as a whole. Discovering and analyzing such subsystems are hence pivotal problems in the quest to gain a quantitative understanding of complex biological systems. In this work, using approaches from machine learning, physics and graph theory, methods for the identification and analysis of such subsystems were developed. A novel methodology, based on a recent machine learning algorithm known as non-negative matrix factorization (NMF), was developed to discover such subsystems in a set of large-scale gene expression data. This set of subsystems was then used to predict functional relationships between genes, and this approach was shown to score significantly higher than conventional methods when benchmarking them against existing databases. Moreover, a mathematical treatment was developed to treat simple network subsystems based only on their topology (independent of particular parameter values). Application to a problem of experimental interest demonstrated the need for extentions to the conventional model to fully explain the experimental data. Finally, the notion of a subsystem was evaluated from a topological perspective. A number of different protein networks were examined to analyze their topological properties with respect to separability, seeking to find separable subsystems. These networks were shown to exhibit separability in a nonintuitive fashion, while the separable subsystems were of strong biological significance. It was demonstrated that the separability property found was not due to incomplete or biased data, but is likely to reflect biological structure.

AIM-2002-024

CBCL-223

Author[s]: Sayan Mukherjee, Partha Niyogi, Tomaso Poggio and Ryan Rifkin

Statistical Learning: Stability is Sufficient for Generalization and Necessary and Sufficient for Consistency of Empirical Risk Minimization

December 2002 (revised July 2003)

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-024.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-024.pdf

Solutions of learning problems by Empirical Risk Minimization (ERM) need to be consistent, so that they may be predictive. They also need to be well- posed, so that they can be used robustly. We show that a statistical form of well-posedness, defined in terms of the key property of L-stability, is necessary and sufficient for consistency of ERM.

AIM-2002-023

CBCL-222

Author[s]: Luis Perez-Breva and Osamu Yoshimi

Model Selection in Summary Evaluation

December 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-023.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-023.pdf

A difficulty in the design of automated text summarization algorithms is in the objective evaluation. Viewing summarization as a tradeoff between length and information content, we introduce a technique based on a hierarchy of classifiers to rank, through model selection, different summarization methods. This summary evaluation technique allows for broader comparison of summarization methods than the traditional techniques of summary evaluation. We present an empirical study of two simple, albeit widely used, summarization methods that shows the different usages of this automated task-based evaluation system and confirms the results obtained with human-based evaluation methods over smaller corpora.

AIM-2002-022

Author[s]: Jake V. Bouvrie

Multiple Resolution Image Classification

December 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-022.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-022.pdf

Binary image classifiction is a problem that has received much attention in recent years. In this paper we evaluate a selection of popular techniques in an effort to find a feature set/ classifier combination which generalizes well to full resolution image data. We then apply that system to images at one-half through one-sixteenth resolution, and consider the corresponding error rates. In addition, we further observe generalization performance as it depends on the number of training images, and lastly, compare the system's best error rates to that of a human performing an identical classification task given teh same set of test images.

AIM-2002-021

Author[s]: Jacob Beal

Leaderless Distributed Hierarchy Formation

December 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-021.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-021.pdf

I present a system for robust leaderless organization of an amorphous network into hierarchical clusters. This system, which assumes that nodes are spatially embedded and can only talk to neighbors within a given radius, scales to networks of arbitrary size and converges rapidly. The amount of data stored at each node is logarithmic in the diameter of the network, and the hierarchical structure produces an addressing scheme such that there is an invertible relation between distance and address for any pair of nodes. The system adapts automatically to stopping failures, network partition, and reorganization.

AIM-2002-020

Author[s]: Erik B. Sudderth, Alexander T. Ihler, William T. Freeman and Alan S. Willsky

Nonparametric Belief Propagation and Facial Appearance Estimation

December 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-020.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-020.pdf

In many applications of graphical models arising in computer vision, the hidden variables of interest are most naturally specified by continuous, non-Gaussian distributions. There exist inference algorithms for discrete approximations to these continuous distributions, but for the high-dimensional variables typically of interest, discrete inference becomes infeasible. Stochastic methods such as particle filters provide an appealing alternative. However, existing techniques fail to exploit the rich structure of the graphical models describing many vision problems. Drawing on ideas from regularized particle filters and belief propagation (BP), this paper develops a nonparametric belief propagation (NBP) algorithm applicable to general graphs. Each NBP iteration uses an efficient sampling procedure to update kernel-based approximations to the true, continuous likelihoods. The algorithm can accomodate an extremely broad class of potential functions, including nonparametric representations. Thus, NBP extends particle filtering methods to the more general vision problems that graphical models can describe. We apply the NBP algorithm to infer component interrelationships in a parts-based face model, allowing location and reconstruction of occluded features.

AIM-2002-019

Author[s]: Antonio Torralba and William T. Freeman

Properties and Applications of Shape Recipes

December 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-019.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-019.pdf

In low-level vision, the representation of scene properties such as shape, albedo, etc., are very high dimensional as they have to describe complicated structures. The approach proposed here is to let the image itself bear as much of the representational burden as possible. In many situations, scene and image are closely related and it is possible to find a functional relationship between them. The scene information can be represented in reference to the image where the functional specifies how to translate the image into the associated scene. We illustrate the use of this representation for encoding shape information. We show how this representation has appealing properties such as locality and slow variation across space and scale. These properties provide a way of improving shape estimates coming from other sources of information like stereo.

AIM-2002-018

Author[s]: Gerald Jay Sussman and Jack Wisdom

The Role of Programming in the Formulation of Ideas

November 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-018.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-018.pdf

Classical mechanics is deceptively simple. It is surprisingly easy to get the right answer with fallacious reasoning or without real understanding. To address this problem we use computational techniques to communicate a deeper understanding of Classical Mechanics. Computational algorithms are used to express the methods used in the analysis of dynamical phenomena. Expressing the methods in a computer language forces them to be unambiguous and computationally effective. The task of formulating a method as a computer-executable program and debugging that program is a powerful exercise in the learning process. Also, once formalized procedurally, a mathematical idea becomes a tool that can be used directly to compute results.

AIM-2002-017

Author[s]: Jack Wisdom

Swimming in Space-Time

November 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-017.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-017.pdf

Cyclic changes in the shape of a quasi-rigid body on a curved manifold can lead to net translation and/or rotation of the body in the manifold. Presuming space-time is a curved manifold as portrayed by general relativity, translation in space can be accomplished simply by cyclic changes in the shape of a body, without any thrust or external forces.

AIM-2002-016

Author[s]: William T. Freeman and Antonio Torralba

Shape Recipes: Scene Representations that Refer to the Image

September 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-016.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-016.pdf

The goal of low-level vision is to estimate an underlying scene, given an observed image. Real-world scenes (e.g., albedos or shapes) can be very complex, conventionally requiring high dimensional representations which are hard to estimate and store. We propose a low-dimensional representation, called a scene recipe, that relies on the image itself to describe the complex scene configurations. Shape recipes are an example: these are the regression coefficients that predict the bandpassed shape from bandpassed image data. We describe the benefits of this representation, and show two uses illustrating their properties: (1) we improve stereo shape estimates by learning shape recipes at low resolution and applying them at full resolution; (2) Shape recipes implicitly contain information about lighting and materials and we use them for material segmentation.

AIM-2002-015

Author[s]: Marshall F. Tappen, William T. Freeman and Edward H. Adelson

Recovering Intrinsic Images from a Single Image

September 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-015.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-015.pdf

We present an algorithm that uses multiple cues to recover shading and reflectance intrinsic images from a single image. Using both color information and a classifier trained to recognize gray-scale patterns, each image derivative is classified as being caused by shading or a change in the surface's reflectance. Generalized Belief Propagation is then used to propagate information from areas where the correct classification is clear to areas where it is ambiguous. We also show results on real images.

AIM-2002-014

Author[s]: Harald Steck and Tommi S. Jaakkola

On the Dirichlet Prior and Bayesian Regularization

September 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-014.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-014.pdf

A common objective in learning a model from data is to recover its network structure, while the model parameters are of minor interest. For example, we may wish to recover regulatory networks from high-throughput data sources. In this paper we examine how Bayesian regularization using a Dirichlet prior over the model parameters affects the learned model structure in a domain with discrete variables. Surprisingly, a weak prior in the sense of smaller equivalent sample size leads to a strong regularization of the model structure (sparse graph) given a sufficiently large data set. In particular, the empty graph is obtained in the limit of a vanishing strength of prior belief. This is diametrically opposite to what one may expect in this limit, namely the complete graph from an (unregularized) maximum likelihood estimate. Since the prior affects the parameters as expected, the prior strength balances a "trade-off" between regularizing the parameters or the structure of the model. We demonstrate the benefits of optimizing this trade-off in the sense of predictive accuracy.

AIM-2002-013

CBCL-220

Author[s]: M.A. Giese and X. Xie

Exact Solution of the Nonlinear Dynamics of Recurrent Neural Mechanisms for Direction Selectivity

August 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-013.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-013.pdf

Different theoretical models have tried to investigate the feasibility of recurrent neural mechanisms for achieving direction selectivity in the visual cortex. The mathematical analysis of such models has been restricted so far to the case of purely linear networks. We present an exact analytical solution of the nonlinear dynamics of a class of direction selective recurrent neural models with threshold nonlinearity. Our mathematical analysis shows that such networks have form-stable stimulus-locked traveling pulse solutions that are appropriate for modeling the responses of direction selective cortical neurons. Our analysis shows also that the stability of such solutions can break down giving raise to a different class of solutions ("lurching activity waves") that are characterized by a specific spatio-temporal periodicity. These solutions cannot arise in models for direction selectivity with purely linear spatio-temporal filtering.

AIM-2002-012

CBCL-219

Author[s]: Martin Alexander Giese and Tomaso Poggio

Biologically Plausible Neural Model for the Recognition of Biological Motion and Actions

August 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-012.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-012.pdf

The visual recognition of complex movements and actions is crucial for communication and survival in many species. Remarkable sensitivity and robustness of biological motion perception have been demonstrated in psychophysical experiments. In recent years, neurons and cortical areas involved in action recognition have been identified in neurophysiological and imaging studies. However, the detailed neural mechanisms that underlie the recognition of such complex movement patterns remain largely unknown. This paper reviews the experimental results and summarizes them in terms of a biologically plausible neural model. The model is based on the key assumption that action recognition is based on learned prototypical patterns and exploits information from the ventral and the dorsal pathway. The model makes specific predictions that motivate new experiments.

AITR-2002-011

Author[s]: J.P. Grossman

Design and Evaluation of the Hamal Parallel Computer

December 5, 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-011.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-011.pdf

Parallel shared-memory machines with hundreds or thousands of processor-memory nodes have been built; in the future we will see machines with millions or even billions of nodes. Associated with such large systems is a new set of design challenges. Many problems must be addressed by an architecture in order for it to be successful; of these, we focus on three in particular. First, a scalable memory system is required. Second, the network messaging protocol must be fault-tolerant. Third, the overheads of thread creation, thread management and synchronization must be extremely low. This thesis presents the complete system design for Hamal, a shared-memory architecture which addresses these concerns and is directly scalable to one million nodes. Virtual memory and distributed objects are implemented in a manner that requires neither inter-node synchronization nor the storage of globally coherent translations at each node. We develop a lightweight fault-tolerant messaging protocol that guarantees message delivery and idempotence across a discarding network. A number of hardware mechanisms provide efficient support for massive multithreading and fine-grained synchronization. Experiments are conducted in simulation, using a trace-driven network simulator to investigate the messaging protocol and a cycle-accurate simulator to evaluate the Hamal architecture. We determine implementation parameters for the messaging protocol which optimize performance. A discarding network is easier to design and can be clocked at a higher rate, and we find that with this protocol its performance can approach that of a non-discarding network. Our simulations of Hamal demonstrate the effectiveness of its thread management and synchronization primitives. In particular, we find register-based synchronization to be an extremely efficient mechanism which can be used to implement a software barrier with a latency of only 523 cycles on a 512 node machine.

AIM-2002-011

CBCL-218

Author[s]: Robert Schneider and Maximilian Riesenhuber

A Detailed Look at Scale and Translation Invariance in a Hierarchical Neural Model of Visual Object Recognition

August 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-011.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-011.pdf

The HMAX model has recently been proposed by Riesenhuber & Poggio as a hierarchical model of position- and size-invariant object recognition in visual cortex. It has also turned out to model successfully a number of other properties of the ventral visual stream (the visual pathway thought to be crucial for object recognition in cortex), and particularly of (view- tuned) neurons in macaque inferotemporal cortex, the brain area at the top of the ventral stream. The original modeling study only used ``paperclip'' stimuli, as in the corresponding physiology experiment, and did not explore systematically how model units' invariance properties depended on model parameters. In this study, we aimed at a deeper understanding of the inner workings of HMAX and its performance for various parameter settings and ``natural'' stimulus classes. We examined HMAX responses for different stimulus sizes and positions systematically and found a dependence of model units' responses on stimulus position for which a quantitative description is offered. Interestingly, we find that scale invariance properties of hierarchical neural models are not independent of stimulus class, as opposed to translation invariance, even though both are affine transformations within the image plane.

AITR-2002-010

Author[s]: John M. Van Eepoel

Achieving Real-Time Mode Estimation through Offline Compilation

October 22, 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-010.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-010.pdf

As exploration of our solar system and outerspace move into the future, spacecraft are being developed to venture on increasingly challenging missions with bold objectives. The spacecraft tasked with completing these missions are becoming progressively more complex. This increases the potential for mission failure due to hardware malfunctions and unexpected spacecraft behavior. A solution to this problem lies in the development of an advanced fault management system. Fault management enables spacecraft to respond to failures and take repair actions so that it may continue its mission. The two main approaches developed for spacecraft fault management have been rule-based and model-based systems. Rules map sensor information to system behaviors, thus achieving fast response times, and making the actions of the fault management system explicit. These rules are developed by having a human reason through the interactions between spacecraft components. This process is limited by the number of interactions a human can reason about correctly. In the model-based approach, the human provides component models, and the fault management system reasons automatically about system wide interactions and complex fault combinations. This approach improves correctness, and makes explicit the underlying system models, whereas these are implicit in the rule- based approach. We propose a fault detection engine, Compiled Mode Estimation (CME) that unifies the strengths of the rule-based and model- based approaches. CME uses a compiled model to determine spacecraft behavior more accurately. Reasoning related to fault detection is compiled in an off-line process into a set of concurrent, localized diagnostic rules. These are then combined on-line along with sensor information to reconstruct the diagnosis of the system. These rules enable a human to inspect the diagnostic consequences of CME. Additionally, CME is capable of reasoning through component interactions automatically and still provide fast and correct responses. The implementation of this engine has been tested against the NEAR spacecraft advanced rule-based system, resulting in detection of failures beyond that of the rules. This evolution in fault detection will enable future missions to explore the furthest reaches of the solar system without the burden of human intervention to repair failed components.

AIM-2002-010

Author[s]: Justin Werfel

Implementing Universal Computation in an Evolutionary System

July 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-010.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-010.pdf

Evolutionary algorithms are a common tool in engineering and in the study of natural evolution. Here we take their use in a new direction by showing how they can be made to implement a universal computer. We consider populations of individuals with genes whose values are the variables of interest. By allowing them to interact with one another in a specified environment with limited resources, we demonstrate the ability to construct any arbitrary logic circuit. We explore models based on the limits of small and large populations, and show examples of such a system in action, implementing a simple logic circuit.

AITR-2002-009

Author[s]: Ron O. Dror

Surface Reflectance Recognition and Real-World Illumination Statistics

October 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-009.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-009.pdf

Humans distinguish materials such as metal, plastic, and paper effortlessly at a glance. Traditional computer vision systems cannot solve this problem at all. Recognizing surface reflectance properties from a single photograph is difficult because the observed image depends heavily on the amount of light incident from every direction. A mirrored sphere, for example, produces a different image in every environment. To make matters worse, two surfaces with different reflectance properties could produce identical images. The mirrored sphere simply reflects its surroundings, so in the right artificial setting, it could mimic the appearance of a matte ping-pong ball. Yet, humans possess an intuitive sense of what materials typically "look like" in the real world. This thesis develops computational algorithms with a similar ability to recognize reflectance properties from photographs under unknown, real-world illumination conditions. Real-world illumination is complex, with light typically incident on a surface from every direction. We find, however, that real-world illumination patterns are not arbitrary. They exhibit highly predictable spatial structure, which we describe largely in the wavelet domain. Although they differ in several respects from the typical photographs, illumination patterns share much of the regularity described in the natural image statistics literature. These properties of real-world illumination lead to predictable image statistics for a surface with given reflectance properties. We construct a system that classifies a surface according to its reflectance from a single photograph under unknown illuminination. Our algorithm learns relationships between surface reflectance and certain statistics computed from the observed image. Like the human visual system, we solve the otherwise underconstrained inverse problem of reflectance estimation by taking advantage of the statistical regularity of illumination. For surfaces with homogeneous reflectance properties and known geometry, our system rivals human performance.

AIM-2002-009

CBCL-217

Author[s]: Adlar J. Kim and Christian R. Shelton

Modeling Stock Order Flows and Learning Market-Making from Data

June 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-009.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-009.pdf

Stock markets employ specialized traders, market-makers, designed to provide liquidity and volume to the market by constantly supplying both supply and demand. In this paper, we demonstrate a novel method for modeling the market as a dynamic system and a reinforcement learning algorithm that learns profitable market-making strategies when run on this model. The sequence of buys and sells for a particular stock, the order flow, we model as an Input-Output Hidden Markov Model fit to historical data. When combined with the dynamics of the order book, this creates a highly non-linear and difficult dynamic system. Our reinforcement learning algorithm, based on likelihood ratios, is run on this partially-observable environment. We demonstrate learning results for two separate real stocks.

AITR-2002-008

CBCL-221

Author[s]: Vinay P. Kumar

Towards Man-Machine Interfaces: Combining Top-down Constraints with Bottom-up Learning in Facial Analysis

September 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-008.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-008.pdf

This thesis proposes a methodology for the design of man-machine interfaces by combining top-down and bottom-up processes in vision. From a computational perspective, we propose that the scientific-cognitive question of combining top- down and bottom-up knowledge is similar to the engineering question of labeling a training set in a supervised learning problem. We investigate these questions in the realm of facial analysis. We propose the use of a linear morphable model (LMM) for representing top-down structure and use it to model various facial variations such as mouth shapes and expression, the pose of faces and visual speech (visemes). We apply a supervised learning method based on support vector machine (SVM) regression for estimating the parameters of LMMs directly from pixel-based representations of faces. We combine these methods for designing new, more self- contained systems for recognizing facial expressions, estimating facial pose and for recognizing visemes.

AIM-2002-008

Author[s]: Andrew "bunnie" Huang

Keeping Secrets in Hardware: the Microsoft Xbox(TM) Case Study

May 26, 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-008.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-008.pdf

This paper discusses the hardware foundations of the cryptosystem employed by the Xbox(TM) video game console from Microsoft. A secret boot block overlay is buried within a system ASIC. This secret boot block decrypts and verifies portions of an external FLASH-type ROM. The presence of the secret boot block is camouflaged by a decoy boot block in the external ROM. The code contained within the secret boot block is transferred to the CPU in the clear over a set of high-speed busses where it can be extracted using simple custom hardware. The paper concludes with recommendations for improving the Xbox security system. One lesson of this study is that the use of a high-performance bus alone is not a sufficient security measure, given the advent of inexpensive, fast rapid prototyping services and high-performance FPGAs.

AITR-2002-007

Author[s]: Carl Steinbach

A Reinforcement-Learning Approach to Power Management

May 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-007.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-007.pdf

We describe an adaptive, mid-level approach to the wireless device power management problem. Our approach is based on reinforcement learning, a machine learning framework for autonomous agents. We describe how our framework can be applied to the power management problem in both infrastructure and ad~hoc wireless networks. From this thesis we conclude that mid-level power management policies can outperform low-level policies and are more convenient to implement than high-level policies. We also conclude that power management policies need to adapt to the user and network, and that a mid-level power management framework based on reinforcement learning fulfills these requirements.

AIM-2002-007

CBCL-216

Author[s]: Ulf Knoblich, David J. Freedman and Maximilian Riesenhuber

Categorization in IT and PFC: Model and Experiments

April 18, 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-007.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-007.pdf

In a recent experiment, Freedman et al. recorded from inferotemporal (IT) and prefrontal cortices (PFC) of monkeys performing a "cat/dog" categorization task (Freedman 2001 and Freedman, Riesenhuber, Poggio, Miller 2001). In this paper we analyze the tuning properties of view-tuned units in our HMAX model of object recognition in cortex (Riesenhuber 1999) using the same paradigm and stimuli as in the experiment. We then compare the simulation results to the monkey inferotemporal neuron population data. We find that view-tuned model IT units that were trained without any explicit category information can show category-related tuning as observed in the experiment. This suggests that the tuning properties of experimental IT neurons might primarily be shaped by bottom-up stimulus-space statistics, with little influence of top-down task-specific information. The population of experimental PFC neurons, on the other hand, shows tuning properties that cannot be explained just by stimulus tuning. These analyses are compatible with a model of object recognition in cortex (Riesenhuber 2000) in which a population of shape-tuned neurons provides a general basis for neurons tuned to different recognition tasks.

AITR-2002-006

Author[s]: Andrew "bunnie" Huang

ADAM: A Decentralized Parallel Computer Architecture Featuring Fast Thread and Data Migration and a Uniform Hardware Abstraction

June 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-006.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-006.pdf

The furious pace of Moore's Law is driving computer architecture into a realm where the the speed of light is the dominant factor in system latencies. The number of clock cycles to span a chip are increasing, while the number of bits that can be accessed within a clock cycle is decreasing. Hence, it is becoming more difficult to hide latency. One alternative solution is to reduce latency by migrating threads and data, but the overhead of existing implementations has previously made migration an unserviceable solution so far. I present an architecture, implementation, and mechanisms that reduces the overhead of migration to the point where migration is a viable supplement to other latency hiding mechanisms, such as multithreading. The architecture is abstract, and presents programmers with a simple, uniform fine-grained multithreaded parallel programming model with implicit memory management. In other words, the spatial nature and implementation details (such as the number of processors) of a parallel machine are entirely hidden from the programmer. Compiler writers are encouraged to devise programming languages for the machine that guide a programmer to express their ideas in terms of objects, since objects exhibit an inherent physical locality of data and code. The machine implementation can then leverage this locality to automatically distribute data and threads across the physical machine by using a set of high performance migration mechanisms. An implementation of this architecture could migrate a null thread in 66 cycles -- over a factor of 1000 improvement over previous work. Performance also scales well; the time required to move a typical thread is only 4 to 5 times that of a null thread. Data migration performance is similar, and scales linearly with data block size. Since the performance of the migration mechanism is on par with that of an L2 cache, the implementation simulated in my work has no data caches and relies instead on multithreading and the migration mechanism to hide and reduce access latencies.

AIM-2002-006

Author[s]: Sarah Finney, Natalia H. Gardiol, Leslie Pack Kaelbling and Tim Oates

Learning with Deictic Representation

April 10, 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-006.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-006.pdf

Most reinforcement learning methods operate on propositional representations of the world state. Such representations are often intractably large and generalize poorly. Using a deictic representation is believed to be a viable alternative: they promise generalization while allowing the use of existing reinforcement-learning methods. Yet, there are few experiments on learning with deictic representations reported in the literature. In this paper we explore the effectiveness of two forms of deictic representation and a naive propositional representation in a simple blocks-world domain. We find, empirically, that the deictic representations actually worsen performance. We conclude with a discussion of possible causes of these results and strategies for more effective learning in domains with objects.

AIM-2002-005

Author[s]: Gregory T. Sullivan

Advanced Programming Language Features for Executable Design Patterns "Better Patterns Through Reflection

March 22, 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-005.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-005.pdf

The Design Patterns book [GOF95] presents 24 time-tested patterns that consistently appear in well-designed software systems. Each pattern is presented with a description of the design problem the pattern addresses, as well as sample implementation code and design considerations. This paper explores how the patterns from the "Gang of Four'', or "GOF'' book, as it is often called, appear when similar problems are addressed using a dynamic, higher-order, object-oriented programming language. Some of the patterns disappear -- that is, they are supported directly by language features, some patterns are simpler or have a different focus, and some are essentially unchanged.

AITR-2002-005

Author[s]: Jeremy Hanford Brown

Sparsely Faceted Arrays: A Mechanism Supporting Parallel Allocation, Communication, and Garbage Collection

June 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-005.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-005.pdf

Conventional parallel computer architectures do not provide support for non-uniformly distributed objects. In this thesis, I introduce sparsely faceted arrays (SFAs), a new low- level mechanism for naming regions of memory, or facets, on different processors in a distributed, shared memory parallel processing system. Sparsely faceted arrays address the disconnect between the global distributed arrays provided by conventional architectures (e.g. the Cray T3 series), and the requirements of high-level parallel programming methods that wish to use objects that are distributed over only a subset of processing elements. A sparsely faceted array names a virtual globally-distributed array, but actual facets are lazily allocated. By providing simple semantics and making efficient use of memory, SFAs enable efficient implementation of a variety of non-uniformly distributed data structures and related algorithms. I present example applications which use SFAs, and describe and evaluate simple hardware mechanisms for implementing SFAs. Keeping track of which nodes have allocated facets for a particular SFA is an important task that suggests the need for automatic memory management, including garbage collection. To address this need, I first argue that conventional tracing techniques such as mark/sweep and copying GC are inherently unscalable in parallel systems. I then present a parallel memory-management strategy, based on reference-counting, that is capable of garbage collecting sparsely faceted arrays. I also discuss opportunities for hardware support of this garbage collection strategy. I have implemented a high-level hardware/OS simulator featuring hardware support for sparsely faceted arrays and automatic garbage collection. I describe the simulator and outline a few of the numerous details associated with a "real" implementation of SFAs and SFA-aware garbage collection. Simulation results are used throughout this thesis in the evaluation of hardware support mechanisms.

AIM-2002-004

CBCL-215

Author[s]: Ulf Knoblich and Maximilan Riesenhuber

Stimulus Simplification and Object Representation: A Modeling Study

March 15, 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-004.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-004.pdf

Tsunoda et al. (2001) recently studied the nature of object representation in monkey inferotemporal cortex using a combination of optical imaging and extracellular recordings. In particular, they examined IT neuron responses to complex natural objects and "simplified" versions thereof. In that study, in 42% of the cases, optical imaging revealed a decrease in the number of activation patches in IT as stimuli were "simplified". However, in 58% of the cases, "simplification" of the stimuli actually led to the appearance of additional activation patches in IT. Based on these results, the authors propose a scheme in which an object is represented by combinations of active and inactive columns coding for individual features. We examine the patterns of activation caused by the same stimuli as used by Tsunoda et al. in our model of object recognition in cortex (Riesenhuber 99). We find that object-tuned units can show a pattern of appearance and disappearance of features identical to the experiment. Thus, the data of Tsunoda et al. appear to be in quantitative agreement with a simple object-based representation in which an object's identity is coded by its similarities to reference objects. Moreover, the agreement of simulations and experiment suggests that the simplification procedure used by Tsunoda (2001) is not necessarily an accurate method to determine neuronal tuning.

AITR-2002-004

Author[s]: Teodoro Arvizo III

A Virtual Machine for a Type-omega Denotational Proof Language

June 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-004.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-004.pdf

In this thesis, I designed and implemented a virtual machine (VM) for a monomorphic variant of Athena, a type-omega denotational proof language (DPL). This machine attempts to maintain the minimum state required to evaluate Athena phrases. This thesis also includes the design and implementation of a compiler for monomorphic Athena that compiles to the VM. Finally, it includes details on my implementation of a read-eval-print loop that glues together the VM core and the compiler to provide a full, user-accessible interface to monomorphic Athena. The Athena VM provides the same basis for DPLs that the SECD machine does for pure, functional programming and the Warren Abstract Machine does for Prolog.

AITR-2002-003

Author[s]: Joanna J. Bryson

Intelligence by Design: Principles of Modularity and Coordination for Engineerin

September 2001

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-003.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-003.pdf

All intelligence relies on search --- for example, the search for an intelligent agent's next action. Search is only likely to succeed in resource-bounded agents if they have already been biased towards finding the right answer. In artificial agents, the primary source of bias is engineering. This dissertation describes an approach, Behavior-Oriented Design (BOD) for engineering complex agents. A complex agent is one that must arbitrate between potentially conflicting goals or behaviors. Behavior-oriented design builds on work in behavior-based and hybrid architectures for agents, and the object oriented approach to software engineering. The primary contributions of this dissertation are: 1.The BOD architecture: a modular architecture with each module providing specialized representations to facilitate learning. This includes one pre-specified module and representation for action selection or behavior arbitration. The specialized representation underlying BOD action selection is Parallel-rooted, Ordered, Slip-stack Hierarchical (POSH) reactive plans. 2.The BOD development process: an iterative process that alternately scales the agent's capabilities then optimizes the agent for simplicity, exploiting tradeoffs between the component representations. This ongoing process for controlling complexity not only provides bias for the behaving agent, but also facilitates its maintenance and extendibility. The secondary contributions of this dissertation include two implementations of POSH action selection, a procedure for identifying useful idioms in agent architectures and using them to distribute knowledge across agent paradigms, several examples of applying BOD idioms to established architectures, an analysis and comparison of the attributes and design trends of a large number of agent architectures, a comparison of biological (particularly mammalian) intelligence to artificial agent architectures, a novel model of primate transitive inference, and many other examples of BOD agents and BOD development.

AIM-2002-003

CBCL-214

Author[s]: Tomaso Poggio, Ryan Rifkin, Sayan Mukherjee and Alex Rakhlin

Bagging Regularizes

March 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-003.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-003.pdf

Intuitively, we expect that averaging --- or bagging --- different regressors with low correlation should smooth their behavior and be somewhat similar to regularization. In this note we make this intuition precise. Using an almost classical definition of stability, we prove that a certain form of averaging provides generalization bounds with a rate of convergence of the same order as Tikhonov regularization --- similar to fashionable RKHS- based learning algorithms.

AITR-2002-002

Author[s]: Jacob Beal

Generating Communications Systems Through Shared Context

January 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-002.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-002.pdf

In a distributed model of intelligence, peer components need to communicate with one another. I present a system which enables two agents connected by a thick twisted bundle of wires to bootstrap a simple communication system from observations of a shared environment. The agents learn a large vocabulary of symbols, as well as inflections on those symbols which allow thematic role-frames to be transmitted. Language acquisition time is rapid and linear in the number of symbols and inflections. The final communication system is robust and performance degrades gradually in the face of problems.

AIM-2002-002

Author[s]: William T. Freeman and Hao Zhang

Shape-Time Photography

January 10, 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-002.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-002.pdf

We introduce a new method to describe, in a single image, changes in shape over time. We acquire both range and image information with a stationary stereo camera. From the pictures taken, we display a composite image consisting of the image data from the surface closest to the camera at every pixel. This reveals the 3-d relationships over time by easy-to-interpret occlusion relationships in the composite image. We call the composite a shape-time photograph. Small errors in depth measurements cause artifacts in the shape-time images. We correct most of these using a Markov network to estimate the most probable front surface, taking into account the depth measurements, their uncertainties, and layer continuity assumptions.

AIM-2002-001

Author[s]: Trevor Darrell, Neal Checka, Alice Oh and Louis-Philippe Morency

Exploring Vision-Based Interfaces: How to Use Your Head in Dual Pointing Tasks

January 2002

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-001.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AIM-2002-001.pdf

The utility of vision-based face tracking for dual pointing tasks is evaluated. We first describe a 3-D face tracking technique based on real-time parametric motion-stereo, which is non-invasive, robust, and self-initialized. The tracker provides a real-time estimate of a ?frontal face ray? whose intersection with the display surface plane is used as a second stream of input for scrolling or pointing, in paral-lel with hand input. We evaluated the performance of com-bined head/hand input on a box selection and coloring task: users selected boxes with one pointer and colors with a second pointer, or performed both tasks with a single pointer. We found that performance with head and one hand was intermediate between single hand performance and dual hand performance. Our results are consistent with previously reported dual hand conflict in symmetric pointing tasks, and suggest that a head-based input stream should be used for asymmetric control.

AITR-2002-001

Author[s]: Lilla Zollei

2D-3D Rigid-Body Registration of X-Ray Fluoroscopy and CT Images

August 2001

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-001.ps

ftp://publications.ai.mit.edu/ai-publications/2002/AITR-2002-001.pdf

The registration of pre-operative volumetric datasets to intra- operative two-dimensional images provides an improved way of verifying patient position and medical instrument loca- tion. In applications from orthopedics to neurosurgery, it has a great value in maintaining up-to-date information about changes due to intervention. We propose a mutual information- based registration algorithm to establish the proper align- ment. For optimization purposes, we compare the perfor- mance of the non-gradient Powell method and two slightly di erent versions of a stochastic gradient ascent strategy: one using a sparsely sampled histogramming approach and the other Parzen windowing to carry out probability density approximation. Our main contribution lies in adopting the stochastic ap- proximation scheme successfully applied in 3D-3D registra- tion problems to the 2D-3D scenario, which obviates the need for the generation of full DRRs at each iteration of pose op- timization. This facilitates a considerable savings in compu- tation expense. We also introduce a new probability density estimator for image intensities via sparse histogramming, de- rive gradient estimates for the density measures required by the maximization procedure and introduce the framework for a multiresolution strategy to the problem. Registration results are presented on uoroscopy and CT datasets of a plastic pelvis and a real skull, and on a high-resolution CT- derived simulated dataset of a real skull, a plastic skull, a plastic pelvis and a plastic lumbar spine segment.

AIM-2001-036

CBCL-213

Author[s]: Antonio Torralba and Aude Oliva

Global Depth Perception from Familiar Scene Structure

December 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-036.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-036.pdf

In the absence of cues for absolute depth measurements as binocular disparity, motion, or defocus, the absolute distance between the observer and a scene cannot be measured. The interpretation of shading, edges and junctions may provide a 3D model of the scene but it will not inform about the actual "size" of the space. One possible source of information for absolute depth estimation is the image size of known objects. However, this is computationally complex due to the difficulty of the object recognition process. Here we propose a source of information for absolute depth estimation that does not rely on specific objects: we introduce a procedure for absolute depth estimation based on the recognition of the whole scene. The shape of the space of the scene and the structures present in the scene are strongly related to the scale of observation. We demonstrate that, by recognizing the properties of the structures present in the image, we can infer the scale of the scene, and therefore its absolute mean depth. We illustrate the interest in computing the mean depth of the scene with application to scene recognition and object detection.

AIM-2001-035

CBCL-212

Author[s]: Andrew Yip and Pawan Sinha

Role of color in face recognition

December 13, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-035.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-035.pdf

One of the key challenges in face perception lies in determining the contribution of different cues to face identification. In this study, we focus on the role of color cues. Although color appears to be a salient attribute of faces, past research has suggested that it confers little recognition advantage for identifying people. Here we report experimental results suggesting that color cues do play a role in face recognition and their contribution becomes evident when shape cues are degraded. Under such conditions, recognition performance with color images is significantly better than that with grayscale images. Our experimental results also indicate that the contribution of color may lie not so much in providing diagnostic cues to identity as in aiding low-level image-analysis processes such as segmentation.

AIM-2001-034

CBCL-211

Author[s]: Maximilian Riesenhuber

Generalization over contrast and mirror reversal, but not figure-ground reversal, in an "edge-based

December 10, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-034.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-034.pdf

Baylis & Driver (Nature Neuroscience, 2001) have recently presented data on the response of neurons in macaque inferotemporal cortex (IT) to various stimulus transformations. They report that neurons can generalize over contrast and mirror reversal, but not over figure-ground reversal. This finding is taken to demonstrate that ``the selectivity of IT neurons is not determined simply by the distinctive contours in a display, contrary to simple edge-based models of shape recognition'', citing our recently presented model of object recognition in cortex (Riesenhuber & Poggio, Nature Neuroscience, 1999). In this memo, I show that the main effects of the experiment can be obtained by performing the appropriate simulations in our simple feedforward model. This suggests for IT cell tuning that the possible contributions of explicit edge assignment processes postulated in (Baylis & Driver, 2001) might be smaller than expected.

AIM-2001-033

Author[s]: Ron O. Dror, Edward H. Adelson, and Alan S. Willsky

Recognition of Surface Reflectance Properties from a Single Image under Unknown Real-World Illumination

October 21, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-033.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-033.pdf

This paper describes a machine vision system that classifies reflectance properties of surfaces such as metal, plastic, or paper, under unknown real-world illumination. We demonstrate performance of our algorithm for surfaces of arbitrary geometry. Reflectance estimation under arbitrary omnidirectional illumination proves highly underconstrained. Our reflectance estimation algorithm succeeds by learning relationships between surface reflectance and certain statistics computed from an observed image, which depend on statistical regularities in the spatial structure of real-world illumination. Although the algorithm assumes known geometry, its statistical nature makes it robust to inaccurate geometry estimates.

AIM-2001-032

Author[s]: Roland W. Fleming, Ron O. Dror, Edward H. Adelson

How do Humans Determine Reflectance Properties under Unknown Illumination?

October 21, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-032.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-032.pdf

Under normal viewing conditions, humans find it easy to distinguish between objects made out of different materials such as plastic, metal, or paper. Untextured materials such as these have different surface reflectance properties, including lightness and gloss. With single isolated images and unknown illumination conditions, the task of estimating surface reflectance is highly underconstrained, because many combinations of reflection and illumination are consistent with a given image. In order to work out how humans estimate surface reflectance properties, we asked subjects to match the appearance of isolated spheres taken out of their original contexts. We found that subjects were able to perform the task accurately and reliably without contextual information to specify the illumination. The spheres were rendered under a variety of artificial illuminations, such as a single point light source, and a number of photographically-captured real-world illuminations from both indoor and outdoor scenes. Subjects performed more accurately for stimuli viewed under real-world patterns of illumination than under artificial illuminations, suggesting that subjects use stored assumptions about the regularities of real-world illuminations to solve the ill-posed problem.

AIM-2001-031

Author[s]: Konstantine Arkoudas

Simplifying transformations for type-alpha certificates

November 13, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-031.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-031.pdf

This paper presents an algorithm for simplifying NDL deductions. An array of simplifying transformations are rigorously defined. They are shown to be terminating, and to respect the formal semantis of the language. We also show that the transformations never increase the size or complexity of a deduction---in the worst case, they produce deductions of the same size and complexity as the original. We present several examples of proofs containing various types of "detours", and explain how our procedure eliminates them, resulting in smaller and cleaner deductions. All of the given transformations are fully implemented in SML-NJ. The complete code listing is presented, along with explanatory comments. Finally, although the transformations given here are defined for NDL, we point out that they can be applied to any type-alpha DPL that satisfies a few simple conditions.

AIM-2001-030

Author[s]: Adrian Corduneanu and Tommi Jaakkola

Stable Mixing of Complete and Incomplete Information

November 8, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-030.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-030.pdf

An increasing number of parameter estimation tasks involve the use of at least two information sources, one complete but limited, the other abundant but incomplete. Standard algorithms such as EM (or em) used in this context are unfortunately not stable in the sense that they can lead to a dramatic loss of accuracy with the inclusion of incomplete observations. We provide a more controlled solution to this problem through differential equations that govern the evolution of locally optimal solutions (fixed points) as a function of the source weighting. This approach permits us to explicitly identify any critical (bifurcation) points leading to choices unsupported by the available complete data. The approach readily applies to any graphical model in O(n^3) time where n is the number of parameters. We use the naive Bayes model to illustrate these ideas and demonstrate the effectiveness of our approach in the context of text classification problems.

AIM-2001-029

CBCL-209

Author[s]: Yuri Ostrovsky, Patrick Cavanagh and Pawan Sinha

Perceiving Illumination Inconsistencies in Scenes

November 5, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-029.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-029.pdf

The human visual system is adept at detecting and encoding statistical regularities in its spatio-temporal environment. Here we report an unexpected failure of this ability in the context of perceiving inconsistencies in illumination distributions across a scene. Contrary to predictions from previous studies [Enns and Rensink, 1990; Sun and Perona, 1996a, 1996b, 1997], we find that the visual system displays a remarkable lack of sensitivity to illumination inconsistencies, both in experimental stimuli and in images of real scenes. Our results allow us to draw inferences regarding how the visual system encodes illumination distributions across scenes. Specifically, they suggest that the visual system does not verify the global consistency of locally derived estimates of illumination direction.

AIM-2001-028

CBCL-208

Author[s]: Antonio Torralba and Pawan Sinha

Detecting Faces in Impoverished Images

November 5, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-028.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-028.pdf

The ability to detect faces in images is of critical ecological significance. It is a pre-requisite for other important face perception tasks such as person identification, gender classification and affect analysis. Here we address the question of how the visual system classifies images into face and non-face patterns. We focus on face detection in impoverished images, which allow us to explore information thresholds required for different levels of performance. Our experimental results provide lower bounds on image resolution needed for reliable discrimination between face and non-face patterns and help characterize the nature of facial representations used by the visual system under degraded viewing conditions. Specifically, they enable an evaluation of the contribution of luminance contrast, image orientation and local context on face-detection performance.

AIM-2001-027

Author[s]: Konstantine Arkoudas

Type-omega DPLs

October 16, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-027.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-027.pdf

Type-omega DPLs (Denotational Proof Languages) are languages for proof presentation and search that offer strong soundness guarantees. LCF-type systems such as HOL offer similar guarantees, but their soundness relies heavily on static type systems. By contrast, DPLs ensure soundness dynamically, through their evaluation semantics; no type system is necessary. This is possible owing to a novel two-tier syntax that separates deductions from computations, and to the abstraction of assumption bases, which is factored into the semantics of the language and allows for sound evaluation. Every type-omega DPL properly contains a type-alpha DPL, which can be used to present proofs in a lucid and detailed form, exclusively in terms of primitive inference rules. Derived inference rules are expressed as user-defined methods, which are "proof recipes" that take arguments and dynamically perform appropriate deductions. Methods arise naturally via parametric abstraction over type-alpha proofs. In that light, the evaluation of a method call can be viewed as a computation that carries out a type-alpha deduction. The type-alpha proof "unwound" by such a method call is called the "certificate" of the call. Certificates can be checked by exceptionally simple type-alpha interpreters, and thus they are useful whenever we wish to minimize our trusted base. Methods are statically closed over lexical environments, but dynamically scoped over assumption bases. They can take other methods as arguments, they can iterate, and they can branch conditionally. These capabilities, in tandem with the bifurcated syntax of type-omega DPLs and their dynamic assumption-base semantics, allow the user to define methods in a style that is disciplined enough to ensure soundness yet fluid enough to permit succinct and perspicuous expression of arbitrarily sophisticated derived inference rules. We demonstrate every major feature of type-omega DPLs by defining and studying NDL-omega, a higher-order, lexically scoped, call-by-value type-omega DPL for classical zero-order natural deduction---a simple choice that allows us to focus on type-omega syntax and semantics rather than on the subtleties of the underlying logic. We start by illustrating how type-alpha DPLs naturally lead to type-omega DPLs by way of abstraction; present the formal syntax and semantics of NDL-omega; prove several results about it, including soundness; give numerous examples of methods; point out connections to the lambda-phi calculus, a very general framework for type-omega DPLs; introduce a notion of computational and deductive cost; define several instrumented interpreters for computing such costs and for generating certificates; explore the use of type-omega DPLs as general programming languages; show that DPLs do not have to be type-less by formulating a static Hindley-Milner polymorphic type system for NDL-omega; discuss some idiosyncrasies of type-omega DPLs such as the potential divergence of proof checking; and compare type-omega DPLs to other approaches to proof presentation and discovery. Finally, a complete implementation of NDL-omega in SML-NJ is given for users who want to run the examples and experiment with the language.

AIM-2001-026

CBCL-210

Author[s]: Jason D. M. Rennie and Ryan Rifkin

Improving Multiclass Text Classification with the Support Vector Machine

October 16, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-026.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-026.pdf

We compare Naive Bayes and Support Vector Machines on the task of multiclass text classification. Using a variety of approaches to combine the underlying binary classifiers, we find that SVMs substantially outperform Naive Bayes. We present full multiclass results on two well-known text data sets, including the lowest error to date on both data sets. We develop a new indicator of binary performance to show that the SVM's lower multiclass error is a result of its improved binary performance. Furthermore, we demonstrate and explore the surprising result that one-vs-all classification performs favorably compared to other approaches even though it has no error-correcting properties.

AIM-2001-025

Author[s]: Konstantine Arkoudas

Type-alpha DPLs

October 5, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-025.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-025.pdf

This paper introduces Denotational Proof Languages (DPLs). DPLs are languages for presenting, discovering, and checking formal proofs. In particular, in this paper we discus type-alpha DPLs---a simple class of DPLs for which termination is guaranteed and proof checking can be performed in time linear in the size of the proof. Type-alpha DPLs allow for lucid proof presentation and for efficient proof checking, but not for proof search. Type-omega DPLs allow for search as well as simple presentation and checking, but termination is no longer guaranteed and proof checking may diverge. We do not study type-omega DPLs here. We start by listing some common characteristics of DPLs. We then illustrate with a particularly simple example: a toy type-alpha DPL called PAR, for deducing parities. We present the abstract syntax of PAR, followed by two different kinds of formal semantics: evaluation and denotational. We then relate the two semantics and show how proof checking becomes tantamount to evaluation. We proceed to develop the proof theory of PAR, formulating and studying certain key notions such as observational equivalence that pervade all DPLs. We then present NDL, a type-alpha DPL for classical zero-order natural deduction. Our presentation of NDL mirrors that of PAR, showing how every basic concept that was introduced in PAR resurfaces in NDL. We present sample proofs of several well-known tautologies of propositional logic that demonstrate our thesis that DPL proofs are readable, writable, and concise. Next we contrast DPLs to typed logics based on the Curry-Howard isomorphism, and discuss the distinction between pure and augmented DPLs. Finally we consider the issue of implementing DPLs, presenting an implementation of PAR in SML and one in Athena, and end with some concluding remarks.

AIM-2001-024

Author[s]: Leonid Taycher and Trevor Darrell

Range Segmentation Using Visibility Constraints

September 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-024.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-024.pdf

Visibility constraints can aid the segmentation of foreground objects observed with multiple range images. In our approach, points are defined as foreground if they can be determined to occlude some {em empty space} in the scene. We present an efficient algorithm to estimate foreground points in each range view using explicit epipolar search. In cases where the background pattern is stationary, we show how visibility constraints from other views can generate virtual background values at points with no valid depth in the primary view. We demonstrate the performance of both algorithms for detecting people in indoor office environments.

AIM-2001-023

Author[s]: Ron O. Dror, Edward H. Adelson and Alan S. Willsky

Surface Reflectance Estimation and Natural Illumination Statistics

September 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-023.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-023.pdf

Humans recognize optical reflectance properties of surfaces such as metal, plastic, or paper from a single image without knowledge of illumination. We develop a machine vision system to perform similar recognition tasks automatically. Reflectance estimation under unknown, arbitrary illumination proves highly underconstrained due to the variety of potential illumination distributions and surface reflectance properties. We have found that the spatial structure of real-world illumination possesses some of the statistical regularities observed in the natural image statistics literature. A human or computer vision system may be able to exploit this prior information to determine the most likely surface reflectance given an observed image. We develop an algorithm for reflectance classification under unknown real-world illumination, which learns relationships between surface reflectance and certain features (statistics) computed from a single observed image. We also develop an automatic feature selection method.

AIM-2001-022

CBCL-207

Author[s]: Angela J. Yu, Martin A. Giese and Tomaso A. Poggio

Biologically Plausible Neural Circuits for Realization of Maximum Operations

September 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-022.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-022.pdf

Object recognition in the visual cortex is based on a hierarchical architecture, in which specialized brain regions along the ventral pathway extract object features of increasing levels of complexity, accompanied by greater invariance in stimulus size, position, and orientation. Recent theoretical studies postulate a non-linear pooling function, such as the maximum (MAX) operation could be fundamental in achieving such invariance. In this paper, we are concerned with neurally plausible mechanisms that may be involved in realizing the MAX operation. Four canonical circuits are proposed, each based on neural mechanisms that have been previously discussed in the context of cortical processing. Through simulations and mathematical analysis, we examine the relative performance and robustness of these mechanisms. We derive experimentally verifiable predictions for each circuit and discuss their respective physiological considerations.

AIM-2001-021

Author[s]: Erik G. Miller, Kinh Tieu and Chris P. Stauffer

Learning Object-Independent Modes of Variation with Feature Flow Fields

September 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-021.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-021.pdf

We present a unifying framework in which "object-independent" modes of variation are learned from continuous-time data such as video sequences. These modes of variation can be used as "generators" to produce a manifold of images of a new object from a single example of that object. We develop the framework in the context of a well-known example: analyzing the modes of spatial deformations of a scene under camera movement. Our method learns a close approximation to the standard affine deformations that are expected from the geometry of the situation, and does so in a completely unsupervised (i.e. ignorant of the geometry of the situation) fashion. We stress that it is learning a "parameterization", not just the parameter values, of the data. We then demonstrate how we have used the same framework to derive a novel data-driven model of joint color change in images due to common lighting variations. The model is superior to previous models of color change in describing non-linear color changes due to lighting.

AIM-2001-020

CBCL-205

Author[s]: Antonio Torralba and Pawan Sinha

Contextual Priming for Object Detection

September 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-020.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-020.pdf

There is general consensus that context can be a rich source of information about an object's identity, location and scale. In fact, the structure of many real-world scenes is governed by strong configurational rules akin to those that apply to a single object. Here we introduce a simple probabilistic framework for modeling the relationship between context and object properties based on the correlation between the statistics of low-level features across the entire scene and the objects that it contains. The resulting scheme serves as an effective procedure for object priming, context driven focus of attention and automatic scale-selection on real-world scenes.

AIM-2001-019

Author[s]: Lily Lee

Gait Dynamics for Recognition and Classification

September 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-019.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-019.pdf

This paper describes a representation of the dynamics of human walking action for the purpose of person identification and classification by gait appearance. Our gait representation is based on simple features such as moments extracted from video silhouettes of human walking motion. We claim that our gait dynamics representation is rich enough for the task of recognition and classification. The use of our feature representation is demonstrated in the task of person recognition from video sequences of orthogonal views of people walking. We demonstrate the accuracy of recognition on gait video sequences collected over different days and times, and under varying lighting environments. In addition, preliminary results are shown on gender classification using our gait dynamics features.

AIM-2001-018

CBCL-206

Author[s]: Gene Yeo, Tomaso Poggio

Multiclass Classification of SRBCTs

August 25, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-018.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-018.pdf

A novel approach to multiclass tumor classification using Artificial Neural Networks (ANNs) was introduced in a recent paper cite{Khan2001}. The method successfully classified and diagnosed small, round blue cell tumors (SRBCTs) of childhood into four distinct categories, neuroblastoma (NB), rhabdomyosarcoma (RMS), non-Hodgkin lymphoma (NHL) and the Ewing family of tumors (EWS), using cDNA gene expression profiles of samples that included both tumor biopsy material and cell lines. We report that using an approach similar to the one reported by Yeang et al cite{Yeang2001}, i.e. multiclass classification by combining outputs of binary classifiers, we achieved equal accuracy with much fewer features. We report the performances of 3 binary classifiers (k-nearest neighbors (kNN), weighted-voting (WV), and support vector machines (SVM)) with 3 feature selection techniques (Golub's Signal to Noise (SN) ratios cite{Golub99}, Fisher scores (FSc) and Mukherjee's SVM feature selection (SVMFS))cite{Sayan98}.

AIM-2001-017

CBCL-203

Author[s]: Pawan Sinha and Antonio Torralba

Role of Low-level Mechanisms in Brightness Perception

August 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-017.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-017.pdf

Brightness judgments are a key part of the primate brain’s visual analysis of the environment. There is general consensus that the perceived brightness of an image region is based not only on its actual luminance, but also on the photometric structure of its neighborhood. However, it is unclear precisely how a region’s context influences its perceived brightness. Recent research has suggested that brightness estimation may be based on a sophisticated analysis of scene layout in terms of transparency, illumination and shadows. This work has called into question the role of low-level mechanisms, such as lateral inhibition, as explanations for brightness phenomena. Here we describe experiments with displays for which low-level and high-level analyses make qualitatively different predictions, and with which we can quantitatively assess the trade-offs between low-level and high-level factors. We find that brightness percepts in these displays are governed by low-level stimulus properties, even when these percepts are inconsistent with higher-level interpretations of scene layout. These results point to the important role of low-level mechanisms in determining brightness percepts.

AIM-2001-016

Author[s]: Jacob Beal

An Algorithm for Bootstrapping Communications

August 13, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-016.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-016.pdf

I present an algorithm which allows two agents to generate a simple language based only on observations of a shared environment. Vocabulary and roles for the language are learned in linear time. Communication is robust and degrades gradually as complexity increases. Dissimilar modes of experience will lead to a shared kernel vocabulary.

AIM-2001-015

CBCL-202

Author[s]: Antonio Torralba, Pawan Sinha

Recognizing Indoor Scenes

July 25, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-015.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-015.pdf

We propose a scheme for indoor place identification based on the recognition of global scene views. Scene views are encoded using a holistic representation that provides low-resolution spatial and spectral information. The holistic nature of the representation dispenses with the need to rely on specific objects or local landmarks and also renders it robust against variations in object configurations. We demonstrate the scheme on the problem of recognizing scenes in video sequences captured while walking through an office environment. We develop a method for distinguishing between 'diagnostic' and 'generic' views and also evaluate changes in system performances as a function of the amount of training data available and the complexity of the representation.

AIM-2001-014

CBCL-201

Author[s]: Richard Russell and Pawan Sinha

Perceptually-based Comparison of Image Similarity Metrics

July 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-014.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-014.pdf

The image comparison operation – assessing how well one image matches another – forms a critical component of many image analysis systems and models of human visual processing. Two norms used commonly for this purpose are L1 and L2, which are specific instances of the Minkowski metric. However, there is often not a principled reason for selecting one norm over the other. One way to address this problem is by examining whether one metric better captures the perceptual notion of image similarity than the other. With this goal, we examined perceptual preferences for images retrieved on the basis of the L1 versus the L2 norm. These images were either small fragments without recognizable content, or larger patterns with recognizable content created via vector quantization. In both conditions the subjects showed a consistent preference for images matched using the L1 metric. These results suggest that, in the domain of natural images of the kind we have used, the L1 metric may better capture human notions of image similarity.

AIM-2001-013

CBCL-200

Author[s]: Nicholas T. Chan, Ely Dahan, Andrew W. Lo and Tomaso Poggio

Experimental Markets for Product Concepts

July 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-013.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-013.pdf

Market prices are well known to efficiently collect and aggregate diverse information regarding the value of commodities and assets. The role of markets has been particularly suitable to pricing financial securities. This article provides an alternative application of the pricing mechanism to marketing research - using pseudo-securities markets to measure preferences over new product concepts. Surveys, focus groups, concept tests and conjoint studies are methods traditionally used to measure individual and aggregate preferences. Unfortunately, these methods can be biased, costly and time-consuming to conduct. The present research is motivated by the desire to efficiently measure preferences and more accurately predict new product success, based on the efficiency and incentive-compatibility of security trading markets. The article describes a novel market research method, pro-vides insight into why the method should work, and compares the results of several trading experiments against other methodologies such as concept testing and conjoint analysis.

AIM-2001-012

CBCL-199

Author[s]: Mariano Alvira, Jim Paris and Ryan Rifkin

The Audiomomma Music Recommendation System

July 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-012.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-012.pdf

We design and implement a system that recommends musicians to listeners. The basic idea is to keep track of what artists a user listens to, to find other users with similar tastes, and to recommend other artists that these similar listeners enjoy. The system utilizes a client-server architecture, a web-based interface, and an SQL database to store and process information. We describe Audiomomma-0.3, a proof-of-concept implementation of the above ideas.

AIM-2001-011

CBCL-198

Author[s]: T. Poggio, S. Mukherjee, R. Rifkin, A. Rakhlin, and A. Verri

July 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-011.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-011.pdf

In this note we characterize the role of b ,which is the constant in the standard form of the solution provided by the Support Vector Machine technique f (x )= i =1 • i K (x ,x i )+b .

AIM-2001-010

CBCL-197

Author[s]: Purdy Ho

Rotation Invariant Real-time Face Detection and Recognition System

May 31, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-010.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-010.pdf

In this report, a face recognition system that is capable of detecting and recognizing frontal and rotated faces was developed. Two face recognition methods focusing on the aspect of pose invariance are presented and evaluated - the whole face approach and the component-based approach. The main challenge of this project is to develop a system that is able to identify faces under different viewing angles in realtime. The development of such a system will enhance the capability and robustness of current face recognition technology. The whole-face approach recognizes faces by classifying a single feature vector consisting of the gray values of the whole face image. The component-based approach first locates the facial components and extracts them. These components are normalized and combined into a single feature vector for classification. The Support Vector Machine (SVM) is used as the classifier for both approaches. Extensive tests with respect to the robustness against pose changes are performed on a database that includes faces rotated up to about 40 degrees in depth. The component-based approach clearly outperforms the whole-face approach on all tests. Although this approach isproven to be more reliable, it is still too slow for real-time applications. That is the reason why a real-time face recognition system using the whole-face approach is implemented to recognize people in color video sequences.

AIM-2001-009

Author[s]: D. Demirdjian and T. Darrell

Motion Estimation from Disparity Images

May 7, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-009.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-009.pdf

A new method for 3D rigid motion estimation from stereo is proposed in this paper. The appealing feature of this method is that it directly uses the disparity images obtained from stereo matching. We assume that the stereo rig has parallel cameras and show, in that case, the geometric and topological properties of the disparity images. Then we introduce a rigid transformation (called d-motion) that maps two disparity images of a rigidly moving object. We show how it is related to the Euclidean rigid motion and a motion estimation algorithm is derived. We show with experiments that our approach is simple and more accurate than standard approaches.

AITR-2001-009

Author[s]: Tevfik Metin Sezgin

Feature Point Detection and Curve Approximation for Early Processing of Freehand Sketches

May 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-009.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-009.pdf

Freehand sketching is both a natural and crucial part of design, yet is unsupported by current design automation software. We are working to combine the flexibility and ease of use of paper and pencil with the processing power of a computer to produce a design environment that feels as natural as paper, yet is considerably smarter. One of the most basic steps in accomplishing this is converting the original digitized pen strokes in the sketch into the intended geometric objects using feature point detection and approximation. We demonstrate how multiple sources of information can be combined for feature detection in strokes and apply this technique using two approaches to signal processing, one using simple average based thresholding and a second using scale space.

AIM-2001-008

Author[s]: A. Rahimi, L.-P. Morency and T. Darrell

Reducing Drift in Parametric Motion Tracking

May 7, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-008.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-008.pdf

We develop a class of differential motion trackers that automatically stabilize when in finite domains. Most differ-ential trackers compute motion only relative to one previous frame, accumulating errors indefinitely. We estimate pose changes between a set of past frames, and develop a probabilistic framework for integrating those estimates. We use an approximation to the posterior distribution of pose changes as an uncertainty model for parametric motion in order to help arbitrate the use of multiple base frames. We demonstrate this framework on a simple 2D translational tracker and a 3D, 6-degree of freedom tracker.

AITR-2001-008

Author[s]: Radhika Nagpal

Programmable Self-Assembly: Constructing Global Shape using Biologically-inspire

June 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-008.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-008.pdf

In this thesis I present a language for instructing a sheet of identically-programmed, flexible, autonomous agents (``cells'') to assemble themselves into a predetermined global shape, using local interactions. The global shape is described as a folding construction on a continuous sheet, using a set of axioms from paper-folding (origami). I provide a means of automatically deriving the cell program, executed by all cells, from the global shape description. With this language, a wide variety of global shapes and patterns can be synthesized, using only local interactions between identically-programmed cells. Examples include flat layered shapes, all plane Euclidean constructions, and a variety of tessellation patterns. In contrast to approaches based on cellular automata or evolution, the cell program is directly derived from the global shape description and is composed from a small number of biologically-inspired primitives: gradients, neighborhood query, polarity inversion, cell-to-cell contact and flexible folding. The cell programs are robust, without relying on regular cell placement, global coordinates, or synchronous operation and can tolerate a small amount of random cell death. I show that an average cell neighborhood of 15 is sufficient to reliably self-assemble complex shapes and geometric patterns on randomly distributed cells. The language provides many insights into the relationship between local and global descriptions of behavior, such as the advantage of constructive languages, mechanisms for achieving global robustness, and mechanisms for achieving scale- independent shapes from a single cell program. The language suggests a mechanism by which many related shapes can be created by the same cell program, in the manner of D'Arcy Thompson's famous coordinate transformations. The thesis illuminates how complex morphology and pattern can emerge from local interactions, and how one can engineer robust self-assembly.

AITR-2001-007

Author[s]: Won Hong

Modeling, Estimation, and Control of Robot-Soil Interactions

September 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-007.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-007.pdf

This thesis presents the development of hardware, theory, and experimental methods to enable a robotic manipulator arm to interact with soils and estimate soil properties from interaction forces. Unlike the majority of robotic systems interacting with soil, our objective is parameter estimation, not excavation. To this end, we design our manipulator with a flat plate for easy modeling of interactions. By using a flat plate, we take advantage of the wealth of research on the similar problem of earth pressure on retaining walls. There are a number of existing earth pressure models. These models typically provide estimates of force which are in uncertain relation to the true force. A recent technique, known as numerical limit analysis, provides upper and lower bounds on the true force. Predictions from the numerical limit analysis technique are shown to be in good agreement with other accepted models. Experimental methods for plate insertion, soil-tool interface friction estimation, and control of applied forces on the soil are presented. In addition, a novel graphical technique for inverting the soil models is developed, which is an improvement over standard nonlinear optimization. This graphical technique utilizes the uncertainties associated with each set of force measurements to obtain all possible parameters which could have produced the measured forces. The system is tested on three cohesionless soils, two in a loose state and one in a loose and dense state. The results are compared with friction angles obtained from direct shear tests. The results highlight a number of key points. Common assumptions are made in soil modeling. Most notably, the Mohr-Coulomb failure law and perfectly plastic behavior. In the direct shear tests, a marked dependence of friction angle on the normal stress at low stresses is found. This has ramifications for any study of friction done at low stresses. In addition, gradual failures are often observed for vertical tools and tools inclined away from the direction of motion. After accounting for the change in friction angle at low stresses, the results show good agreement with the direct shear values.

AIM-2001-007

Author[s]: Konstantine Arkoudas

Certified Computation

April 30, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-007.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-007.pdf

This paper introduces the notion of certified computation. A certified computation does not only produce a result r, but also a correctness certificate, which is a formal proof that r is correct. This can greatly enhance the credibility of the result: if we trust the axioms and inference rules that are used in the certificate,then we can be assured that r is correct. In effect,we obtain a trust reduction: we no longer have to trust the entire computation; we only have to trust the certificate. Typically, the reasoning used in the certificate is much simpler and easier to trust than the entire computation. Certified computation has two main applications: as a software engineering discipline, it can be used to increase the reliability of our code; and as a framework for cooperative computation, it can be used whenever a code consumer executes an algorithm obtained from an untrusted agent and needs to be convinced that the generated results are correct. We propose DPLs (Denotational Proof Languages)as a uniform platform for certified computation. DPLs enforce a sharp separation between logic and control and over versatile mechanicms for constructing certificates. We use Athena as a concrete DPL to illustrate our ideas, and we present two examples of certified computation, giving full working code in both cases.

AITR-2001-006

Author[s]: Aaron Mark Ucko

Predicate Dispatching in the Common Lisp Object System

May 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-006.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-006.pdf

I have added support for predicate dispatching, a powerful generalization of other dispatching mechanisms, to the Common Lisp Object System (CLOS). To demonstrate its utility, I used predicate dispatching to enhance Weyl, a computer algebra system which doubles as a CLOS library. My result is Dispatching-Enhanced Weyl (DEW), a computer algebra system that I have demonstrated to be well suited for both users and programmers.

AIM-2001-006

CBCL-196

Author[s]: Javid Sadr and Pawan Sinha

Exploring Object Perception with Random Image Structure Evolution

March 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-006.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-006.pdf

We have developed a technique called RISE (Random Image Structure Evolution), by which one may systematically sample continuous paths in a high-dimensional image space. A basic RISE sequence depicts the evolution of an object's image from a random field, along with the reverse sequence which depicts the transformation of this image back into randomness. The processing steps are designed to ensure that important low-level image attributes such as the frequency spectrum and luminance are held constant throughout a RISE sequence. Experiments based on the RISE paradigm can be used to address some key open issues in object perception. These include determining the neural substrates underlying object perception, the role of prior knowledge and expectation in object perception, and the developmental changes in object perception skills from infancy to adulthood.

AITR-2001-005

Author[s]: Jessica Banks

Design and Control of an Anthropomorphic Robotic Finger with Multi-point Tactile Sensation

May 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-005.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-005.pdf

The goal of this research is to develop the prototype of a tactile sensing platform for anthropomorphic manipulation research. We investigate this problem through the fabrication and simple control of a planar 2-DOF robotic finger inspired by anatomic consistency, self-containment, and adaptability. The robot is equipped with a tactile sensor array based on optical transducer technology whereby localized changes in light intensity within an illuminated foam substrate correspond to the distribution and magnitude of forces applied to the sensor surface plane. The integration of tactile perception is a key component in realizing robotic systems which organically interact with the world. Such natural behavior is characterized by compliant performance that can initiate internal, and respond to external, force application in a dynamic environment. However, most of the current manipulators that support some form of haptic feedback either solely derive proprioceptive sensation or only limit tactile sensors to the mechanical fingertips. These constraints are due to the technological challenges involved in high resolution, multi-point tactile perception. In this work, however, we take the opposite approach, emphasizing the role of full-finger tactile feedback in the refinement of manual capabilities. To this end, we propose and implement a control framework for sensorimotor coordination analogous to infant-level grasping and fixturing reflexes. This thesis details the mechanisms used to achieve these sensory, actuation, and control objectives, along with the design philosophies and biological influences behind them. The results of behavioral experiments with a simple tactilely-modulated control scheme are also described. The hope is to integrate the modular finger into an %engineered analog of the human hand with a complete haptic system.

AIM-2001-005

CBCL-195

Author[s]: Nicholas Tung Chan and Christian Shelton

An Electronic Market-Maker

April 17, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-005.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-005.pdf

This paper presents an adaptive learning model for market-making under the reinforcement learning framework. Reinforcement learning is a learning technique in which agents aim to maximize the long-term accumulated rewards. No knowledge of the market environment, such as the order arrival or price process, is assumed. Instead, the agent learns from real-time market experience and develops explicit market-making strategies, achieving multiple objectives including the maximizing of profits and minimization of the bid-ask spread. The simulation results show initial success in bringing learning techniques to building market-making algorithms.

AITR-2001-004

Author[s]: Jason D. M. Rennie

Improving Multi-class Text Classification with Naive Bayes

September 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-004.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-004.pdf

There are numerous text documents available in electronic form. More and more are becoming available every day. Such documents represent a massive amount of information that is easily accessible. Seeking value in this huge collection requires organization; much of the work of organizing documents can be automated through text classification. The accuracy and our understanding of such systems greatly influences their usefulness. In this paper, we seek 1) to advance the understanding of commonly used text classification techniques, and 2) through that understanding, improve the tools that are available for text classification. We begin by clarifying the assumptions made in the derivation of Naive Bayes, noting basic properties and proposing ways for its extension and improvement. Next, we investigate the quality of Naive Bayes parameter estimates and their impact on classification. Our analysis leads to a theorem which gives an explanation for the improvements that can be found in multiclass classification with Naive Bayes using Error-Correcting Output Codes. We use experimental evidence on two commonly-used data sets to exhibit an application of the theorem. Finally, we show fundamental flaws in a commonly-used feature selection algorithm and develop a statistics-based framework for text feature selection. Greater understanding of Naive Bayes and the properties of text allows us to make better use of it in text classification.

AIM-2001-004

CBCL-193

Author[s]: Mariano Alvira and Ryan Rifkin

An Empirical Comparison of SNoW and SVMs for Face Detection

January 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-004.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-004.pdf

Impressive claims have been made for the performance of the SNoW algorithm on face detection tasks by Yang et. al. [7]. In particular, by looking at both their results and those of Heisele et. al. [3], one could infer that the SNoW system performed substantially better than an SVM-based system, even when the SVM used a polynomial kernel and the SNoW system used a particularly simplistic 'primitive' linear representation. We evaluated the two approaches in a controlled experiment, looking directly at performance on a simple, fixed-sized test set, isolating out 'infrastructure' issues related to detecting faces at various scales in large images. We found that SNoW performed about as well as linear SVMs, and substantially worse than polynomial SVMs.

AITR-2001-003

CBCL-204

Author[s]: Christian Robert Shelton

Importance Sampling for Reinforcement Learning with Multiple Objectives

August 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-003.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-003.pdf

This thesis considers three complications that arise from applying reinforcement learning to a real-world application. In the process of using reinforcement learning to build an adaptive electronic market-maker, we find the sparsity of data, the partial observability of the domain, and the multiple objectives of the agent to cause serious problems for existing reinforcement learning algorithms. We employ importance sampling (likelihood ratios) to achieve good performance in partially observable Markov decision processes with few data. Our importance sampling estimator requires no knowledge about the environment and places few restrictions on the method of collecting data. It can be used efficiently with reactive controllers, finite-state controllers, or policies with function approximation. We present theoretical analyses of the estimator and incorporate it into a reinforcement learning algorithm. Additionally, this method provides a complete return surface which can be used to balance multiple objectives dynamically. We demonstrate the need for multiple goals in a variety of applications and natural solutions based on our sampling method. The thesis concludes with example results from employing our algorithm to the domain of automated electronic market-making.

AIM-2001-003

Author[s]: Nicolas Meuleau, Leonid Peshkin and Kee-Eung Kim

Exploration in Gradient-Based Reinforcement Learning

April 3, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-003.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-003.pdf

Gradient-based policy search is an alternative to value-function-based methods for reinforcement learning in non-Markovian domains. One apparent drawback of policy search is its requirement that all actions be 'on-policy'; that is, that there be no explicit exploration. In this paper, we provide a method for using importance sampling to allow any well-behaved directed exploration policy during learning. We show both theoretically and experimentally that using this method can achieve dramatic performance improvements.

AITR-2001-002

Author[s]: Pedro F. Felzenszwalb

Object Recognition with Pictorial Structures

May 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-002.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-002.pdf

This thesis presents a statistical framework for object recognition. The framework is motivated by the pictorial structure models introduced by Fischler and Elschlager nearly 30 years ago. The basic idea is to model an object by a collection of parts arranged in a deformable configuration. The appearance of each part is modeled separately, and the deformable configuration is represented by spring-like connections between pairs of parts. These models allow for qualitative descriptions of visual appearance, and are suitable for generic recognition problems. The problem of detecting an object in an image and the problem of learning an object model using training examples are naturally formulated under a statistical approach. We present efficient algorithms to solve these problems in our framework. We demonstrate our techniques by training models to represent faces and human bodies. The models are then used to locate the corresponding objects in novel images.

AIM-2001-002

CBCL-194

Author[s]: Christian R. Shelton

Policy Improvement for POMDPs Using Normalized Importance Sampling

March 20, 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-002.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-002.pdf

We present a new method for estimating the expected return of a POMDP from experience. The estimator does not assume any knowle ge of the POMDP and allows the experience to be gathered with an arbitrary set of policies. The return is estimated for any new policy of the POMDP. We motivate the estimator from function-approximation and importance sampling points-of-view and derive its theoretical properties. Although the estimator is biased, it has low variance and the bias is often irrelevant when the estimator is used for pair-wise comparisons.We conclude by extending the estimator to policies with memory and compare its performance in a greedy search algorithm to the REINFORCE algorithm showing an order of magnitude reduction in the number of trials required.

AITR-2001-001

Author[s]: Kimberle Koile

The Architect's Collaborator: Toward Intelligent Tools for Conceptual Design

January 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-001.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AITR-2001-001.pdf

In early stages of architectural design, as in other design domains, the language used is often very abstract. In architectural design, for example, architects and their clients use experiential terms such as "private" or "open" to describe spaces. If we are to build programs that can help designers during this early-stage design, we must give those programs the capability to deal with concepts on the level of such abstractions. The work reported in this thesis sought to do that, focusing on two key questions: How are abstract terms such as "private" and "open" translated into physical form? How might one build a tool to assist designers with this process? The Architect's Collaborator (TAC) was built to explore these issues. It is a design assistant that supports iterative design refinement, and that represents and reasons about how experiential qualities are manifested in physical form. Given a starting design and a set of design goals, TAC explores the space of possible designs in search of solutions that satisfy the goals. It employs a strategy we've called dependency-directed redesign: it evaluates a design with respect to a set of goals, then uses an explanation of the evaluation to guide proposal and refinement of repair suggestions; it then carries out the repair suggestions to create new designs. A series of experiments was run to study TAC's behavior. Issues of control structure, goal set size, goal order, and modification operator capabilities were explored. In addition, TAC's use as a design assistant was studied in an experiment using a house in the process of being redesigned. TAC's use as an analysis tool was studied in an experiment using Frank Lloyd Wright's Prairie houses.

AIM-2001-001

Author[s]: T. Darrell, D. Demirdjian, N. Checka and P. Felzenswalb

Plan-view Trajectory Estimation with Dense Stereo Background Models

February 2001

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-001.ps

ftp://publications.ai.mit.edu/ai-publications/2001/AIM-2001-001.pdf

In a known environment, objects may be tracked in multiple views using a set of back-ground models. Stereo-based models can be illumination-invariant, but often have undefined values which inevitably lead to foreground classification errors. We derive dense stereo models for object tracking using long-term, extended dynamic-range imagery, and by detecting and interpolating uniform but unoccluded planar regions. Foreground points are detected quickly in new images using pruned disparity search. We adopt a 'late-segmentation' strategy, using an integrated plan-view density representation. Foreground points are segmented into object regions only when a trajectory is finally estimated, using a dynamic programming-based method. Object entry and exit are optimally determined and are not restricted to special spatial zones.

AIM-1697

CBCL-192

Author[s]: Thomas Serre, Bernd Heisele, Sayan Mukherjee and Tomaso Poggio

Feature Selection for Face Detection

September 2000

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1697.ps

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1697.pdf

We present a new method to select features for a face detection system using Support Vector Machines (SVMs). In the first step we reduce the dimensionality of the input space by projecting the data into a subset of eigenvectors. The dimension of the subset is determined by a classification criterion based on minimizing a bound on the expected error probability of an SVM. In the second step we select features from the SVM feature space by removing those that have low contributions to the decision function of the SVM.

AIM-1695

CBCL-190

Author[s]: Maximilian Riesenhuber and Tomaso Poggio

Computational Models of Object Recognition in Cortex: A Review

August 7, 2000

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1695.ps

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1695.pdf

Understanding how biological visual systems perform object recognition is one of the ultimate goals in computational neuroscience. Among the biological models of recognition the main distinctions are between feedforward and feedback and between object-centered and view-centered. From a computational viewpoint the different recognition tasks - for instance categorization and identification - are very similar, representing different trade-offs between specificity and invariance. Thus the different tasks do not strictly require different classes of models. The focus of the review is on feedforward, view-based models that are supported by psychophysical and physiological data.

AIM-1688

CBCL-188

Author[s]: Chikahito Nakajima, Massimiliano Pontil, Bernd Heisele and Tomaso Poggio

People Recognition in Image Sequences by Supervised Learning

June 2000

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1688.ps

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1688.pdf

We describe a system that learns from examples to recognize people in images taken indoors. Images of people are represented by color-based and shape-based features. Recognition is carried out through combinations of Support Vector Machine classifiers (SVMs). Different types of multiclass strategies based on SVMs are explored and compared to k-Nearest Neighbors classifiers (kNNs). The system works in real time and shows high performance rates for people recognition throughout one day.

AIM-1687

CBCL-187

Author[s]: Bernd Heisele, Tomaso Poggio and Massimiliano Pontil

Face Detection in Still Gray Images

May 2000

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1687.ps

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1687.pdf

We present a trainable system for detecting frontal and near-frontal views of faces in still gray images using Support Vector Machines (SVMs). We first consider the problem of detecting the whole face pattern by a single SVM classifer. In this context we compare different types of image features, present and evaluate a new method for reducing the number of features and discuss practical issues concerning the parameterization of SVMs and the selection of training data. The second part of the paper describes a component-based method for face detection consisting of a two-level hierarchy of SVM classifers. On the first level, component classifers independently detect components of a face, such as the eyes, the nose, and the mouth. On the second level, a single classifer checks if the geometrical configuration of the detected components in the image matches a geometrical model of a face.

AIM-1682

CBCL-185

Author[s]: Maximilian Riesenhuber and Tomaso Poggio

The Individual is Nothing, the Class Everything: Psychophysics and Modeling of Recognition in Obect Classes

May, 1, 2000

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1682.ps

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1682.pdf

Most psychophysical studies of object recognition have focussed on the recognition and representation of individual objects subjects had previously explicitely been trained on. Correspondingly, modeling studies have often employed a 'grandmother'-type representation where the objects to be recognized were represented by individual units. However, objects in the natural world are commonly members of a class containing a number of visually similar objects, such as faces, for which physiology studies have provided support for a representation based on a sparse population code, which permits generalization from the learned exemplars to novel objects of that class. In this paper, we present results from psychophysical and modeling studies intended to investigate object recognition in natural ('continuous') object classes. In two experiments, subjects were trained to perform subordinate level discrimination in a continuous object class - images of computer-rendered cars - created using a 3D morphing system. By comparing the recognition performance of trained and untrained subjects we could estimate the effects of viewpoint-specific training and infer properties of the object class-specific representation learned as a result of training. We then compared the experimental findings to simulations, building on our recently presented HMAX model of object recognition in cortex, to investigate the computational properties of a population-based object class representation as outlined above. We find experimental evidence, supported by modeling results, that training builds a viewpoint- and class-specific representation that supplements a pre-existing repre-sentation with lower shape discriminability but possibly greater viewpoint invariance.

AIM-1696

CBCL-191

Author[s]: Vinay Kumar and Tomaso Poggio

Learning-Based Approach to Estimation of Morphable Model Parameters

September 2000

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1696.ps

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1696.pdf

We describe the key role played by partial evaluation in the Supercomputing Toolkit, a parallel computing system for scientific applications that effectively exploits the vast amount of parallelism exposed by partial evaluation. The Supercomputing Toolkit parallel processor and its associated partial evaluation-based compiler have been used extensively by scientists at MIT, and have made possible recent results in astrophysics showing that the motion of the planets in our solar system is chaotically unstable.

AITR-1685

CBCL-186

Author[s]: Constantine P. Papageorgiou

A Trainable System for Object Detection in Images and Video Sequences

May 2000

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1685.ps

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1685.pdf

This thesis presents a general, trainable system for object detection in static images and video sequences. The core system finds a certain class of objects in static images of completely unconstrained, cluttered scenes without using motion, tracking, or handcrafted models and without making any assumptions on the scene structure or the number of objects in the scene. The system uses a set of training data of positive and negative example images as input, transforms the pixel images to a Haar wavelet representation, and uses a support vector machine classifier to learn the difference between in-class and out-of-class patterns. To detect objects in out-of-sample images, we do a brute force search over all the subwindows in the image. This system is applied to face, people, and car detection with excellent results. For our extensions to video sequences, we augment the core static detection system in several ways -- 1) extending the representation to five frames, 2) implementing an approximation to a Kalman filter, and 3) modeling detections in an image as a density and propagating this density through time according to measured features. In addition, we present a real-time version of the system that is currently running in a DaimlerChrysler experimental vehicle. As part of this thesis, we also present a system that, instead of detecting full patterns, uses a component-based approach. We find it to be more robust to occlusions, rotations in depth, and severe lighting conditions for people detection than the full body version. We also experiment with various other representations including pixels and principal components and show results that quantify how the number of features, color, and gray-level affect performance.

AIM-1681

CBCL-184

Author[s]: Theodoros Evgeniou and Massimiliano Pontil

A Note on the Generalization Performance of Kernel Classifiers with Margin

May 1, 2000

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1681.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1681.pdf

We present distribution independent bounds on the generalization misclassification performance of a family of kernel classifiers with margin. Support Vector Machine classifiers (SVM) stem out of this class of machines. The bounds are derived through computations of the $V_gamma$ dimension of a family of loss functions where the SVM one belongs to. Bounds that use functions of margin distributions (i.e. functions of the slack variables of SVM) are derived.

AIM-1679

CBCL-183

Author[s]: Maximilian Riesenhuber and Tomaso Poggio

A Note on Object Class Representation and Categorical Perception

December 17, 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1679.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1679.pdf

We present a novel scheme ("Categorical Basis Functions", CBF) for object class representation in the brain and contrast it to the "Chorus of Prototypes" scheme recently proposed by Edelman. The power and flexibility of CBF is demonstrated in two examples. CBF is then applied to investigate the phenomenon of Categorical Perception, in particular the finding by Bulthoff et al. (1998) of categorization of faces by gender without corresponding Categorical Perception. Here, CBF makes predictions that can be tested in a psychophysical experiment. Finally, experiments are suggested to further test CBF.

AITR-1675

Author[s]: J. Kenneth Salisbury, Jr. and Mandayam A. Srinivasan (editors)

Proceedings of the Fourth PHANTOM Users Group Workshop

November 4, 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1675.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1675.pdf

This Report contains the proceedings of the Fourth Phantom Users Group Workshop contains 17 papers presented October 9-12, 1999 at MIT Endicott House in Dedham Massachusetts. The workshop included sessions on, Tools for Programmers, Dynamic Environments, Perception and Cognition, Haptic Connections, Collision Detection / Collision Response, Medical and Seismic Applications, and Haptics Going Mainstream. The proceedings include papers that cover a variety of subjects in computer haptics including rendering, contact determination, development libraries, and applications in medicine, path planning, data interaction and training.

AITR-1674

Author[s]: J.P. Mellor

Automatically Recovering Geometry and Texture from Large Sets of Calibrated Images

October 22, 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1674.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1674.pdf

Three-dimensional models which contain both geometry and texture have numerous applications such as urban planning, physical simulation, and virtual environments. A major focus of computer vision (and recently graphics) research is the automatic recovery of three-dimensional models from two- dimensional images. After many years of research this goal is yet to be achieved. Most practical modeling systems require substantial human input and unlike automatic systems are not scalable. This thesis presents a novel method for automatically recovering dense surface patches using large sets (1000's) of calibrated images taken from arbitrary positions within the scene. Physical instruments, such as Global Positioning System (GPS), inertial sensors, and inclinometers, are used to estimate the position and orientation of each image. Essentially, the problem is to find corresponding points in each of the images. Once a correspondence has been established, calculating its three-dimensional position is simply a matter of geometry. Long baseline images improve the accuracy. Short baseline images and the large number of images greatly simplifies the correspondence problem. The initial stage of the algorithm is completely local and scales linearly with the number of images. Subsequent stages are global in nature, exploit geometric constraints, and scale quadratically with the complexity of the underlying scene. We describe techniques for: 1) detecting and localizing surface patches; 2) refining camera calibration estimates and rejecting false positive surfels; and 3) grouping surface patches into surfaces and growing the surface along a two-dimensional manifold. We also discuss a method for producing high quality, textured three-dimensional models from these surfaces. Some of the most important characteristics of this approach are that it: 1) uses and refines noisy calibration estimates; 2) compensates for large variations in illumination; 3) tolerates significant soft occlusion (e.g. tree branches); and 4) associates, at a fundamental level, an estimated normal (i.e. no frontal-planar assumption) and texture with each surface patch.

AIM-1673

CBCL-180

Author[s]: Constantine P. Papageorgiou and Tomaso Poggio

A Trainable Object Detection System: Car Detection in Static Images

october 13, 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1673.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1673.pdf

This paper describes a general, trainable architecture for object detection that has previously been applied to face and peoplesdetection with a new application to car detection in static images. Our technique is a learning based approach that uses a set of labeled training data from which an implicit model of an object class -- here, cars -- is learned. Instead of pixel representations that may be noisy and therefore not provide a compact representation for learning, our training images are transformed from pixel space to that of Haar wavelets that respond to local, oriented, multiscale intensity differences. These feature vectors are then used to train a support vector machine classifier. The detection of cars in images is an important step in applications such as traffic monitoring, driver assistance systems, and surveillance, among others. We show several examples of car detection on out-of- sample images and show an ROC curve that highlights the performance of our system.

AIM-1672

CBCL-179

Author[s]: Vinay P. Kumar and Tomaso Poggio

Learning-Based Approach to Real Time Tracking and Analysis of Faces

September 23, 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1672.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1672.pdf

This paper describes a trainable system capable of tracking faces and facialsfeatures like eyes and nostrils and estimating basic mouth features such as sdegrees of openness and smile in real time. In developing this system, we have addressed the twin issues of image representation and algorithms for learning. We have used the invariance properties of image representations based on Haar wavelets to robustly capture various facial features. Similarly, unlike previous approaches this system is entirely trained using examples and does not rely on a priori (hand-crafted) models of facial features based on optical flow or facial musculature. The system works in several stages that begin with face detection, followed by localization of facial features and estimation of mouth parameters. Each of these stages is formulated as a problem in supervised learning from examples. We apply the new and robust technique of support vector machines (SVM) for classification in the stage of skin segmentation, face detection and eye detection. Estimation of mouth parameters is modeled as a regression from a sparse subset of coefficients (basis functions) of an overcomplete dictionary of Haar wavelets.

AIM-1670

Author[s]: Mark M. Millonas and Erik M. Rauch

Trans-membrane Signal Transduction and Biochemical Turing Pattern Formation

September 28, 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1670.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1670.pdf

The Turing mechanism for the production of a broken spatial symmetry in an initially homogeneous system of reacting and diffusing substances has attracted much interest as a potential model for certain aspects of morphogenesis such as pre- patterning in the embryo, and has also served as a model for self-organization in more generic systems. The two features necessary for the formation of Turing patterns are short- range autocatalysis and long-range inhibition which usually only occur when the diffusion rate of the inhibitor is significantly greater than that of the activator. This observation has sometimes been used to cast doubt on applicability of the Turing mechanism to cellular patterning since many messenger molecules that diffuse between cells do so at more-or-less similar rates. Here we show that stationary, symmetry-breaking Turing patterns can form in physiologically realistic systems even when the extracellular diffusion coefficients are equal; the kinetic properties of the 'receiver' and 'transmitter' proteins responsible for signal transduction will be primary factors governing this process.

AIM-1669

Author[s]: Kinh Tieu and Paul Viola

Boosting Image Database Retrieval

September 10, 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1669.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1669.pdf

We present an approach for image database retrieval using a very large number of highly- selective features and simple on-line learning. Our approach is predicated on the assumption that each image is generated by a sparse set of visual "causes" and that images which are visually similar share causes. We propose a mechanism for generating a large number of complex features which capture some aspects of this causal structure. Boosting is used to learn simple and efficient classifiers in this complex feature space. Finally we will describe a practical implementation of our retrieval system on a database of 3000 images.

AITR-1668

Author[s]: Tommi Jaakkola, Marina Meila and Tony Jebara

Maximum Entropy Discrimination

December 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1668.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1668.pdf

We present a general framework for discriminative estimation based on the maximum entropy principle and its extensions. All calculations involve distributions over structures and/or parameters rather than specific settings and reduce to relative entropy projections. This holds even when the data is not separable within the chosen parametric class, in the context of anomaly detection rather than classification, or when the labels in the training set are uncertain or incomplete. Support vector machines are naturally subsumed under this class and we provide several extensions. We are also able to estimate exactly and efficiently discriminative distributions over tree structures of class- conditional models within this framework. Preliminary experimental results are indicative of the potential in these techniques.

AIM-1666

Author[s]: Radhika Nagpal

Organizing a Global Coordinate System from Local Information on an Amorphous Computer

August 29, 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1666.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1666.pdf

This paper demonstrates that it is possible to generate a reasonably accurate coordinate system on randomly distributed processors, using only local information and local communication. By coordinate systems we imply that each element assigns itself a logical coordinate that maps to its global physical location, starting with no apriori knowledge of position or orientation. The algorithm presented is inspired by biological systems that use chemical gradients to determine the position of cells. Extensive analysis and simulation results are presented. Two key results are: there is a critical minimum average neighborhood size of 15 for good accuracy and there is a fundamental limit on the resolution of any coordinate system determined strictly from local communication. We also demonstrate that using this algorithm, random distributions of processors produce significantly better accuracy than regular processor grids - such as those used by cellular automata. This has implications for discrete models of biology as well as for building smart sensor arrays.

AIM-1665

Author[s]: Harold Abelson, Don Allen, Daniel Coore, Chris Hanson, George Homsy, Thomas F. Knight, Jr., Radhika Nagpal, Erik Rauch, Gerald Jay Sussman and Ron Weiss

Amorphous Computing

August 29, 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1665.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1665.pdf

Amorphous computing is the development of organizational principles and programming languages for obtaining coherent behaviors from the cooperation of myriads of unreliable parts that are interconnected in unknown, irregular, and time-varying ways. The impetus for amorphous computing comes from developments in microfabrication and fundamental biology, each of which is the basis of a kernel technology that makes it possible to build or grow huge numbers of almost-identical information-processing units at almost no cost. This paper sets out a research agenda for realizing the potential of amorphous computing and surveys some initial progress, both in programming and in fabrication. We describe some approaches to programming amorphous systems, which are inspired by metaphors from biology and physics. We also present the basic ideas of cellular computing, an approach to constructing digital-logic circuits within living cells by representing logic levels by concentrations DNA-binding proteins.

AIM-1664

CBCL-178

Author[s]: Anuj Mohan

Object Detection in Images by Components

August 11, 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1664.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1664.pdf

In this paper we present a component based person detection system that is capable of detecting frontal, rear and near side views of people, and partially occluded persons in cluttered scenes. The framework that is described here for people is easily applied to other objects as well. The motivation for developing a component based approach is two fold: first, to enhance the performance of person detection systems on frontal and rear views of people and second, to develop a framework that directly addresses the problem of detecting people who are partially occluded or whose body parts blend in with the background. The data classification is handled by several support vector machine classifiers arranged in two layers. This architecture is known as Adaptive Combination of Classifiers (ACC). The system performs very well and is capable of detecting people even when all components of a person are not found. The performance of the system is significantly better than a full body person detector designed along similar lines. This suggests that the improved performance is due to the components based approach and the ACC data classification structure.

AIM-1663

Author[s]: Rajesh Kasturirangan

Multiple Scales in Small-World Networks

August 11, 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1663.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1663.pdf

Small-world architectures may be implicated in a range of phenomena from networks of neurons in the cerebral cortex to social networks and propogation of viruses. Small- world networks are interpolations of regular and random networks that retain the advantages of both regular and random networks by being highly clustered like regular networks and having small average path length between nodes, like random networks. While most of the recent attention on small- world networks has focussed on the effect of introducing disorder/randomness into a regular network, we show that that the fundamental mechanism behind the small- world phenomenon is not disorder/ randomness, but the presence of connections of many different length scales. Consequently, in order to explain the small-world phenomenon, we introduce the concept of multiple scale networks and then state the multiple length scale hypothesis. We show that small-world behavior in randomly rewired networks is a consequence of features common to all multiple scale networks. To support the multiple length scale hypothesis, novel network architectures are introduced that need not be a result of random rewiring of a regular network. In each case it is shown that whenever the network exhibits small- world behavior, it also has connections of diverse length scales. We also show that the distribution of the length scales of the new connections is significantly more important than whether the new connections are long range, medium range or short range.

AIM-1662

Author[s]: Liana M. Lorigo, Olivier Faugeras, W.E.L. Grimson, Renaud Keriven, Ron Kikinis, Carl-Fredrik Westin

Co-dimension 2 Geodesic Active Contours for MRA Segmentation

August 11, 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1662.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1662.pdf

Automatic and semi-automatic magnetic resonance angiography (MRA)s segmentation techniques can potentially save radiologists larges amounts of time required for manual segmentation and cans facilitate further data analysis. The proposed MRAs segmentation method uses a mathematical modeling technique whichs is well-suited to the complicated curve-like structure of bloods vessels. We define the segmentation task as ans energy minimization over all 3D curves and use a level set methods to search for a solution. Ours approach is an extension of previous level set segmentations techniques to higher co-dimension.

AIM-1661

CBCL-177

Author[s]: Ryan Rifkin, Massimiliano Pontil and Alessandro Verri

A Note on Support Vector Machines Degeneracy

August 11, 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1661.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1661.pdf

When training Support Vector Machines (SVMs) over non-separable data sets, one sets the threshold $b$ using any dual cost coefficient that is strictly between the bounds of $0$ and $C$. We show that there exist SVM training problems with dual optimal solutions with all coefficients at bounds, but that all such problems are degenerate in the sense that the "optimal separating hyperplane" is given by ${f w} = {f 0}$, and the resulting (degenerate) SVM will classify all future points identically (to the class that supplies more training data). We also derive necessary and sufficient conditions on the input data for this to occur. Finally, we show that an SVM training problem can always be made degenerate by the addition of a single data point belonging to a certain unboundedspolyhedron, which we characterize in terms of its extreme points and rays.

AIM-1658

CBCL-173

Author[s]: Tony Ezzat and Tomaso Poggio

Visual Speech Synthesis by Morphing Visemes

May 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1658.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1658.pdf

We present MikeTalk, a text-to-audiovisual speech synthesizer which converts input text into an audiovisual speech stream. MikeTalk is built using visemes, which are a small set of images spanning a large range of mouth shapes. The visemes are acquired from a recorded visual corpus of a human subject which is specifically designed to elicit one instantiation of each viseme. Using optical flow methods, correspondence from every viseme to every other viseme is computed automatically. By morphing along this correspondence, a smooth transition between viseme images may be generated. A complete visual utterance is constructed by concatenating viseme transitions. Finally, phoneme and timing information extracted from a text-to-speech synthesizer is exploited to determine which viseme transitions to use, and the rate at which the morphing process should occur. In this manner, we are able to synchronize the visual speech stream with the audio speech stream, and hence give the impression of a photorealistic talking face.

AIM-1657

Author[s]: Hany Farid

Detecting Digital Forgeries Using Bispectral Analysis

December 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1657.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1657.pdf

With the rapid increase in low-cost and sophisticated digital technology the need for techniques to authenticate digital material will become more urgent. In this paper we address the problem of authenticating digital signals assuming no explicit prior knowledge of the original. The basic approach that we take is to assume that in the frequency domain a "natural" signal has weak higher-order statistical correlations. We then show that "un-natural" correlations are introduced if this signal is passed through a non-linearity (which would almost surely occur in the creation of a forgery). Techniques from polyspectral analysis are then used to detect the presence of these correlations. We review the basics of polyspectral analysis, show how and why these tools can be used in detecting forgeries and show their effectiveness in analyzing human speech.

AIM-1656

CBCL-172

Author[s]: Theodoros Evgeniou and Massimiliano Pontil

On the V(subscript gamma) Dimension for Regression in Reproducing Kernel Hilbert Spaces

May 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1656.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1656.pdf

This paper presents a computation of the $V_gamma$ dimension for regression in bounded subspaces of Reproducing Kernel Hilbert Spaces (RKHS) for the Support Vector Machine (SVM) regression $epsilon$-insensitive loss function, and general $L_p$ loss functions. Finiteness of the RV_gamma$ dimension is shown, which also proves uniform convergence in probability for regression machines in RKHS subspaces that use the $L_epsilon$ or general $L_p$ loss functions. This paper presenta a novel proof of this result also for the case that a bias is added to the functions in the RKHS.

AIM-1655

Author[s]: Gideon P. Stein, Raquel Romano and Lily Lee

Monitoring Activities from Multiple Video Streams: Establishing a Common Coordinate Frame

April 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1655.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1655.pdf

Passive monitoring of large sites typically requires coordination between multiple cameras, which in turn requires methods for automatically relating events between distributed cameras. This paper tackles the problem of self-calibration of multiple cameras which are very far apart, using feature correspondences to determine the camera geometry. The key problem is finding such correspondences. Since the camera geometry and photometric characteristics vary considerably between images, one cannot use brightness and/or proximity constraints. Instead we apply planar geometric constraints to moving objects in the scene in order to align the scene"s ground plane across multiple views. We do not assume synchronized cameras, and we show that enforcing geometric constraints enables us to align the tracking data in time. Once we have recovered the homography which aligns the planar structure in the scene, we can compute from the homography matrix the 3D position of the plane and the relative camera positions. This in turn enables us to recover a homography matrix which maps the images to an overhead view. We demonstrate this technique in two settings: a controlled lab setting where we test the effects of errors in internal camera calibration, and an uncontrolled, outdoor setting in which the full procedure is applied to external camera calibration and ground plane recovery. In spite of noise in the internal camera parameters and image data, the system successfully recovers both planar structure and relative camera positions in both settings.

AIM-1654

CBCL-171

Author[s]: Theodoros Evgeniou, Massimiliano Pontil and Tomaso Poggio

A Unified Framework for Regularization Networks and Support Vector Machines

March 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1654.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1654.pdf

Regularization Networks and Support Vector Machines are techniques for solving certain problems of learning from examples -- in particular the regression problem of approximating a multivariate function from sparse data. We present both formulations in a unified framework, namely in the context of Vapnik's theory of statistical learning which provides a general foundation for the learning problem, combining functional analysis and statistics.

AIM-1653

CBCL-170

Author[s]: Sayan Mukherjee and Vladimir Vapnik

Multivariate Density Estimation: An SVM Approach

April 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1653.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1653.pdf

We formulate density estimation as an inverse operator problem. We then use convergence results of empirical distribution functions to true distribution functions to develop an algorithm for multivariate density estimation. The algorithm is based upon a Support Vector Machine (SVM) approach to solving inverse operator problems. The algorithm is implemented and tested on simulated data from different distributions and different dimensionalities, gaussians and laplacians in $R^2$ and $R^{12}$. A comparison in performance is made with Gaussian Mixture Models (GMMs). Our algorithm does as well or better than the GMMs for the simulations tested and has the added advantage of being automated with respect to parameters.

AIM-1652

Author[s]: Marina Meila

An Accelerated Chow and Liu Algorithm: Fitting Tree Distributions to High Dimensional Sparse Data

January 1999

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1652.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1652.pdf

Chow and Liu introduced an algorithm for fitting a multivariate distribution with a tree (i.e. a density model that assumes that there are only pairwise dependencies between variables) and that the graph of these dependencies is a spanning tree. The original algorithm is quadratic in the dimesion of the domain, and linear in the number of data points that define the target distribution $P$. This paper shows that for sparse, discrete data, fitting a tree distribution can be done in time and memory that is jointly subquadratic in the number of variables and the size of the data set. The new algorithm, called the acCL algorithm, takes advantage of the sparsity of the data to accelerate the computation of pairwise marginals and the sorting of the resulting mutual informations, achieving speed ups of up to 2-3 orders of magnitude in the experiments.

AIM-1651

CBCL-168

Author[s]: Massimiliano Pontil, Sayan Mukherjee and Federico Girosi

On the Noise Model of Support Vector Machine Regression

October 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1651.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1651.pdf

Support Vector Machines Regression (SVMR) is a regression technique which has been recently introduced by V. Vapnik and his collaborators (Vapnik, 1995; Vapnik, Golowich and Smola, 1996). In SVMR the goodness of fit is measured not by the usual quadratic loss function (the mean square error), but by a different loss function called Vapnik"s $epsilon$- insensitive loss function, which is similar to the "robust" loss functions introduced by Huber (Huber, 1981). The quadratic loss function is well justified under the assumption of Gaussian additive noise. However, the noise model underlying the choice of Vapnik's loss function is less clear. In this paper the use of Vapnik's loss function is shown to be equivalent to a model of additive and Gaussian noise, where the variance and mean of the Gaussian are random variables. The probability distributions for the variance and mean will be stated explicitly. While this work is presented in the framework of SVMR, it can be extended to justify non-quadratic loss functions in any Maximum Likelihood or Maximum A Posteriori approach. It applies not only to Vapnik's loss function, but to a much broader class of loss functions.

AITR-1650

CBCL-167

Author[s]: Christian R. Shelton

Three-Dimensional Correspondence

December 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1650.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1650.pdf

This paper describes the problem of three-dimensional object correspondence and presents an algorithm for matching two three-dimensional colored surfaces using polygon reduction and the minimization of an energy function. At the core of this algorithm is a novel data-dependent multi-resolution pyramid for polygonal surfaces. The algorithm is general to correspondence between any two manifolds of the same dimension embedded in a higher dimensional space. Results demonstrating correspondences between various objects are presented and a method for incorporating user input is also detailed.

AIM-1649

CBCL-166

Author[s]: Massimiliano Pontil, Ryan Rifkin and Theodoros Evgeniou

From Regression to Classification in Support Vector Machines

November 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1649.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1649.pdf

We study the relation between support vector machines (SVMs) for regression (SVMR) and SVM for classification (SVMC). We show that for a given SVMC solution there exists a SVMR solution which is equivalent for a certain choice of the parameters. In particular our result is that for $epsilon$ sufficiently close to one, the optimal hyperplane and threshold for the SVMC problem with regularization parameter C_c are equal to (1-epsilon)^{- 1} times the optimal hyperplane and threshold for SVMR with regularization parameter C_r = (1-epsilon)C_c. A direct consequence of this result is that SVMC can be seen as a special case of SVMR.

AIM-1648

CBCL-165

Author[s]: Marina Meila, Michael I. Jordan and Quaid Morris

Estimating Dependency Structure as a Hidden Variable

September 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1648.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1648.pdf

This paper introduces a probability model, the mixture of trees that can account for sparse, dynamically changing dependence relationships. We present a family of efficient algorithms that use EM and the Minimum Spanning Tree algorithm to find the ML and MAP mixture of trees for a variety of priors, including the Dirichlet and the MDL priors. We also show that the single tree classifier acts like an implicit feature selector, thus making the classification performance insensitive to irrelevant attributes. Experimental results demonstrate the excellent performance of the new model both in density estimation and in classification.

AIM-1647

Author[s]: Hany Farid and Edward H. Adelson

Separating Reflections from Images Using Independent Components Analysis

September 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1647.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1647.pdf

The image of an object can vary dramatically depending on lighting, specularities/reflections and shadows. It is often advantageous to separate these incidental variations from the intrinsic aspects of an image. Along these lines this paper describes a method for photographing objects behind glass and digitally removing the reflections off the glass leaving the image of the objects behind the glass intact. We describe the details of this method which employs simple optical techniques and independent components analysis (ICA) and show its efficacy with several examples.

AIM-1646

CBCL-164

Author[s]: Nicholas Chan, Blake LeBaron, Andrew Lo and Tomaso Poggio

Information Dissemination and Aggregation in Asset Markets with Simple Intelligent Traders

September 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1646.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1646.pdf

Various studies of asset markets have shown that traders are capable of learning and transmitting information through prices in many situations. In this paper we replace human traders with intelligent software agents in a series of simulated markets. Using these simple learning agents, we are able to replicate several features of the experiments with human subjects, regarding (1) dissemination of information from informed to uninformed traders, and (2) aggregation of information spread over different traders.

AITR-1645

Author[s]: Deborah A. Wallach

A Hierarchical Cache Coherent Protocol

September 1992

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1645.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1645.pdf

As the number of processors in distributed- memory multiprocessors grows, efficiently supporting a shared-memory programming model becomes difficult. We have designed the Protocol for Hierarchical Directories (PHD) to allow shared-memory support for systems containing massive numbers of processors. PHD eliminates bandwidth problems by using a scalable network, decreases hot-spots by not relying on a single point to distribute blocks, and uses a scalable amount of space for its directories. PHD provides a shared- memory model by synthesizing a global shared memory from the local memories of processors. PHD supports sequentially consistent read, write, and test- and-set operations. This thesis also introduces a method of describing locality for hierarchical protocols and employs this method in the derivation of an abstract model of the protocol behavior. An embedded model, based on the work of Johnson[ISCA19], describes the protocol behavior when mapped to a k-ary n- cube. The thesis uses these two models to study the average height in the hierarchy that operations reach, the longest path messages travel, the number of messages that operations generate, the inter-transaction issue time, and the protocol overhead for different locality parameters, degrees of multithreading, and machine sizes. We determine that multithreading is only useful for approximately two to four threads; any additional interleaving does not decrease the overall latency. For small machines and high locality applications, this limitation is due mainly to the length of the running threads. For large machines with medium to low locality, this limitation is due mainly to the protocol overhead being too large. Our study using the embedded model shows that in situations where the run length between references to shared memory is at least an order of magnitude longer than the time to process a single state transition in the protocol, applications exhibit good performance. If separate controllers for processing protocol requests are included, the protocol scales to 32k processor machines as long as the application exhibits hierarchical locality: at least 22% of the global references must be able to be satisfied locally; at most 35% of the global references are allowed to reach the top level of the hierarchy.

AIM-1642

Author[s]: Parry Husbands, Charles Lee Isbell, Jr. and Alan Edelman

Interactive Supercomputing with MIT Matlab

July 28, 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1642.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1642.pdf

This paper describes MITMatlab, a system that enables users of supercomputers or networked PCs to work on large data sets within Matlab transparently. MITMatlab is based on the Parallel Problems Server (PPServer), a standalone 'linear algebra server' that provides a mechanism for running distributed memory algorithms on large data sets. The PPServer and MITMatlab enable high-performance interactive supercomputing. With such a tool, researchers can now use Matlab as more than a prototyping tool for experimenting with small problems. Instead, MITMatlab makes is possible to visualize and operate interactively on large data sets. This has implications not only in supercomputing, but for Artificial Intelligence applicatons such as Machine Learning, Information Retrieval and Image Processing.

AIM-1640

CBCL-163

Author[s]: Zhaoping Li

Pre-Attentive Segmentation in the Primary Visual Cortex

June 30, 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1640.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1640.pdf

Stimuli outside classical receptive fields have been shown to exert significant influence over the activities of neurons in primary visual cortexWe propose that contextual influences are used for pre-attentive visual segmentation, in a new framework called segmentation without classification. This means that segmentation of an image into regions occurs without classification of features within a region or comparison of features between regions. This segmentation framework is simpler than previous computational approaches, making it implementable by V1 mechanisms, though higher leve l visual mechanisms are needed to refine its output. However, it easily handles a class of segmentation problems that are tricky in conventional methods. The cortex computes global region boundaries by detecting the breakdown of homogeneity or translation invariance in the input, using local intra-cortical interactions mediated by the horizontal connections. The difference between contextual influences near and far from region boundaries makes neural activities near region boundaries higher than elsewhere, making boundaries more salient for perceptual pop-out. This proposal is implemented in a biologically based model of V1, and demonstrated using examples of texture segmentation and figure-ground segregation. The model performs segmentation in exactly the same neural circuit that solves the dual problem of the enhancement of contours, as is suggested by experimental observations. Its behavior is compared with psychophysical and physiological data on segmentation, contour enhancement, and contextual influences. We discuss the implications of segmentation without classification and the predictions of our V1 model, and relate it to other phenomena such as asymmetry in visual search.

AITR-1639

Author[s]: Oded Maron

Learning from Ambiguity

December 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1639.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1639.pdf

There are many learning problems for which the examples given by the teacher are ambiguously labeled. In this thesis, we will examine one framework of learning from ambiguous examples known as Multiple- Instance learning. Each example is a bag, consisting of any number of instances. A bag is labeled negative if all instances in it are negative. A bag is labeled positive if at least one instance in it is positive. Because the instances themselves are not labeled, each positive bag is an ambiguous example. We would like to learn a concept which will correctly classify unseen bags. We have developed a measure called Diverse Density and algorithms for learning from multiple- instance examples. We have applied these techniques to problems in drug design, stock prediction, and image database retrieval. These serve as examples of how to translate the ambiguity in the application domain into bags, as well as successful examples of applying Diverse Density techniques.

AIM-1638

Author[s]: Oded Maron and Tomas Lozano-Perez

Visible Decomposition: Real-Time Path Planning in Large Planar Environments

June, 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1638.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1638.pdf

We describe a method called Visible Decomposition for computing collision-free paths in real time through a planar environment with a large number of obstacles. This method divides space into local visibility graphs, ensuring that all operations are local. The search time is kept low since the number of regions is proved to be small. We analyze the computational demands of the algorithm and the quality of the paths it produces. In addition, we show test results on a large simulation testbed.

AIM-1637

Author[s]: Gina-Anne Levow

Corpus-Based Techniques for Word Sense Disambiguation

May 27, 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1637.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1637.pdf

The need for robust and easily extensible systems for word sense disambiguation coupled with successes in training systems for a variety of tasks using large on-line corpora has led to extensive research into corpus-based statistical approaches to this problem. Promising results have been achieved by vector space representations of context, clustering combined with a semantic knowledge base, and decision lists based on collocational relations. We evaluate these techniques with respect to three important criteria: how their definition of context affects their ability to incorporate different types of disambiguating information, how they define similarity among senses, and how easily they can generalize to new senses. The strengths and weaknesses of these systems provide guidance for future systems which must capture and model a variety of disambiguating information, both syntactic and semantic.

AIM-1636

Author[s]: Charles Isbell and Paul Viola

Restructuring Sparse High Dimensional Data for Effective Retrieval

May 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1636.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1636.pdf

The task in text retrieval is to find the subset of a collection of documents relevant to a user's information request, usually expressed as a set of words. Classically, documents and queries are represented as vectors of word counts. In its simplest form, relevance is defined to be the dot product between a document and a query vector--a measure of the number of common terms. A central difficulty in text retrieval is that the presence or absence of a word is not sufficient to determine relevance to a query. Linear dimensionality reduction has been proposed as a technique for extracting underlying structure from the document collection. In some domains (such as vision) dimensionality reduction reduces computational complexity. In text retrieval it is more often used to improve retrieval performance. We propose an alternative and novel technique that produces sparse representations constructed from sets of highly-related words. Documents and queries are represented by their distance to these sets. and relevance is measured by the number of common clusters. This technique significantly improves retrieval performance, is efficient to compute and shares properties with the optimal linear projection operator and the independent components of documents.

AIM-1635

CBCL-162

Author[s]: Constantine P. Papgeorgiou, Federico Girosi and Tomaso Poggio

Sparse Correlation Kernel Analysis and Reconstruction

May 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1635.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1635.pdf

This paper presents a new paradigm for signal reconstruction and superresolution, Correlation Kernel Analysis (CKA), that is based on the selection of a sparse set of bases from a large dictionary of class- specific basis functions. The basis functions that we use are the correlation functions of the class of signals we are analyzing. To choose the appropriate features from this large dictionary, we use Support Vector Machine (SVM) regression and compare this to traditional Principal Component Analysis (PCA) for the tasks of signal reconstruction, superresolution, and compression. The testbed we use in this paper is a set of images of pedestrians. This paper also presents results of experiments in which we use a dictionary of multiscale basis functions and then use Basis Pursuit De-Noising to obtain a sparse, multiscale approximation of a signal. The results are analyzed and we conclude that 1) when used with a sparse representation technique, the correlation function is an effective kernel for image reconstruction and superresolution, 2) for image compression, PCA and SVM have different tradeoffs, depending on the particular metric that is used to evaluate the results, 3) in sparse representation techniques, L_1 is not a good proxy for the true measure of sparsity, L_0, and 4) the L_epsilon norm may be a better error metric for image reconstruction and compression than the L_2 norm, though the exact psychophysical metric should take into account high order structure in images.

AITR-1634

Author[s]: Anita M. Flynn

Piezoelectric Ultrasonic Micromotors

June 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1634.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1634.pdf

This report describes development of micro- fabricated piezoelectric ultrasonic motors and bulk-ceramic piezoelectric ultrasonic motors. Ultrasonic motors offer the advantage of low speed, high torque operation without the need for gears. They can be made compact and lightweight and provide a holding torque in the absence of applied power, due to the traveling wave frictional coupling mechanism between the rotor and the stator. This report covers modeling, simulation, fabrication and testing of ultrasonic motors. Design of experiments methods were also utilized to find optimal motor parameters. A suite of 8 mm diameter x 3 mm tall motors were machined for these studies and maximum stall torques as large as 10^(- 3) Nm, maximum no-load speeds of 1710 rpm and peak power outputs of 27 mW were realized. Aditionally, this report describes the implementation of a microfabricated ultrasonic motor using thin- film lead zirconate titanate. In a joint project with the Pennsylvania State University Materials Research Laboratory and MIT Lincoln Laboratory, 2 mm and 5 mm diameter stator structures were fabricated on 1 micron thick silicon nitride membranes. Small glass lenses placed down on top spun at 100-300 rpm with 4 V excitation at 90 kHz. The large power densities and stall torques of these piezoelectric ultrasonic motors offer tremendous promis for integrated machines: complete intelligent, electro-mechanical autonomous systems mass-produced in a single fabrication process.

AIM-1633

Author[s]: Kenneth Yip and Gerald Jay Sussman

Sparse Representations for Fast, One-Shot Learning

November 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1633.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1633.pdf

Humans rapidly and reliably learn many kinds of regularities and generalizations. We propose a novel model of fast learning that exploits the properties of sparse representations and the constraints imposed by a plausible hardware mechanism. To demonstrate our approach we describe a computational model of acquisition in the domain of morphophonology. We encapsulate phonological information as bidirectional boolean constraint relations operating on the classical linguistic representations of speech sounds in term of distinctive features. The performance model is described as a hardware mechanism that incrementally enforces the constraints. Phonological behavior arises from the action of this mechanism. Constraints are induced from a corpus of common English nouns and verbs. The induction algorithm compiles the corpus into increasingly sophisticated constraints. The algorithm yields one-shot learning from a few examples. Our model has been implemented as a computer program. The program exhibits phonological behavior similar to that of young children. As a bonus the constraints that are acquired can be interpreted as classical linguistic rules.

AIM-1632

CBCL-161

Author[s]: Tomaso Poggio and Federico Girosi

Notes on PCA, Regularization, Sparsity and Support Vector Machines

May 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1632.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1632.pdf

We derive a new representation for a function as a linear combination of local correlation kernels at optimal sparse locations and discuss its relation to PCA, regularization, sparsity principles and Support Vector Machines. We first review previous results for the approximation of a function from discrete data (Girosi, 1998) in the context of Vapnik"s feature space and dual representation (Vapnik, 1995). We apply them to show 1) that a standard regularization functional with a stabilizer defined in terms of the correlation function induces a regression function in the span of the feature space of classical Principal Components and 2) that there exist a dual representations of the regression function in terms of a regularization network with a kernel equal to a generalized correlation function. We then describe the main observation of the paper: the dual representation in terms of the correlation function can be sparsified using the Support Vector Machines (Vapnik, 1982) technique and this operation is equivalent to sparsify a large dictionary of basis functions adapted to the task, using a variation of Basis Pursuit De-Noising (Chen, Donoho and Saunders, 1995; see also related work by Donahue and Geiger, 1994; Olshausen and Field, 1995; Lewicki and Sejnowski, 1998). In addition to extending the close relations between regularization, Support Vector Machines and sparsity, our work also illuminates and formalizes the LFA concept of Penev and Atick (1996). We discuss the relation between our results, which are about regression, and the different problem of pattern classification.

AIM-1631

Author[s]: Kevin K. Lin

Coordinate-Independent Computations on Differential Equations

March 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1631.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1631.pdf

This project investigates the computational representation of differentiable manifolds, with the primary goal of solving partial differential equations using multiple coordinate systems on general n- dimensional spaces. In the process, this abstraction is used to perform accurate integrations of ordinary differential equations using multiple coordinate systems. In the case of linear partial differential equations, however, unexpected difficulties arise even with the simplest equations.

AIM-1630

Author[s]: Thomas Marill

Recovery of Three-Dimensional Objects from Single Perspective Images

March 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1630.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1630.pdf

Any three-dimensional wire-frame object constructed out of parallelograms can be recovered from a single perspective two-dimensional image. A procedure for performing the recovery is given.

AIM-1629

CBCL-160

Author[s]: Maximilian Riesenhuber and Tomaso Poggio

Modeling Invariances in Inferotemporal Cell Tuning

March 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1629.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1629.pdf

In macaque inferotemporal cortex (IT), neurons have been found to respond selectively to complex shapes while showing broad tuning ("invariance") with respect to stimulus transformations such as translation and scale changes and a limited tuning to rotation in depth. Training monkeys with novel, paperclip-like objects, Logothetis et al. could investigate whether these invariance properties are due to experience with exhaustively many transformed instances of an object or if there are mechanisms that allow the cells to show response invariance also to previously unseen instances of that object. They found object-selective cells in anterior IT which exhibited limited invariance to various transformations after training with single object views. While previous models accounted for the tuning of the cells for rotations in depth and for their selectivity to a specific object relative to a population of distractor objects, the model described here attempts to explain in a biologically plausible way the additional properties of translation and size invariance. Using the same stimuli as in the experiment, we find that model IT neurons exhibit invariance properties which closely parallel those of real neurons. Simulations show that the model is capable of unsupervised learning of view-tuned neurons. The model also allows to make experimentally testable predictions regarding novel stimulus transformations and combinations of stimuli.

AIM-1628

Author[s]: Brian Scassellati

A Binocular, Foveated Active Vision System

March 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1628.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1628.pdf

This report documents the design and implementation of a binocular, foveated active vision system as part of the Cog project at the MIT Artificial Intelligence Laboratory. The active vision system features a three degree of freedom mechanical platform that supports four color cameras, a motion control system, and a parallel network of digital signal processors for image processing. To demonstrate the capabilities of the system, we present results from four sample visual-motor tasks.

AITR-1627

Author[s]: Alan Bawden

Implementing Distributed Systems Using Linear Naming

March 1993

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1627.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1627.pdf

Linear graph reduction is a simple computational model in which the cost of naming things is explicitly represented. The key idea is the notion of "linearity". A name is linear if it is only used once, so with linear naming you cannot create more than one outstanding reference to an entity. As a result, linear naming is cheap to support and easy to reason about. Programs can be translated into the linear graph reduction model such that linear names in the program are implemented directly as linear names in the model. Nonlinear names are supported by constructing them out of linear names. The translation thus exposes those places where the program uses names in expensive, nonlinear ways. Two applications demonstrate the utility of using linear graph reduction: First, in the area of distributed computing, linear naming makes it easy to support cheap cross-network references and highly portable data structures, Linear naming also facilitates demand driven migration of tasks and data around the network without requiring explicit guidance from the programmer. Second, linear graph reduction reveals a new characterization of the phenomenon of state. Systems in which state appears are those which depend on certain - global- system properties. State is not a localizable phenomenon, which suggests that our usual object oriented metaphor for state is flawed.

AIM-1626

Author[s]: Radhika Nagpal and Daniel Coore

An Algorithm for Group Formation and Maximal Independent Set in an Amorphous Computer

February 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1626.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1626.pdf

Amorphous computing is the study of programming ultra-scale computing environments of smart sensors and actuators cite{white-paper}. The individual elements are identical, asynchronous, randomly placed, embedded and communicate locally via wireless broadcast. Aggregating the processors into groups is a useful paradigm for programming an amorphous computer because groups can be used for specialization, increased robustness, and efficient resource allocation. This paper presents a new algorithm, called the clubs algorithm, for efficiently aggregating processors into groups in an amorphous computer, in time proportional to the local density of processors. The clubs algorithm is well-suited to the unique characteristics of an amorphous computer. In addition, the algorithm derives two properties from the physical embedding of the amorphous computer: an upper bound on the number of groups formed and a constant upper bound on the density of groups. The clubs algorithm can also be extended to find the maximal independent set (MIS) and $Delta + 1$ vertex coloring in an amorphous computer in $O(log N)$ rounds, where $N$ is the total number of elements and $Delta$ is the maximum degree.

AIM-1625

CBCL-159

Author[s]: Thomas Hofmann and Jan Puzicha

Statistical Models for Co-occurrence Data

February 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1625.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1625.pdf

Modeling and predicting co-occurrences of events is a fundamental problem of unsupervised learning. In this contribution we develop a statistical framework for analyzing co-occurrence data in a general setting where elementary observations are joint occurrences of pairs of abstract objects from two finite sets. The main challenge for statistical models in this context is to overcome the inherent data sparseness and to estimate the probabilities for pairs which were rarely observed or even unobserved in a given sample set. Moreover, it is often of considerable interest to extract grouping structure or to find a hierarchical data organization. A novel family of mixture models is proposed which explain the observed data by a finite number of shared aspects or clusters. This provides a common framework for statistical inference and structure discovery and also includes several recently proposed models as special cases. Adopting the maximum likelihood principle, EM algorithms are derived to fit the model parameters. We develop improved versions of EM which largely avoid overfitting problems and overcome the inherent locality of EM--based optimization. Among the broad variety of possible applications, e.g., in information retrieval, natural language processing, data mining, and computer vision, we have chosen document retrieval, the statistical analysis of noun/adjective co-occurrence and the unsupervised segmentation of textured images to test and evaluate the proposed algorithms.

AIM-1624

CBCL-158

Author[s]: Yar Weiss and Edward H. Adelson

Slow and Smooth: A Bayesian Theory for the Combination of Local Motion Signals in Human Vision

February 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1624.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1624.pdf

In order to estimate the motion of an object, the visual system needs to combine multiple local measurements, each of which carries some degree of ambiguity. We present a model of motion perception whereby measurements from different image regions are combined according to a Bayesian estimator --- the estimated motion maximizes the posterior probability assuming a prior favoring slow and smooth velocities. In reviewing a large number of previously published phenomena we find that the Bayesian estimator predicts a wide range of psychophysical results. This suggests that the seemingly complex set of illusions arise from a single computational strategy that is optimal under reasonable assumptions.

AIM-1621

Author[s]: Gideon P. Stein and Amnon Shashua

Direct Estimation of Motion and Extended Scene Structure from a Moving Stereo Rig

December 1998

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1621.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1621.pdf

We describe a new method for motion estimation and 3D reconstruction from stereo image sequences obtained by a stereo rig moving through a rigid world. We show that given two stereo pairs one can compute the motion of the stereo rig directly from the image derivatives (spatial and temporal). Correspondences are not required. One can then use the images from both pairs combined to compute a dense depth map. The motion estimates between stereo pairs enable us to combine depth maps from all the pairs in the sequence to form an extended scene reconstruction and we show results from a real image sequence. The motion computation is a linear least squares computation using all the pixels in the image. Areas with little or no contrast are implicitly weighted less so one does not have to explicitly apply a confidence measure.

AIM-1620

CBCL-157

Author[s]: Gideon P. Stein and Amnon Shashua

On Degeneracy of Linear Reconstruction from Three Views: Linear Line Complex and Applications

December 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1620.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1620.pdf

This paper investigates the linear degeneracies of projective structure estimation from point and line features across three views. We show that the rank of the linear system of equations for recovering the trilinear tensor of three views reduces to 23 (instead of 26) in the case when the scene is a Linear Line Complex (set of lines in space intersecting at a common line) and is 21 when the scene is planar. The LLC situation is only linearly degenerate, and we show that one can obtain a unique solution when the admissibility constraints of the tensor are accounted for. The line configuration described by an LLC, rather than being some obscure case, is in fact quite typical. It includes, as a particular example, the case of a camera moving down a hallway in an office environment or down an urban street. Furthermore, an LLC situation may occur as an artifact such as in direct estimation from spatio-temporal derivatives of image brightness. Therefore, an investigation into degeneracies and their remedy is important also in practice.

AIM-1619

CBCL-156

Author[s]: Theodoros Evgeniou and Tomaso Poggio

Sparse Representations of Multiple Signals

September 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1619.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1619.pdf

We discuss the problem of finding sparse representations of a class of signals. We formalize the problem and prove it is NP-complete both in the case of a single signal and that of multiple ones. Next we develop a simple approximation method to the problem and we show experimental results using artificially generated signals. Furthermore,we use our approximation method to find sparse representations of classes of real signals, specifically of images of pedestrians. We discuss the relation between our formulation of the sparsity problem and the problem of finding representations of objects that are compact and appropriate for detection and classification.

AIM-1618

Author[s]: Dan Halperin and Christian R. Shelton

A Perturbation Scheme for Spherical Arrangements with Application to Molecular Modeling

December 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1618.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1618.pdf

We describe a software package for computing and manipulating the subdivision of a sphere by a collection of (not necessarily great) circles and for computing the boundary surface of the union of spheres. We present problems that arise in the implementation of the software and the solutions that we have found for them. At the core of the paper is a novel perturbation scheme to overcome degeneracies and precision problems in computing spherical arrangements while using floating point arithmetic. The scheme is relatively simple, it balances between the efficiency of computation and the magnitude of the perturbation, and it performs well in practice. In one O(n) time pass through the data, it perturbs the inputs necessary to insure no potential degeneracies and then passes the perturbed inputs on to the geometric algorithm. We report and discuss experimental results. Our package is a major component in a larger package aimed to support geometric queries on molecular models; it is currently employed by chemists working in "rational drug design." The spherical subdivisions are used to construct a geometric model of a molecule where each sphere represents an atom. We also give an overview of the molecular modeling package and detail additional features and implementation issues.

AITR-1617

Author[s]: J. Kenneth Salisbury and Mandayam A. Srinivasan

Proceedings of the Second PHANToM User's Group Workshop

December 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1617.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1617.pdf

On October 19-22, 1997 the Second PHANToM Users Group Workshop was held at the MIT Endicott House in Dedham, Massachusetts. Designed as a forum for sharing results and insights, the workshop was attended by more than 60 participants from 7 countries. These proceedings report on workshop presentations in diverse areas including rigid and compliant rendering, tool kits, development environments, techniques for scientific data visualization, multi-modal issues and a programming tutorial.

AIM-1616

CBCL-155

Author[s]: Yair Weiss

Belief Propagation and Revision in Networks with Loops

November 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1616.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1616.pdf

Local belief propagation rules of the sort proposed by Pearl(1988) are guaranteed to converge to the optimal beliefs for singly connected networks. Recently, a number of researchers have empirically demonstrated good performance of these same algorithms on networks with loops, but a theoretical understanding of this performance has yet to be achieved. Here we lay the foundation for an understanding of belief propagation in networks with loops. For networks with a single loop, we derive ananalytical relationship between the steady state beliefs in the loopy network and the true posterior probability. Using this relationship we show a category of networks for which the MAP estimate obtained by belief update and by belief revision can be proven to be optimal (although the beliefs will be incorrect). We show how nodes can use local information in the messages they receive in order to correct the steady state beliefs. Furthermore we prove that for all networks with a single loop, the MAP estimate obtained by belief revisionat convergence is guaranteed to give the globally optimal sequence of states. The result is independent of the length of the cycle and the size of the statespace. For networks with multiple loops, we introduce the concept of a "balanced network" and show simulati.

AIM-1615

CBCL-154

Author[s]: Shimon Edelman and Sharon Duvdevani-Bar

Visual Recognition and Categorization on the Basis of Similarities to Multiple Class Prototypes

September 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1615.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1615.pdf

To recognize a previously seen object, the visual system must overcome the variability in the object's appearance caused by factors such as illumination and pose. Developments in computer vision suggest that it may be possible to counter the influence of these factors, by learning to interpolate between stored views of the target object, taken under representative combinations of viewing conditions. Daily life situations, however, typically require categorization, rather than recognition, of objects. Due to the open-ended character both of natural kinds and of artificial categories, categorization cannot rely on interpolation between stored examples. Nonetheless, knowledge of several representative members, or prototypes, of each of the categories of interest can still provide the necessary computational substrate for the categorization of new instances. The resulting representational scheme based on similarities to prototypes appears to be computationally viable, and is readily mapped onto the mechanisms of biological vision revealed by recent psychophysical and physiological studies.

AIM-1614

Author[s]: Daniel Coore, Radhika Nagpal and Ron Weiss

Paradigms for Structure in an Amorphous Computer

October 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1614.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1614.pdf

Recent developments in microfabrication and nanotechnology will enable the inexpensive manufacturing of massive numbers of tiny computing elements with sensors and actuators. New programming paradigms are required for obtaining organized and coherent behavior from the cooperation of large numbers of unreliable processing elements that are interconnected in unknown, irregular, and possibly time-varying ways. Amorphous computing is the study of developing and programming such ultrascale computing environments. This paper presents an approach to programming an amorphous computer by spontaneously organizing an unstructured collection of processing elements into cooperative groups and hierarchies. This paper introduces a structure called an AC Hierarchy, which logically organizes processors into groups at different levels of granularity. The AC hierarchy simplifies programming of an amorphous computer through new language abstractions, facilitates the design of efficient and robust algorithms, and simplifies the analysis of their performance. Several example applications are presented that greatly benefit from the AC hierarchy. This paper introduces three algorithms for constructing multiple levels of the hierarchy from an unstructured collection of processors.

AIM-1613

CBCL-153

Author[s]: Zhaoping Li

Visual Segmentation without Classification in a Model of the Primary Visual Cortex

August 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1613.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1613.pdf

Stimuli outside classical receptive fields significantly influence the neurons' activities in primary visual cortex. We propose that such contextual influences are used to segment regions by detecting the breakdown of homogeneity or translation invariance in the input, thus computing global region boundaries using local interactions. This is implemented in a biologically based model of V1, and demonstrated in examples of texture segmentation and figure-ground segregation. By contrast with traditional approaches, segmentation occurs without classification or comparison of features within or between regions and is performed by exactly the same neural circuit responsible for the dual problem of the grouping and enhancement of contours.

AIM-1612

CBCL-152

Author[s]: Massimiliano Pontil and Alessandro Verri

Properties of Support Vector Machines

August 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1612.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1612.pdf

Support Vector Machines (SVMs) perform pattern recognition between two point classes by finding a decision surface determined by certain points of the training set, termed Support Vectors (SV). This surface, which in some feature space of possibly infinite dimension can be regarded as a hyperplane, is obtained from the solution of a problem of quadratic programming that depends on a regularization parameter. In this paper we study some mathematical properties of support vectors and show that the decision surface can be written as the sum of two orthogonal terms, the first depending only on the margin vectors (which are SVs lying on the margin), the second proportional to the regularization parameter. For almost all values of the parameter, this enables us to predict how the decision surface varies for small parameter changes. In the special but important case of feature space of finite dimension m, we also show that there are at most m+1 margin vectors and observe that m+1 SVs are usually sufficient to fully determine the decision surface. For relatively small m this latter result leads to a consistent reduction of the SV number.

AIM-1611

CBCL-151

Author[s]: Marina Meila, Michael I. Jordan and Quaid Morris

Estimating Dependency Structure as a Hidden Variable

June 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1611.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1611.pdf

AIM-1610

CBCL-150

Author[s]: Marcus Dill and Shimon Edelman

Translation Invariance in Object Recognition, and Its Relation to Other Visual Transformations

June 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1610.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1610.pdf

Human object recognition is generally considered to tolerate changes of the stimulus position in the visual field. A number of recent studies, however, have cast doubt on the completeness of translation invariance. In a new series of experiments we tried to investigate whether positional specificity of short-term memory is a general property of visual perception. We tested same/different discrimination of computer graphics models that were displayed at the same or at different locations of the visual field, and found complete translation invariance, regardless of the similarity of the animals and irrespective of direction and size of the displacement (Exp. 1 and 2). Decisions were strongly biased towards same decisions if stimuli appeared at a constant location, while after translation subjects displayed a tendency towards different decisions. Even if the spatial order of animal limbs was randomized ("scrambled animals"), no deteriorating effect of shifts in the field of view could be detected (Exp. 3). However, if the influence of single features was reduced (Exp. 4 and 5) small but significant effects of translation could be obtained. Under conditions that do not reveal an influence of translation, rotation in depth strongly interferes with recognition (Exp. 6). Changes of stimulus size did not reduce performance (Exp. 7). Tolerance to these object transformations seems to rely on different brain mechanisms, with translation and scale invariance being achieved in principle, while rotation invariance is not.

AIM-1608

CBCL-148

Author[s]: Gad Geiger and Jerome Y. Lettvin

A View on Dyslexia

June 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1608.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1608.pdf

We describe here, briefly, a perceptual non- reading measure which reliably distinguishes between dyslexic persons and ordinary readers. More importantly, we describe a regimen of practice with which dyslexics learn a new perceptual strategy for reading. Two controlled experiment on dyslexics children demonstrate the regimen's efficiency.

AIM-1606

CBCL-147

Author[s]: Federico Girosi

An Equivalence Between Sparse Approximation and Support Vector Machines

May 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1606.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1606.pdf

In the first part of this paper we show a similarity between the principle of Structural Risk Minimization Principle (SRM) (Vapnik, 1982) and the idea of Sparse Approximation, as defined in (Chen, Donoho and Saunders, 1995) and Olshausen and Field (1996). Then we focus on two specific (approximate) implementations of SRM and Sparse Approximation, which have been used to solve the problem of function approximation. For SRM we consider the Support Vector Machine technique proposed by V. Vapnik and his team at AT&T Bell Labs, and for Sparse Approximation we consider a modification of the Basis Pursuit De-Noising algorithm proposed by Chen, Donoho and Saunders (1995). We show that, under certain conditions, these two techniques are equivalent: they give the same solution and they require the solution of the same quadratic programming problem.

AIM-1605

CBCL-146

Author[s]: Marina Meila and Michael I. Jordan

Triangulation by Continuous Embedding

March 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1605.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1605.pdf

When triangulating a belief network we aim to obtain a junction tree of minimum state space. Searching for the optimal triangulation can be cast as a search over all the permutations of the network's vaeriables. Our approach is to embed the discrete set of permutations in a convex continuous domain D. By suitably extending the cost function over D and solving the continous nonlinear optimization task we hope to obtain a good triangulation with respect to the aformentioned cost. In this paper we introduce an upper bound to the total junction tree weight as the cost function. The appropriatedness of this choice is discussed and explored by simulations. Then we present two ways of embedding the new objective function into continuous domains and show that they perform well compared to the best known heuristic.

AIM-1604

Author[s]: William J. Dally, Leonard McMillan, Gary Bishop and Henry Fuchs

The Delta Tree: An Object-Centered Approach to Image-Based Rendering

May 2, 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1604.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1604.pdf

This paper introduces the delta tree, a data structure that represents an object using a set of reference images. It also describes an algorithm for generating arbitrary re- projections of an object by traversing its delta tree. Delta trees are an efficient representation in terms of both storage and rendering performance. Each node of a delta tree stores an image taken from a point on a sampling sphere that encloses the object. Each image is compressed by discarding pixels that can be reconstructed by warping its ancestor's images to the node's viewpoint. The partial image stored at each node is divided into blocks and represented in the frequency domain. The rendering process generates an image at an arbitrary viewpoint by traversing the delta tree from a root node to one or more of its leaves. A subdivision algorithm selects only the required blocks from the nodes along the path. For each block, only the frequency components necessary to reconstruct the final image at an appropriate sampling density are used. This frequency selection mechanism handles both antialiasing and level-of-detail within a single framework. A complex scene is initially rendered by compositing images generated by traversing the delta trees of its components. Once the reference views of a scene are rendered once in this manner, the entire scene can be reprojected to an arbitrary viewpoint by traversing its own delta tree. Our approach is limited to generating views of an object from outside the object's convex hull. In practice we work around this problem by subdividing objects to render views from within the convex hull.

AIM-1603

CBCL-145

Author[s]: Shai Avidan, Theodoros Evgeniou, Amnon Shashua and Tomaso Poggio

Image-Based View Synthesis

January 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1603.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1603.pdf

We present a new method for rendering novel images of flexible 3D objects from a small number of example images in correspondence. The strength of the method is the ability to synthesize images whose viewing position is significantly far away from the viewing cone of the example images ("view extrapolation"), yet without ever modeling the 3D structure of the scene. The method relies on synthesizing a chain of "trilinear tensors" that governs the warping function from the example images to the novel image, together with a multi-dimensional interpolation function that synthesizes the non-rigid motions of the viewed object from the virtual camera position. We show that two closely spaced example images alone are sufficient in practice to synthesize a significant viewing cone, thus demonstrating the ability of representing an object by a relatively small number of model images --- for the purpose of cheap and fast viewers that can run on standard hardware.

AIM-1602

CBCL-144

Author[s]: Edgar Osuna, Robert Freund and Federico Girosi

Support Vector Machines: Training and Applications

March 1997

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1602.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1602.pdf

The Support Vector Machine (SVM) is a new and very promising classification technique developed by Vapnik and his group at AT&T Bell Labs. This new learning algorithm can be seen as an alternative training technique for Polynomial, Radial Basis Function and Multi- Layer Perceptron classifiers. An interesting property of this approach is that it is an approximate implementation of the Structural Risk Minimization (SRM) induction principle. The derivation of Support Vector Machines, its relationship with SRM, and its geometrical insight, are discussed in this paper. Training a SVM is equivalent to solve a quadratic programming problem with linear and box constraints in a number of variables equal to the number of data points. When the number of data points exceeds few thousands the problem is very challenging, because the quadratic form is completely dense, so the memory needed to store the problem grows with the square of the number of data points. Therefore, training problems arising in some real applications with large data sets are impossible to load into memory, and cannot be solved using standard non-linear constrained optimization algorithms. We present a decomposition algorithm that can be used to train SVM's over large data sets. The main idea behind the decomposition is the iterative solution of sub-problems and the evaluation of, and also establish the stopping criteria for the algorithm. We present previous approaches, as well as results and important details of our implementation of the algorithm using a second-order variant of the Reduced Gradient Method as the solver of the sub- problems. As an application of SVM's, we present preliminary results we obtained applying SVM to the problem of detecting frontal human faces in real images.

AIM-1600

CBCL-143

Author[s]: Thomas Vetter, Michael J. Jones and Tomaso Poggio

A Bootstrapping Algorithm for Learning Linear Models of Object Classes

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1600.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1600.pdf

Flexible models of object classes, based on linear combinations of prototypical images, are capable of matching novel images of the same class and have been shown to be a powerful tool to solve several fundamental vision tasks such as recognition, synthesis and correspondence. The key problem in creating a specific flexible model is the computation of pixelwise correspondence between the prototypes, a task done until now in a semiautomatic way. In this paper we describe an algorithm that automatically bootstraps the correspondence between the prototypes. The algorithm - which can be used for 2D images as well as for 3D models - is shown to synthesize successfully a flexible model of frontal face images and a flexible model of handwritten digits.

AIM-1599

CBCL-142

Author[s]: B. Schoelkopf, K. Sung, C. Burges, F. Girosi, P. Niyogi, T. Poggio and V. Vapnik

Comparing Support Vector Machines with Gaussian Kernels to Radial Basis Function Classifiers

December 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1599.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1599.pdf

The Support Vector (SV) machine is a novel type of learning machine, based on statistical learning theory, which contains polynomial classifiers, neural networks, and radial basis function (RBF) networks as special cases. In the RBF case, the SV algorithm automatically determines centers, weights and threshold such as to minimize an upper bound on the expected test error. The present study is devoted to an experimental comparison of these machines with a classical approach, where the centers are determined by $k$-- means clustering and the weights are found using error backpropagation. We consider three machines, namely a classical RBF machine, an SV machine with Gaussian kernel, and a hybrid system with the centers determined by the SV method and the weights trained by error backpropagation. Our results show that on the US postal service database of handwritten digits, the SV machine achieves the highest test accuracy, followed by the hybrid approach. The SV approach is thus not only theoretically well--founded, but also superior in a practical application.

AIM-1598

CBCL-141

Author[s]: Joerg C. Lemm

Prior Information and Generalized Questions

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1598.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1598.pdf

In learning problems available information is usually divided into two categories: examples of function values (or training data) and prior information (e.g. a smoothness constraint). This paper 1.) studies aspects on which these two categories usually differ, like their relevance for generalization and their role in the loss function, 2.) presents a unifying formalism, where both types of information are identified with answers to generalized questions, 3.) shows what kind of generalized information is necessary to enable learning, 4.) aims to put usual training data and prior information on a more equal footing by discussing possibilities and variants of measurement and control for generalized questions, including the examples of smoothness and symmetries, 5.) reviews shortly the measurement of linguistic concepts based on fuzzy priors, and principles to combine preprocessors, 6.) uses a Bayesian decision theoretic framework, contrasting parallel and inverse decision problems, 7.) proposes, for problems with non--approximation aspects, a Bayesian two step approximation consisting of posterior maximization and a subsequent risk minimization, 8.) analyses empirical risk minimization under the aspect of nonlocal information 9.) compares the Bayesian two step approximation with empirical risk minimization, including their interpretations of Occam's razor, 10.) formulates examples of stationarity conditions for the maximum posterior approximation with nonlocal and nonconvex priors, leading to inhomogeneous nonlinear equations, similar for example to equations in scattering theory in physics. In summary, this paper focuses on the dependencies between answers to different questions. Because not training examples alone but such dependencies enable generalization, it emphasizes the need of their empirical measurement and control and of a more explicit treatment in theory.

AITR-1596

Author[s]: J. Kenneth Salisbury and Mandayam A. Srinivasan (editors)

The Proceedings of the First PHANToM User's Group Workshop

December 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1596.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1596.pdf

These proceedings summarize the results of the First PHANToM User's Group Workshop held September 27-30, 1996 MIT. The goal of the workshop was to bring together a group of active users of the PHANToM Haptic Interface to discuss the scientific and engineering challenges involved in bringing haptics into widespread use, and to explore the future possibilities of this exciting technology. With over 50 attendees and 25 presentations the workshop provided the first large forum for users of a common haptic interface to share results and engage in collaborative discussions. Short papers from the presenters are contained herein and address the following topics: Research Effort Overviews, Displays and Effects, Applications in Teleoperation and Training, Tools for Simulated Worlds and, Data Visualization.

AIM-1595

Author[s]: Gideon P. Stein

Lens Distortion Calibration Using Point Correspondences

December 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1595.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1595.pdf

This paper describes a new method for lens distortion calibration using only point correspondences in multiple views, without the need to know either the 3D location of the points or the camera locations. The standard lens distortion model is a model of the deviations of a real camera from the ideal pinhole or projective camera model.Given multiple views of a set of corresponding points taken by ideal pinhole cameras there exist epipolar and trilinear constraints among pairs and triplets of these views. In practice, due to noise in the feature detection and due to lens distortion these constraints do not hold exactly and we get some error. The calibration is a search for the lens distortion parameters that minimize this error. Using simulation and experimental results with real images we explore the properties of this method. We describe the use of this method with the standard lens distortion model, radial and decentering, but it could also be used with any other parametric distortion models. Finally we demonstrate that lens distortion calibration improves the accuracy of 3D reconstruction.

AIM-1594

Author[s]: Gideon P. Stein and Amnon Shashua

Direct Methods for Estimation of Structure and Motion from Three Views

December 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1594.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1594.pdf

We describe a new direct method for estimating structure and motion from image intensities of multiple views. We extend the direct methods of Horn- and-Weldon to three views. Adding the third view enables us to solve for motion, and compute a dense depth map of the scene, directly from image spatio -temporal derivatives in a linear manner without first having to find point correspondences or compute optical flow. We describe the advantages and limitations of this method which are then verified through simulation and experiments with real images.

AIM-1593

Author[s]: J.P. Mellor, Seth Teller and Tomas Lozano-Perez

Dense Depth Maps from Epipolar Images

November 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1593.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1593.pdf

Recovering three-dimensional information from two-dimensional images is the fundamental goal of stereo techniques. The problem of recovering depth (three- dimensional information) from a set of images is essentially the correspondence problem: Given a point in one image, find the corresponding point in each of the other images. Finding potential correspondences usually involves matching some image property. If the images are from nearby positions, they will vary only slightly, simplifying the matching process. Once a correspondence is known, solving for the depth is simply a matter of geometry. Real images are composed of noisy, discrete samples, therefore the calculated depth will contain error. This error is a function of the baseline or distance between the images. Longer baselines result in more precise depths. This leads to a conflict: short baselines simplify the matching process, but produce imprecise results; long baselines produce precise results, but complicate the matching process. In this paper, we present a method for generating dense depth maps from large sets (1000's) of images taken from arbitrary positions. Long baseline images improve the accuracy. Short baseline images and the large number of images greatly simplifies the correspondence problem, removing nearly all ambiguity. The algorithm presented is completely local and for each pixel generates an evidence versus depth and surface normal distribution. In many cases, the distribution contains a clear and distinct global maximum. The location of this peak determines the depth and its shape can be used to estimate the error. The distribution can also be used to perform a maximum likelihood fit of models directly to the images. We anticipate that the ability to perform maximum likelihood estimation from purely local calculations will prove extremely useful in constructing three dimensional models from large sets of images.

AIM-1592

CBCL-140

Author[s]: Theodoros Evgeniou

Image Based Rendering Using Algebraic Techniques

November 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1592.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1592.pdf

This paper presents an image-based rendering system using algebraic relations between different views of an object. The system uses pictures of an object taken from known positions. Given three such images it can generate "virtual'' ones as the object would look from any position near the ones that the two input images were taken from. The extrapolation from the example images can be up to about 60 degrees of rotation. The system is based on the trilinear constraints that bind any three view so fan object. As a side result, we propose two new methods for camera calibration. We developed and used one of them. We implemented the system and tested it on real images of objects and faces. We also show experimentally that even when only two images taken from unknown positions are given, the system can be used to render the object from other view points as long as we have a good estimate of the internal parameters of the camera used and we are able to find good correspondence between the example images. In addition, we present the relation between these algebraic constraints and a factorization method for shape and motion estimation. As a result we propose a method for motion estimation in the special case of orthographic projection.

AIM-1591

Author[s]: Paul Viola

Complex Feature Recognition: A Bayesian Approach for Learning to Recognize Objects

November 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1591.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1591.pdf

We have developed a new Bayesian framework for visual object recognition which is based on the insight that images of objects can be modeled as a conjunction of local features. This framework can be used to both derive an object recognition algorithm and an algorithm for learning the features themselves. The overall approach, called complex feature recognition or CFR, is unique for several reasons: it is broadly applicable to a wide range of object types, it makes constructing object models easy, it is capable of identifying either the class or the identity of an object, and it is computationally efficient-- requiring time proportional to the size of the image. Instead of a single simple feature such as an edge, CFR uses a large set of complex features that are learned from experience with model objects. The response of a single complex feature contains much more class information than does a single edge. This significantly reduces the number of possible correspondences between the model and the image. In addition, CFR takes advantage of a type of image processing called 'oriented energy'. Oriented energy is used to efficiently pre-process the image to eliminate some of the difficulties associated with changes in lighting and pose.

AITR-1590

Author[s]: Andrew A. Berlin

Towards Intelligent Structures: Active Control of Buckling

May 1994

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1590.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1590.pdf

The buckling of compressively-loaded members is one of the most important factors limiting the overall strength and stability of a structure. I have developed novel techniques for using active control to wiggle a structural element in such a way that buckling is prevented. I present the results of analysis, simulation, and experimentation to show that buckling can be prevented through computer- controlled adjustment of dynamical behavior.sI have constructed a small-scale railroad-style truss bridge that contains compressive members that actively resist buckling through the use of piezo-electric actuators. I have also constructed a prototype actively controlled column in which the control forces are applied by tendons, as well as a composite steel column that incorporates piezo-ceramic actuators that are used to counteract buckling. Active control of buckling allows this composite column to support 5.6 times more load than would otherwise be possible.sThese techniques promise to lead to intelligent physical structures that are both stronger and lighter than would otherwise be possible.

AIM-1589

Author[s]: Andrew Justin Blumberg

General Purpose Parallel Computation on a DNA Substrate

December 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1589.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1589.pdf

In this paper I describe and extend a new DNA computing paradigm introduced in Blumberg for building massively parallel machines in the DNA-computing models described by Adelman, Cai et. al., and Liu et. al. Employing only DNA operations which have been reported as successfully performed, I present an implementation of a Connection Machine, a SIMD (single-instruction multiple-data) parallel computer as an illustration of how to apply this approach to building computers in this domain (and as an implicit demonstration of PRAM equivalence). This is followed with a description of how to implement a MIMD (multiple-instruction multiple-data) parallel machine. The implementations described herein differ most from existing models in that they employ explicit communication between processing elements (and hence strands of DNA).

AIM-1588

Author[s]: Andrew Justin Blumberg

Parallel Function Application on a DNA Substrate

December 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1588.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1588.pdf

In this paper I present a new model that employs a biological (specifically DNA - based) substrate for performing computation. Specifically, I describe strategies for performing parallel function application in the DNA-computing models described by Adelman, Cai et. al., and Liu et. al. Employing only DNA operations which can presently be performed, I discuss some direct algorithms for computing a variety of useful mathematical functions on DNA, culminating in an algorithm for minimizing an arbitrary continuous function. In addition, computing genetic algorithms on a DNA substrate is briefly discussed.

AITR-1587

Author[s]: Partha Niyogi

The Informational Complexity of Learning from Examples

September 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1587.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1587.pdf

This thesis attempts to quantify the amount of information needed to learn certain tasks. The tasks chosen vary from learning functions in a Sobolev space using radial basis function networks to learning grammars in the principles and parameters framework of modern linguistic theory. These problems are analyzed from the perspective of computational learning theory and certain unifying perspectives emerge.

AITR-1586

Author[s]: Andre DeHon

Reconfigurable Architectures for General-Purpose Computing

September 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1586.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1586.pdf

General-purpose computing devices allow us to (1) customize computation after fabrication and (2) conserve area by reusing expensive active circuitry for different functions in time. We define RP-space, a restricted domain of the general-purpose architectural space focussed on reconfigurable computing architectures. Two dominant features differentiate reconfigurable from special- purpose architectures and account for most of the area overhead associated with RP devices: (1) instructions which tell the device how to behave, and (2) flexible interconnect which supports task dependent dataflow between operations. We can characterize RP- space by the allocation and structure of these resources and compare the efficiencies of architectural points across broad application characteristics. Conventional FPGAs fall at one extreme end of this space and their efficiency ranges over two orders of magnitude across the space of application characteristics. Understanding RP-space and its consequences allows us to pick the best architecture for a task and to search for more robust design points in the space. Our DPGA, a fine- grained computing device which adds small, on-chip instruction memories to FPGAs is one such design point. For typical logic applications and finite- state machines, a DPGA can implement tasks in one-third the area of a traditional FPGA. TSFPGA, a variant of the DPGA which focuses on heavily time- switched interconnect, achieves circuit densities close to the DPGA, while reducing typical physical mapping times from hours to seconds. Rigid, fabrication-time organization of instruction resources significantly narrows the range of efficiency for conventional architectures. To avoid this performance brittleness, we developed MATRIX, the first architecture to defer the binding of instruction resources until run-time, allowing the application to organize resources according to its needs. Our focus MATRIX design point is based on an array of 8-bit ALU and register- file building blocks interconnected via a byte- wide network. With today's silicon, a single chip MATRIX array can deliver over 10 Gop/s (8-bit ops). On sample image processing tasks, we show that MATRIX yields 10-20x the computational density of conventional processors. Understanding the cost structure of RP-space helps us identify these intermediate architectural points and may provide useful insight more broadly in guiding our continual search for robust and efficient general-purpose computing structures.

AITR-1585

Author[s]: Miguel Hall

Prototype of a Configurable Web-Based Assessment System

June 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1585.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1585.pdf

The MIT Prototype Educational Assessment System provides subjects and courses at MIT with the ability to perform online assessment. The system includes polices to handle harassment and electronic "flaming" while protecting privacy. Within these frameworks, individual courses and subjects can make their own policy decisions about such matters as to when assessments can occur, who can submit assessments, and how anonymous assessments are. By allowing assessment to take place continually and allowing both students and staff to participate, the system can provide a forum for the online discussion of subjects. Even in the case of scheduled assessments, the system can provide advantages over end-of-term assessment, since the scheduled assessments can occur several times during the semester, allowing subjects to identify and adjust those areas that could use improvement. Subjects can also develop customized questionnaires, perhaps in response to previous assessments, to suit their needs.

AIM-1584

Author[s]: Ujjaval Y. Desai, Marcelo M. Mizuki, Ichiro Masaki and Berthold K.P. Horn

Edge and Mean Based Image Compression

November 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1584.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1584.pdf

In this paper, we present a static image compression algorithm for very low bit rate applications. The algorithm reduces spatial redundancy present in images by extracting and encoding edge and mean information. Since the human visual system is highly sensitive to edges, an edge-based compression scheme can produce intelligible images at high compression ratios. We present good quality results for facial as well as textured, 256~x~256 color images at 0.1 to 0.3 bpp. The algorithm described in this paper was designed for high performance, keeping hardware implementation issues in mind. In the next phase of the project, which is currently underway, this algorithm will be implemented in hardware, and new edge- based color image sequence compression algorithms will be developed to achieve compression ratios of over 100, i.e., less than 0.12 bpp from 12 bpp. Potential applications include low power, portable video telephones.

AIM-1583

CBCL-139

Author[s]: Michael J. Jones and Tomaso Poggio

Model-Based Matching by Linear Combinations of Prototypes

December 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1583.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1583.pdf

We describe a method for modeling object classes (such as faces) using 2D example images and an algorithm for matching a model to a novel image. The object class models are "learned'' from example images that we call prototypes. In addition to the images, the pixelwise correspondences between a reference prototype and each of the other prototypes must also be provided. Thus a model consists of a linear combination of prototypical shapes and textures. A stochastic gradient descent algorithm is used to match a model to a novel image by minimizing the error between the model and the novel image. Example models are shown as well as example matches to novel images. The robustness of the matching algorithm is also evaluated. The technique can be used for a number of applications including the computation of correspondence between novel images of a certain known class, object recognition, image synthesis and image compression.

AITR-1582

Author[s]: Ann L. Torres

Virtual Model Control of a Hexapod Walking Robot

December 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1582.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1582.pdf

Since robots are typically designed with an individual actuator at each joint, the control of these systems is often difficult and non- intuitive. This thesis explains a more intuitive control scheme called Virtual Model Control. This thesis also demonstrates the simplicity and ease of this control method by using it to control a simulated walking hexapod. Virtual Model Control uses imagined mechanical components to create virtual forces, which are applied through the joint torques of real actuators. This method produces a straightforward means of controlling joint torques to produce a desired robot behavior. Due to the intuitive nature of this control scheme, the design of a virtual model controller is similar to the design of a controller with basic mechanical components. The ease of this control scheme facilitates the use of a high level control system which can be used above the low level virtual model controllers to modulate the parameters of the imaginary mechanical components. In order to apply Virtual Model Control to parallel mechanisms, a solution to the force distribution problem is required. This thesis uses an extension of Gardner`s Partitioned Force Control method which allows for the specification of constrained degrees of freedom. This virtual model control technique was applied to a simulated hexapod robot. Although the hexapod is a highly non-linear, parallel mechanism, the virtual models allowed text-book control solutions to be used while the robot was walking. Using a simple linear control law, the robot walked while simultaneously balancing a pendulum and tracking an object.

AITR-1581

Author[s]: Jerry E. Pratt

Virtual Model Control of a Biped Walking Robot

December 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1581.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1581.pdf

The transformation from high level task specification to low level motion control is a fundamental issue in sensorimotor control in animals and robots. This thesis develops a control scheme called virtual model control which addresses this issue. Virtual model control is a motion control language which uses simulations of imagined mechanical components to create forces, which are applied through joint torques, thereby creating the illusion that the components are connected to the robot. Due to the intuitive nature of this technique, designing a virtual model controller requires the same skills as designing the mechanism itself. A high level control system can be cascaded with the low level virtual model controller to modulate the parameters of the virtual mechanisms. Discrete commands from the high level controller would then result in fluid motion. An extension of Gardner's Partitioned Actuator Set Control method is developed. This method allows for the specification of constraints on the generalized forces which each serial path of a parallel mechanism can apply. Virtual model control has been applied to a bipedal walking robot. A simple algorithm utilizing a simple set of virtual components has successfully compelled the robot to walk eight consecutive steps.

AIM-1580

CBCL-138

Author[s]: Bruno A. Olshausen

Learning Linear, Sparse, Factorial Codes

December 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1580.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1580.pdf

In previous work (Olshausen & Field 1996), an algorithm was described for learning linear sparse codes which, when trained on natural images, produces a set of basis functions that are spatially localized, oriented, and bandpass (i.e., wavelet-like). This note shows how the algorithm may be interpreted within a maximum-likelihood framework. Several useful insights emerge from this connection: it makes explicit the relation to statistical independence (i.e., factorial coding), it shows a formal relationship to the algorithm of Bell and Sejnowski (1995), and it suggests how to adapt parameters that were previously fixed.

AITR-1579

Author[s]: Brian A. LaMacchia

Internet Fish

August 1, 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1579.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1579.pdf

I have invented "Internet Fish," a novel class of resource-discovery tools designed to help users extract useful information from the Internet. Internet Fish (IFish) are semi- autonomous, persistent information brokers; users deploy individual IFish to gather and refine information related to a particular topic. An IFish will initiate research, continue to discover new sources of information, and keep tabs on new developments in that topic. As part of the information-gathering process the user interacts with his IFish to find out what it has learned, answer questions it has posed, and make suggestions for guidance. Internet Fish differ from other Internet resource discovery systems in that they are persistent, personal and dynamic. As part of the information-gathering process IFish conduct extended, long-term conversations with users as they explore. They incorporate deep structural knowledge of the organization and services of the net, and are also capable of on-the-fly reconfiguration, modification and expansion. Human users may dynamically change the IFish in response to changes in the environment, or IFish may initiate such changes itself. IFish maintain internal state, including models of its own structure, behavior, information environment and its user; these models permit an IFish to perform meta-level reasoning about its own structure. To facilitate rapid assembly of particular IFish I have created the Internet Fish Construction Kit. This system provides enabling technology for the entire class of Internet Fish tools; it facilitates both creation of new IFish as well as additions of new capabilities to existing ones. The Construction Kit includes a collection of encapsulated heuristic knowledge modules that may be combined in mix-and-match fashion to create a particular IFish; interfaces to new services written with the Construction Kit may be immediately added to "live" IFish. Using the Construction Kit I have created a demonstration IFish specialized for finding World-Wide Web documents related to a given group of documents. This "Finder" IFish includes heuristics that describe how to interact with the Web in general, explain how to take advantage of various public indexes and classification schemes, and provide a method for discovering similarity relationships among documents.

AITR-1577

Author[s]: Ignacio Sean McQuirk

An Analog VLSI Chip for Estimating the Focus of Expansion

August 21, 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1577.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1577.pdf

For applications involving the control of moving vehicles, the recovery of relative motion between a camera and its environment is of high utility. This thesis describes the design and testing of a real- time analog VLSI chip which estimates the focus of expansion (FOE) from measured time-varying images. Our approach assumes a camera moving through a fixed world with translational velocity; the FOE is the projection of the translation vector onto the image plane. This location is the point towards which the camera is moving, and other points appear to be expanding outward from. By way of the camera imaging parameters, the location of the FOE gives the direction of 3-D translation. The algorithm we use for estimating the FOE minimizes the sum of squares of the differences at every pixel between the observed time variation of brightness and the predicted variation given the assumed position of the FOE. This minimization is not straightforward, because the relationship between the brightness derivatives depends on the unknown distance to the surface being imaged. However, image points where brightness is instantaneously constant play a critical role. Ideally, the FOE would be at the intersection of the tangents to the iso- brightness contours at these "stationary" points. In practice, brightness derivatives are hard to estimate accurately given that the image is quite noisy. Reliable results can nevertheless be obtained if the image contains many stationary points and the point is found that minimizes the sum of squares of the perpendicular distances from the tangents at the stationary points. The FOE chip calculates the gradient of this least-squares minimization sum, and the estimation is performed by closing a feedback loop around it. The chip has been implemented using an embedded CCD imager for image acquisition and a row-parallel processing scheme. A 64 x 64 version was fabricated in a 2um CCD/ BiCMOS process through MOSIS with a design goal of 200 mW of on-chip power, a top frame rate of 1000 frames/second, and a basic accuracy of 5%. A complete experimental system which estimates the FOE in real time using real motion and image scenes is demonstrated.

AIM-1576

Author[s]: Olin Shivers

Supporting Dynamic Languages on the Java Virtual Machine

April 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1576.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1576.pdf

In this note, I propose two extensions to the Java virtual machine (or VM) to allow dynamic languages such as Dylan, Scheme and Smalltalk to be efficiently implemented on the VM. These extensions do not affect the performance of pure Java programs on the machine. The first extension allows for efficient encoding of dynamic data; the second allows for efficient encoding of language-specific computational elements.

AIM-1575

Author[s]: Kenneth Yip and Gerald Jay Sussman

A Computational Model for the Acquisition and Use of Phonological Knowledge

March 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1575.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1575.pdf

Does knowledge of language consist of symbolic rules? How do children learn and use their linguistic knowledge? To elucidate these questions, we present a computational model that acquires phonological knowledge from a corpus of common English nouns and verbs. In our model the phonological knowledge is encapsulated as boolean constraints operating on classical linguistic representations of speech sounds in term of distinctive features. The learning algorithm compiles a corpus of words into increasingly sophisticated constraints. The algorithm is incremental, greedy, and fast. It yields one-shot learning of phonological constraints from a few examples. Our system exhibits behavior similar to that of young children learning phonological knowledge. As a bonus the constraints can be interpreted as classical linguistic rules. The computational model can be implemented by a surprisingly simple hardware mechanism. Our mechanism also sheds light on a fundamental AI question: How are signals related to symbols?

AITR-1574

Author[s]: David Beymer

Pose-Invariant Face Recognition Using Real and Virtual Views

March 28, 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1574.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1574.pdf

The problem of automatic face recognition is to visually identify a person in an input image. This task is performed by matching the input face against the faces of known people in a database of faces. Most existing work in face recognition has limited the scope of the problem, however, by dealing primarily with frontal views, neutral expressions, and fixed lighting conditions. To help generalize existing face recognition systems, we look at the problem of recognizing faces under a range of viewpoints. In particular, we consider two cases of this problem: (i) many example views are available of each person, and (ii) only one view is available per person, perhaps a driver's license or passport photograph. Ideally, we would like to address these two cases using a simple view-based approach, where a person is represented in the database by using a number of views on the viewing sphere. While the view-based approach is consistent with case (i), for case (ii) we need to augment the single real view of each person with synthetic views from other viewpoints, views we call 'virtual views'. Virtual views are generated using prior knowledge of face rotation, knowledge that is 'learned' from images of prototype faces. This prior knowledge is used to effectively rotate in depth the single real view available of each person. In this thesis, I present the view- based face recognizer, techniques for synthesizing virtual views, and experimental results using real and virtual views in the recognizer.

AITR-1573

Author[s]: Thomas F. Stahovich

SketchIT: A Sketch Interpretation Tool for Conceptual Mechanical Design

March 13, 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1573.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1573.pdf

We describe a program called SketchIT capable of producing multiple families of designs from a single sketch. The program is given a rough sketch (drawn using line segments for part faces and icons for springs and kinematic joints) and a description of the desired behavior. The sketch is "rough" in the sense that taken literally, it may not work. From this single, perhaps flawed sketch and the behavior description, the program produces an entire family of working designs. The program also produces design variants, each of which is itself a family of designs. SketchIT represents each family of designs with a "behavior ensuring parametric model" (BEP-Model), a parametric model augmented with a set of constraints that ensure the geometry provides the desired behavior. The construction of the BEP-Model from the sketch and behavior description is the primary task and source of difficulty in this undertaking. SketchIT begins by abstracting the sketch to produce a qualitative configuration space (qc- space) which it then uses as its primary representation of behavior. SketchIT modifies this initial qc-space until qualitative simulation verifies that it produces the desired behavior. SketchIT's task is then to find geometries that implement this qc-space. It does this using a library of qc-space fragments. Each fragment is a piece of parametric geometry with a set of constraints that ensure the geometry implements a specific kind of boundary (qcs- curve) in qc-space. SketchIT assembles the fragments to produce the BEP-Model. SketchIT produces design variants by mapping the qc-space to multiple implementations, and by transforming rotating parts to translating parts and vice versa.

AITR-1572

Author[s]: Kah-Kay Sung

Learning and Example Selection for Object and Pattern Detection

March 13, 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1572.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1572.pdf

This thesis presents a learning based approach for detecting classes of objects and patterns with variable image appearance but highly predictable image boundaries. It consists of two parts. In part one, we introduce our object and pattern detection approach using a concrete human face detection example. The approach first builds a distribution-based model of the target pattern class in an appropriate feature space to describe the target's variable image appearance. It then learns from examples a similarity measure for matching new patterns against the distribution-based target model. The approach makes few assumptions about the target pattern class and should therefore be fairly general, as long as the target class has predictable image boundaries. Because our object and pattern detection approach is very much learning-based, how well a system eventually performs depends heavily on the quality of training examples it receives. The second part of this thesis looks at how one can select high quality examples for function approximation learning tasks. We propose an {em active learning} formulation for function approximation, and show for three specific approximation function classes, that the active example selection strategy learns its target with fewer data samples than random sampling. We then simplify the original active learning formulation, and show how it leads to a tractable example selection paradigm, suitable for use in many object and pattern detection problems.

AIM-1571

Author[s]: Tommi S. Jaakkola and Michael I. Jordan

Computing Upper and Lower Bounds on Likelihoods in Intractable Networks

March 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1571.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1571.pdf

We present techniques for computing upper and lower bounds on the likelihoods of partial instantiations of variables in sigmoid and noisy-OR networks. The bounds determine confidence intervals for the desired likelihoods and become useful when the size of the network (or clique size) precludes exact computations. We illustrate the tightness of the obtained bounds by numerical experiments.

AIM-1570

Author[s]: Lawrence K. Saul, Tommi Jaakkola and Michael I. Jordan

Mean Field Theory for Sigmoid Belief Networks

August 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1570.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1570.pdf

We develop a mean field theory for sigmoid belief networks based on ideas from statistical mechanics. Our mean field theory provides a tractable approximation to the true probability distribution in these networks; it also yields a lower bound on the likelihood of evidence. We demonstrate the utility of this framework on a benchmark problem in statistical pattern recognition -- the classification of handwritten digits.

AITR-1569

Author[s]: Deniz Yuret

From Genetic Algorithms to Efficient Organization

May 1994

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1569.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1569.pdf

The work described in this thesis began as an inquiry into the nature and use of optimization programs based on "genetic algorithms." That inquiry led, eventually, to three powerful heuristics that are broadly applicable in gradient-ascent programs: First, remember the locations of local maxima and restart the optimization program at a place distant from previously located local maxima. Second, adjust the size of probing steps to suit the local nature of the terrain, shrinking when probes do poorly and growing when probes do well. And third, keep track of the directions of recent successes, so as to probe preferentially in the direction of most rapid ascent. These algorithms lie at the core of a novel optimization program that illustrates the power to be had from deploying them together. The efficacy of this program is demonstrated on several test problems selected from a variety of fields, including De Jong's famous test-problem suite, the traveling salesman problem, the problem of coordinate registration for image guided surgery, the energy minimization problem for determining the shape of organic molecules, and the problem of assessing the structure of sedimentary deposits using seismic data.

AIM-1568

Author[s]: Philip N. Sabes and Michael I. Jordan

Reinforcement Learning by Probability Matching

Jaunary 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1568.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1568.pdf

We present a new algorithm for associative reinforcement learning. The algorithm is based upon the idea of matching a network's output probability with a probability distribution derived from the environment"s reward signal. This Probability Matching algorithm is shown to perform faster and be less susceptible to local minima than previously existing algorithms. We use Probability Matching to train mixture of experts networks, an architecture for which other reinforcement learning rules fail to converge reliably on even simple problems. This architecture is particularly well suited for our algorithm as it can compute arbitrarily complex functions yet calculation of the output probability is simple.

AIM-1567

Author[s]: Marina Meila and Michael I. Jordan

Learning Fine Motion by Markov Mixtures of Experts

November 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1567.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1567.pdf

Compliant control is a standard method for performing fine manipulation tasks, like grasping and assembly, but it requires estimation of the state of contact between the robot arm and the objects involved. Here we present a method to learn a model of the movement from measured data. The method requires little or no prior knowledge and the resulting model explicitly estimates the state of contact. The current state of contact is viewed as the hidden state variable of a discrete HMM. The control dependent transition probabilities between states are modeled as parametrized functions of the measurement We show that their parameters can be estimated from measurements concurrently with the estimation of the parameters of the movement in each state of contact. The learning algorithm is a variant of the EM procedure. The E step is computed exactly; solving the M step exactly would require solving a set of coupled nonlinear algebraic equations in the parameters. Instead, gradient ascent is used to produce an increase in likelihood.

AITR-1566

Author[s]: Tina Kapur

Segmentation of Brain Tissue from Magnetic Resonance Images

January 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1566.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1566.pdf

Segmentation of medical imagery is a challenging problem due to the complexity of the images, as well as to the absence of models of the anatomy that fully capture the possible deformations in each structure. Brain tissue is a particularly complex structure, and its segmentation is an important step for studies in temporal change detection of morphology, as well as for 3D visualization in surgical planning. In this paper, we present a method for segmentation of brain tissue from magnetic resonance images that is a combination of three existing techniques from the Computer Vision literature: EM segmentation, binary morphology, and active contour models. Each of these techniques has been customized for the problem of brain tissue segmentation in a way that the resultant method is more robust than its components. Finally, we present the results of a parallel implementation of this method on IBM's supercomputer Power Visualization System for a database of 20 brain scans each with 256x256x124 voxels and validate those against segmentations generated by neuroanatomy experts.

AIM-1565

CBCL-132

Author[s]: Padhraic Smyth, David Heckerman and Michael Jordan

Probabilistic Independence Networks for Hidden Markov Probability Models

March 13, 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1565.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1565.pdf

Graphical techniques for modeling the dependencies of randomvariables have been explored in a variety of different areas includingstatistics, statistical physics, artificial intelligence, speech recognition, image processing, and genetics.Formalisms for manipulating these models have been developedrelatively independently in these research communities. In this paper weexplore hidden Markov models (HMMs) and related structures within the general framework of probabilistic independencenetworks (PINs). The paper contains a self-contained review of the basic principles of PINs.It is shown that the well- known forward-backward (F-B) and Viterbialgorithms for HMMs are special cases of more general inference algorithms forarbitrary PINs. Furthermore, the existence of inference and estimationalgorithms for more general graphical models provides a set of analysistools for HMM practitioners who wish to explore a richer class of HMMstructures.Examples of relatively complex models to handle sensorfusion and coarticulationin speech recognitionare introduced and treated within the graphical model framework toillustrate the advantages of the general approach.

AIM-1564

Author[s]: Jonathan A. Rees

A Security Kernel Based on the Lambda-Calculus

March 13, 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1564.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1564.pdf

Cooperation between independent agents depends upon establishing adegree of security. Each of the cooperating agents needs assurance that the cooperation will not endanger resources of value to that agent. In a computer system, a computational mechanism can assure safe cooperation among the system's users by mediating resource access according to desired security policy. Such a mechanism, which is called a security kernel, lies at the heart of many operating systems and programming environments.The report describes Scheme 48, a programming environment whose design is guided by established principles of operating system security. Scheme 48's security kernel is small, consisting of the call- by-value $lambda$-calculus with a few simple extensions to support abstract data types, object mutation, and access to hardware resources. Each agent (user or subsystem) has a separate evaluation environment that holds objects representing privileges granted to that agent. Because environments ultimately determine availability of object references, protection and sharing can be controlled largely by the way in which environments are constructed. I will describe experience with Scheme 48 that shows how it serves as a robust and flexible experimental platform. Two successful applications of Scheme 48 are the programming environment for the Cornell mobile robots, where Scheme 48 runs with no (other) operating system support; and a secure multi- user environment that runs on workstations.

AITR-1563

Author[s]: John Bryant Morrell

Parallel Coupled Micro-Macro Actuators

January 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1563.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1563.pdf

This thesis presents a new actuator system consisting of a micro-actuator and a macro- actuator coupled in parallel via a compliant transmission. The system is called the Parallel Coupled Micro-Macro Actuator, or PaCMMA. In this system, the micro-actuator is capable of high bandwidth force control due to its low mass and direct-drive connection to the output shaft. The compliant transmission of the macro-actuator reduces the impedance (stiffness) at the output shaft and increases the dynamic range of force. Performance improvement over single actuator systems was expected in force control, impedance control, force distortion and reduction of transient impact forces. A set of quantitative measures is proposed and the actuator system is evaluated against them: Force Control Bandwidth, Position Bandwidth, Dynamic Range, Impact Force, Impedance ("Backdriveability'"), Force Distortion and Force Performance Space. Several theoretical performance limits are derived from the saturation limits of the system. A control law is proposed and control system performance is compared to the theoretical limits. A prototype testbed was built using permanenent magnet motors and an experimental comparison was performed between this actuator concept and two single actuator systems. The following performance was observed: Force bandwidth of 56Hz, Torque Dynamic Range of 800:1, Peak Torque of 1040mNm, Minimum Torque of 1.3mNm. Peak Impact Force was reduced by an order of magnitude. Distortion at small amplitudes was reduced substantially. Backdriven impedance was reduced by 2-3 orders of magnitude. This actuator system shows promise for manipulator design as well as psychophysical tests of human performance.

AIM-1562

CBCL-131

Author[s]: Michael I. Jordan and Christopher M. Bishop

Neural Networks

March 13, 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1562.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1562.pdf

We present an overview of current research on artificial neural networks, emphasizing a statistical perspective. We view neural networks as parameterized graphs that make probabilistic assumptions about data, and view learning algorithms as methods for finding parameter values that look probable in the light of the data. We discuss basic issues in representation and learning, and treat some of the practical issues that arise in fitting networks to data. We also discuss links between neural networks and the general formalism of graphical models.

AIM-1561

CBCL-130

Author[s]: Zoubin Ghahramani and Michael I. Jordan

Factorial Hidden Markov Models

February 9, 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1561.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1561.pdf

We present a framework for learning in hidden Markov models with distributed state representations. Within this framework, we derive a learning algorithm based on the Expectation--Maximization (EM) procedure for maximum likelihood estimation. Analogous to the standard Baum-Welch update rules, the M-step of our algorithm is exact and can be solved analytically. However, due to the combinatorial nature of the hidden state representation, the exact E-step is intractable. A simple and tractable mean field approximation is derived. Empirical results on a set of problems suggest that both the mean field approximation and Gibbs sampling are viable alternatives to the computationally expensive exact algorithm.

AIM-1560

CBCL-129

Author[s]: Tommi S. Jaakkola, Lawrence K. Saul and Michael I. Jordan

Fast Learning by Bounding Likelihoods in Sigmoid Type Belief Networks

February 9, 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1560.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1560.pdf

Sigmoid type belief networks, a class of probabilistic neural networks, provide a natural framework for compactly representing probabilistic information in a variety of unsupervised and supervised learning problems. Often the parameters used in these networks need to be learned from examples. Unfortunately, estimating the parameters via exact probabilistic calculations (i.e, the EM-algorithm) is intractable even for networks with fairly small numbers of hidden units. We propose to avoid the infeasibility of the E step by bounding likelihoods instead of computing them exactly. We introduce extended and complementary representations for these networks and show that the estimation of the network parameters can be made fast (reduced to quadratic optimization) by performing the estimation in either of the alternative domains. The complementary networks can be used for continuous density estimation as well.

AIM-1559

CBCL-128

Author[s]: Michael J. Jones, Tomaso Poggio

Model-Based Matching of Line Drawings by Linear Combinations of Prototypes

January 18, 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1559.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1559.pdf

We describe a technique for finding pixelwise correspondences between two images by using models of objects of the same class to guide the search. The object models are 'learned' from example images (also called prototypes) of an object class. The models consist of a linear combination ofsprototypes. The flow fields giving pixelwise correspondences between a base prototype and each of the other prototypes must be given. A novel image of an object of the same class is matched to a model by minimizing an error between the novel image and the current guess for the closest modelsimage. Currently, the algorithm applies to line drawings of objects. An extension to real grey level images is discussed.

AIM-1558

CBCL-129

Author[s]: Carl de Marcken

The Unsupervised Acquisition of a Lexicon from Continuous Speech

January 18, 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1558.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1558.pdf

We present an unsupervised learning algorithm that acquires a natural-language lexicon from raw speech. The algorithm is based on the optimal encoding of symbol sequences in an MDL framework, and uses a hierarchical representation of language that overcomes many of the problems that have stymied previous grammar-induction procedures. The forward mapping from symbol sequences to the speech stream is modeled using features based on articulatory gestures. We present results on the acquisition of lexicons and language models from raw speech, text, and phonetic transcripts, and demonstrate that our algorithm compares very favorably to other reported results with respect to segmentation performance and statistical efficiency.

AIM-1556

CBCL-127

Author[s]: David C. Somers, Emanuel V. Todorov, Athanassios G. Siapas and Mriganka Sur

Vector-Based Integration of Local and Long-Range Information in Visual Cortex

January 18, 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1556.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1556.pdf

Integration of inputs by cortical neurons provides the basis for the complex information processing performed in the cerebral cortex. Here, we propose a new analytic framework for understanding integration within cortical neuronal receptive fields. Based on the synaptic organization of cortex, we argue that neuronal integration is a systems--level process better studied in terms of local cortical circuitry than at the level of single neurons, and we present a method for constructing self-contained modules which capture (nonlinear) local circuit interactions. In this framework, receptive field elements naturally have dual (rather than the traditional unitary influence since they drive both excitatory and inhibitory cortical neurons. This vector-based analysis, in contrast to scalarsapproaches, greatly simplifies integration by permitting linear summation of inputs from both "classical" and "extraclassical" receptive field regions. We illustrate this by explaining two complex visual cortical phenomena, which are incompatible with scalar notions of neuronal integration.

AIM-1555

Author[s]: Thomas Marill

The Three-Dimensional Interpretation of a Class of Simple Line-Drawings

October 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1555.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1555.pdf

We provide a theory of the three-dimensional interpretation of a class of line-drawings called p-images, which are interpreted by the human vision system as parallelepipeds ("boxes"). Despite their simplicity, p-images raise a number of interesting vision questions: *Why are p-images seen as three-dimensional objects? Why not just as flatimages? *What are the dimensions and pose of the perceived objects? *Why are some p-images interpreted as rectangular boxes, while others are seen as skewed, even though there is no obvious distinction between the images? *When p-images are rotated in three dimensions, why are the image-sequences perceived as distorting objects---even though structure-from-motion would predict that rigid objects would be seen? *Why are some three-dimensional parallelepipeds seen as radically different when viewed from different viewpoints? We show that these and related questions can be answered with the help of a single mathematical result and an associated perceptual principle. An interesting special case arises when there are right angles in the p-image. This case represents a singularity in the equations and is mystifying from the vision point of view. It would seem that (at least in this case) the vision system does not follow the ordinary rules of geometry but operates in accordance with other (and as yet unknown) principles.

AIM-1554

Author[s]: D.A. Leopold, J.C. Fitzgibbons and N.K. Logothetis

The Role of Attention in Binocular Rivalry as Revealed Through Optokinetic Nystagmus

November 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1554.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1554.pdf

When stimuli presented to the two eyes differ considerably, stable binocular fusion fails, and the subjective percept alternates between the two monocular images, a phenomenon known as binocular rivalry. The influence of attention over this perceptual switching has long been studied, and although there is evidence that attention can affect the alternation rate, its role in the overall dynamics of the rivalry process remains unclear. The present study investigated the relationship between the attention paid to the rivalry stimulus, and the dynamics of the perceptual alternations. Specifically, the temporal course of binocular rivalry was studied as the subjects performed difficult nonvisual and visual concurrent tasks, directing their attention away from the rivalry stimulus. Periods of complete perceptual dominance were compared for the attended condition, where the subjects reported perceptual changes, and the unattended condition, where one of the simultaneous tasks was performed. During both the attended and unattended conditions, phases of rivalry dominance were obtained by analyzing the subject"s optokinetic nystagmus recorded by an electrooculogram, where the polarity of the nystagmus served as an objective indicator of the perceived direction of motion. In all cases, the presence of a difficult concurrent task had little or no effect on the statistics of the alternations, as judged by two classic tests of rivalry, although the overall alternation rate showed a small but significant increase with the concurrent task. It is concluded that the statistical patterns of rivalry alternations are not governed by attentional shifts or decision-making on the part of the subject.

AIM-1553

Author[s]: N.K. Logothetis and D.A. Leopold

On the Physiology of Bistable Percepts

November 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1553.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1553.pdf

Binocular rivalry refers to the alternating perceptions experienced when two dissimilar patterns are stereoscopically viewed. To study the neural mechanism that underlies such competitive interactions, single cells were recorded in the visual areas V1, V2, and V4, while monkeys reported the perceived orientation of rivaling sinusoidal grating patterns. A number of neurons in all areas showed alternating periods of excitation and inhibition that correlated with the perceptual dominance and suppression of the cell"s preferred orientation. The remaining population of cells were not influenced by whether or not the optimal stimulus orientation was perceptually suppressed. Response modulation during rivalry was not correlated with cell attributes such as monocularity, binocularity, or disparity tuning. These results suggest that the awareness of a visual pattern during binocular rivalry arises through interactions between neurons at different levels of visual pathways, and that the site of suppression is unlikely to correspond to a particular visual area, as often hypothesized on the basis of psychophysical observations. The cell-types of modulating neurons and their overwhelming preponderance in higher rather than in early visual areas also suggests -- together with earlier psychophysical evidence -- the possibility of a common mechanism underlying rivalry as well as other bistable percepts, such as those experienced with ambiguous figures.

AIM-1552

Author[s]: David A. Cohn

Minimizing Statistical Bias with Queries

September 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1552.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1552.pdf

I describe an exploration criterion that attempts to minimize the error of a learner by minimizing its estimated squared bias. I describe experiments with locally-weighted regression on two simple kinematics problems, and observe that this "bias-only" approach outperforms the more common "variance-only" exploration approach, even in the presence of noise.

AIM-1551

Author[s]: Jacob Katzenelson and Aharon Unikovski

A Network Charge-Orineted MOS Transistor Model

August 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1551.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1551.pdf

The MOS transistor physical model as described in [3] is presented here as a network model. The goal is to obtain an accurate model, suitable for simulation, free from certain problems reported in the literature [13], and conceptually as simple as possible. To achieve this goal the original model had to be extended and modified. The paper presents the derivation of the network model from physical equations, including the corrections which are required for simulation and which compensate for simplifications introduced in the original physical model. Our intrinsic MOS model consists of three nonlinear voltage-controlled capacitors and a dependent current source. The charges of the capacitors and the current of the current source are functions of the voltages $V_{gs}$, $V_{bs}$, and $V_{ds}$. The complete model consists of the intrinsic model plus the parasitics. The apparent simplicity of the model is a result of hiding information in the characteristics of the nonlinear components. The resulted network model has been checked by simulation and analysis. It is shown that the network model is suitable for simulation: It is defined for any value of the voltages; the functions involved are continuous and satisfy Lipschitz conditions with no jumps at region boundaries; Derivatives have been computed symbolically and are available for use by the Newton-Raphson method. The model"s functions can be measured from the terminals. It is also shown that small channel effects can be included in the model. Higher frequency effects can be modeled by using a network consisting of several sections of the basic lumped model. Future plans include a detailed comparison of the network model with models such as SPICE level 3 and a comparison of the multi- section higher frequency model with experiments.

AIM-1550

Author[s]: T.D. Alter and Ronen Basri

Extracting Salient Curves from Images: An Analysis of the Saliency Network

August 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1550.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1550.pdf

The Saliency Network proposed by Shashua and Ullman is a well-known approach to the problem of extracting salient curves from images while performing gap completion. This paper analyzes the Saliency Network. The Saliency Network is attractive for several reasons. First, the network generally prefers long and smooth curves over short or wiggly ones. While computing saliencies, the network also fills in gaps with smooth completions and tolerates noise. Finally, the network is locally connected, and its size is proportional to the size of the image. Nevertheless, our analysis reveals certain weaknesses with the method. In particular, we show cases in which the most salient element does not lie on the perceptually most salient curve. Furthermore, in some cases the saliency measure changes its preferences when curves are scaled uniformly. Also, we show that for certain fragmented curves the measure prefers large gaps over a few small gaps of the same total size. In addition, we analyze the time complexity required by the method. We show that the number of steps required for convergence in serial implementations is quadratic in the size of the network, and in parallel implementations is linear in the size of the network. We discuss problems due to coarse sampling of the range of possible orientations. We show that with proper sampling the complexity of the network becomes cubic in the size of the network. Finally, we consider the possibility of using the Saliency Network for grouping. We show that the Saliency Network recovers the most salient curve efficiently, but it has problems with identifying any salient curve other than the most salient one.

AIM-1549

Author[s]: Roberto Brunelli and Tomaso Poggio

Template Matching: Matched Spatial Filters and Beyond

October 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1549.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1549.pdf

Template matching by means of cross-correlation is common practice in pattern recognition. However, its sensitivity to deformations of the pattern and the broad and unsharp peaks it produces are significant drawbacks. This paper reviews some results on how these shortcomings can be removed. Several techniques (Matched Spatial Filters, Synthetic Discriminant Functions, Principal Components Projections and Reconstruction Residuals) are reviewed and compared on a common task: locating eyes in a database of faces. New variants are also proposed and compared: least squares Discriminant Functions and the combined use of projections on eigenfunctions and the corresponding reconstruction residuals. Finally, approximation networks are introduced in an attempt to improve filter design by the introduction of nonlinearity.

AITR-1548

Author[s]: Paul A. Viola

Alignment by Maximization of Manual Information

March 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1548.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1548.pdf

A new information-theoretic approach is presented for finding the pose of an object in an image. The technique does not require information about the surface properties of the object, besides its shape, and is robust with respect to variations of illumination. In our derivation, few assumptions are made about the nature of the imaging process. As a result the algorithms are quite general and can foreseeably be used in a wide variety of imaging situations. Experiments are presented that demonstrate the approach registering magnetic resonance (MR) images with computed tomography (CT) images, aligning a complex 3D object model to real scenes including clutter and occlusion, tracking a human head in a video sequence and aligning a view-based 2D object model to real images. The method is based on a formulation of the mutual information between the model and the image called EMMA. As applied here the technique is intensity-based, rather than feature-based. It works well in domains where edge or gradient-magnitude based methods have difficulty, yet it is more robust than traditional correlation. Additionally, it has an efficient implementation that is based on stochastic approximation. Finally, we will describe a number of additional real- world applications that can be solved efficiently and reliably using EMMA. EMMA can be used in machine learning to find maximally informative projections of high-dimensional data. EMMA can also be used to detect and correct corruption in magnetic resonance images (MRI).

AIM-1547

Author[s]: Michael R. Blair, Natalya Cohen, David M. LaMacchia and Brian K. Zuzga

MIT SchMUSE: Class-Based Remote Delegation in a Capricious Distributed Environment

February 1993

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1547.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1547.pdf

MIT SchMUSE (pronounced "shmooz") is a concurrent, distributed, delegation-based object-oriented interactive environment with persistent storage. It is designed to run in a "capricious" network environment, where servers can migrate from site to site and can regularly become unavailable. Our design introduces a new form of unique identifiers called "globally unique tickets" that provide globally unique time/space stamps for objects and classes without being location specific. Object location is achieved by a distributed hierarchical lazy lookup mechanism that we call "realm resolution." We also introduce a novel mechanism called "message deferral" for enhanced reliability in the face of remote delegation. We conclude with a comparison to related work and a projection of future work on MIT SchMUSE.

AITR-1546

Author[s]: Yoky Matsuoka

Embodiment and Manipulation Learning Process for a Humanoid Hand

May 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1546.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1546.pdf

Babies are born with simple manipulation capabilities such as reflexes to perceived stimuli. Initial discoveries by babies are accidental until they become coordinated and curious enough to actively investigate their surroundings. This thesis explores the development of such primitive learning systems using an embodied light-weight hand with three fingers and a thumb. It is self- contained having four motors and 36 exteroceptor and proprioceptor sensors controlled by an on-palm microcontroller. Primitive manipulation is learned from sensory inputs using competitive learning, back-propagation algorithm and reinforcement learning strategies. This hand will be used for a humanoid being developed at the MIT Artificial Intelligence Laboratory.

AITR-1545

Author[s]: James A. Stuart Fiske

Thread Scheduling Mechanisms for Multiple-Context Parallel Processors

June 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1545.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1545.pdf

Scheduling tasks to efficiently use the available processor resources is crucial to minimizing the runtime of applications on shared-memory parallel processors. One factor that contributes to poor processor utilization is the idle time caused by long latency operations, such as remote memory references or processor synchronization operations. One way of tolerating this latency is to use a processor with multiple hardware contexts that can rapidly switch to executing another thread of computation whenever a long latency operation occurs, thus increasing processor utilization by overlapping computation with communication. Although multiple contexts are effective for tolerating latency, this effectiveness can be limited by memory and network bandwidth, by cache interference effects among the multiple contexts, and by critical tasks sharing processor resources with less critical tasks. This thesis presents techniques that increase the effectiveness of multiple contexts by intelligently scheduling threads to make more efficient use of processor pipeline, bandwidth, and cache resources. This thesis proposes thread prioritization as a fundamental mechanism for directing the thread schedule on a multiple-context processor. A priority is assigned to each thread either statically or dynamically and is used by the thread scheduler to decide which threads to load in the contexts, and to decide which context to switch to on a context switch. We develop a multiple-context model that integrates both cache and network effects, and shows how thread prioritization can both maintain high processor utilization, and limit increases in critical path runtime caused by multithreading. The model also shows that in order to be effective in bandwidth limited applications, thread prioritization must be extended to prioritize memory requests. We show how simple hardware can prioritize the running of threads in the multiple contexts, and the issuing of requests to both the local memory and the network. Simulation experiments show how thread prioritization is used in a variety of applications. Thread prioritization can improve the performance of synchronization primitives by minimizing the number of processor cycles wasted in spinning and devoting more cycles to critical threads. Thread prioritization can be used in combination with other techniques to improve cache performance and minimize cache interference between different working sets in the cache. For applications that are critical path limited, thread prioritization can improve performance by allowing processor resources to be devoted preferentially to critical threads. These experimental results show that thread prioritization is a mechanism that can be used to implement a wide range of scheduling policies.

AITR-1544

Author[s]: J.P. Mellor

Enhanced Reality Visualization in a Surgical Environment

January 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1544.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1544.pdf

Enhanced reality visualization is the process of enhancing an image by adding to it information which is not present in the original image. A wide variety of information can be added to an image ranging from hidden lines or surfaces to textual or iconic data about a particular part of the image. Enhanced reality visualization is particularly well suited to neurosurgery. By rendering brain structures which are not visible, at the correct location in an image of a patient's head, the surgeon is essentially provided with X-ray vision. He can visualize the spatial relationship between brain structures before he performs a craniotomy and during the surgery he can see what's under the next layer before he cuts through. Given a video image of the patient and a three dimensional model of the patient's brain the problem enhanced reality visualization faces is to render the model from the correct viewpoint and overlay it on the original image. The relationship between the coordinate frames of the patient, the patient's internal anatomy scans and the image plane of the camera observing the patient must be established. This problem is closely related to the camera calibration problem. This report presents a new approach to finding this relationship and develops a system for performing enhanced reality visualization in a surgical environment. Immediately prior to surgery a few circular fiducials are placed near the surgical site. An initial registration of video and internal data is performed using a laser scanner. Following this, our method is fully automatic, runs in nearly real-time, is accurate to within a pixel, allows both patient and camera motion, automatically corrects for changes to the internal camera parameters (focal length, focus, aperture, etc.) and requires only a single image.

AITR-1543

Author[s]: Brian Scott Eberman

Contact Sensing: A Sequential Decision Approach to Sensing Manipulation Contact

May 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1543.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1543.pdf

This paper describes a new statistical, model-based approach to building a contact state observer. The observer uses measurements of the contact force and position, and prior information about the task encoded in a graph, to determine the current location of the robot in the task configuration space. Each node represents what the measurements will look like in a small region of configuration space by storing a predictive, statistical, measurement model. This approach assumes that the measurements are statistically block independent conditioned on knowledge of the model, which is a fairly good model of the actual process. Arcs in the graph represent possible transitions between models. Beam Viterbi search is used to match measurement history against possible paths through the model graph in order to estimate the most likely path for the robot. The resulting approach provides a new decision process that can be use as an observer for event driven manipulation programming. The decision procedure is significantly more robust than simple threshold decisions because the measurement history is used to make decisions. The approach can be used to enhance the capabilities of autonomous assembly machines and in quality control applications.

AIM-1542

Author[s]: D. McAllester, P. Van Henlenryck and T. Kapur

Three Cuts for Accelerated Interval Propagation

May 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1542.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1542.pdf

This paper addresses the problem of nonlinear multivariate root finding. In an earlier paper we described a system called Newton which finds roots of systems of nonlinear equations using refinements of interval methods. The refinements are inspired by AI constraint propagation techniques. Newton is competative with continuation methods on most benchmarks and can handle a variety of cases that are infeasible for continuation methods. This paper presents three "cuts" which we believe capture the essential theoretical ideas behind the success of Newton. This paper describes the cuts in a concise and abstract manner which, we believe, makes the theoretical content of our work more apparent. Any implementation will need to adopt some heuristic control mechanism. Heuristic control of the cuts is only briefly discussed here.

AITR-1541

Author[s]: Elmer S. Hung

Parameter Estimation in Chaotic Systems

April 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1541.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1541.pdf

This report examines how to estimate the parameters of a chaotic system given noisy observations of the state behavior of the system. Investigating parameter estimation for chaotic systems is interesting because of possible applications for high-precision measurement and for use in other signal processing, communication, and control applications involving chaotic systems. In this report, we examine theoretical issues regarding parameter estimation in chaotic systems and develop an efficient algorithm to perform parameter estimation. We discover two properties that are helpful for performing parameter estimation on non-structurally stable systems. First, it turns out that most data in a time series of state observations contribute very little information about the underlying parameters of a system, while a few sections of data may be extraordinarily sensitive to parameter changes. Second, for one-parameter families of systems, we demonstrate that there is often a preferred direction in parameter space governing how easily trajectories of one system can "shadow'" trajectories of nearby systems. This asymmetry of shadowing behavior in parameter space is proved for certain families of maps of the interval. Numerical evidence indicates that similar results may be true for a wide variety of other systems. Using the two properties cited above, we devise an algorithm for performing parameter estimation. Standard parameter estimation techniques such as the extended Kalman filter perform poorly on chaotic systems because of divergence problems. The proposed algorithm achieves accuracies several orders of magnitude better than the Kalman filter and has good convergence properties for large data sets.

AIM-1537

Author[s]: David Beymer

Vectorizing Face Images by Interpreting Shape and Texture Computations

September 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1537.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1537.pdf

The correspondence problem in computer vision is basically a matching task between two or more sets of features. In this paper, we introduce a vectorized image representation, which is a feature-based representation where correspondence has been established with respect to a reference image. This representation has two components: (1) shape, or (x, y) feature locations, and (2) texture, defined as the image grey levels mapped onto the standard reference image. This paper explores an automatic technique for "vectorizing" face images. Our face vectorizer alternates back and forth between computation steps for shape and texture, and a key idea is to structure the two computations so that each one uses the output of the other. A hierarchical coarse-to-fine implementation is discussed, and applications are presented to the problems of facial feature detection and registration of two arbitrary faces.

AIM-1536

Author[s]: David Beymer and Tomaso Poggio

Face Recognition from One Example View

September 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1536.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1536.pdf

If we are provided a face database with only one example view per person, is it possible to recognize new views of them under a variety of different poses, especially views rotated in depth from the original example view? We investigate using prior knowledge about faces plus each single example view to generate virtual views of each person, or views of the face as seen from different poses. Prior knowledge of faces is represented in an example-based way, using 2D views of a prototype face seen rotating in depth. The synthesized virtual views are evaluated as example views in a view-based approach to pose-invariant face recognition. They are shown to improve the recognition rate over the scenario where only the single real view is used.

AIM-1535

Author[s]: Panayotis Skordos and Gerald Jay Sussman

Comparison Between Subsonic Flow Simulation and Physical Measurements of Flue Pipes

April 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1535.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1535.pdf

Direct simulations of wind musical instruments using the compressible Navier Stokes equations have recently become possible through the use of parallel computing and through developments in numerical methods. As a first demonstration, the flow of air and the generation of musical tones inside a soprano recorder are simulated numerically. In addition, physical measurements are made of the acoustic signal generated by the recorder at different blowing speeds. The comparison between simulated and physically measured behavior is encouraging and points towards ways of improving the simulations.

AIM-1534

Author[s]: Panayotis A. Skordos

Aeroacoustics on Non-Dedicated Workstations

April 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1534.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1534.pdf

The simulation of subsonic aeroacoustic problems such as the flow-generated sound of wind instruments is well suited for parallel computing on a cluster of non-dedicated workstations. Simulations are demonstrated which employ 20 non-dedicated Hewlett-Packard workstations (HP9000/715), and achieve comparable performance on this problem as a 64-node CM-5 dedicated supercomputer with vector units. The success of the present approach depends on the low communication requirements of the problem (low communication to computation ratio) which arise from the coarse-grain decomposition of the problem and the use of local-interaction methods. Many important problems may be suitable for this type of parallel computing including computer vision, circuit simulation, and other subsonic flow problems.

AIM-1533

Author[s]: N.K. Logothetis, J. Pauls and T. Poggio

Spatial Reference Frames for Object Recognition: Tuning for Rotations in Depth

March 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1533.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1533.pdf

The inferior temporal cortex (IT) of monkeys is thought to play an essential role in visual object recognition. Inferotemporal neurons are known to respond to complex visual stimuli, including patterns like faces, hands, or other body parts. What is the role of such neurons in object recognition? The present study examines this question in combined psychophysical and electrophysiological experiments, in which monkeys learned to classify and recognize novel visual 3D objects. A population of neurons in IT were found to respond selectively to such objects that the monkeys had recently learned to recognize. A large majority of these cells discharged maximally for one view of the object, while their response fell off gradually as the object was rotated away from the neuron"s preferred view. Most neurons exhibited orientation-dependent responses also during view-plane rotations. Some neurons were found tuned around two views of the same object, while a very small number of cells responded in a view- invariant manner. For five different objects that were extensively used during the training of the animals, and for which behavioral performance became view-independent, multiple cells were found that were tuned around different views of the same object. No selective responses were ever encountered for views that the animal systematically failed to recognize. The results of our experiments suggest that neurons in this area can develop a complex receptive field organization as a consequence of extensive training in the discrimination and recognition of objects. Simple geometric features did not appear to account for the neurons" selective responses. These findings support the idea that a population of neurons -- each tuned to a different object aspect, and each showing a certain degree of invariance to image transformations -- may, as an assembly, encode complex 3D objects. In such a system, several neurons may be active for any given vantage point, with a single unit acting like a blurred template for a limited neighborhood of a single view.

AIM-1532

Author[s]: Marco Fillo, Stephen W. Keckler, William J. Dally, Nicholas P. Carter, Andrew Chang, Yevgeny Gurevich and Whay S. Lee

The M-Machine Multicomputer

March 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1532.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1532.pdf

The M-Machine is an experimental multicomputer being developed to test architectural concepts motivated by the constraints of modern semiconductor technology and the demands of programming systems. The M- Machine computing nodes are connected with a 3-D mesh network; each node is a multithreaded processor incorporating 12 function units, on-chip cache, and local memory. The multiple function units are used to exploit both instruction-level and thread-level parallelism. A user accessible message passing system yields fast communication and synchronization between nodes. Rapid access to remote memory is provided transparently to the user with a combination of hardware and software mechanisms. This paper presents the architecture of the M-Machine and describes how its mechanisms maximize both single thread performance and overall system throughput.

AIM-1531

Author[s]: Thomas Vetter and Tomaso Poggio

Linear Object Classes and Image Synthesis from a Single Example Image

March 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1531.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1531.pdf

The need to generate new views of a 3D object from a single real image arises in several fields, including graphics and object recognition. While the traditional approach relies on the use of 3D models, we have recently introduced techniques that are applicable under restricted conditions but simpler. The approach exploits image transformations that are specific to the relevant object class and learnable from example views of other "prototypical" objects of the same class. In this paper, we introduce such a new technique by extending the notion of linear class first proposed by Poggio and Vetter. For linear object classes it is shown that linear transformations can be learned exactly from a basis set of 2D prototypical views. We demonstrate the approach on artificial objects and then show preliminary evidence that the technique can effectively "rotate" high- resolution face images from a single 2D view.

AIM-1530

Author[s]: Partha Niyogi and Robert C. Berwick

A Note of Zipf's Law, Natural Languages, and Noncoding DNA Regions

March 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1530.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1530.pdf

In Phys. Rev. Letters (73:2), Mantegna et al. conclude on the basis of Zipf rank frequency data that noncoding DNA sequence regions are more like natural languages than coding regions. We argue on the contrary that an empirical fit to Zipf"s "law" cannot be used as a criterion for similarity to natural languages. Although DNA is a presumably "organized system of signs" in Mandelbrot"s (1961) sense, and observation of statistical featurs of the sort presented in the Mantegna et al. paper does not shed light on the similarity between DNA's "gramar" and natural language grammars, just as the observation of exact Zipf-like behavior cannot distinguish between the underlying processes of tossing an M-sided die or a finite-state branching process.

AITR-1529

Author[s]: Aparna Lakshmi Ratan

The Role of Fixation and Visual Attention in Object Recognition

July 21, 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1529.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1529.pdf

This research project is a study of the role of fixation and visual attention in object recognition. In this project, we build an active vision system which can recognize a target object in a cluttered scene efficiently and reliably. Our system integrates visual cues like color and stereo to perform figure/ground separation, yielding candidate regions on which to focus attention. Within each image region, we use stereo to extract features that lie within a narrow disparity range about the fixation position. These selected features are then used as input to an alignment-style recognition system. We show that visual attention and fixation significantly reduce the complexity and the false identifications in model-based recognition using Alignment methods. We also demonstrate that stereo can be used effectively as a figure/ground separator without the need for accurate camera calibration.

AITR-1527

Author[s]: Panayotis A. Skordos

Modeling Flue Pipes: Subsonic Flow, Lattice Boltzmann, and Parallel Distributed Computers

Arpil 21, 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1527.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1527.pdf

The problem of simulating the hydrodynamics and the acoustic waves inside wind musical instruments such as the recorder, the organ, and the flute is considered. The problem is attacked by developing suitable local- interaction algorithms and a parallel simulation system on a cluster of non- dedicated workstations. Physical measurements of the acoustic signal of various flue pipes show good agreement with the simulations. Previous attempts at this problem have been frustrated because the modeling of acoustic waves requires small integration time steps which make the simulation very compute-intensive. In addition, the simulation of subsonic viscous compressible flow at high Reynolds numbers is susceptible to slow-growing numerical instabilities which are triggered by high- frequency acoustic modes. The numerical instabilities are mitigated by employing suitable explicit algorithms: lattice Boltzmann method, compressible finite differences, and fourth-order artificial-viscosity filter. Further, a technique for accurate initial and boundary conditions for the lattice Boltzmann method is developed, and the second-order accuracy of the lattice Boltzmann method is demonstrated. The compute-intensive requirements are handled by developing a parallel simulation system on a cluster of non-dedicated workstations. The system achieves 80 percent parallel efficiency (speedup/processors) using 20 HP-Apollo workstations. The system is built on UNIX and TCP/IP communication routines, and includes automatic process migration from busy hosts to free hosts.

AIM-1526

Author[s]: Kenji Nagao and Berthold Horn

Direct Object Recognition Using No Higher Than Second or Third Order Statistics of the Image

December 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1526.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1526.pdf

Novel algorithms for object recognition are described that directly recover the transformations relating the image to its model. Unlike methods fitting the typical conventional framework, these new methods do not require exhaustive search for each feature correspondence in order to solve for the transformation. Yet they allow simultaneous object identification and recovery of the transformation. Given hypothesized % potentially corresponding regions in the model and data (2D views) --- which are from planar surfaces of the 3D objects --- these methods allow direct compututation of the parameters of the transformation by which the data may be generated from the model. We propose two algorithms: one based on invariants derived from no higher than second and third order moments of the image, the other via a combination of the affine properties of geometrical and the differential attributes of the image. Empirical results on natural images demonstrate the effectiveness of the proposed algorithms. A sensitivity analysis of the algorithm is presented. We demonstrate in particular that the differential method is quite stable against perturbations --- although not without some error --- when compared with conventional methods. We also demonstrate mathematically that even a single point correspondence suffices, theoretically at least, to recover affine parameters via the differential method.

AITR-1524

Author[s]: Matthew M. Williamson

Series Elastic Actuators

September 7, 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1524.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1524.pdf

This thesis presents the design, construction, control and evaluation of a novel force controlled actuator. Traditional force controlled actuators are designed from the premise that "Stiffer is better''. This approach gives a high bandwidth system, prone to problems of contact instability, noise, and low power density. The actuator presented in this thesis is designed from the premise that "Stiffness isn't everything". The actuator, which incorporates a series elastic element, trades off achievable bandwidth for gains in stable, low noise force control, and protection against shock loads. This thesis reviews related work in robot force control, presents theoretical descriptions of the control and expected performance from a series elastic actuator, and describes the design of a test actuator constructed to gather performance data. Finally the performance of the system is evaluated by comparing the performance data to theoretical predictions.

AIM-1523

Author[s]: Kenji Nagao and Eric Grimson

Recognizing 3D Object Using Photometric Invariant

April 22, 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1523.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1523.pdf

In this paper we describe a new efficient algorithm for recognizing 3D objects by combining photometric and geometric invariants. Some photometric properties are derived, that are invariant to the changes of illumination and to relative object motion with respect to the camera and/or the lighting source in 3D space. We argue that conventional color constancy algorithms can not be used in the recognition of 3D objects. Further we show recognition does not require a full constancy of colors, rather, it only needs something that remains unchanged under the varying light conditions sand poses of the objects. Combining the derived color invariants and the spatial constraints on the object surfaces, we identify corresponding positions in the model and the data space coordinates, using centroid invariance of corresponding groups of feature positions. Tests are given to show the stability and efficiency of our approach to 3D object recognition.

AIM-1522

CBCL-110

Author[s]: David A. Cohn, Zoubin Ghahramani and Michael I. Jordan

Active Learning with Statistical Models

March 21, 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1522.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1522.pdf

For many types of learners one can compute the statistically 'optimal' way to select data. We review how these techniques have been used with feedforward neural networks. We then show how the same principles may be used to select data for two alternative, statistically-based learning architectures: mixtures of Gaussians and locally weighted regression. While the techniques for neural networks are expensive and approximate, the techniques for mixtures of Gaussians and locally weighted regression are both efficient and accurate.

AIM-1521

CBCL-112

Author[s]: Kah Kay Sung and Tomaso Poggio

Example Based Learning for View-Based Human Face Detection

January 24, 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1521.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1521.pdf

We present an example-based learning approach for locating vertical frontal views of human faces in complex scenes. The technique models the distribution of human face patterns by means of a few view-based "face'' and "non-face'' prototype clusters. At each image location, the local pattern is matched against the distribution-based model, and a trained classifier determines, based on the local difference measurements, whether or not a human face exists at the current image location. We provide an analysis that helps identify the critical components of our system.

AIM-1520

CBCL-111

Author[s]: Michael Jordan and Lei Xu

On Convergence Properties of the EM Algorithm for Gaussian Mixtures

April 21, 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1520.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1520.pdf

"Expectation-Maximization'' (EM) algorithm and gradient-based approaches for maximum likelihood learning of finite Gaussian mixtures. We show that the EM step in parameter space is obtained from the gradient via a projection matrix $P$, and we provide an explicit expression for the matrix. We then analyze the convergence of EM in terms of special properties of $P$ and provide new results analyzing the effect that $P$ has on the likelihood surface. Based on these mathematical results, we present a comparative discussion of the advantages and disadvantages of EM and other algorithms for the learning of Gaussian mixture models.

AIM-1518

CBCL-106

Author[s]: Pawan Sinha and Tomaso Poggio

View-Based Strategies for 3D Object Recognition

April 21, 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1518.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1518.pdf

A persistent issue of debate in the area of 3D object recognition concerns the nature of the experientially acquired object models in the primate visual system. One prominent proposal in this regard has expounded the use of object centered models, such as representations of the objects’ 3D structures in a coordinate frame independent of the viewing parameters [Marr and Nishihara, 1978]. In contrast to this is another proposal which suggests that the viewing parameters encountered during the learning phase might be inextricably linked to subsequent performance on a recognition task [Tarr and Pinker, 1989; Poggio and Edelman, 1990]. The ‘object model’, according to this idea, is simply a collection of the sample views encountered during training. Given that object centered recognition strategies have the attractive feature of leading to viewpoint independence, they have garnered much of the research effort in the field of computational vision. Furthermore, since human recognition performance seems remarkably robust in the face of imaging variations [Ellis et al., 1989], it has often been implicitly assumed that the visual system employs an object centered strategy. In the present study we examine this assumption more closely. Our experimental results with a class of novel 3D structures strongly suggest the use of a view-based strategy by the human visual system even when it has the opportunity of constructing and using object-centered models. In fact, for our chosen class of objects, the results seem to support a stronger claim: 3D object recognition is 2D view-based.

AIM-1517

CBCL-137

Author[s]: Douglas A. Jones, Robert C. Berwick, Franklin Cho, Zeeshan Khan, Karen T. Kohl, Naoyuki Nomura, Anand Radhakrishnan, Ulrich Sauerland and Brian Ulicny

Verb Classes and Alternations in Bangla, German, English, and Korean

May 6, 1996

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1517.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1517.pdf

In this report, we investigate the relationship between the semantic and syntactic properties of verbs. Our work is based on the English Verb Classes and Alternations of (Levin, 1993). We explore how these classes are manifested in other languages, in particular, in Bangla, German, and Korean. Our report includes a survey and classification of several hundred verbs from these languages into the cross-linguistic equivalents of Levin's classes. We also explore ways in which our findings may be used to enhance WordNet in two ways: making the English syntactic information of WordNet more fine-grained, and making WordNet multilingual.

AIM-1516

CBCL-115

Author[s]: Partha Niyogi and Robert Berwick

The Logical Problem of Language Change

December 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1516.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1516.pdf

This paper considers the problem of language change. Linguists must explain not only how languages are learned but also how and why they have evolved along certain trajectories and not others. While the language learning problem has focused on the behavior of individuals and how they acquire a particular grammar from a class of grammars ${cal G}$, here we consider a population of such learners and investigate the emergent, global population characteristics of linguistic communities over several generations. We argue that language change follows logically from specific assumptions about grammatical theories and learning paradigms. In particular, we are able to transform parameterized theories and memoryless acquisition algorithms into grammatical dynamical systems, whose evolution depicts a population's evolving linguistic composition. We investigate the linguistic and computational consequences of this model, showing that the formalization allows one to ask questions about diachronic that one otherwise could not ask, such as the effect of varying initial conditions on the resulting diachronic trajectories. From a more programmatic perspective, we give an example of how the dynamical system model for language change can serve as a way to distinguish among alternative grammatical theories, introducing a formal diachronic adequacy criterion for linguistic theories.

AIM-1515

CBCL-114

Author[s]: Partha Niyogi and Robert Berwick

A Dynamical Systems Model for Language Change

December 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1515.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1515.pdf

Formalizing linguists' intuitions of language change as a dynamical system, we quantify the time course of language change including sudden vs. gradual changes in languages. We apply the computer model to the historical loss of Verb Second from Old French to modern French, showing that otherwise adequate grammatical theories can fail our new evolutionary criterion.

AIM-1514

CBCL-113

Author[s]: Partha Niyogi

Sequential Optimal Recovery: A Paradigm for Active Learning

May 12, 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1514.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1514.pdf

In most classical frameworks for learning from examples, it is assumed that examples are randomly drawn and presented to the learner. In this paper, we consider the possibility of a more active learner who is allowed to choose his/her own examples. Our investigations are carried out in a function approximation setting. In particular, using arguments from optimal recovery (Micchelli and Rivlin, 1976), we develop an adaptive sampling strategy (equivalent to adaptive approximation) for arbitrary approximation schemes. We provide a general formulation of the problem and show how it can be regarded as sequential optimal recovery. We demonstrate the application of this general formulation to two special cases of functions on the real line 1) monotonically increasing functions and 2) functions with bounded derivative. An extensive investigation of the sample complexity of approximating these functions is conducted yielding both theoretical and empirical results on test functions. Our theoretical results (stated insPAC-style), along with the simulations demonstrate the superiority of our active scheme over both passive learning as well as classical optimal recovery. The analysis of active function approximation is conducted in a worst-case setting, in contrast with other Bayesian paradigms obtained from optimal design (Mackay, 1992).

AITR-1513

Author[s]: Ruth Bergman

Learning World Models in Environments with Manifest Causal Structure

May 5, 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1513.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1513.pdf

This thesis examines the problem of an autonomous agent learning a causal world model of its environment. Previous approaches to learning causal world models have concentrated on environments that are too "easy" (deterministic finite state machines) or too "hard" (containing much hidden state). We describe a new domain --- environments with manifest causal structure - -- for learning. In such environments the agent has an abundance of perceptions of its environment. Specifically, it perceives almost all the relevant information it needs to understand the environment. Many environments of interest have manifest causal structure and we show that an agent can learn the manifest aspects of these environments quickly using straightforward learning techniques. We present a new algorithm to learn a rule-based causal world model from observations in the environment. The learning algorithm includes (1) a low level rule-learning algorithm that converges on a good set of specific rules, (2) a concept learning algorithm that learns concepts by finding completely correlated perceptions, and (3) an algorithm that learns general rules. In addition this thesis examines the problem of finding a good expert from a sequence of experts. Each expert has an "error rate"; we wish to find an expert with a low error rate. However, each expert's error rate and the distribution of error rates are unknown. A new expert-finding algorithm is presented and an upper bound on the expected error rate of the expert is derived.

AIM-1512

CBCL-103

Author[s]: M. Poggio and T. Poggio

Cooperative Physics of Fly Swarms: An Emergent Behavior

April 11, 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1512.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1512.pdf

We have simulated the behavior of several artificial flies, interacting visually with each other. Each fly is described by a simple tracking system (Poggio and Reichardt, 1973; Land and Collett, 1974) which summarizes behavioral experiments in which individual flies fixate a target. Our main finding is that the interaction of theses implemodules gives rise to a variety of relatively complex behaviors. In particular, we observe a swarm-like behavior of a group of many artificial flies for certain reasonable ranges of our tracking system parameters.

AITR-1511

Author[s]: Ian Horswill

Specialization of Perceptual Processes

April 22, 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1511.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1511.pdf

In this report, I discuss the use of vision to support concrete, everyday activity. I will argue that a variety of interesting tasks can be solved using simple and inexpensive vision systems. I will provide a number of working examples in the form of a state-of-the-art mobile robot, Polly, which uses vision to give primitive tours of the seventh floor of the MIT AI Laboratory. By current standards, the robot has a broad behavioral repertoire and is both simple and inexpensive (the complete robot was built for less than $20,000 using commercial board-level components). The approach I will use will be to treat the structure of the agent's activity---its task and environment---as positive resources for the vision system designer. By performing a careful analysis of task and environment, the designer can determine a broad space of mechanisms which can perform the desired activity. My principal thesis is that for a broad range of activities, the space of applicable mechanisms will be broad enough to include a number mechanisms which are simple and economical. The simplest mechanisms that solve a given problem will typically be quite specialized to that problem. One thus worries that building simple vision systems will be require a great deal of {it ad-hoc} engineering that cannot be transferred to other problems. My second thesis is that specialized systems can be analyzed and understood in a principled manner, one that allows general lessons to be extracted from specialized systems. I will present a general approach to analyzing specialization through the use of transformations that provably improve performance. By demonstrating a sequence of transformations that derive a specialized system from a more general one, we can summarize the specialization of the former in a compact form that makes explicit the additional assumptions that it makes about its environment. The summary can be used to predict the performance of the system in novel environments. Individual transformations can be recycled in the design of future systems.

AIM-1510

CBCL-109

Author[s]: Margrit Betke and Nicholas Makris

Fast Object Recognition in Noisy Images Using Simulated Annealing

January 25, 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1510.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1510.pdf

A fast simulated annealing algorithm is developed for automatic object recognition. The normalized correlation coefficient is used as a measure of the match between a hypothesized object and an image. Templates are generated on-line during the search by transforming model images. Simulated annealing reduces the search time by orders of magnitude with respect to an exhaustive search. The algorithm is applied to the problem of how landmarks, for example, traffic signs, can be recognized by an autonomous vehicle or a navigating robot. The algorithm works well in noisy, real-world images of complicated scenes for model images with high information content.

AIM-1509

CBCL-108

Author[s]: Zoubin Ghahramani and Michael I. Jordan

Learning from Incomplete Data

January 24,1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1509.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1509.pdf

Real-world learning tasks often involve high- dimensional data sets with complex patterns of missing features. In this paper we review the problem of learning from incomplete data from two statistical perspectives---the likelihood-based and the Bayesian. The goal is two-fold: to place current neural network approaches to missing data within a statistical framework, and to describe a set of algorithms, derived from the likelihood-based framework, that handle clustering, classification, and function approximation from incomplete data in a principled and efficient manner. These algorithms are based on mixture modeling and make two distinct appeals to the Expectation-Maximization (EM) principle (Dempster, Laird, and Rubin 1977)-- -both for the estimation of mixture components and for coping with the missing data.

AIM-1506

CBCL-104

Author[s]: Pawan Sinha

Reciprocal Interactions Between Motion and Form Perception

April 21, 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AIM-1506.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1506.pdf

The processes underlying the perceptual analysis of visual form are believed to have minimal interaction with those subserving the perception of visual motion (Livingstone and Hubel, 1987; Victor and Conte, 1990). Recent reports of functionally and anatomically segregated parallel streams in the primate visual cortex seem to support this hypothesis (Ungerlieder and Mishkin, 1982; VanEssen and Maunsell, 1983; Shipp and Zeki, 1985; Zeki and Shipp, 1988; De Yoe et al., 1994). Here we present perceptual evidence that is at odds with this view and instead suggests strong symmetric interactions between the form and motion processes. In one direction, we show that the introduction of specific static figural elements, say 'F', in a simple motion sequence biases an observer to perceive a particular motion field, say 'M'. In the reverse direction, the imposition of the same motion field 'M' on the original sequence leads the observer to perceive illusory static figural elements 'F'. A specific implication of these findings concerns the possible existence of (what we call) motion end-stopped units in the primate visual system. Such units might constitute part of a mechanism for signalling subjective occluding contours based on motion-field discontinuities.

AITR-1504

Author[s]: Robert Playter

Passive Dynamics in the Control of Gymnastic Maneuvers

March 1995

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1504.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1504.pdf

The control of aerial gymnastic maneuvers is challenging because these maneuvers frequently involve complex rotational motion and because the performer has limited control of the maneuver during flight. A performer can influence a maneuver using a sequence of limb movements during flight. However, the same sequence may not produce reliable performances in the presence of off-nominal conditions. How do people compensate for variations in performance to reliably produce aerial maneuvers? In this report I explore the role that passive dynamic stability may play in making the performance of aerial maneuvers simple and reliable. I present a control strategy comprised of active and passive components for performing robot front somersaults in the laboratory. I show that passive dynamics can neutrally stabilize the layout somersault which involves an "inherently unstable" rotation about the intermediate principal axis. And I show that a strategy that uses open loop joint torques plus passive dynamics leads to more reliable 1 1/2 twisting front somersaults in simulation than a strategy that uses prescribed limb motion. Results are presented from laboratory experiments on gymnastic robots, from dynamic simulation of humans and robots, and from linear stability analyses of these systems.

AITR-1500

Author[s]: Saed G. Younis

Asymptotically Zero Energy Computing Using Split-Level Charge Recovery Logic

June 1994

ftp://publications.ai.mit.edu/ai-publications/1500-1999/AITR-1500.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1500.pdf

The dynamic power requirement of CMOS circuits is rapidly becoming a major concern in the design of personal information systems and large computers. In this work we present a number of new CMOS logic families, Charge Recovery Logic (CRL) as well as the much improved Split-Level Charge Recovery Logic (SCRL), within which the transfer of charge between the nodes occurs quasistatically. Operating quasistatically, these logic families have an energy dissipation that drops linearly with operating frequency, i.e., their power consumption drops quadratically with operating frequency as opposed to the linear drop of conventional CMOS. The circuit techniques in these new families rely on constructing an explicitly reversible pipelined logic gate, where the information necessary to recover the energy used to compute a value is provided by computing its logical inverse. Information necessary to uncompute the inverse is available from the subsequent inverse logic stage. We demonstrate the low energy operation of SCRL by presenting the results from the testing of the first fully quasistatic 8 x 8 multiplier chip (SCRL-1) employing SCRL circuit techniques.

AIM-1499

Author[s]: Roberto Brunelli

Estimation of Pose and Illuminant Direction for Face Processing

November 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1499.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1499.pdf

In this paper three problems related to the analysis of facial images are addressed: the illuminant direction, the compensation of illumination effects and, finally, the recovery of the pose of the face, restricted to in-depth rotations. The solutions proposed for these problems rely on the use of computer graphics techniques to provide images of faces under different illumination and pose, starting from a database of frontal views under frontal illumination.

AITR-1498

Author[s]: Lisa Dron

Computing 3-D Motion in Custom Analog and Digital VLSI

Nov 28, 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1498.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1498.pdf

This thesis examines a complete design framework for a real-time, autonomous system with specialized VLSI hardware for computing 3-D camera motion. In the proposed architecture, the first step is to determine point correspondences between two images. Two processors, a CCD array edge detector and a mixed analog/digital binary block correlator, are proposed for this task. The report is divided into three parts. Part I covers the algorithmic analysis; part II describes the design and test of a 32$ ime $32 CCD edge detector fabricated through MOSIS; and part III compares the design of the mixed analog/digital correlator to a fully digital implementation.

AITR-1495

Author[s]: Maja J. Mataric

Interaction and Intelligent Behavior

August 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1495.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1495.pdf

We introduce basic behaviors as primitives for control and learning in situated, embodied agents interacting in complex domains. We propose methods for selecting, formally specifying, algorithmically implementing, empirically evaluating, and combining behaviors from a basic set. We also introduce a general methodology for automatically constructing higher--level behaviors by learning to select from this set. Based on a formulation of reinforcement learning using conditions, behaviors, and shaped reinforcement, out approach makes behavior selection learnable in noisy, uncertain environments with stochastic dynamics. All described ideas are validated with groups of up to 20 mobile robots performing safe-- wandering, following, aggregation, dispersion, homing, flocking, foraging, and learning to forage.

AIM-1494

Author[s]: Sebastian Toleg and Tomaso Poggio

Towards an Example-Based Image Compression Architecture for Video-Conferencing

June 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1494.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1494.pdf

This paper consists of two major parts. First, we present the outline of a simple approach to very-low bandwidth video-conferencing system relying on an example-based hierarchical image compression scheme. In particular, we discuss the use of example images as a model, the number of required examples, faces as a class of semi-rigid objects, a hierarchical model based on decomposition into different time-scales, and the decomposition of face images into patches of interest. In the second part, we present several algorithms for image processing and animation as well as experimental evaluations. Among the original contributions of this paper is an automatic algorithm for pose estimation and normalization. We also review and compare different algorithms for finding the nearest neighbors in a database for a new input as well as a generalized algorithm for blending patches of interest in order to synthesize new images. Finally, we outline the possible integration of several algorithms to illustrate a simple model-based video-conference system.

AITR-1493

Author[s]: Michael H. Coen

SodaBot: A Software Agent Environment and Construction System

Nov 2, 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1493.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1493.pdf

This thesis presents SodaBot, a general- purpose software agent user-environment and construction system. Its primary component is the basic software agent --- a computational framework for building agents which is essentially an agent operating system. We also present a new language for programming the basic software agent whose primitives are designed around human-level descriptions of agent activity. Via this programming language, users can easily implement a wide-range of typical software agent applications, e.g. personal on-line assistants and meeting scheduling agents. The SodaBot system has been implemented and tested, and its description comprises the bulk of this thesis.

AITR-1492

Author[s]: John S. Keen

Logging and Recovery in a Highly Concurrent Database

June 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1492.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1492.pdf

This report addresses the problem of fault tolerance to system failures for database systems that are to run on highly concurrent computers. It assumes that, in general, an application may have a wide distribution in the lifetimes of its transactions. Logging remains the method of choice for ensuring fault tolerance. Generational garbage collection techniques manage the limited disk space reserved for log information; this technique does not require periodic checkpoints and is well suited for applications with a broad range of transaction lifetimes. An arbitrarily large collection of parallel log streams provide the necessary disk bandwidth.

AIM-1491

Author[s]: David A. Cohn

Neural Network Exploration Using Optimal Experiment Design

June 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1491.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1491.pdf

We consider the question "How should one act when the only goal is to learn as much as possible?" Building on the theoretical results of Fedorov [1972] and MacKay [1992], we apply techniques from Optimal Experiment Design (OED) to guide the query/action selection of a neural network learner. We demonstrate that these techniques allow the learner to minimize its generalization error by exploring its domain efficiently and completely. We conclude that, while not a panacea, OED-based query/action has much to offer, especially in domains where its high computational costs can be tolerated.

AIM-1489

Author[s]: Ammon Shashua and Nassir Navab

Relative Affine Structure: Canonical Model for 3D from 2D Geometry and Applications

June 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1489.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1489.pdf

We propose an affine framework for perspective views, captured by a single extremely simple equation based on a viewer- centered invariant we call "relative affine structure". Via a number of corollaries of our main results we show that our framework unifies previous work --- including Euclidean, projective and affine --- in a natural and simple way, and introduces new, extremely simple, algorithms for the tasks of reconstruction from multiple views, recognition by alignment, and certain image coding applications.

AITR-1488

Author[s]: Leon Wong

Automated Reasoning About Classical Mechanics

May 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1488.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1488.pdf

In recent years, researchers in artificial intelligence have become interested in replicating human physical reasoning talents in computers. One of the most important skills in this area is predicting how physical systems will behave. This thesis discusses an implemented program that generates algebraic descriptions of how systems of rigid bodies evolve over time. Discussion about the design of this program identifies a physical reasoning paradigm and knowledge representation approach based on mathematical model construction and algebraic reasoning. This paradigm offers several advantages over methods that have become popular in the field, and seems promising for reasoning about a wide variety of classical mechanics problems.

AIM-1487

Author[s]: Andrew Berlin and Rajeev Surati

Partial Evaluation for Scientific Computing: The Supercomputer Toolkit Experience

May 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1487.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1487.pdf

We describe the key role played by partial evaluation in the Supercomputer Toolkit, a parallel computing system for scientific applications that effectively exploits the vast amount of parallelism exposed by partial evaluation. The Supercomputer Toolkit parallel processor and its associated partial evaluation-based compiler have been used extensively by scientists at M.I.T., and have made possible recent results in astrophysics showing that the motion of the planets in our solar system is chaotically unstable.

AIM-1485

Author[s]: Panayotis A. Skordos

Parallel Simulation of Subsonic Fluid Dynamics on a Cluster of Workstations

December 1995

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1485.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1485.pdf

An effective approach of simulating fluid dynamics on a cluster of non- dedicated workstations is presented. The approach uses local interaction algorithms, small communication capacity, and automatic migration of parallel processes from busy hosts to free hosts. The approach is well- suited for simulating subsonic flow problems which involve both hydrodynamics and acoustic waves; for example, the flow of air inside wind musical instruments. Typical simulations achieve $80\%$ parallel efficiency (speedup/processors) using 20 HP-Apollo workstations. Detailed measurements of the parallel efficiency of 2D and 3D simulations are presented, and a theoretical model of efficiency is developed which fits closely the measurements. Two numerical methods of fluid dynamics are tested: explicit finite differences, and the lattice Boltzmann method.

AIM-1479

CBCL-96

Author[s]: Heinrich H. Buelthoff, Shimon Y. Edelman and Michael J. Tarr

How are Three-Deminsional Objects Represented in the Brain?

April 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1479.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1479.pdf

We discuss a variety of object recognition experiments in which human subjects were presented with realistically rendered images of computer-generated three-dimensional objects, with tight control over stimulus shape, surface properties, illumination, and viewpoint, as well as subjects' prior exposure to the stimulus objects. In all experiments recognition performance was: (1) consistently viewpoint dependent; (2) only partially aided by binocular stereo and other depth information, (3) specific to viewpoints that were familiar; (4) systematically disrupted by rotation in depth more than by deforming the two-dimensional images of the stimuli. These results are consistent with recently advanced computational theories of recognition based on view interpolation.

AIM-1476

Author[s]: D.W. Jacobs and T.D. Alter

Uncertainty Propagation in Model-Based Recognition

February 1995

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1476.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1476.pdf

Building robust recognition systems requires a careful understanding of the effects of error in sensed features. Error in these image features results in a region of uncertainty in the possible image location of each additional model feature. We present an accurate, analytic approximation for this uncertainty region when model poses are based on matching three image and model points, for both Gaussian and bounded error in the detection of image points, and for both scaled-orthographic and perspective projection models. This result applies to objects that are fully three- dimensional, where past results considered only two- dimensional objects. Further, we introduce a linear programming algorithm to compute the uncertainty region when poses are based on any number of initial matches. Finally, we use these results to extend, from two-dimensional to three- dimensional objects, robust implementations of alignmentt interpretation- tree search, and ransformation clustering.

AIM-1474

Author[s]: Margrit Betke, Ronald L. Rivest and Mona Singh

Piecemeal Learning of an Unknown Environment

March 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1474.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1474.pdf

We introduce a new learning problem: learning a graph by piecemeal search, in which the learner must return every so often to its starting point (for refueling, say). We present two linear-time piecemeal-search algorithms for learning city-block graphs: grid graphs with rectangular obstacles.

AIM-1473

Author[s]: N.K. Logothetis, J. Pauls and T. Poggio

Viewer-Centered Object Recognition in Monkeys

April 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1473.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1473.pdf

How does the brain recognize three-dimensional objects? We trained monkeys to recognize computer rendered objects presented from an arbitrarily chosen training view, and subsequently tested their ability to generalize recognition for other views. Our results provide additional evidence in favor of with a recognition model that accomplishes view-invariant performance by storing a limited number of object views or templates together with the capacity to interpolate between the templates (Poggio and Edelman, 1990).

AIM-1472

Author[s]: Nikos K. Logothetis, Thomas Vetter, Anya Hurlbert and Tomaso Poggio

View-Based Models of 3D Object Recognition and Class-Specific Invariances

April 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1472.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1472.pdf

This paper describes the main features of a view-based model of object recognition. The model tries to capture general properties to be expected in a biological architecture for object recognition. The basic module is a regularization network in which each of the hidden units is broadly tuned to a specific view of the object to be recognized.

AIM-1471

Author[s]: James M. Hutchinson, Andrew Lo and Tomaso Poggio

A Nonparametric Approach to Pricing and Hedging Derivative Securities via Learning Networks

April 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1471.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1471.pdf

We propose a nonparametric method for estimating derivative financial asset pricing formulae using learning networks. To demonstrate feasibility, we first simulate Black-Scholes option prices and show that learning networks can recover the Black- Scholes formula from a two-year training set of daily options prices, and that the resulting network formula can be used successfully to both price and delta-hedge options out-of- sample. For comparison, we estimate models using four popular methods: ordinary least squares, radial basis functions, multilayer perceptrons, and projection pursuit. To illustrate practical relevance, we also apply our approach to S&P 500 futures options data from 1987 to 1991.

AITR-1469

Author[s]: Karen Beth Sarachik

An Analysis of the Effect of Gaussian Error in Object Recognition

February 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1469.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1469.pdf

Object recognition is complicated by clutter, occlusion, and sensor error. Since pose hypotheses are based on image feature locations, these effects can lead to false negatives and positives. In a typical recognition algorithm, pose hypotheses are tested against the image, and a score is assigned to each hypothesis. We use a statistical model to determine the score distribution associated with correct and incorrect pose hypotheses, and use binary hypothesis testing techniques to distinguish between them. Using this approach we can compare algorithms and noise models, and automatically choose values for internal system thresholds to minimize the probability of making a mistake.

AIM-1468

Author[s]: Whitman Richards and Jan J. Koenderink

Trajectory Mapping ("TM''): A New Non-Metric Scaling Technique

December 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1468.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1468.pdf

Trajectory Mapping "TM'' is a new scaling technique designed to recover the parameterizations, axes, and paths used to traverse a feature space. Unlike Multidimensional Scaling (MDS), there is no assumption that the space is homogenous or metric. Although some metric ordering information is obtained with TM, the main output is the feature parameterizations that partition the given domain of object samples into different categories. Following an introductory example, the technique is further illustrated using first a set of colors and then a collection of textures taken from Brodatz (1966).

AIM-1467

Author[s]: Partha Niyogi and Federico Girosi

On the Relationship Between Generalization Error, Hypothesis Complexity, and Sample Complexity for Radial Basis Functions

February 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1467.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1467.pdf

In this paper, we bound the generalization error of a class of Radial Basis Function networks, for certain well defined function learning tasks, in terms of the number of parameters and number of examples. We show that the total generalization error is partly due to the insufficient representational capacity of the network (because of its finite size) and partly due to insufficient information about the target function (because of finite number of samples). We make several observations about generalization error which are valid irrespective of the approximation scheme. Our result also sheds light on ways to choose an appropriate network architecture for a particular problem.

AITR-1465

Author[s]: Lynne E. Parker

Heterogeneous Multi-Robot Cooperation

February 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1465.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1465.pdf

This report addresses the problem of achieving cooperation within small- to medium- sized teams of heterogeneous mobile robots. I describe a software architecture I have developed, called ALLIANCE, that facilitates robust, fault tolerant, reliable, and adaptive cooperative control. In addition, an extended version of ALLIANCE, called L-ALLIANCE, is described, which incorporates a dynamic parameter update mechanism that allows teams of mobile robots to improve the efficiency of their mission performance through learning. A number of experimental results of implementing these architectures on both physical and simulated mobile robot teams are described. In addition, this report presents the results of studies of a number of issues in mobile robot cooperation, including fault tolerant cooperative control, adaptive action selection, distributed control, robot awareness of team member actions, improving efficiency through learning, inter- robot communication, action recognition, and local versus global control.

AITR-1464

Author[s]: Nancy S. Pollard

Parallel Methods for Synthesizing Whole-Hand Grasps from Generalized Prototypes

January 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1464.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1464.pdf

This report addresses the problem of acquiring objects using articulated robotic hands. Standard grasps are used to make the problem tractable, and a technique is developed for generalizing these standard grasps to increase their flexibility to variations in the problem geometry. A generalized grasp description is applied to a new problem situation using a parallel search through hand configuration space, and the result of this operation is a global overview of the space of good solutions. The techniques presented in this report have been implemented, and the results are verified using the Salisbury three- finger robotic hand.

AIM-1463

Author[s]: Kanji Nagao and W. Eric L. Grimson

Object Recognition By Alignment Using Invariant Projections of Planar Surfaces

February 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1463.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1463.pdf

In order to recognize an object in an image, we must determine the best transformation from object model to the image. In this paper, we show that for features from coplanar surfaces which undergo linear transformations in space, there exist projections invariant to the surface motions up to rotations in the image field. To use this property, we propose a new alignment approach to object recognition based on centroid alignment of corresponding feature groups. This method uses only a single pair of 2D model and data. Experimental results show the robustness of the proposed method against perturbations of feature positions.

AIM-1462

Author[s]: James S. Miller and Guillermo J. Rozas

Garbage Collection is Fast, But a Stack is Faster

March 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1462.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1462.pdf

Prompted by claims that garbage collection can outperform stack allocation when sufficient physical memory is available, we present a careful analysis and set of cross-architecture measurements comparing these two approaches for the implementation of continuation (procedure call) frames. When the frames are allocated on a heap they require additional space, increase the amount of data transferred between memory and registers, and, on current architectures, require more instructions. We find that stack allocation of continuation frames outperforms heap allocation in some cases by almost a factor of three. Thus, stacks remain an important implementation technique for procedure calls, even in the presence of an efficient, compacting garbage collector and large amounts of memory.

AIM-1461

Author[s]: David J. Beymer

Face Recognition Under Varying Pose

December 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1461.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1461.pdf

While researchers in computer vision and pattern recognition have worked on automatic techniques for recognizing faces for the last 20 years, most systems specialize on frontal views of the face. We present a face recognizer that works under varying pose, the difficult part of which is to handle face rotations in depth. Building on successful template-based systems, our basic approach is to represent faces with templates from multiple model views that cover different poses from the viewing sphere. Our system has achieved a recognition rate of 98% on a data base of 62 people containing 10 testing and 15 modelling views per person.

AITR-1459

Author[s]: Peter R. Nuth

The Named-State Register File

August 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1459.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1459.pdf

This thesis introduces the Named-State Register File, a fine-grain, fully-associative register file. The NSF allows fast context switching between concurrent threads as well as efficient sequential program performance. The NSF holds more live data than conventional register files, and requires less spill and reload traffic to switch between contexts. This thesis demonstrates an implementation of the Named-State Register File and estimates the access time and chip area required for different organizations. Architectural simulations of large sequential and parallel applications show that the NSF can reduce execution time by 9% to 17% compared to alternative register files.

AIM-1458

Author[s]: Michael I. Jordan and Lei Xu

Convergence Results for the EM Approach to Mixtures of Experts Architectures

November 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1458.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1458.pdf

The Expectation-Maximization (EM) algorithm is an iterative approach to maximum likelihood parameter estimation. Jordan and Jacobs (1993) recently proposed an EM algorithm for the mixture of experts architecture of Jacobs, Jordan, Nowlan and Hinton (1991) and the hierarchical mixture of experts architecture of Jordan and Jacobs (1992). They showed empirically that the EM algorithm for these architectures yields significantly faster convergence than gradient ascent. In the current paper we provide a theoretical analysis of this algorithm. We show that the algorithm can be regarded as a variable metric algorithm with its searching direction having a positive projection on the gradient of the log likelihood. We also analyze the convergence of the algorithm and provide an explicit expression for the convergence rate. In addition, we describe an acceleration technique that yields a significant speedup in simulation experiments.

AITR-1457

Author[s]: James M. Hutchinson

A Radial Basis Function Approach to Financial Time Series Analysis

December 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1457.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1457.pdf

Nonlinear multivariate statistical techniques on fast computers offer the potential to capture more of the dynamics of the high dimensional, noisy systems underlying financial markets than traditional models, while making fewer restrictive assumptions. This thesis presents a collection of practical techniques to address important estimation and confidence issues for Radial Basis Function networks arising from such a data driven approach, including efficient methods for parameter estimation and pruning, a pointwise prediction error estimator, and a methodology for controlling the "data mining'' problem. Novel applications in the finance area are described, including customized, adaptive option pricing and stock price prediction.

AITR-1456

Author[s]: Jeffrey M. Siskind

Naive Physics, Event Perception, Lexical Semantics, and Language Acquisition

April 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1456.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1456.pdf

This thesis proposes a computational model of how children may come to learn the meanings of words in their native language. The proposed model is divided into two separate components. One component produces semantic descriptions of visually observed events while the other correlates those descriptions with co-occurring descriptions of those events in natural language. The first part of this thesis describes three implementations of the correlation process whereby representations of the meanings of whole utterances can be decomposed into fragments assigned as representations of the meanings of individual words. The second part of this thesis describes an implemented computer program that recognizes the occurrence of simple spatial motion events in simulated video input.

AITR-1455

Author[s]: Martha J. Hiller

The Role of Chemical Mechanisms in Neural Computation and Learning

May 23, 1995

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1455.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1455.pdf

Most computational models of neurons assume that their electrical characteristics are of paramount importance. However, all long- term changes in synaptic efficacy, as well as many short-term effects, are mediated by chemical mechanisms. This technical report explores the interaction between electrical and chemical mechanisms in neural learning and development. Two neural systems that exemplify this interaction are described and modelled. The first is the mechanisms underlying habituation, sensitization, and associative learning in the gill withdrawal reflex circuit in Aplysia, a marine snail. The second is the formation of retinotopic projections in the early visual pathway during embryonic development.

AITR-1453

Author[s]: Carl de Marcken

Methods for Parallelizing Search Paths in Phrasing

January 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1453.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1453.pdf

Many search problems are commonly solved with combinatoric algorithms that unnecessarily duplicate and serialize work at considerable computational expense. There are techniques available that can eliminate redundant computations and perform remaining operations concurrently, effectively reducing the branching factors of these algorithms. This thesis applies these techniques to the problem of parsing natural language. The result is an efficient programming language that can reduce some of the expense associated with principle- based parsing and other search problems. The language is used to implement various natural language parsers, and the improvements are compared to those that result from implementing more deterministic theories of language processing.

AIM-1452

Author[s]: Ammon Shashua

Algebraic Functions For Recognition

January 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1452.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1452.pdf

In the general case, a trilinear relationship between three perspective views is shown to exist. The trilinearity result is shown to be of much practical use in visual recognition by alignment --- yielding a direct method that cuts through the computations of camera transformation, scene structure and epipolar geometry. The proof of the central result may be of further interest as it demonstrates certain regularities across homographies of the plane and introduces new view invariants. Experiments on simulated and real image data were conducted, including a comparative analysis with epipolar intersection and the linear combination methods, with results indicating a greater degree of robustness in practice and a higher level of performance in re-projection tasks.

AITR-1451

Author[s]: Matthew Birkholz

Emacs Lisp in Edwin SScheme

September 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1451.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1451.pdf

The MIT-Scheme program development environment includes a general-purpose text editor, Edwin, that has an extension language, Edwin Scheme. Edwin is very similar to another general-purpose text editor, GNU Emacs, which also has an extension language, Emacs Lisp. The popularity of GNU Emacs has lead to a large library of tools written in Emacs Lisp. The goal of this thesis is to implement a useful subset of Emacs Lisp in Edwin Scheme. This subset was chosen to be sufficient for simple operation of the GNUS news reading program.

AITR-1450

Author[s]: Daniel M. Albro

AMAR: A Computational Model of Autosegmental Phonology

October 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1450.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1450.pdf

This report describes a computational system with which phonologists may describe a natural language in terms of autosegmental phonology, currently the most advanced theory pertaining to the sound systems of human languages. This system allows linguists to easily test autosegmental hypotheses against a large corpus of data. The system was designed primarily with tonal systems in mind, but also provides support for tree or feature matrix representation of phonemes (as in The Sound Pattern of English), as well as syllable structures and other aspects of phonological theory. Underspecification is allowed, and trees may be specified before, during, and after rule application. The association convention is automatically applied, and other principles such as the conjunctivity condition are supported. The method of representation was designed such that rules are designated in as close a fashion as possible to the existing conventions of autosegmental theory while adhering to a textual constraint for maximum portability.

AIM-1449

Author[s]: Partha Niyogi and Robert C. Berwick

Formalizing Triggers: A Learning Model for Finite Spaces

November 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1449.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1449.pdf

In a recent seminal paper, Gibson and Wexler (1993) take important steps to formalizing the notion of language learning in a (finite) space whose grammars are characterized by a finite number of parameters. They introduce the Triggering Learning Algorithm (TLA) and show that even in finite space convergence may be a problem due to local maxima. In this paper we explicitly formalize learning in finite parameter space as a Markov structure whose states are parameter settings. We show that this captures the dynamics of TLA completely and allows us to explicitly compute the rates of convergence for TLA and other variants of TLA e.g. random walk. Also included in the paper are a corrected version of GW's central convergence proof, a list of "problem states" in addition to local maxima, and batch and PAC-style learning bounds for the model.

AIM-1448

CBCL-85

Author[s]: Amnon Shashua and Sebastian Toelg

The Quadric Reference Surface: Theory and Applications

June 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1448.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1448.pdf

The conceptual component of this work is about "reference surfaces'' which are the dual of reference frames often used for shape representation purposes. The theoretical component of this work involves the question of whether one can find a unique (and simple) mapping that aligns two arbitrary perspective views of an opaque textured quadric surface in 3D, given (i) few corresponding points in the two views, or (ii) the outline conic of the surface in one view (only) and few corresponding points in the two views. The practical component of this work is concerned with applying the theoretical results as tools for the task of achieving full correspondence between views of arbitrary objects.

AIM-1447

CBCL-101

Author[s]: Takaya Miyano and Federico Girosi

Forecasting Global Temperature Variations by Neural Networks

August 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1447.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1447.pdf

Global temperature variations between 1861 and 1984 are forecast usingsregularization networks, multilayer perceptrons and linearsautoregression. The regularization network, optimized by stochasticsgradient descent associated with colored noise, gives the bestsforecasts. For all the models, prediction errors noticeably increasesafter 1965. These results are consistent with the hypothesis that thesclimate dynamics is characterized by low-dimensional chaos and thatsthe it may have changed at some point after 1965, which is alsosconsistent with the recent idea of climate change.s

AITR-1445

Author[s]: Andre DeHon

Robust, High-Speed Network Design for Large-Scale Multiprocessing

September 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1445.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1445.pdf

As multiprocessor system size scales upward, two important aspects of multiprocessor systems will generally get worse rather than better: (1) interprocessor communication latency will increase and (2) the probability that some component in the system will fail will increase. These problems can prevent us from realizing the potential benefits of large-scale multiprocessing. In this report we consider the problem of designing networks which simultaneously minimize communication latency while maximizing fault tolerance. Using a synergy of techniques including connection topologies, routing protocols, signalling techniques, and packaging technologies we assemble integrated, system-level solutions to this network design problem.

AITR-1444

Author[s]: Michael de la Maza

Synthesizing Regularity Exposing Attributes in Large Protein Databases

May 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1444.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1444.pdf

This thesis describes a system that synthesizes regularity exposing attributes from large protein databases. After processing primary and secondary structure data, this system discovers an amino acid representation that captures what are thought to be the three most important amino acid characteristics (size, charge, and hydrophobicity) for tertiary structure prediction. A neural network trained using this 16 bit representation achieves a performance accuracy on the secondary structure prediction problem that is comparable to the one achieved by a neural network trained using the standard 24 bit amino acid representation. In addition, the thesis describes bounds on secondary structure prediction accuracy, derived using an optimal learning algorithm and the probably approximately correct (PAC) model.

AITR-1443

Author[s]: Cynthia Ferrell

Robust Agent Control of an Autonomous Robot with Many Sensors and Actuators

May 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1443.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1443.pdf

This thesis presents methods for implementing robust hexpod locomotion on an autonomous robot with many sensors and actuators. The controller is based on the Subsumption Architecture and is fully distributed over approximately 1500 simple, concurrent processes. The robot, Hannibal, weighs approximately 6 pounds and is equipped with over 100 physical sensors, 19 degrees of freedom, and 8 on board computers. We investigate the following topics in depth: distributed control of a complex robot, insect-inspired locomotion control for gait generation and rough terrain mobility, and fault tolerance. The controller was implemented, debugged, and tested on Hannibal. Through a series of experiments, we examined Hannibal's gait generation, rough terrain locomotion, and fault tolerance performance. These results demonstrate that Hannibal exhibits robust, flexible, real-time locomotion over a variety of terrain and tolerates a multitude of hardware failures.

AITR-1442

Author[s]: J. Brian Subirana-Vilanova

Mid-Level Vision and Recognition of Non-Rigid Objects

April 1995

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1442.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1442.pdf

We address mid-level vision for the recognition of non-rigid objects. We align model and image using frame curves - which are object or "figure/ground" skeletons. Frame curves are computed, without discontinuities, using Curved Inertia Frames, a provably global scheme implemented on the Connection Machine, based on: non- cartisean networks; a definition of curved axis of inertia; and a ridge detector. I present evidence against frame alignment in human perception. This suggests: frame curves have a role in figure/ground segregation and in fuzzy boundaries; their outside/near/top/ incoming regions are more salient; and that perception begins by setting a reference frame (prior to early vision), and proceeds by processing convex structures.

AIM-1441

CBCL-84

Author[s]: Tommi Jaakkola, Michael I. Jordan and Satinder P. Singh

On the Convergence of Stochastic Iterative Dynamic Programming Algorithms

August 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1441.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1441.pdf

Recent developments in the area of reinforcement learning have yielded a number of new algorithms for the prediction and control of Markovian environments. These algorithms, including the TD(lambda) algorithm of Sutton (1988) and the Q-learning algorithm of Watkins (1989), can be motivated heuristically as approximations to dynamic programming (DP). In this paper we provide a rigorous proof of convergence of these DP- based learning algorithms by relating them to the powerful techniques of stochastic approximation theory via a new convergence theorem. The theorem establishes a general class of convergent algorithms to which both TD(lambda) and Q-learning belong.

AIM-1440

CBCL-83

Author[s]: Michael I. Jordan and Robert A. Jacobs

Hierarchical Mixtures of Experts and the EM Algorithm

August 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1440.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1440.pdf

We present a tree-structured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM's). Learning is treated as a maximum likelihood problem; in particular, we present an Expectation- Maximization (EM) algorithm for adjusting the parameters of the architecture. We also develop an on-line learning algorithm in which the parameters are updated incrementally. Comparative simulation results are presented in the robot dynamics domain.

AIM-1439

Author[s]: Rodney Brooks and Lynn A. Stein

Building Brains for Bodies

August 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1439.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1439.pdf

We describe a project to capitalize on newly available levels of computational resources in order to understand human cognition. We will build an integrated physical system including vision, sound input and output, and dextrous manipulation, all controlled by a continuously operating large scale parallel MIMD computer. The resulting system will learn to "think'' by building on its bodily experiences to accomplish progressively more abstract tasks. Past experience suggests that in attempting to build such an integrated system we will have to fundamentally change the way artificial intelligence, cognitive science, linguistics, and philosophy think about the organization of intelligence. We expect to be able to better reconcile the theories that will be developed with current work in neuroscience.

AIM-1438

CBCL-116

Author[s]: Kah Kay Sung and Partha Niyogi

A Formulation for Active Learning with Applications to Object Detection

June 6, 1996

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1438.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1438.pdf

We discuss a formulation for active example selection for function learning problems. This formulation is obtained by adapting Fedorov's optimal experiment design to the learning problem. We specifically show how to analytically derive example selection algorithms for certain well defined function classes. We then explore the behavior and sample complexity of such active learning algorithms. Finally, we view object detection as a special case of function learning and show how our formulation reduces to a useful heuristic to choose examples to reduce the generalization error.

AIM-1437

CBCL-82

Author[s]: Reza Shadmehr and Ferdinando Mussa-Ivaldi

Geometric Structure of the Adaptive Controller of the Human Arm

July 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1437.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1437.pdf

The objects with which the hand interacts with may significantly change the dynamics of the arm. How does the brain adapt control of arm movements to this new dynamic? We show that adaptation is via composition of a model of the task's dynamics. By exploring generalization capabilities of this adaptation we infer some of the properties of the computational elements with which the brain formed this model: the elements have broad receptive fields and encode the learned dynamics as a map structured in an intrinsic coordinate system closely related to the geometry of the skeletomusculature. The low- -level nature of these elements suggests that they may represent asset of primitives with which a movement is represented in the CNS.

AIM-1435

Author[s]: W. Eric L. Grimson

Why Stereo Vision is Not Always About 3D Reconstruction

July 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1435.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1435.pdf

It is commonly assumed that the goal of stereovision is computing explicit 3D scene reconstructions. We show that very accurate camera calibration is needed to support this, and that such accurate calibration is difficult to achieve and maintain. We argue that for tasks like recognition, figure/ground separation is more important than 3D depth reconstruction, and demonstrate a stereo algorithm that supports figure/ground separation without 3D reconstruction.

AITR-1434

Author[s]: Ronald D. Chaney

Feature Extraction Without Edge Detection

September 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1434.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1434.pdf

Information representation is a critical issue in machine vision. The representation strategy in the primitive stages of a vision system has enormous implications for the performance in subsequent stages. Existing feature extraction paradigms, like edge detection, provide sparse and unreliable representations of the image information. In this thesis, we propose a novel feature extraction paradigm. The features consist of salient, simple parts of regions bounded by zero-crossings. The features are dense, stable, and robust. The primary advantage of the features is that they have abstract geometric attributes pertaining to their size and shape. To demonstrate the utility of the feature extraction paradigm, we apply it to passive navigation. We argue that the paradigm is applicable to other early vision problems.

AIM-1433

CBCL-91

Author[s]: Jose L. Marroquin

Measure Fields for Function Approximation

June 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1433.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1433.pdf

The computation of a piecewise smooth function that approximates a finite set of data points may be decomposed into two decoupled tasks: first, the computation of the locally smooth models, and hence, the segmentation of the data into classes that consist on the sets of points best approximated by each model, and second, the computation of the normalized discriminant functions for each induced class. The approximating function may then be computed as the optimal estimator with respect to this measure field. We give an efficient procedure for effecting both computations, and for the determination of the optimal number of components.

AIM-1432

CBCL-81

Author[s]: Philippe G. Schyns and Heinrich H. Bulthoff

Conditions for Viewpoint Dependent Face Recognition

August 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1432.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1432.pdf

Poggio and Vetter (1992) showed that learning one view of a bilaterally symmetric object could be sufficient for its recognition, if this view allows the computation of a symmetric, "virtual," view. Faces are roughly bilaterally symmetric objects. Learning a side- view--which always has a symmetric view-- should allow for better generalization performances than learning the frontal view. Two psychophysical experiments tested these predictions. Stimuli were views of shaded 3D models of laser-scanned faces. The first experiment tested whether a particular view of a face was canonical. The second experiment tested which single views of a face give rise to best generalization performances. The results were compatible with the symmetry hypothesis: Learning a side view allowed better generalization performances than learning the frontal view.

AIM-1431

CBCL-80

Author[s]: David Beymer, Amnon Shashua and Tomaso Poggio

Example Based Image Analysis and Synthesis

November 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1431.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1431.pdf

Image analysis and graphics synthesis can be achieved with learning techniques using directly image examples without physically- based, 3D models. In our technique: -- the mapping from novel images to a vector of "pose" and "expression" parameters can be learned from a small set of example images using a function approximation technique that we call an analysis network; -- the inverse mapping from input "pose" and "expression" parameters to output images can be synthesized from a small set of example images and used to produce new images using a similar synthesis network. The techniques described here have several applications in computer graphics, special effects, interactive multimedia and very low bandwidth teleconferencing.

AIM-1430

CBCL-75

Author[s]: Federico Girosi, Michael Jones and Tomaso Poggio

Priors Stabilizers and Basis Functions: From Regularization to Radial, Tensor and Additive Splines

June 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1430.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1430.pdf

We had previously shown that regularization principles lead to approximation schemes, as Radial Basis Functions, which are equivalent to networks with one layer of hidden units, called Regularization Networks. In this paper we show that regularization networks encompass a much broader range of approximation schemes, including many of the popular general additive models, Breiman's hinge functions and some forms of Projection Pursuit Regression. In the probabilistic interpretation of regularization, the different classes of basis functions correspond to different classes of prior probabilities on the approximating function spaces, and therefore to different types of smoothness assumptions. In the final part of the paper, we also show a relation between activation functions of the Gaussian and sigmoidal type.

AITR-1429

Author[s]: Christine L. Tsien

Maygen: A Symbolic Debugger Generation System

May 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1429.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1429.pdf

With the development of high-level languages for new computer architectures comes the need for appropriate debugging tools as well. One method for meeting this need would be to develop, from scratch, a symbolic debugger with the introduction of each new language implementation for any given architecture. This, however, seems to require unnecessary duplication of effort among developers. This paper describes Maygen, a "debugger generation system," designed to efficiently provide the desired language-dependent and architecture-dependent debuggers. A prototype of the Maygen system has been implemented and is able to handle the semantically different languages of C and OPAL.

AITR-1427

Author[s]: Guillermo J. Rozas

Translucent Procedures, Abstraction without Opacity

October 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1427.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1427.pdf

This report introduces TRANSLUCENT PROCEDURES as a new mechanism for implementing behavioral abstractions. Like an ordinary procedure, a translucent procedure can be invoked, and thus provides an obvious way to capture a BEHAVIOR. Translucent procedures, like ordinary procedures, can be manipulated as first-class objects and combined using functional composition. But unlike ordinary procedures, translucent procedures have structure that can be examined in well-specified non- destructive ways, without invoking the procedure.

AITR-1426

Author[s]: Gideon P. Stein

Internal Camera Calibration Using Rotation and Geometric Shapes

February 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1426.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1426.pdf

This paper describes a simple method for internal camera calibration for computer vision. This method is based on tracking image features through a sequence of images while the camera undergoes pure rotation. The location of the features relative to the camera or to each other need not be known and therefore this method can be used both for laboratory calibration and for self calibration in autonomous robots working in unstructured environments. A second method of calibration is also presented. This method uses simple geometric objects such as spheres and straight lines to The camera parameters. Calibration is performed using both methods and the results compared.

AITR-1425

Author[s]: Michael E. Caine

The Design of Shape from Motion Constraints

September 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1425.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1425.pdf

This report presents a set of representations methodologies and tools for the purpose of visualizing, analyzing and designing functional shapes in terms of constraints on motion. The core of the research is an interactive computational environment that provides an explicit visual representation of motion constraints produced by shape interactions, and a series of tools that allow for the manipulation of motion constraints and their underlying shapes for the purpose of design.

AITR-1424

Author[s]: Charles L. Isbell

Explorations of the Practical Issues of Learning Prediction-Control Tasks Using Temporal Difference Learning Methods

December 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1424.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1424.pdf

There has been recent interest in using temporal difference learning methods to attack problems of prediction and control. While these algorithms have been brought to bear on many problems, they remain poorly understood. It is the purpose of this thesis to further explore these algorithms, presenting a framework for viewing them and raising a number of practical issues and exploring those issues in the context of several case studies. This includes applying the TD(lambda) algorithm to: 1) learning to play tic-tac-toe from the outcome of self-play and of play against a perfectly-playing opponent and 2) learning simple one-dimensional segmentation tasks.

AIM-1423

Author[s]: Neil C. Singer and Warren P. Seering

A Simplified Method for Deriving Equations of Motion For Continuous Systems with Flexible Members

May 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1423.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1423.pdf

A method is proposed for deriving dynamical equations for systems with both rigid and flexible components. During the derivation, each flexible component of the system is represented by a "surrogate element" which captures the response characteristics of that component and is easy to mathematically manipulate. The derivation proceeds essentially as if each surrogate element were a rigid body. Application of an extended form of Lagrange's equation yields a set of simultaneous differential equations which can then be transformed to be the exact, partial differential equations for the original flexible system. This method's use facilitates equation generation either by an analyst or through application of software-based symbolic manipulation.

AIM-1422

Author[s]: Henry M. Wu

A Method for Eliminating Skew Introduced by Non-Uniform Buffer Delay and Wire Lengths in Clock Distribution Trees

April 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1422.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1422.pdf

The computation of a piecewise smooth function that approximates a finite set of data points is decomposed into two decoupled tasks: first, the computation of the locally smooth models, and hence, the segmentation of the data into classes that consist on the sets of points best approximated by each model, and second, the computation of the normalized discriminant functions for each induced class. The approximating function is then computed as the optimal estimator with respect to this measure field. Applications to image processing and time series prediction are presented as well.

AIM-1421

Author[s]: Brian Eberman and S. Kenneth Salisbury

Application of Charge Detection to Dynamic Contact Sensing

March 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1421.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1421.pdf

The manipulation contact forces convey substantial information about the manipulation state. This paper address the fundamental problem of interpreting the force signals without any additional manipulation context. Techniques based on forms of the generalized sequential likelihood ratio test are used to segment individual strain signals into statistically equivalent pieces. We report on our experimental development of the segmentation algorithm and on its results for contact states. The sequential likelihood ratio test is reviewed and some of its special cases and optimal properties are discussed. Finally, we conclude by discussing extensions to the techniques and a contact interpretation framework.

AITR-1420

Author[s]: S. Tanveer F. Mahmood

Attentional Selection in Object Recognition

1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1420.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1420.pdf

A key problem in object recognition is selection, namely, the problem of identifying regions in an image within which to start the recognition process, ideally by isolating regions that are likely to come from a single object. Such a selection mechanism has been found to be crucial in reducing the combinatorial search involved in the matching stage of object recognition. Even though selection is of help in recognition, it has largely remained unsolved because of the difficulty in isolating regions belonging to objects under complex imaging conditions involving occlusions, changing illumination, and object appearances. This thesis presents a novel approach to the selection problem by proposing a computational model of visual attentional selection as a paradigm for selection in recognition. In particular, it proposes two modes of attentional selection, namely, attracted and pay attention modes as being appropriate for data and model-driven selection in recognition. An implementation of this model has led to new ways of extracting color, texture and line group information in images, and their subsequent use in isolating areas of the scene likely to contain the model object. Among the specific results in this thesis are: a method of specifying color by perceptual color categories for fast color region segmentation and color-based localization of objects, and a result showing that the recognition of texture patterns on model objects is possible under changes in orientation and occlusions without detailed segmentation. The thesis also presents an evaluation of the proposed model by integrating with a 3D from 2D object recognition system and recording the improvement in performance. These results indicate that attentional selection can significantly overcome the computational bottleneck in object recognition, both due to a reduction in the number of features, and due to a reduction in the number of matches during recognition using the information derived during selection. Finally, these studies have revealed a surprising use of selection, namely, in the partial solution of the pose of a 3D object.

AITR-1417

Author[s]: Patrick Sobalvarro

A Lifetime-based Garbage Collector for LISP Systems on General-Purpose Computers

February 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1417.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1417.pdf

Garbage collector performance in LISP systems on custom hardware has been substantially improved by the adoption of lifetime-based garbage collection techniques. To date, however, successful lifetime-based garbage collectors have required special- purpose hardware, or at least privileged access to data structures maintained by the virtual memory system. I present here a lifetime-based garbage collector requiring no special-purpose hardware or virtual memory system support, and discuss its performance.

AITR-1416

Author[s]: David W. Jacobs

Recognizing 3-D Objects Using 2-D Images

April 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1416.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1416.pdf

We discuss a strategy for visual recognition by forming groups of salient image features, and then using these groups to index into a data base to find all of the matching groups of model features. We discuss the most space efficient possible method of representing 3-D models for indexing from 2-D data, and show how to account for sensing error when indexing. We also present a convex grouping method that is robust and efficient, both theoretically and in practice. Finally, we combine these modules into a complete recognition system, and test its performance on many real images.

AIM-1415

Author[s]: Pawan Sinha

Pattern Motion Perception: Feature Tracking or Integration of Component Motions?

October 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1415.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1415.pdf

A key question regarding primate visual motion perception is whether the motion of 2D patterns is recovered by tracking distinctive localizable features [Lorenceau and Gorea, 1989; Rubin and Hochstein, 1992] or by integrating ambiguous local motion estimates [Adelson and Movshon, 1982; Wilson and Kim, 1992]. For a two-grating plaid pattern, this translates to either tracking the grating intersections or to appropriately combining the motion estimates for each grating. Since both component and feature information are simultaneously available in any plaid pattern made of contrast defined gratings, it is unclear how to determine which of the two schemes is actually used to recover the plaid"s motion. To address this problem, we have designed a plaid pattern made with subjective, rather than contrast defined, gratings. The distinguishing characteristic of such a plaid pattern is that it contains no contrast defined intersections that may be tracked. We find that notwithstanding the absence of such features, observers can accurately recover the pattern velocity. Additionally we show that the hypothesis of tracking "illusory features" to estimate pattern motion does not stand up to experimental test. These results present direct evidence in support of the idea that calls for the integration of component motions over the one that mandates tracking localized features to recover 2D pattern motion. The localized features, we suggest, are used primarily as providers of grouping information - which component motion signals to integrate and which not to.

AIM-1414A

Author[s]: Rajeev Surati

Exploiting the Parallelism Exposed by Partial Evaluation

May 1994

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1414A.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1414A.pdf

AIM-1414

Author[s]: Andrew A. Berlin and Rajeev J. Surati

Exploiting the Parallelism Exposed by Partial Evaluation

April 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1414.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1414.pdf

We describe an approach to parallel compilation that seeks to harness the vast amount of fine-grain parallelism that is exposed through partial evaluation of numerically-intensive scientific programs. We have constructed a compiler for the Supercomputer Toolkit parallel processor that uses partial evaluation to break down data abstractions and program structure, producing huge basic blocks that contain large amounts of fine-grain parallelism. We show that this fine-grain prarllelism can be effectively utilized even on coarse-grain parallel architectures by selectively grouping operations together so as to adjust the parallelism grain-size to match the inter-processor communication capabilities of the target architecture.

AITR-1412

Author[s]: Jonathan Amsterdam

Automatic Qualitative Modeling of Dynamic Physical Systems

January 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1412.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1412.pdf

This report describes MM, a computer program that can model a variety of mechanical and fluid systems. Given a system's structure and qualitative behavior, MM searches for models using an energy- based modeling framework. MM uses general facts about physical systems to relate behavioral and model properties. These facts enable a more focussed search for models than would be obtained by mere comparison of desired and predicted behaviors. When these facts do not apply, MM uses behavior- constrained qualitative simulation to verify candidate models efficiently. MM can also design experiments to distinguish among multiple candidate models.

AITR-1411

Author[s]: Clay Matthew Thompson

Robust Photo-topography by Fusing Shape-from-Shading and Stereo

February 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1411.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1411.pdf

Methods for fusing two computer vision methods are discussed and several example algorithms are presented to illustrate the variational method of fusing algorithms. The example algorithms seek to determine planet topography given two images taken from two different locations with two different lighting conditions. The algorithms each employ assingle cost function that combines the computer vision methods of shape-from- shading and stereo in different ways. The algorithms are closely coupled and take into account all the constraints of the photo- topography problem. The algorithms are run on four synthetic test image sets of varying difficulty.

AITR-1410

Author[s]: Tao Daniel Alter

Robust and Efficient 3D Recognition by Alignment

September 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1410.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1410.pdf

Alignment is a prevalent approach for recognizing 3D objects in 2D images. A major problem with current implementations is how to robustly handle errors that propagate from uncertainties in the locations of image features. This thesis gives a technique for bounding these errors. The technique makes use of a new solution to the problem of recovering 3D pose from three matching point pairs under weak-perspective projection. Furthermore, the error bounds are used to demonstrate that using line segments for features instead of points significantly reduces the false positive rate, to the extent that alignment can remain reliable even in cluttered scenes.

AIM-1409

CBCL-76

Author[s]: Thomas Vetter, Tomaso Poggio and Heinrich B'ulthoff

3D Object Recognition: Symmetry and Virtual Views

December 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1409.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1409.pdf

Many 3D objects in the world around us are strongly constrained. For instance, not only cultural artifacts but also many natural objects are bilaterally symmetric. Thoretical arguments suggest and psychophysical experiments confirm that humans may be better in the recognition of symmetric objects. The hypothesis of symmetry-induced virtual views together with a network model that successfully accounts for human recognition of generic 3D objects leads to predictions that we have verified with psychophysical experiments.

AITR-1408

Author[s]: Philip Greenspun

Site Controller: A System for Computer-Aided Civil Engineering and Construction

February 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1408.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1408.pdf

A revolution in earthmoving, a $100 billion industry, can be achieved with three components: the GPS location system, sensors and computers in bulldozers, and SITE CONTROLLER, a central computer system that maintains design data and directs operations. The first two components are widely available; I built SITE CONTROLLER to complete the triangle and describe it here. SITE CONTROLLER assists civil engineers in the design, estimation, and construction of earthworks, including hazardous waste site remediation. The core of SITE CONTROLLER is a site modelling system that represents existing and prospective terrain shapes, roads, hydrology, etc. Around this core are analysis, simulation, and vehicle control tools. Integrating these modules into one program enables civil engineers and contractors to use a single interface and database throughout the life of a project.

AIM-1405

CBCL-78

Author[s]: Amnon Shashua

Geometric and Algebraic Aspects of 3D Affine and Projective Structures from Perspective 2D Views

July 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1405.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1405.pdf

We investigate the differences --- conceptually and algorithmically --- between affine and projective frameworks for the tasks of visual recognition and reconstruction from perspective views. It is shown that an affine invariant exists between any view and a fixed view chosen as a reference view. This implies that for tasks for which a reference view can be chosen, such as in alignment schemes for visual recognition, projective invariants are not really necessary. We then use the affine invariant to derive new algebraic connections between perspective views. It is shown that three perspective views of an object are connected by certain algebraic functions of image coordinates alone (no structure or camera geometry needs to be involved).

AIM-1404

CBCL-77

Author[s]: Tomaso Poggio and Anya Hurlbert

Observations on Cortical Mechanisms for Object Recognition andsLearning

December 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1404.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1404.pdf

This paper sketches a hypothetical cortical architecture for visual 3D object recognition based on a recent computational model. The view-centered scheme relies on modules for learning from examples, such as Hyperbf-like networks. Such models capture a class of explanations we call Memory-Based Models (MBM) that contains sparse population coding, memory-based recognition, and codebooks of prototypes. Unlike the sigmoidal units of some artificial neural networks, the units of MBMs are consistent with the description of cortical neurons. We describe how an example of MBM may be realized in terms of cortical circuitry and biophysical mechanisms, consistent with psychophysical and physiological data.

AIM-1403

Author[s]: Gary C. Borchardt

Causal Reconstruction

February 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1403.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1403.pdf

Causal reconstruction is the task of reading a written causal description of a physical behavior, forming an internal model of the described activity, and demonstrating comprehension through question answering. T his task is difficult because written d escriptions often do not specify exactly how r eferenced events fit together. This article (1) ch aracterizes the causal reconstruction problem, (2) presents a representation called transition space, which portrays events in terms of "transitions,'' or collections of changes expressible in everyday language, and (3) describes a program called PATHFINDER, which uses the transition space representation to perform causal reconstruction on simplified English descriptions of physical activity.

AIM-1402

Author[s]: Athanassios G. Siapas

A Global Approach to Parameter Estimation of Chaotic Dynamical Systems

December 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1402.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1402.pdf

We present a novel approach to parameter estimation of systems with complicated dynamics, as well as evidence for the existence of a universal power law that enables us to quantify the dependence of global geometry on small changes in the parameters of the system. This power law gives rise to what seems to be a new dynamical system invariant.

AITR-1401

Author[s]: Amnon Shashua

Geometry and Photometry in 3D Visual Recognition

November 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1401.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1401.pdf

The report addresses the problem of visual recognition under two sources of variability: geometric and photometric. The geometric deals with the relation between 3D objects and their views under orthographic and perspective projection. The photometric deals with the relation between 3D matte objects and their images under changing illumination conditions. Taken together, an alignment- based method is presented for recognizing objects viewed from arbitrary viewing positions and illuminated by arbitrary settings of light sources.

AIM-1399

Author[s]: S. Tanveer F. Mahmood

Data and Model-Driven Selection Using Parallel-Line Groups

May 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1399.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1399.pdf

A key problem in model-based object recognition is selection, namely, the problem of isolating regions in an image that are likely to come from a single object. This isolation can be either based solely on image data (data-driven) or can incorporate the knowledge of the model object (model- driven). In this paper we present an approach that exploits the property of closely-spaced parallelism between lines on objects to achieve data and model-driven selection. Specifically, we present a method of identifying groups of closely-spaced parallel lines in images that generates a linear number of small-sized and reliable groups thus meeting several of the desirable requirements of a grouping scheme for recognition. The line groups generated form the basis for data and model-driven selection. Data-driven selection is achieved by selecting salient line groups as judged by a saliency measure that emphasizes the likelihood of the groups coming from single objects. The approach to model-driven selection, on the other hand, uses the description of closely- spaced parallel line groups on the model object to selectively generate line groups in the image that are likely to eb the projections of the model groups under a set of allowable transformations and taking into account the effect of occlusions, illumination changes, and imaging errors. We then discuss the utility of line groups-based selection in the context of reducing the search involved in recognition, both as an independent selection mechanism, and when used in combination with other cues such as color. Finally, we present results that indicate a vast improvement in the performance of a recognition system that is integrated with parallel line groups-based selection.

AITR-1398

Author[s]: William M. Wells III

Statistical Object Recognition

January 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1398.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1398.pdf

Two formulations of model-based object recognition are described. MAP Model Matching evaluates joint hypotheses of match and pose, while Posterior Marginal Pose Estimation evaluates the pose only. Local search in pose space is carried out with the Expectation--Maximization (EM) algorithm. Recognition experiments are described where the EM algorithm is used to refine and evaluate pose hypotheses in 2D and 3D. Initial hypotheses for the 2D experiments were generated by a simple indexing method: Angle Pair Indexing. The Linear Combination of Views method of Ullman and Basri is employed as the projection model in the 3D experiments.

AIM-1397

Author[s]: Ronald Chaney

Complexity as a Sclae-Space for the Medial Axis Transform

January 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1397.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1397.pdf

The medial axis skeleton is a thin line graph that preserves the topology of a region. The skeleton has often been cited as a useful representation for shape description, region interpretation, and object recognition. Unfortunately, the computation of the skeleton is extremely sensitive to variations in the bounding contour. In this paper, we describe a robust method for computing the medial axis skeleton across a variety of scales. The resulting scale-space is parametric with the complexity of the skeleton, where the complexity is defined as the number of branches in the skeleton.

AITR-1396

Author[s]: Michael J. Jones

Using Recurrent Networks for Dimensionality Reduction

September 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1396.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1396.pdf

This report explores how recurrent neural networks can be exploited for learning high- dimensional mappings. Since recurrent networks are as powerful as Turing machines, an interesting question is how recurrent networks can be used to simplify the problem of learning from examples. The main problem with learning high-dimensional functions is the curse of dimensionality which roughly states that the number of examples needed to learn a function increases exponentially with input dimension. This thesis proposes a way of avoiding this problem by using a recurrent network to decompose a high-dimensional function into many lower dimensional functions connected in a feedback loop.

AIM-1395

Author[s]: Karen B. Sarachik

Limitations of Geometric Hashing in the Presence of Gaussian Noise

October 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1395.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1395.pdf

This paper presents a detailed error analysis of geometric hashing for 2D object recogition. We analytically derive the probability of false positives and negatives as a function of the number of model and image, features and occlusion, using a 2D Gaussian noise model. The results are presented in the form of ROC (receiver-operating characteristic) curves, which demonstrate that the 2D Gaussian error model always has better performance than that of the bounded uniform model. They also directly indicate the optimal performance that can be achieved for a given clutter and occlusion rate, and how to choose the thresholds to achieve these rates.

AIM-1392

Author[s]: Ronald D. Chaney

Analytical Representation of Contours

October 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1392.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1392.pdf

The interpretation and recognition of noisy contours, such as silhouettes, have proven to be difficult. One obstacle to the solution of these problems has been the lack of a robust representation for contours. The contour is represented by a set of pairwise tangent circular arcs. The advantage of such an approach is that mathematical properties such as orientation and curvature are explicityly represented. We introduce a smoothing criterion for the contour tht optimizes the tradeoff between the complexity of the contour and proximity of the data points. The complexity measure is the number of extrema of curvature present in the contour. The smoothing criterion leads us to a true scale-space for contours. We describe the computation of the contour representation as well as the computation of relevant properties of the contour. We consider the potential application of the representation, the smoothing paradigm, and the scale-space to contour interpretation and recognition.

AIM-1391

Author[s]: Ronen Basri

Recognition by Prototypes

December 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1391.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1391.pdf

A scheme for recognizing 3D objects from single 2D images is introduced. The scheme proceeds in two stages. In the first stage, the categorization stage, the image is compared to prototype objects. For each prototype, the view that most resembles the image is recovered, and, if the view is found to be similar to the image, the class identity of the object is determined. In the second stage, the identification stage, the observed object is compared to the individual models of its class, where classes are expected to contain objects with relatively similar shapes. For each model, a view that matches the image is sought. If such a view is found, the object's specific identity is determined. The advantage of categorizing the object before it is identified is twofold. First, the image is compared to a smaller number of models, since only models that belong to the object's class need to be considered. Second, the cost of comparing the image to each model in a classis very low, because correspondence is computed once for the whoel class. More specifically, the correspondence and object pose computed in the categorization stage to align the prototype with the image are reused in the identification stage to align the individual models with the image. As a result, identification is reduced to a series fo simple template comparisons. The paper concludes with an algorithm for constructing optimal prototypes for classes of objects.

AIM-1390

Author[s]: Jose L. Marroquin and Federico Girosi

Some Extensions of the K-Means Algorithm for Image Segmentation and Pattern Classification

January 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1390.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1390.pdf

In this paper we present some extensions to the k-means algorithm for vector quantization that permit its efficient use in image segmentation and pattern classification tasks. It is shown that by introducing state variables that correspond to certain statistics of the dynamic behavior of the algorithm, it is possible to find the representative centers fo the lower dimensional maniforlds that define the boundaries between classes, for clouds of multi-dimensional, mult-class data; this permits one, for example, to find class boundaries directly from sparse data (e.g., in image segmentation tasks) or to efficiently place centers for pattern classification (e.g., with local Gaussian classifiers). The same state variables can be used to define algorithms for determining adaptively the optimal number of centers for clouds of data with space-varying density. Some examples of the applicatin of these extensions are also given.

AITR-1388

Author[s]: Elizabeth Bradley

Taming Chaotic Circuits

September 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1388.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1388.pdf

Control algorithms that exploit chaotic behavior can vastly improve the performance of many practical and useful systems. The program Perfect Moment is built around a collection of such techniques. It autonomously explores a dynamical system's behavior, using rules embodying theorems and definitions from nonlinear dynamics to zero in on interesting and useful parameter ranges and state-space regions. It then constructs a reference trajectory based on that information and causes the system to follow it. This program and its results are illustrated with several examples, among them the phase- locked loop, where sections of chaotic attractors are used to increase the capture range of the circuit.

AITR-1385

Author[s]: Feng Zhao

Automatic Analysis and Synthesis of Controllers for Dynamical Systems Based On P

September 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1385.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1385.pdf

I present a novel design methodology for the synthesis of automatic controllers, together with a computational environment---the Control Engineer's Workbench---integrating a suite of programs that automatically analyze and design controllers for high-performance, global control of nonlinear systems. This work demonstrates that difficult control synthesis tasks can be automated, using programs that actively exploit and efficiently represent knowledge of nonlinear dynamics and phase space and effectively use the representation to guide and perform the control design. The Control Engineer's Workbench combines powerful numerical and symbolic computations with artificial intelligence reasoning techniques. As a demonstration, the Workbench automatically designed a high-quality maglev controller that outperforms a previous linear design by a factor of 20.

AITR-1384

Author[s]: M. Ali Taalebinezhaad

Robot Motion Vision by Fixation

September 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1384.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1384.pdf

In many motion-vision scenarios, a camera (mounted on a moving vehicle) takes images of an environment to find the "motion'' and shape. We introduce a direct-method called fixation for solving this motion-vision problem in its general case. Fixation uses neither feature-correspondence nor optical-flow. Instead, spatio-temporal brightness gradients are used directly. In contrast to previous direct methods, fixation does not restrict the motion or the environment. Moreover, fixation method neither requires tracked images as its input nor uses mechanical tracking for obtaining fixated images. The experimental results on real images are presented and the implementation issues and techniques are discussed.

AIM-1382

Author[s]: M. Ali Taalebinezhaad

Visual Tracking

October 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1382.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1382.pdf

A typical robot vision scenario might involve a vehicle moving with an unknown 3D motion (translation and rotation) while taking intensity images of an arbitrary environment. This paper describes the theory and implementation issues of tracking any desired point in the environment. This method is performed completely in software without any need to mechanically move the camera relative to the vehicle. This tracking technique is simple an inexpensive. Furthermore, it does not use either optical flow or feature correspondence. Instead, the spatio-temporal gradients of the input intensity images are used directly. The experimental results presented support the idea of tracking in software. The final result is a sequence of tracked images where the desired point is kept stationary in the images independent of the nature of the relative motion. Finally, the quality of these tracked images are examined using spatio-temporal gradient maps.

AIM-1378

Author[s]: T.D. Alter

3D Pose from Three Corresponding Points Under Weak-Perspective Projection

July 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1378.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1378.pdf

Model-based object recognition commonly involves using a minimal set of matched model and image points to compute the pose of the model in image coordinates. Furthermore, recognition systems often rely on the "weak-perspective" imaging model in place of the perspective imaging model. This paper discusses computing the pose of a model from three corresponding points under weak-perspective projection. A new solution to the problem is proposed which, like previous solutins, involves solving a biquadratic equation. Here the biquadratic is motivate geometrically and its solutions, comprised of an actual and a false solution, are interpreted graphically. The final equations take a new form, which lead to a simple expression for the image position of any unmatched model point.

AITR-1377

Author[s]: Rajeev Surati

A Parallelizing Compiler Based on Partial Evaluation

July 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1377.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1377.pdf

We constructed a parallelizing compiler that utilizes partial evaluation to achieve efficient parallel object code from very high-level data independent source programs. On several important scientific applications, the compiler attains parallel performance equivalent to or better than the best observed results from the manual restructuring of code. This is the first attempt to capitalize on partial evaluation's ability to expose low-level parallelism. New static scheduling techniques are used to utilize the fine-grained parallelism of the computations. The compiler maps the computation graph resulting from partial evaluation onto the Supercomputer Toolkit, an eight VLIW processor parallel computer.

AIM-1376

Author[s]: Ehud Rivlin and Ronen Basri

Localization and Positioning Using Combinations of Model Views

September 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1376.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1376.pdf

A method for localization and positioning in an indoor environment is presented. The method is based on representing the scene as a set of 2D views and predicting the appearances of novel views by linear combinations of the model views. The method is accurate under weak perspective projection. Analysis of this projection as well as experimental results demonstrate that in many cases it is sufficient to accurately describe the scene. When weak perspective approximation is invalid, an iterative solution to account for the perspective distortions can be employed. A simple algorithm for repositioning, the task of returning to a previously visited position defined by a single view, is derived from this method.

AIM-1375

Author[s]: Nicola Ancona and Tomaso Poggio

Optical Flow From 1D Correlation: Application to a Simple Time-To-Crash Detector

October 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1375.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1375.pdf

In the first part of this paper we show that a new technique exploiting 1D correlation of 2D or even 1D patches between successive frames may be sufficient to compute a satisfactory estimation of the optical flow field. The algorithm is well-suited to VLSI implementations. The sparse measurements provided by the technique can be used to compute qualitative properties of the flow for a number of different visual tsks. In particular, the second part of the paper shows how to combine our 1D correlation technique with a scheme for detecting expansion or rotation ([5]) in a simple algorithm which also suggests interesting biological implications. The algorithm provides a rough estimate of time-to-crash. It was tested on real image sequences. We show its performance and compare the results to previous approaches.

AITR-1374

Author[s]: Thomas M. Breuel

Geometric Aspects of Visual Object Recognition

May 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1374.pdf

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1374.pdf

This thesis presents there important results in visual object recognition based on shape. (1) A new algorithm (RAST; Recognition by Adaptive Sudivisions of Tranformation space) is presented that has lower average-case complexity than any known recognition algorithm. (2) It is shown, both theoretically and empirically, that representing 3D objects as collections of 2D views (the "View-Based Approximation") is feasible and affects the reliability of 3D recognition systems no more than other commonly made approximations. (3) The problem of recognition in cluttered scenes is considered from a Bayesian perspective; the commonly-used "bounded- error errorsmeasure" is demonstrated to correspond to an independence assumption. It is shown that by modeling the statistical properties of real-scenes better, objects can be recognized more reliably.

AIM-1373

Author[s]: Ronen Basri and Daphna Weinshall

Distance Metric Between 3D Models and 3D Images for Recognition and Classification

July 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1373.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1373.pdf

Similarity measurements between 3D objects and 2D images are useful for the tasks of object recognition and classification. We distinguish between two types of similarity metrics: metrics computed in image-space (image metrics) and metrics computed in transformation-space (transformation metrics). Existing methods typically use image and the nearest view of the object. Example for such a measure is the Euclidean distance between feature points in the image and corresponding points in the nearest view. (Computing this measure is equivalent to solving the exterior orientation calibration problem.) In this paper we introduce a different type of metrics: transformation metrics. These metrics penalize for the deformatoins applied to the object to produce the observed image. We present a transformation metric that optimally penalizes for "affine deformations" under weak-perspective. A closed-form solution, together with the nearest view according to this metric, are derived. The metric is shown to be equivalent to the Euclidean image metric, in the sense that they bound each other from both above and below. For Euclidean image metric we offier a sub-optimal closed-form solution and an iterative scheme to compute the exact solution.

AITR-1371

Author[s]: B. Whitney Rappole, Jr.

Minimizing Residual Vibrations in Flexible Systems

June 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1371.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1371.pdf

Residual vibrations degrade the performance of many systems. Due to the lightweight and flexible nature of space structures, controlling residual vibrations is especially difficult. Also, systems such as the Space Shuttle remote Manipulator System have frequencies that vary significantly based upon configuration and loading. Recently, a technique of minimizing vibrations in flexible structures by command input shaping was developed. This document presents research completed in developing a simple, closed- form method of calculating input shaping sequences for two-mode systems and a system to adapt the command input shaping technique to known changes in system frequency about the workspace. The new techniques were tested on a three-link, flexible manipulator.

AITR-1370

Author[s]: Vijay Balasubramanian

Equivalence and Reduction of Hidden Markov Models

January 1993

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1370.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1370.pdf

This report studies when and why two Hidden Markov Models (HMMs) may represent the same stochastic process. HMMs are characterized in terms of equivalence classes whose elements represent identical stochastic processes. This characterization yields polynomial time algorithms to detect equivalent HMMs. We also find fast algorithms to reduce HMMs to essentially unique and minimal canonical representations. The reduction to a canonical form leads to the definition of 'Generalized Markov Models' which are essentially HMMs without the positivity constraint on their parameters. We discuss how this generalization can yield more parsimonious representations of stochastic processes at the cost of the probabilistic interpretation of the model parameters.

AIM-1369

Author[s]: Michael D. Ernst (Editor)

Intellectual Property in Computing: (How) Should Software Be Protected? An Industry Perspective

May 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1369.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1369.pdf

The future of the software industry is today being shaped in the courtroom. Most discussions of intellectual property to date, however, have been frames as debates about how the existing law --- promulgated long before the computer revolution --- should be applied to software. This memo is a transcript of a panel discussion on what forms of legal protection should apply to software to best serve both the industry and society in general. After addressing that question we can consider what laws would bring this about.

AITR-1368

Author[s]: Kenneth W. Chang

Shaping Inputs to Reduce Vibration in Flexible Space Structures

June 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1368.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1368.pdf

Future NASA plans to launch large space strucutres solicit the need for effective vibration control schemes which can solve the unique problems associated with unwanted residual vibration in flexible spacecraft. In this work, a unique method of input command shaping called impulse shaping is examined. A theoretical background is presented along with some insight into the methdos of calculating multiple mode sequences. The Middeck Active Control Experiment (MACE) is then described as the testbed for hardware experiments. These results are shown and some of the difficulties of dealing with nonlinearities are discussed. The paper is concluded with some conclusions about calculating and implementing impulse shaping in complex nonlinear systems.

AIM-1366

Author[s]: Thomas Marill

Why Do We See Three-dimensional Objects?

June 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1366.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1366.pdf

When we look at certain line-drawings, we see three-dimensional objects. The question is why; why not just see two-dimensional images? We theorize that we see objects rather than images because the objects we see are, in a certain mathematical sense, less complex than the images; and that furthermore the particular objects we see will be the least complex of the available alternatives. Experimental data supporting the theory is reported. The work is based on ideas of Solomonoff, Kolmogorov, and the "minimum description length'' concepts of Rissanen.

AITR-1365

Author[s]: Timothy D. Tuttle

Understanding and Modeling the Behavior of a Harmonic Drive Gear Transmission

May 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1365.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1365.pdf

In my research, I have performed an extensive experimental investigation of harmonic-drive properties such as stiffness, friction, and kinematic error. From my experimental results, I have found that these properties can be sharply non-linear and highly dependent on operating conditions. Due to the complex interaction of these poorly behaved transmission properties, dynamic response measurements showed surprisingly agitated behavior, especially around system resonance. Theoretical models developed to mimic the observed response illustrated that non-linear frictional effects cannot be ignored in any accurate harmonic-drive representation. Additionally, if behavior around system resonance must be replicated, kinematic error and transmission compliance as well as frictional dissipation from gear- tooth rubbing must all be incorporated into the model.

AITR-1364

Author[s]: Patrick G. Sobalvarro

Probabilistic Analysis of Multistage Interconnection Network Performance

April 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1364.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1364.pdf

We present methods of calculating the value of two performance parameters for multipath, multistage interconnection networks: the normalized throughput and the probability of successful message transmission. We develop a set of exact equations for the loading probability mass functions of network channels and a program for solving them exactly. We also develop a Monte Carlo method for approxmiate solution of the equations, and show that the resulting approximation method will always calculate the values of the performance parameters more quickly than direct simulation.

AIM-1363

Author[s]: Amnon Shashua

Projective Structure from Two Uncalibrated Images: Structure from Motion and RecRecognition

September 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1363.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1363.pdf

This paper addresses the problem of recovering relative structure, in the form of an invariant, referred to as projective structure, from two views of a 3D scene. The invariant structure is computed without any prior knowledge of camera geometry, or internal calibration, and with the property that perspective and orthographic projections are treated alike, namely, the system makes no assumption regarding the existence of perspective distortions in the input images.

AIM-1362

Author[s]: W. Eric Grimson, Daniel P. Huttenlocher and T. D. Alter

Recognizing 3D Ojbects of 2D Images: An Error Analysis

July 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1362.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1362.pdf

Many object recognition systems use a small number of pairings of data and model features to compute the 3D transformation from a model coordinate frame into the sensor coordinate system. With perfect image data, these systems work well. With uncertain image data, however, their performance is less clear. We examine the effects of 2D sensor uncertainty on the computation of 3D model transformations. We use this analysis to bound the uncertainty in the transformation parameters, and the uncertainty associated with transforming other model features into the image. We also examine the impact of the such transformation uncertainty on recognition methods.

AIM-1359

Author[s]: Gerald J. Sussman and Jack Wisdom

Chaotic Evolution of the Solar System

March 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1359.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1359.pdf

The evolution of the entire planetary system has been numerically integrated for a time span of nearly 100 million years. This calculation confirms that the evolution of the solar system as a whole is chaotic, with a time scale of exponential divergence of about 4 million years. Additional numerical experiments indicate that the Jovian planet subsystem is chaotic, although some small variations in the model can yield quasiperiodic motion. The motion of Pluto is independently and robustly chaotic.

AITR-1358

Author[s]: Linda M. Wills

Automated Program Recognition by Graph Parsing

July 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1358.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1358.pdf

Recognizing standard computational structures (cliches) in a program can help an experienced programmer understand the program. We develop a graph parsing approach to automating program recognition in which programs and cliches are represented in an attributed graph grammar formalism and recognition is achieved by graph parsing. In studying this approach, we evaluate our representation's ability to suppress many common forms of variation which hinder recognition. We investigate the expressiveness of our graph grammar formalism for capturing programming cliches. We empirically and analytically study the computational cost of our recognition approach with respect to two medium-sized, real-world simulator programs.

AIM-1357

Author[s]: Lynne E. Parker

Local Versus Global Control Laws for Cooperative Agent Teams

March 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1357.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1357.pdf

The design of the control laws governing the behavior of individual agents is crucial for the successful development of cooperative agent teams. These control laws may utilize a combination of local and/or global knowledge to achieve the resulting group behavior. A key difficulty in this development is deciding the proper balance between local and global control required to achieve the desired emergent group behavior. This paper addresses this issue by presenting some general guidelines and principles for determining the appropriate level of global versus local control. These principles are illustrated and implemented in a "keep formation'' cooperative task case study.

AIM-1356

Author[s]: W. Richards and A. Jepson

What Makes a Good Feature?

April 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1356.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1356.pdf

Using a Bayesian framework, we place bounds on just what features are worth computing if inferences about the world properties are to be made from image data. Previously others have proposed that useful features reflect "non-accidental'' or "suspicious'' configurations (such as parallel or colinear lines). We make these notions more precise and show them to be context sensitive.

AITR-1355

Author[s]: Stephen W. Keckler

A Coupled Multi-ALU Processing Node for a Highly Parallel Computer

September 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1355.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1355.pdf

This report describes Processor Coupling, a mechanism for controlling multiple ALUs on a single integrated circuit to exploit both instruction-level and inter-thread parallelism. A compiler statically schedules individual threads to discover available intra-thread instruction-level parallelism. The runtime scheduling mechanism interleaves threads, exploiting inter-thread parallelism to maintain high ALU utilization. ALUs are assigned to threads on a cycle byscycle basis, and several threads can be active concurrently. Simulation results show that Processor Coupling performs well both on single threaded and multi-threaded applications. The experiments address the effects of memory latencies, function unit latencies, and communication bandwidth between function units.

AIM-1354

Author[s]: Tomaso Poggio and Roberto Brunelli

A Novel Approach to Graphics

February 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1354.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1354.pdf

sWe show that we can optimally represent the set of 2D images producedsby the point features of a rigid 3D model as two lines in twoshigh-dimensional spaces. We then decribe a working recognition systemsin which we represent these spaces discretely in a hash table. We cansaccess this table at run time to find all the groups of model featuressthat could match a group of image features, accounting for the effectssof sensing error. We also use this representation of a model's imagessto demonstrate significant new limitations of two other approaches tosrecognition: invariants, and non-accidental properties.

AIM-1353

Author[s]: David W. Jacobs

Space Efficient 3D Model Indexing

February 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1353.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1353.pdf

We show that we can optimally represent the set of 2D images produced by the point features of a rigid 3D model as two lines in two high-dimensional spaces. We then decribe a working recognition system in which we represent these spaces discretely in a hash table. We can access this table at run time to find all the groups of model features that could match a group of image features, accounting for the effects of sensing error. We also use this representation of a model's images to demonstrate significant new limitations of two other approaches to recognition: invariants, and non- accidental properties.

AITR-1350

Author[s]: Lyle J. Borg-Graham

On Directional Selectivity in Vertebrate Retina: An Experimental and Computational Study

January 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1350.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1350.pdf

This thesis describes an investigation of retinal directional selectivity. We show intracellular (whole-cell patch) recordings in turtle retina which indicate that this computation occurs prior to the ganglion cell, and we describe a pre-ganglionic circuit model to account for this and other findings which places the non-linear spatio-temporal filter at individual, oriented amacrine cell dendrites. The key non-linearity is provided by interactions between excitatory and inhibitory synaptic inputs onto the dendrites, and their distal tips provide directionally selective excitatory outputs onto ganglion cells. Detailed simulations of putative cells support this model, given reasonable parameter constraints. The performance of the model also suggests that this computational substructure may be relevant within the dendritic trees of CNS neurons in general.

AITR-1349

Author[s]: Kah-Kay Sung

A Vector Signal Processing Approach to Color

January 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1349.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1349.pdf

Surface (Lambertain) color is a useful visual cue for analyzing material composition of scenes. This thesis adopts a signal processing approach to color vision. It represents color images as fields of 3D vectors, from which we extract region and boundary information. The first problem we face is one of secondary imaging effects that makes image color different from surface color. We demonstrate a simple but effective polarization based technique that corrects for these effects. We then propose a systematic approach of scalarizing color, that allows us to augment classical image processing tools and concepts for multi-dimensional color signals.

AIM-1348

Author[s]: Shimon Edelman, Heinrich Bulthoff, and Erik Sklar

Task and Object Learning in Visual Recognition

January 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1348.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1348.pdf

Human performance in object recognition changes with practice, even in the absence of feedback to the subject. The nature of the change can reveal important properties of the process of recognition. We report an experiment designed to distinguish between non-specific task learning and object- specific practice effects. The results of the experiment support the notion that learning through modification of object representations can be separated from less interesting effects of practice, if appropriate response measures (specifically, the coefficient of variation of response time over views of an object) are used. Furthermore, the results, obtained with computer-generated amoeba-like objects, corroborate previous findings regarding the development of canonical views and related phenomena with practice.

AIM-1347

Author[s]: Tomaso Poggio and Thomas Vetter

Recognition and Structure from One 2D Model View: Observations on Prototypes, Object Classes and Symmetries

February 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1347.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1347.pdf

In this note we discuss how recognition can be achieved from a single 2D model view exploiting prior knowledge of an object's structure (e.g. symmetry). We prove that for any bilaterally symmetric 3D object one non- accidental 2D model view is sufficient for recognition. Symmetries of higher order allow the recovery of structure from one 2D view. Linear transformations can be learned exactly from a small set of examples in the case of "linear object classes" and used to produce new views of an object from a single view.

AIM-1346

Author[s]: P. A. Skordos

Time-Reversible Maxwell's Demon

September 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1346.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1346.pdf

A time-reversible Maxwell's demon is demonstrated which creates a density difference between two chambers initialized to have equal density. The density difference is estimated theoretically and confirmed by computer simulations. It is found that the reversible Maxwell's demon compresses phase space volume even though its dynamics are time reversible. The significance of phase space volume compression in operating a microscopic heat engine is also discussed.

AIM-1345

Author[s]: Michael de la Maza and Bruce Tidor

Boltzmannn Weighted Selection Improves Performance of Genetic Algorithms

December 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1345.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1345.pdf

Modifiable Boltzmann selective pressure is investigated as a tool to control variability in optimizations using genetic algorithms. An implementation of variable selective pressure, modeled after the use of temperature as a parameter in simulated annealing approaches, is described. The convergence behavior of optimization runs is illustrated as a function of selective pressure; the method is compared to a genetic algorithm lacking this control feature and is shown to exhibit superior convergence properties on a small set of test problems. An analysis is presented that compares the selective pressure of this algorithm to a standard selection procedure.

AIM-1344

Author[s]: Robert Givan and David McAllester

Tractable Inference Relations

December 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1344.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1344.pdf

We consider the concept of local sets of inference rules. Locality is a syntactic condition on rule sets which guarantees that the inference relation defined by those rules is polynomial time decidable. Unfortunately, determining whether a given rule set is local can be difficult. In this paper we define inductive locality, a strengthening of locality. We also give a procedure which can automatically recognize the locality of any inductively local rule set. Inductive locality seems to be more useful that the earlier concept of strong locality. We show that locality, as a property of rule sets, is undecidable in general.

AIM-1343

Author[s]: David McAllester and Jeffrey Siskind

Lifting Transformations

December 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1343.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1343.pdf

Lifting is a well known technique in resolution theorem proving, logic programming, and term rewriting. In this paper we formulate lifting as an efficiency-motivated program transformation applicable to a wide variety of nondeterministic procedures. This formulation allows the immediate lifting of complex procedures, such as the Davis- Putnam algorithm, which are otherwise difficult to lift. We treat both classical lifting, which is based on unification, and various closely related program transformations which we also call lifting transformations. These nonclassical lifting transformations are closely related to constraint techniques in logic programming, resolution, and term rewriting.

AIM-1342

Author[s]: David McAllester

Grammar Rewriting

December 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1342.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1342.pdf

We present a term rewriting procedure based on congruence closure that can be used with arbitrary equational theories. This procedure is motivated by the pragmatic need to prove equations in equational theories where confluence can not be achieved. The procedure uses context free grammars to represent equivalence classes of terms. The procedure rewrites grammars rather than terms and uses congruence closure to maintain certain congruence properties of the grammar. Grammars provide concise representations of large term sets. Infinite term sets can be represented with finite grammars and exponentially large term sets can be represented with linear sized grammars.

AIM-1341

Author[s]: Robert Givan, David McAllester and Sameer Shalaby

Natural Language Based Inference Procedures Applied to Schubert's Steamroller

December 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1341.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1341.pdf

We have previously argued that the syntactic structure of natural language can be exploited to construct powerful polynomial time inference procedures. This paper supports the earlier arguments by demonstrating that a natural language based polynomial time procedure can solve Schubert's steamroller in a single step.

AIM-1340

Author[s]: David McAllester

Observations on Cognitive Judgments

December 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1340.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1340.pdf

It is obvious to anyone familiar with the rules of the game of chess that a king on an empty board can reach every square. It is true, but not obvious, that a knight can reach every square. Why is the first fact obvious but the second fact not? This paper presents an analytic theory of a class of obviousness judgments of this type. Whether or not the specifics of this analysis are correct, it seems that the study of obviousness judgments can be used to construct integrated theories of linguistics, knowledge representation, and inference.

AIM-1339

Author[s]: David McAllester and David Rosenblatt

Systematic Nonlinear Planning

December 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1339.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1339.pdf

This paper presents a simple, sound, complete, and systematic algorithm for domain independent STRIPS planning. Simplicity is achieved by starting with a ground procedure and then applying a general and independently verifiable, lifting transformation. Previous planners have been designed directly as lifted procedures. Our ground procedure is a ground version of Tate's NONLIN procedure. In Tate's procedure one is not required to determine whether a prerequisite of a step in an unfinished plan is guarnateed to hold in all linearizations. This allows Tate"s procedure to avoid the use of Chapman"s modal truth criterion. Systematicity is the property that the same plan, or partial plan, is never examined more than once. Systematicity is achieved through a simple modification of Tate's procedure.

AIM-1338

Author[s]: Lynn Andrea Stein and Leora Morgenstern

Motivated Action Theory: A Formal Theory of Causal Reasoning

December 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1338.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1338.pdf

When we reason about change over time, causation provides an implicit preference: we prefer sequences of situations in which one situation leads causally to the next, rather than sequences in which one situation follows another at random and without causal connections. In this paper, we explore the problem of temporal reasoning --- reasoning about change over time --- and the crucial role that causation plays in our intuitions. We examine previous approaches to temporal reasoning, and their shortcomings, in light of this analysis. We propose a new system for causal reasoning, motivated action theory, which builds upon causation as a crucial preference creterion. Motivated action theory solves the traditional problems of both forward and backward reasoning, and additionally provides a basis for a new theory of explanation.

AIM-1337

Author[s]: Patrick G. Sobalvarro

Calculation of Blocking Probabilities in Multistage Interconnection Networks with Redundant Paths

December 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1337.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1337.pdf

The blocking probability of a network is a common measure of its performance. There exist means of quickly calculating the blocking probabilities of Banyan networks; however, because Banyan networks have no redundant paths, they are not inherently fault-tolerant, and so their use in large-scale multiprocessors is problematic. Unfortunately, the addition of multiple paths between message sources and sinks in a network complicates the calculation of blocking probabilities. A methodology for exact calculation of blocking probabilities for small networks with redundant paths is presented here, with some discussion of its potential use in approximating blocking probabilities for large networks with redundant paths.

AIM-1336

Author[s]: Tomaso Poggio, Manfred Fahle and Shimon Edelman

Fast Perceptual Learning in Visual Hyperacuity

December 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1336.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1336.pdf

In many different spatial discrimination tasks, such as in determining the sign of the offset in a vernier stimulus, the human visual system exhibits hyperacuity-level performance by evaluating spatial relations with the precision of a fraction of a photoreceptor"s diameter. We propose that this impressive performance depends in part on a fast learning process that uses relatively few examples and occurs at an early processing stage in the visual pathway. We show that this hypothesis is plausible by demonstrating that it is possible to synthesize, from a small number of examples of a given task, a simple (HyperBF) network that attains the required performance level. We then verify with psychophysical experiments some of the key predictions of our conjecture. In particular, we show that fast timulus-specific learning indeed takes place in the human visual system and that this learning does not transfer between two slightly different hyperacuity tasks.

AIM-1334

Author[s]: M. Ali Taalebinezhaad

Towards Autonomous Motion Vision

April 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1334.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1334.pdf

Earlier, we introduced a direct method called fixation for the recovery of shape and motion in the general case. The method uses neither feature correspondence nor optical flow. Instead, it directly employs the spatiotemporal gradients of image brightness. This work reports the experimental results of applying some of our fixation algorithms to a sequence of real images where the motion is a combination of translation and rotation. These results show that parameters such as the fization patch size have crucial effects on the estimation of some motion parameters. Some of the critical issues involved in the implementaion of our autonomous motion vision system are also discussed here. Among those are the criteria for automatic choice of an optimum size for the fixation patch, and an appropriate location for the fixation point which result in good estimates for important motion parameters. Finally, a calibration method is described for identifying the real location of the rotation axis in imaging systems.

AIM-1333

Author[s]: Ronen Basri

On The Uniqueness of Correspondence Under Orthographic and Perspective Projections

December 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1333.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1333.pdf

The task of shape recovery from a motion sequence requires the establishment of correspondence between image points. The two processes, the matching process and the shape recovery one, are traditionally viewed as independent. Yet, information obtained during the process of shape recovery can be used to guide the matching process. This paper discusses the mutual relationship between the two processes. The paper is divided into two parts. In the first part we review the constraints imposed on the correspondence by rigid transformations and extend them to objects that undergo general affine (non rigid) transformation (including stretch and shear), as well as to rigid objects with smooth surfaces. In all these cases corresponding points lie along epipolar lines, and these lines can be recovered from a small set of corresponding points. In the second part of the paper we discuss the potential use of epipolar lines in the matching process. We present an algorithm that recovers the correspondence from three contour images. The algorithm was implemented and used to construct object models for recognition. In addition we discuss how epipolar lines can be used to solve the aperture problem.

AIM-1332

Author[s]: Ronen Basri

The Alignment of Objects With Smooth Surfaces: Error Analysis of the Curvature Method

November 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1332.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1332.pdf

The recognition of objects with smooth bounding surfaces from their contour images is considerably more complicated than that of objects with sharp edges, since in the former case the set of object points that generates the silhouette contours changes from one view to another. The "curvature method", developed by Basri and Ullman [1988], provides a method to approximate the appearance of such objects from different viewpoints. In this paper we analyze the curvature method. We apply the method to ellipsoidal objects and compute analytically the error obtained for different rotations of the objects. The error depends on the exact shape of the ellipsoid (namely, the relative lengths of its axes), and it increases a sthe ellipsoid becomes "deep" (elongated in the Z-direction). We show that the errors are usually small, and that, in general, a small number of models is required to predict the appearance of an ellipsoid from all possible views. Finally, we show experimentally that the curvature method applies as well to objects with hyperbolic surface patches.

AIM-1331

Author[s]: David L. Brock

Dynamic Model and Control of an Artificial Muscle Based on Contractile Polymers

November 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1331.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1331.pdf

A dynamic model and control system of an artificial muscle is presented. The artificial muscle is based on a contractile polymer gel which undergoes abrupt volume changes in response to variations in external conditions. The device uses an acid-base reaction to directly convert chemical to mechanical energy. A nonlinear sliding mode control system is proposed to track desired joint trajectories of a single link controlled by two antagonist muscles. Both the model and controller were implemented and produced acceptable tracking performance at 2Hz.

AIM-1330

Author[s]: David L. Brock

Review of Artificial Muscle Based on Contractile Polymers

November 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1330.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1330.pdf

An artificial muscle with strength and speed equal to that of a human muscle may soon be possible. Polymer gels exhibit abrubt volume changes in response to variations in their external conditions -- shrinking or swelling up to 1000 times their original volume. Through the conversion of chemical or electrical energy into mechanical work, a number of devices have already been constructed which produce forces up to 100N/cm2 and contraction rates on the order of a second. Through the promise of an artificial muscle is real, many fundamental physical and engineering questions remain before the extent or limit of these devices is known.

AIM-1329

Author[s]: Harold Abelson, Andrew A. Berlin, Jacob Katzenelson, William H. McAllister, Guillermo J. Rozas, Gerald Jay Sussman and Jack Wisdom

The Supercomputer Toolkit: A General Framework for Special-purpose Computing

November 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1329.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1329.pdf

The Toolkit is a family of hardware modules (processors, memory, interconnect, and input- output devices) and a collection of software modules (compilers, simulators, scientific libraries, and high-level front ends) from which high-performance special-purpose computers can be easily configured and programmed. The hardware modules are intended to be standard, reusable parts. These are combined by means of a user- reconfigurable, static interconnect technology. T he Toolkit's software support, based on n ovel compilation techniques, produces e xtremely high- performance numerical code from high-level language input, and will eventually automatically configure hardware modules for particular applications.

AIM-1328

Author[s]: Randall Davis

Intellectual Property and Software: The Assumptions are Broken

November 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1328.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1328.pdf

In March 1991 the World Intellectual Property Organization held an international symposium attended primarily by lawyers, to discuss the questions that artificial intelligence poses for intellectual property law (i.e., copyright and patents). This is an edited version of a talk presented there, which argues that AI poses few problems in the near term and that almost all the truly challenging issues arise instead from software in general. The talk was an attempt to bridge the gap between the legal community and the software community, to explain why existing concepts and categories in intellectual property law present such difficult problems for software, and why software as a technology breaks several important assumptions underlying intellectual property law.

AIM-1327

Author[s]: Amnon Shashua

Correspondence and Affine Shape from Two Orthographic Views: Motion and Recognition

December 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1327.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1327.pdf

The paper presents a simple model for recovering affine shape and correspondence from two orthographic views of a 3D object. It is shown that four corresponding points along two orthographic views, taken under similar illumination conditions, determine affine shape and correspondence for all other points. The scheme is useful for purposes of visual recognition by generating novel views of an object given two model views. It is also shown that the scheme can handle objects with smooth boundaries, to a good approximation, without introducing any modifications or additional model views.

AIM-1326

Author[s]: Ivan A. Bachelder

Contour Matching Using Local Affine Transformations

April 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1326.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1326.pdf

Partial constraints are often available in visual processing tasks requiring the matching of contours in two images. We propose a non- iterative scheme to determine contour matches using locally affine transformations. The method assumes that contours are approximated by the orthographic projection of planar patches within oriented neighborhoods of varying size. For degenerate cases, a minimal matching solution is chosen closest to the minimal pure translation. Performance on noisy synthetic and natural contour imagery is reported.

AIM-1325

Author[s]: Michael Eisenberg

Programmable Applications: Interpreter Meets Interface

October 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1325.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1325.pdf

Current fashion in "user-friendly'' software design tends to place an overreliance on direct manipulation interfaces. To be truly expressive (and thus truly user-friendly), applications need both learnable interfaces and domain-enriched languages that are accessible to the user. This paper discusses some of the design issues that arise in the creation of such programmable applications. As an example, we present "SchemePaint", a graphics application that combines a MacPaint-like interface with an interpreter for (a "graphics-enriched'') Scheme.

AIM-1323

Author[s]: Elizabeth Bradley

A Control Algorithm for Chaotic Physical Systems

October 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1323.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1323.pdf

Control algorithms which exploit the unique properties of chaos can vastly improve the design and performance of many practical and useful systems. The program Perfect Moment is built around such an algorithm. Given two points in the system's state space, it autonomously maps the space, chooses a set of trajectory segments from the maps, uses them to construct a composite path between the points, then causes the system to follow that path. This program is illustrated with two practical examples: the driven single pendulum and its electronic analog, the phase-locked loop. Strange attractor bridges, which alter the reachability of different state space points, can be used to increase the capture range of the circuit.

AIM-1322

Author[s]: Maja Mataric

A Comparative Analysis of Reinforcement Learning Methods

October 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1322.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1322.pdf

This paper analyzes the suitability of reinforcement learning (RL) for both programming and adapting situated agents. We discuss two RL algorithms: Q-learning and the Bucket Brigade. We introduce a special case of the Bucket Brigade, and analyze and compare its performance to Q in a number of experiments. Next we discuss the key problems of RL: time and space complexity, input generalization, sensitivity to parameter values, and selection of the reinforcement function. We address the tradeoffs between the built-in and learned knowledge and the number of training examples required by a learning algorithm. Finally, we suggest directions for future research.

AITR-1321

Author[s]: Waldemar Horwat

Concurrent Smalltalk on the Message-Driven Processor

September 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1321.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1321.pdf

Concurrent Smalltalk is the primary language used for programming the J- Machine, a MIMD message-passing computer containing thousands of 36-bit processors connected by a very low latency network. This thesis describes in detail Concurrent Smalltalk and its implementation on the J-Machine, including the Optimist II global optimizing compiler and Cosmos fine-grain parallel operating system. Quantitative and qualitative results are presented.

AIM-1320

Author[s]: Lisa Dron

The Multi-Scale Veto Model: A Two-Stage Analog Network for Edge Detection and Image Reconstruction

March 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1320.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1320.pdf

This paper presents the theory behind a model for a two-stage analog network for edge detection and image reconstruction to be implemented in VLSI. Edges are detected in the first stage using the multi-scale veto rule, which eliminates candidates that do not pass a threshold test at each of a set of different spatial scales. The image is reconstructed in the second stage from the brightness values adjacent to edge locations. The MSV rule allows good localization and efficient noise removal. Since the reconstructed images are visually similar to the originals, the possibility exists of achieving significant bandwidth compression.

AIM-1319

Author[s]: P.A. Skordos and W.H. Zurek

Maxwell's Demon, Rectifiers, and the Second Law: Computer Simulation of Smoluchowski's Trapdoor

September 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1319.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1319.pdf

We have simulated numerically an automated Maxwell's demon inspired by Smoluchowski's ideas of 1912. Two gas chambers of equal area are connected via an opening that is covered by a trapdoor. The trapdoor can open to the left but not to the right, and is intended to rectify naturally occurring variations in density between the two chambers. Our results confirm that though the trapdoor behaves as a rectifier when large density differences are imposed by external means, it can not extract useful work from the thermal motion of the molecules when left on its own.

AIM-1318

Author[s]: J. Brian Subirana-Vilanova and Kah Kay Sung

Multi-Scale Vector-Ridge-Detection for Perceptual Organization Without Edges

December 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1318.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1318.pdf

We present a novel ridge detector that finds ridges on vector fields. It is designed to automatically find the right scale of a ridge even in the presence of noise, multiple steps and narrow valleys. One of the key features of such ridge detector is that it has a zero response at discontinuities. The ridge detector can be applied to scalar and vector quantities such as color. We also present a parallel perceptual organization scheme based on such ridge detector that works without edges; in addition to perceptual groups, the scheme computes potential focus of attention points at which to direct future processing. The relation to human perception and several theoretical findings supporting the scheme are presented. We also show results of a Connection Machine implementation of the scheme for perceptual organization (without edges) using color.

AIM-1316

Author[s]: Lynn Andrea Stein

Resolving Ambiguity in Nonmonotonic Inheritance Hierarchies

August 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1316.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1316.pdf

This paper describes a theory of inheritance theories. We present an original theory of inheritance in nonmonotonic hierarchies. The structures on which this theory is based delineate a framework that subsumes most inheritance theories in the literature, providing a new foundation for inheritance. * Our path- based theory is sound and complete w.r.t. a direct model-theoretic semantics. * Both the credulous and the skeptical conclusions of this theory are polynomial-time computable. * We prove that true skeptical inheritance is not contained in the language of path-based inheritance. Because our techniques are modular w.r.t. the definition of specificity, they generalize to provide a unified framework for a broad class of inheritance theories. By describing multiple inheritance theories in the same “language” of credulous extensions, we make principled comparisons rather than the ad-hoc examination of specific examples makes up most of the comparative inheritance work.

AITR-1315

Author[s]: Ellen Spertus

Why are There so Few Female Computer Scientists?

August 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1315.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1315.pdf

This report examines why women pursue careers in computer science and related fields far less frequently than men do. In 1990, only 13% of PhDs in computer science went to women, and only 7.8% of computer science professors were female. Causes include the different ways in which boys and girls are raised, the stereotypes of female engineers, subtle biases that females face, problems resulting from working in predominantly male environments, and sexual biases in language. A theme of the report is that women's underrepresentation is not primarily due to direct discrimination but to subconscious behavior that perpetuates the status quo.

AIM-1314

Author[s]: Ellen C. Hildreth, Hiroshi Ando, Richard Anderson and Stefan Treue

Recovering Three-Dimensional Structure from Motion with Surface Reconstruction

December 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1314.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1314.pdf

We address the computational role that the construction of a complete surface representation may play in the recovery of 3--D structure from motion. We present a model that combines a feature--based structure-- from- -motion algorithm with smooth surface interpolation. This model can represent multiple surfaces in a given viewing direction, incorporates surface constraints from object boundaries, and groups image features using their 2--D image motion. Computer simulations relate the model's behavior to perceptual observations. In a companion paper, we discuss further perceptual experiments regarding the role of surface reconstruction in the human recovery of 3--D structure from motion.

AIM-1312

Author[s]: Daphna Weinshall

The Matching of Doubly Ambiguous Stereograms

July 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1312.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1312.pdf

I have previously described psychophysical experiments that involved the perception of many transparent layers, corresponding to multiple matching, in doubly ambiguous random dot stereograms. Additional experiments are described in the first part of this paper. In one experiment, subjects were required to report the density of dots on each transparent layer. In another experiment, the minimal density of dots on each layer, which is required for the subjects to perceive it as a distinct transparent layer, was measured. The difficulties encountered by stereo matching algorithms, when applied to doubly ambiguous stereograms, are described in the second part of this paper. Algorithms that can be modified to perform consistently with human perception, and the constraints imposed on their parameters by human perception, are discussed.

AIM-1311

Author[s]: Shimon Ullman

Sequence-Seeking and Counter Streams: A Model for Information Processing in the Cortex

December 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1311.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1311.pdf

This paper presents a model for the general flow in the neocortex. The basic process, called "sequence-seeking," is a search for a sequence of mappings or transformations, linking source and target representations. The search is bi-directional, "bottom-up" as well as "top-down," and it explores in parallel a large numbe rof alternative sequences. This operation is implemented in a structure termed "counter streams," in which multiple sequences are explored along two separate, complementary pathways which seeking to meet. The first part of the paper discusses the general sequence-seeking scheme and a number of related processes, such as the learning of successful sequences, context effects, and the use of "express lines" and partial matches. The second part discusses biological implications of the model in terms of connections within and between cortical areas. The model is compared with existing data, and a number of new predictions are proposed.

AIM-1309

Author[s]: Pegor Papazian

Principles, Opportunism and Seeing in Design: A Computational Approach

June 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1309.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1309.pdf

This thesis introduces elements of a theory of design activity and a computational framework for developing design systems. The theory stresses the opportunistic nature of designing and the complementary roles of focus and distraction, the interdependence of evaluation and generation, the multiplicity of ways of seeing over the history of a design session versus the exclusivity of a given way of seeing over an arbitrarily short period, and the incommensurability of criteria used to evaluate a design. The thesis argues for a principle based rather than rule based approach to designing documents. The Discursive Generator is presented as a computational framework for implementing specific design systems, and a simple system for arranging blocks according to a set of formal principles is developed by way of illustration. Both shape grammars and constraint based systems are used to contrast current trends in design automation with the discursive approach advocated in the thesis. The Discursive Generator is shown to have some important properties lacking in other types of systems, such as dynamism, robustness and the ability to deal with partial designs. When studied in terms of a search metaphor, the Discursive Generator is shown to exhibit behavior which is radically different from some traditional search techniques, and to avoid some of the well-known difficulties associated with them.

AITR-1307

Author[s]: David T. Clemens

Region-Based Feature Interpretation for Recognizing 3D Models in 2D Images

June 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1307.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1307.pdf

In model-based vision, there are a huge number of possible ways to match model features to image features. In addition to model shape constraints, there are important match-independent constraints that can efficiently reduce the search without the combinatorics of matching. I demonstrate two specific modules in the context of a complete recognition system, Reggie. The first is a region-based grouping mechanism to find groups of image features that are likely to come from a single object. The second is an interpretive matching scheme to make explicit hypotheses about occlusion and instabilities in the image features.

AITR-1306

Author[s]: Michael A. Eisenberg

The Kineticist's Workbench: Combining Symbolic and Numerical Methods in the Simulation of Chemical Reaction

May 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1306.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1306.pdf

The Kineticist's Workbench is a program that simulates chemical reaction mechanisms by predicting, generating, and interpreting numerical data. Prior to simulation, it analyzes a given mechanism to predict that mechanism's behavior; it then simulates the mechanism numerically; and afterward, it interprets and summarizes the data it has generated. In performing these tasks, the Workbench uses a variety of techniques: graph- theoretic algorithms (for analyzing mechanisms), traditional numerical simulation methods, and algorithms that examine simulation results and reinterpret them in qualitative terms. The Workbench thus serves as a prototype for a new class of scientific computational tools---tools that provide symbiotic collaborations between qualitative and quantitative methods.

AIM-1303

Author[s]: Feng Zhao and Richard Thorton

Automatic Design of a Maglev Controller in State Space

December 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1303.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1303.pdf

We describe the automatic synthesis of a global nonlinear controller for stabilizing a magnetic levitation system. The synthesized control system can stabilize the maglev vehicle with large initial displacements from an equilibrium, and possesses a much larger operating region than the classical linear feedback design for the same system. The controller is automatically synthesized by a suite of computational tools. This work demonstrates that the difficult control synthesis task can be automated, using programs that actively exploit knowledge of nonlinear dynamics and state space and combine powerful numerical and symbolic computations with spatial-reasoning techniques.

AIM-1301

Author[s]: Yael Moses and Shimon Ullman

Limitations of Non Model-Based Recognition Schemes

May 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1301.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1301.pdf

Different approaches to visual object recognition can be divided into two general classes: model-based vs. non model-based schemes. In this paper we establish some limitation on the class of non model-based recognition schemes. We show that every function that is invariant to viewing position of all objects is the trivial (constant) function. It follows that every consistent recognition scheme for recognizing all 3-D objects must in general be model based. The result is extended to recognition schemes that are imperfect (allowed to make mistakes) or restricted to certain classes of objects.

AITR-1300

Author[s]: David M. Siegel

Pose Determination of a Grasped Object Using Limited Sensing

May 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1300.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1300.pdf

This report explores methods for determining the pose of a grasped object using only limited sensor information. The problem of pose determination is to find the position of an object relative to the hand. The information is useful when grasped objects are being manipulated. The problem is hard because of the large space of grasp configurations and the large amount of uncertainty inherent in dexterous hand control. By studying limited sensing approaches, the problem's inherent constraints can be better understood. This understanding helps to show how additional sensor data can be used to make recognition methods more effective and robust.

AIM-1297

Author[s]: Ellen C. Hildreth

Recovering Heading for Visually-Guided Navigation

June 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1297.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1297.pdf

We present a model for recovering the direction of heading of an observer who is moving relative to a scene that may contain self-moving objects. The model builds upon an algorithm proposed by Rieger and Lawton (1985), which is based on earlier work by Longuet-Higgens and Prazdny (1981). The algorithm uses velocity differences computed in regions of high depth variation to estimate the location of the focus of expansion, which indicates the observer’s heading direction. We relate the behavior of the proposed model to psychophysical observations regarding the ability of human observers to judge their heading direction, and show how the model can cope with self-moving objects in the environment. We also discuss this model in the broader context of a navigational system that performs tasks requiring rapid sensing and response through the interaction of simple task-specific routines.

AITR-1296

Author[s]: Joachim Heel

Temporal Surface Reconstruction

May 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1296.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1296.pdf

This thesis investigates the problem of estimating the three-dimensional structure of a scene from a sequence of images. Structure information is recovered from images continuously using shading, motion or other visual mechanisms. A Kalman filter represents structure in a dense depth map. With each new image, the filter first updates the current depth map by a minimum variance estimate that best fits the new image data and the previous estimate. Then the structure estimate is predicted for the next time step by a transformation that accounts for relative camera motion. Experimental evaluation shows the significant improvement in quality and computation time that can be achieved using this technique.

AITR-1295

Author[s]: James M. Hyde

Multiple Mode Vibration Suppression in Controlled Flexible Systems

May 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1295.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1295.pdf

Prior research has led to the development of input command shapers that can reduce residual vibration in single- or multiple-mode flexible systems. We present a method for the development of multiple-mode shapers which are simpler to implement and produce smaller response delays than previous designs. An MIT / NASA experimental flexible structure, MACE, is employed as a test article for the validation of the new shaping method. We examine the results of tests conducted on simulations of MACE. The new shapers are shown to be effective in suppressing multiple-mode vibration, even in the presence of mild kinematic and dynamic non-linearities.

AITR-1294

Author[s]: Larry R. Dennison

Reliable Interconnection Networks for Parallel Computers

October 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1294.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1294.pdf

This technical report describes a new protocol, the Unique Token Protocol, for reliable message communication. This protocol eliminates the need for end-to-end acknowledgments and minimizes the communication effort when no dynamic errors occur. Various properties of end-to-end protocols are presented. The unique token protocol solves the associated problems. It eliminates source buffering by maintaining in the network at least two copies of a message. A token is used to decide if a message was delivered to the destination exactly once. This technical report also presents a possible implementation of the protocol in a worm-hole routed, 3-D mesh network.

AIM-1293

Author[s]: Rodney A. Brooks

Intelligence Without Reason

April 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1293.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1293.pdf

Computers and Thought are the two categories that together define Artificial Intelligence as a discipline. It is generally accepted that work in Artificial Intelligence over the last thirty years has had a strong influence on aspects of computer architectures. In this paper we also make the converse claim; that the state of computer architecture has been a strong influence on our models of thought. The Von Neumann model of computation has lead Artificial Intelligence in particular directions. Intelligence in biological systems is completely different. Recent work in behavior-based Artificial Intelligenge has produced new models of intelligence that are much closer in spirit to biological systems. The non-Von Neumann computational models they use share many characteristics with biological computation.

AIM-1291

Author[s]: Minoru Maruyama, Federico Girosi and Tomaso Poggio

A Connection Between GRBF and MLP

April 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1291.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1291.pdf

Both multilayer perceptrons (MLP) and Generalized Radial Basis Functions (GRBF) have good approximation properties, theoretically and experimentally. Are they related? The main point of this paper is to show that for normalized inputs, multilayer perceptron networks are radial function networks (albeit with a non-standard radial function). This provides an interpretation of the weights w as centers t of the radial function network, and therefore as equivalent to templates. This insight may be useful for practical applications, including better initialization procedures for MLP. In the remainder of the paper, we discuss the relation between the radial functions that correspond to the sigmoid for normalized inputs and well-behaved radial basis functions, such as the Gaussian. In particular, we observe that the radial function associated with the sigmoid is an activation function that is good approximation to Gaussian basis functions for a range of values of the bias parameter. The implication is that a MLP network can always simulate a Gaussian GRBF network (with the same number of units but less parameters); the converse is true only for certain values of the bias parameter. Numerical experiments indicate that this constraint is not always satisfied in practice by MLP networks trained with backpropagation. Multiscale GRBF networks, on the other hand, can approximate MLP networks with a similar number of parameters.

AIM-1289

Author[s]: Tomaso Poggio, Allessandro Verri and Vincent Torre

Green Theorems and Qualitative Properties of the Optical Flow

April 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1289.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1289.pdf

How can one compute qualitative properties of the optical flow, such as expansion or rotation, in a way which is robust and invariant to the position of the focus of expansion or the center of rotation? We suggest a particularly simple algorithm, well-suited to VLSI implementations, that exploits well-known relations between the integral and differential properties of vector fields and their linear behaviour near singularities.

AIM-1288

Author[s]: Federico Girosi and Gabriele Anzellotti

Convergence Rates of Approximation by Translates

March 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1288.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1288.pdf

In this paper we consider the problem of approximating a function belonging to some funtion space ø by a linear comination of n translates of a given function G. Ussing a lemma by Jones (1990) and Barron (1991) we show that it is possible to define function spaces and functions G for which the rate of convergence to zero of the erro is 0(1/n) in any number of dimensions. The apparent avoidance of the "curse of dimensionality" is due to the fact that these function spaces are more and more constrained as the dimension increases. Examples include spaces of the Sobolev tpe, in which the number of weak derivatives is required to be larger than the number of dimensions. We give results both for approximation in the L2 norm and in the Lc norm. The interesting feature of these results is that, thanks to the constructive nature of Jones" and Barron"s lemma, an iterative procedure is defined that can achieve this rate.

AIM-1287

Author[s]: Federico Girosi

Models of Noise and Robust Estimates

November 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1287.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1287.pdf

Given n noisy observations g; of the same quantity f, it is common use to give an estimate of f by minimizing the function Eni=1(gi-f)2. From a statistical point of view this corresponds to computing the Maximum likelihood estimate, under the assumption of Gaussian noise. However, it is well known that this choice leads to results that are very sensitive to the presence of outliers in the data. For this reason it has been proposed to minimize the functions of the form Eni=1V(gi-f), where V is a function that increases less rapidly than the square. Several choices for V have been proposed and successfully used to obtain "robust" estimates. In this paper we show that, for a class of functions V, using these robust estimators corresponds to assuming that data are corrupted by Gaussian noise whose variance fluctuates according to some given probability distribution, that uniquely determines the shape of V.

AIM-1286

Author[s]: Feng Zhao

Phase Space Navigator: Towards Automating Control Synthesis in Phase Spaces for Nonlinear Control Systems

April 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1286.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1286.pdf

We develop a novel autonomous control synthesis strategy called Phase Space Navigator for the automatic synthesis of nonlinear control systems. The Phase Space Navigator generates global control laws by synthesizing flow shapes of dynamical systems and planning and navigating system trajectories in the phase spaces. Parsing phase spaces into trajectory flow pipes provide a way to efficiently reason about the phase space structures and search for global control paths. The strategy is particularly suitable for synthesizing high-performance control systems that do not lend themselves to traditional design and analysis techniques.

AIM-1285

Author[s]: Daniel Kersten and Heinrich Bulthoff

Apparent Opacity Affects Perception of Structure from Motion

January 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1285.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1285.pdf

The judgment of surface attributes such as transparency or opacity is often considered to be a higher-level visual process that would make use of low-level stereo or motion information to tease apart the transparent from the opaque parts. In this study, we describe a new illusion and some results that question the above view by showing that depth from transparency and opacity can override the rigidity bias in perceiving depth from motion. This provides support for the idea that the brain's computation of the surface material attribute of transparency may have to be done either before, or in parallel with the computation of structure from motion.

AITR-1284

Author[s]: Henry Minsky

A Parallel Crossbar Routing Chip for a Shared Memory Multiprocessor

March 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1284.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1284.pdf

This thesis describes the design and implementation of an integrated circuit and associated packaging to be used as the building block for the data routing network of a large scale shared memory multiprocessor system. A general purpose multiprocessor depends on high-bandwidth, low-latency communications between computing elements. This thesis describes the design and construction of RN1, a novel self-routing, enhanced crossbar switch as a CMOS VLSI chip. This chip provides the basic building block for a scalable pipelined routing network with byte- wide data channels. A series of RN1 chips can be cascaded with no additional internal network components to form a multistage fault-tolerant routing switch. The chip is designed to operate at clock frequencies up to 100Mhz using Hewlett- Packard's HP34 $1.2\mu$ process. This aggressive performance goal demands that special attention be paid to optimization of the logic architecture and circuit design.

AITR-1283

Author[s]: Brian A. LaMacchia

Basis Reduction Algorithms and Subset Sum Problems

June 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1283.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1283.pdf

This thesis investigates a new approach to lattice basis reduction suggested by M. Seysen. Seysen's algorithm attempts to globally reduce a lattice basis, whereas the Lenstra, Lenstra, Lovasz (LLL) family of reduction algorithms concentrates on local reductions. We show that Seysen's algorithm is well suited for reducing certain classes of lattice bases, and often requires much less time in practice than the LLL algorithm. We also demonstrate how Seysen's algorithm for basis reduction may be applied to subset sum problems. Seysen's technique, used in combination with the LLL algorithm, and other heuristics, enables us to solve a much larger class of subset sum problems than was previously possible.

AIM-1282

Author[s]: Thomas Knight and Henry M. Wu

A Method for Skew-free Distribution of Digital Signals Using Matched Variable Delay Lines

March 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1282.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1282.pdf

The ability to distribute signals everywhere in a circuit with controlled and known delays is essential in large, high-speed digital systems. We present a technique by which a signal driver can adjust the arrival time of the signal at the end of the wire using a pair of matched variable delay lines. We show an implemention of this idea requiring no extra wiring, and how it can be extended to distribute signals skew-free to receivers along the signal run. We demonstrate how this scheme fits into the boundary scan logic of a VLSI chip.

AITR-1281

Author[s]: Chris Hanson

MIT Scheme Reference Manual

January 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1281.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1281.pdf

MIT Scheme is an implementation of the Scheme programming language that runs on many popular workstations. The MIT Scheme Reference Manual describes the special forms, procedures, and datatypes provided by the implementation for use by application programmers.

AIM-1280

Author[s]: A. Lumsdaine, J.L. Wyatt, Jr. and I.M. Elfadel

Nonlinear Analog Networks for Image Smoothing and Segmentation

January 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1280.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1280.pdf

Image smoothing and segmentation algorithms are frequently formulatedsas optimization problems. Linear and nonlinear (reciprocal) resistivesnetworks have solutions characterized by an extremum principle. Thus,sappropriately designed networks can automatically solve certainssmoothing and segmentation problems in robot vision. This papersconsiders switched linear resistive networks and nonlinear resistivesnetworks for such tasks. The latter network type is derived from thesformer via an intermediate stochastic formulation, and a new resultsrelating the solution sets of the two is given for the "zerostermperature'' limit. We then present simulation studies of severalscontinuation methods that can be gracefully implemented in analog VLSIsand that seem to give "good'' results for these non- convexsoptimization problems.

AIM-1278

Author[s]: Elizabeth Bradley

Control Algorithms for Chaotic Systems

March 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1278.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1278.pdf

This paper presents techniques that actively exploit chaotic behavior to accomplish otherwise-impossible control tasks. The state space is mapped by numerical integration at different system parameter values and trajectory segments from several of these maps are automatically combined into a path between the desired system states. A fine- grained search and high computational accuracy are required to locate appropriate trajectory segments, piece them together and cause the system to follow this composite path. The sensitivity of a chaotic system's state-space topology to the parameters of its equations and of its trajectories to the initial conditions make this approach rewarding in spite of its computational demands.

AIM-1277

Author[s]: Lynn Andrea Stein

Imagination and Situated Cognition

February 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1277.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1277.pdf

A subsumption-based mobile robot is extended to perform cognitive tasks. Following directions, the robot navigates directly to previously unexplored goals. This robot exploits a novel architecture based on the idea that cognition uses the underlying machinery of interaction, imagining sensations and actions.

AITR-1275

Author[s]: Anselm Spoerri

The Early Detection of Motion Boundaries

May 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1275.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1275.pdf

This thesis shows how to detect boundaries on the basis of motion information alone. The detection is performed in two stages: (i) the local estimation of motion discontinuities and of the visual flowsfield; (ii) the extraction of complete boundaries belonging to differently moving objects. For the first stage, three new methods are presented: the "Bimodality Tests,'' the "Bi-distribution Test,'' and the "Dynamic Occlusion Method.'' The second stage consists of applying the "Structural Saliency Method,'' by Sha'ashua and Ullman to extract complete and unique boundaries from the output of the first stage. The developed methods can successfully segment complex motion sequences.

AIM-1274

Author[s]: Feng Zhao

Extracting and Representing Qualitative Behaviors of Complex Systems in Phase Spaces

March 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1274.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1274.pdf

We develop a qualitative method for understanding and representing phase space structures of complex systems and demonstrate the method with a program, MAPS --- Modeler and Analyzer for Phase Spaces, using deep domain knowledge of dynamical system theory. Given a dynamical system, the program generates a complete, high level symbolic description of the phase space structure sensible to human beings and manipulable by other programs. Using the phase space descriptions, we are developing a novel control synthesis strategy to automatically synthesize a controller for a nonlinear system in the phase space to achieve desired properties.

AIM-1272

Author[s]: Ellen Spertus and William J. Dally

Experiments with Dataflow on a General-Purpose Parallel Computer

January 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1272.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1272.pdf

The MIT J-Machine, a massively-parallel computer, is an experiment in providing general-purpose mechanisms for communication, synchronization, and naming that will support a wide variety of parallel models of comptuation. We have developed two experimental dataflow programming systems for the J-Machine. For the first system, we adapted Papadopoulos' explicit token store to implement static and then dynamic dataflow. Our second system made use of Iannucci's hybrid execution model to combine several dataflow graph nodes into a single sequence, decreasing scheduling overhead. By combining the strengths of the two systems, it is possible to produce a system with competitive performance.

AIM-1271

Author[s]: Tomaso Poggio, Manfred Fahle and Shimon Edelman

Synthesis of Visual Modules from Examples: Learning Hyperacuity

January 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1271.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1271.pdf

Networks that solve specific visual tasks, such as the evaluation of spatial relations with hyperacuity precision, can be eastily synthesized from a small set of examples. This may have significant implications for the interpretation of many psychophysical results in terms of neuronal models.

AIM-1270

Author[s]: Tanveer Fathima Syeda-Mahmood

Data and Model-Driven Selection Using Color Regions

February 1992

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1270.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1270.pdf

A key problem in model-based object recognition is selection, namely, the problem of determining which regions in the image are likely to come from a single object. In this paper we present an approach that extracts and uses color region information to perform selection either based solely on image- data (data-driven), or based on the knowledge of the color description of the model (model -driven). The paper presents a method of perceptual color specification by color categories to extract perceptual color regions. It also discusses the utility of color-based selection in reducing the search involved in recognition.

AIM-1269

Author[s]: Anita M. Flynn, Lee S. Tavrow, Stephen F. Bart and Rodney A. Brooks

Piezoelectric Micromotors for Microrobots

February 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1269.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1269.pdf

By combining new robot control systems with piezoelectric motors and micromechanics, we propose creating micromechanical systems which are small, cheap and completely autonomous. We have fabricated small - a few millimeters in diameter - piezoelectric motors using ferroelectric thin films and consisting of two pieces: a stator and a rotor. The stationary stator includes a piezoelectric film in which we induce bending in the form of a traveling wave. Anything which sits atop the stator is propelled by the wave. A small glass lens placed upon the stator becomes the spinning rotor. Using thin films of PZT on silicon nitride memebranes, various types of actuator structures have been fabricated.

AIM-1266

Author[s]: David J. Beymer

Finding Junctions Using the Image Gradient

December 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1266.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1266.pdf

Junctions are the intersection points of three or more intensity surfaces in an image. An analysis of zero crossings and the gradient near junctions demonstrates that gradient- based edge detection schemes fragment edges at junctions. This fragmentation is caused by the intrinsic pairing of zero crossings and a destructive interference of edge gradients at junctions. Using the previous gradient analysis, we propose a junction detector that finds junctions in edge maps by following gradient ridges and using the minimum direction of saddle points in the gradient. The junction detector is demonstrated on real imagery and previous approaches to junction detection are discussed.

AIM-1265

Author[s]: Joachim Dengler

Estimation of Discontinuous Displacement Vector Fields with the Minimum Description Length Criterion

October 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1265.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1265.pdf

A new noniterative approach to determine displacement vector fields with discontinuities is described. In order to overcome the limitations of current methods, the problem is regarded as a general modelling problem. Starting from a family of regularized estimates, by measuring the difference in description length the compatibility between different levels of regularization is determined. This gives local but noisy evidence of possible model boundaries at multiple scales. With the two constraints of continous lines of discontinuities and the spatial coincidence assumption consistent boundary evidence is found. Based on this combined evidence the model is updated, now describing homogeneous regions with sharp discontinuities.

AIM-1264

Author[s]: Daphna Weinshall

The Shape of Shading

October 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1264.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1264.pdf

This paper discusses the relationship between the shape of the shading, the surface whose depth at each point equals the brightness in the image, and the shape of the original surface. I suggest the shading as an initial local approximation to shape, and discuss the scope of this approximation and what it may be good for. In particular, qualitative surface features, such as the sign of the Gaussian curvature, can be computed in some cases directly from the shading. Finally, a method to compute the direction of the illuminant (assuming a single point light source) from shading on occluding contours is shown.

AIM-1263

Author[s]: Antionio Bicchi

A Criterion for the Optimal Design of Multiaxis Force Sensors

October 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1263.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1263.pdf

This paper deals with the design of multi-axis force (also known as force/torque) sensors, as considered within the framework of optimal design theory. The principal goal of this paper is to identify a mathematical objective function, whose minimization corresponds to the optimization of sensor accuracy. The methodology employed is derived from linear algebra and analysis of numerical stability. The problem of optimizing the number of basic transducers employed in a multi- component sensor is also addressed. Finally, applications of the proposed method to the design of a simple sensor as well as to the optimization of a novel, 6-axis miniaturized sensor are discussed.

AIM-1262

Author[s]: Antonio Bicchi, J. Kenneth Salisbury and David L. Brock

Contact Sensing from Force Measurements

October 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1262.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1262.pdf

This paper addresses contact sensing, i.e. the problem of resolving the location of a contact, the force at the interface and the moment about the contact normals. Called "intrinsic'' contact sensing for the use of internal force and torque measurements, this method allows for practical devices which provide simple, relevant contact information in practical robotic applications. Such sensors have been used in conjunction with robot hands to identify objects, determine surface friction, detect slip, augment grasp stability, measure object mass, probe surfaces, control collision and a variety of other useful tasks. This paper describes the theoretical basis for their operation and provides a framework for future device design.

AIM-1261

Author[s]: Claudio Melchiorri and J.K. Salisbury

Exploiting the Redundancy of a Hand-Arm Robotic System

October 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1261.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1261.pdf

In this report, a method for exploiting the redundancy of a hand-arm mechanical system for manipulation tasks is illustrated. The basic idea is to try to exploit the different intrinsic capabilities of the arm and hand subsystems. The Jacobian transpose technique is at the core of the method: different behaviors of the two subsystems are obtained by means of constraints in Null(J) generated by non-orthogonal projectors. Comments about the computation of the constraints are reported in the memo, as well as a description of some preliminary experiments on a robotic system at the A.I. Lab., M.I.T.

AITR-1260

Author[s]: Eric Sven Ristad

Computational Structure of Human Language

October 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1260.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1260.pdf

The central thesis of this report is that human language is NP-complete. That is, the process of comprehending and producing utterances is bounded above by the class NP, and below by NP-hardness. This constructive complexity thesis has two empirical consequences. The first is to predict that a linguistic theory outside NP is unnaturally powerful. The second is to predict that a linguistic theory easier than NP-hard is descriptively inadequate. To prove the lower bound, I show that the following three subproblems of language comprehension are all NP-hard: decide whether a given sound is possible sound of a given language; disambiguate a sequence of words; and compute the antecedents of pronouns. The proofs are based directly on the empirical facts of the language user’s knowledge, under an appropriate idealization. Therefore, they are invariant across linguistic theories. (For this reason, no knowledge of linguistic theory is needed to understand the proofs, only knowledge of English.) To illustrate the usefulness of the upper bound, I show that two widely-accepted analyses of the language user’s knowledge (of syntactic ellipsis and phonological dependencies) lead to complexity outside of NP (PSPACE-hard and Undecidable, respectively). Next, guided by the complexity proofs, I construct alternate linguisitic analyses that are strictly superior on descriptive grounds, as well as being less complex computationally (in NP). The report also presents a new framework for linguistic theorizing, that resolves important puzzles in generative linguistics, and guides the mathematical investigation of human language.

AIM-1259

Author[s]: Thomas M. Breuel

An Efficient Correspondence Based Algorithm for 2D and 3D Model Based Recognition

October 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1259.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1259.pdf

A polynomial time algorithm (pruned correspondence search, PCS) with good average case performance for solving a wide class of geometric maximal matching problems, including the problem of recognizing 3D objects from a single 2D image, is presented. Efficient verification algorithms, based on a linear representation of location constraints, are given for the case of affine transformations among vector spaces and for the case of rigid 2D and 3D transformations with scale. Some preliminary experiments suggest that PCS is a practical algorithm. Its similarity to existing correspondence based algorithms means that a number of existing techniques for speedup can be incorporated into PCS to improve its performance.

AITR-1257

Author[s]: Choon P. Goh

Model Selection for Solving Kinematics Problems

September 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1257.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1257.pdf

There has been much interest in the area of model-based reasoning within the Artificial Intelligence community, particularly in its application to diagnosis and troubleshooting. The core issue in this thesis, simply put, is, model-based reasoning is fine, but whence the model? Where do the models come from? How do we know we have the right models? What does the right model mean anyway? Our work has three major components. The first component deals with how we determine whether a piece of information is relevant to solving a problem. We have three ways of determining relevance: derivational, situational and an order-of- magnitude reasoning process. The second component deals with the defining and building of models for solving problems. We identify these models, determine what we need to know about them, and importantly, determine when they are appropriate. Currently, the system has a collection of four basic models and two hybrid models. This collection of models has been successfully tested on a set of fifteen simple kinematics problems. The third major component of our work deals with how the models are selected.

AIM-1256

Author[s]: Yang Meng Tan

Supporting Reuse and Evolution in Software Design

October 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1256.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1256.pdf

Program design is an area of programming that can benefit significantly from machine-mediated assistance. A proposed tool, called the Design Apprentice (DA), can assist a programmer in the detailed design of programs. The DA supports software reuse through a library of commonly-used algorithmic fragments, or cliches, that codifies standard programming. The cliche library enables the programmer to describe the design of a program concisely. The DA can detect some kinds of inconsistencies and incompleteness in program descriptions. It automates detailed design by automatically selecting appropriate algorithms and data structures. It supports the evolution of program designs by keeping explicit dependencies between the design decisions made. These capabilities of the DA are underlaid bya model of programming, called programming by successive elaboration, which mimics the way programmers interact. Programming by successive elaboration is characterized by the use of breadth-first exposition of layered program descriptions and the successive modifications of descriptions. A scenario is presented to illustrate the concept of the DA. Technques for automating the detailed design process are described. A framework is given in which designs are incrementally augmented and modified by a succession of design steps. A library of cliches and a suite of design steps needed to support the scenario are presented.

AIM-1255

Author[s]: Brian Eberman and David L. Brock

Line Kinematics for Whole-Arm Manipulation

January 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1255.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1255.pdf

A Whole-Arm Manipulator uses every surface to both sense and interact with the environment. To facilitate the analysis and control of a Whole-Arm Manipulator, line geometry is used to describe the location and trajectory of the links. Applications of line kinematics are described and implemented on the MIT Whole-Arm Manipulator (WAM-1).

AIM-1254

Author[s]: Bruno Caprile and Federico Girosi

A Nondeterministic Minimization Algorithm

September 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1254.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1254.pdf

The problem of minimizing a multivariate function is recurrent in many disciplines as Physics, Mathematics, Engeneering and, of course, Computer Science. In this paper we describe a simple nondeterministic algorithm which is based on the idea of adaptive noise, and that proved to be particularly effective in the minimization of a class of multivariate, continuous valued, smooth functions, associated with some recent extension of regularization theory by Poggio and Girosi (1990). Results obtained by using this method and a more traditional gradient descent technique are also compared.

AIM-1253

Author[s]: Tomaso Poggio

A Theory of How the Brain Might Work

December 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1253.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1253.pdf

I wish to propose a quite speculative new version of the grandmother cell theory to explain how the brain, or parts of it, may work. In particular, I discuss how the visual system may learn to recognize 3D objects. The model would apply directly to the cortical cells involved in visual face recognition. I will also outline the relation of our theory to existing models of the cerebellum and of motor control. Specific biophysical mechanisms can be readily suggested as part of a basic type of neural circuitry that can learn to approximate multidimensional input-output mappings from sets of examples and that is expected to be replicated in different regions of the brain and across modalities. The main points of the theory are: -the brain uses modules for multivariate function approximation as basic components of several of its information processing subsystems. -these modules are realized as HyperBF networks (Poggio and Girosi, 1990a,b). -HyperBF networks can be implemented in terms of biologically plausible mechanisms and circuitry. The theory predicts a specific type of population coding that represents an extension of schemes such as look-up tables. I will conclude with some speculations about the trade-off between memory and computation and the evolution of intelligence.

AIM-1252

Author[s]: Anita M. Flynn

The 1990 AI Fair

August 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1252.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1252.pdf

This year, as the finale to the Artificial Intelligence Laboratory’s annual Winter Olympics, the Lab staged an AI Fair – a night devoted to displaying the wide variety of talents and interests within the laboratory. The Fair provided an outlet for creativity and fun in a carnival-like atmosphere. Students organized events from robot boat races to face-recognition vision contests. Research groups came together to make posters and booths explaining their work. The robots rolled down out of the labs, networks were turned over to aerial combat computer games and walls were decorated with posters of zany ideas for the future. Everyone pitched in, and this photograph album is a pictorial account of the fun that night at the AI Fair.

AITR-1251

Author[s]: Robert Joseph Hall

Program Improvement by Automatic Redistribution of Intermediate Results

February 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1251.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1251.pdf

Introducing function sharing into designs allows eliminating costly structure by adapting existing structure to perform its function. This can eliminate many inefficiencies of reusing general componentssin specific contexts. "Redistribution of intermediate results'' focuses on instances where adaptation requires only addition/deletion of data flow and unused code removal. I show that this approach unifies and extends several well- known optimization classes. The system performs search and screening by deriving, using a novel explanation-based generalization technique, operational filtering predicates from input teleological information. The key advantage is to focus the system's effort on optimizations that are easier to prove safe.

AIM-1250

Author[s]: W. Eric L. Grimson, Daniel P. Huttenlocher and David W. Jacobs

Affine Matching with Bounded Sensor Error: A Study of Geometric Hashing and Alignment

August 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1250.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1250.pdf

Affine transformations are often used in recognition systems, to approximate the effects of perspective projection. The underlying mathematics is for exact feature data, with no positional uncertainty. In practice, heuristics are added to handle uncertainty. We provide a precise analysis of affine point matching, obtaining an expression for the range of affine-invariant values consistent with bounded uncertainty. This analysis reveals that the range of affine- invariant values depends on the actual $x$- $y$-positions of the features, i.e. with uncertainty, affine representations are not invariant with respect to the Cartesian coordinate system. We analyze the effect of this on geometric hashing and alignment recognition methods.

AIM-1249

Author[s]: Harold Abelson, Andrew A. Berlin, Jacob Katzenelson, William H. McAllister, Guillermo J. Rozas and Gerald Jay Sussman

The Supercomputer Toolkit and Its Applications

July 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1249.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1249.pdf

The Supercomputer Toolkit is a proposed family of standard hardware and software components from which special-purpose machines can be easily configured. Using the Toolkit, a scientist or an engineer, starting with a suitable computational problem, will be able to readily configure a special purpose multiprocessor that attains supercomputer- class performance on that problem, at a fraction of the cost of a general purpose supercomputer. The Toolkit is currently being built as a joint project between Hewlett- Packard and MIT. The software and the applications are in various stages of development and research.

AITR-1248

Author[s]: Andrew Andai Chien

Concurrent Aggregates (CA): An Object-Oriented Language for Fine-Grained Message-Passing Machines

July 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1248.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1248.pdf

Fine-grained parallel machines have the potential for very high speed computation. To program massively-concurrent MIMD machines, programmers need tools for managing complexity. These tools should not restrict program concurrency. Concurrent Aggregates (CA) provides multiple-access data abstraction tools, Aggregates, which can be used to implement abstractions with virtually unlimited potential for concurrency. Such tools allow programmers to modularize programs without reducing concurrency. I describe the design, motivation, implementation and evaluation of Concurrent Aggregates. CA has been used to construct a number of application programs. Multi-access data abstractions are found to be useful in constructing highly concurrent programs.

AITR-1247

Author[s]: Bruce R. Thompson

The PHD: A Planar, Harmonic Drive Robot for Joint Torque Control

May 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1247.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1247.pdf

This thesis details the development of a model of a seven degree of freedom manipulator for position control. Then, it goes on to discuss the design and construction of a the PHD, a robot built to serve two purposes: first, to perform research on joint torque control schemes, and second, to determine the important dynamic characteristics of the Harmonic Drive. The PHD, is a planar, three degree of freedom arm with torque sensors integral to each joint. Preliminary testing has shown that a simple linear spring model of the Harmonic Drive's flexibility is suitable in many situations.

AITR-1245

Author[s]: Donald Scott Wills

Pi: A Parallel Architecture Interface for Multi-Model Execution

May 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1245.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1245.pdf

This thesis defines Pi, a parallel architecture interface that separates model and machine issues, allowing them to be addressed independently. This provides greater flexibility for both the model and machine builder. Pi addresses a set of common parallel model requirements including low latency communication, fast task switching, low cost synchronization, efficient storage management, the ability to exploit locality, and efficient support for sequential code. Since Pi provides generic parallel operations, it can efficiently support many parallel programming models including hybrids of existing models. Pi also forms a basis of comparison for architectural components.

AITR-1244

Author[s]: Michael Dean Levin

Design and Control of a Closed-Loop Brushless Torque Actuator

May 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1244.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1244.pdf

This report explores the design and control issues associated with a brushless actuator capable of achieving extremely high torque accuracy. Models of several different motor - sensor configurations were studied to determine dynamic characteristics. A reaction torque sensor fixed to the motor stator was implemented to decouple the transmission dynamics from the sensor. This resulted in a compact actuator with higher bandwidth and precision than could be obtained with an inline or joint sensor. Testing demonstrated that closed-loop torque accuracy was within 0.1%, and the mechanical bandwidth approached 300 Hz.

AIM-1242

Author[s]: Manfred Fahle

On the Shifter Hyposthesis for the Elimination of Motion Blur

August 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1242.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1242.pdf

Moving objects may stimulate many retinal photoreceptors within the integration time of the receptors without motion blur being experienced. Anderson and vanEssen (1987) suggested that the neuronal representation of retinal images is shifted on its way to the cortex, in an opposite direction to the motion. Thus, the cortical representation of objects would be stationary. I have measured thresholds for two vernier stimuli, moving simultaneously into opposite directions over identical positions. Motion blur for these stimuli is not stronger than with a single moving stimulus, and thresholds can be below a photoreceptor diameter. This result cannot be easily reconciled with the hypothesis of Tshifter circuitsU.

AIM-1240

Author[s]: Manfred Fahle and Gunther Palm

A Model for Rivalry Between Cognitive Contours

June 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1240.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1240.pdf

The interactions between illusory and real contours have been inves- tigated under monocular, binocular and dichoptic conditions. Results show that under all three presentation conditions, periodic alternations, generally called rivalry, occur during the perception of cognitive (or illusory) triangles, while earlier research had failed to find such rivalry (Bradley & Dumais, 1975). With line triangles, rivalry is experienced only under dichoptic conditions. A model is proposed to account for the observed phenomena, and the results of simulations are presented.

AIM-1239

Author[s]: Shimon Edelman and Heinrich H. Bulthoff

Viewpoint-Specific Representations in Three-Dimensional Object Recognition

August 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1239.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1239.pdf

We report a series of psychophysical experiments that explore different aspects of the problem of object representation and recognition in human vision. Contrary to the paradigmatic view which holds that the representations are three-dimensional and object-centered, the results consistently support the notion of view-specific representations that include at most partial depth information. In simulated experiments that involved the same stimuli shown to the human subjects, computational models built around two-dimensional multiple-view representations replicated our main psychophysical results, including patterns of generalization errors and the time course of perceptual learning.

AIM-1238

Author[s]: Gary C. Borchardt

Transition Space

November 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1238.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1238.pdf

Informal causal descriptions of physical systems abound in sources such as encyclopedias, reports and user's manuals. Yet these descriptions remain largely opaque to computer processing. This paper proposes a representational framework in which such descriptions are viewed as providing partial specifications of paths in a space of possible transitions, or transition space. In this framework, the task of comprehending informal causal descriptions emerges as one of completing the specifications of paths in transition space---filling causal gaps and relating accounts of activity varied by analogy and abstraction. The use of the representation and its operations is illustrated in the context of a simple description concerning rocket propulsion.

AITR-1237

Author[s]: Camille Z. Chammas

Analysis and Implementation of Robust Grasping Behaviors

May 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1237.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1237.pdf

This thesis addresses the problem of developing automatic grasping capabilities for robotic hands. Using a 2-jointed and a 4- jointed nmodel of the hand, we establish the geometric conditions necessary for achieving form closure grasps of cylindrical objects. We then define and show how to construct the grasping pre-image for quasi-static (friction dominated) and zero-G (inertia dominated) motions for sensorless and sensor-driven grasps with and without arm motions. While the approach does not rely on detailed modeling, it is computationally inexpensive, reliable, and easy to implement. Example behaviors were successfully implemented on the Salisbury hand and on a planar 2- fingered, 4 degree-of-freedom hand.

AIM-1236

Author[s]: Jonathan Amsterdam

The Iterate Manual

October 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1236.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1236.pdf

This is the manual for version 1.1 of Iterate, a powerful iteration macro for Common Lisp. Iterate is similar to Loop but provides numerous additional features, is well integrated with Lisp, and is extensible.

AITR-1235

Author[s]: Helen Greiner

Passive and Active Grasping with a Prehensile Robot End-Effector

May 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1235.ps

ftp://publications.ai.mit.edu/ai-publicaitons/pdf/AITR-1235.pdf

This report presents a design of a new type of robot end-effector with inherent mechanical grasping capabilities. Concentrating on designing an end-effector to grasp a simple class of objects, cylindrical, allowed a design with only one degree of actuation. The key features of this design are high bandwidth response to forces, passive grasping capabilities, ease of control, and ability to wrap around objects with simple geometries providing form closure. A prototype of this mechanism was built to evaluate these features.

AITR-1234

Author[s]: David J. Bennett

The Control of Human Arm Movement Models and Mechanical Constraints

May 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1234.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1234.pdf

A serial-link manipulator may form a mobile closed kinematic chain when interacting with the environment, if it is redundant with respect to the task degrees of freedom (DOFs) at the endpoint. If the mobile closed chain assumes a number of configurations, then loop consistency equations permit the manipulator and task kinematics to be calibrated simultaneously using only the joint angle readings; endpoint sensing is not required. Example tasks include a fixed endpoint (0 DOF task), the opening of a door (1 DOF task), and point contact (3 DOF task). Identifiability conditions are derived for these various tasks.

AITR-1233

Author[s]: Ellen Spertus

Dataflow Computation for the J-Machine

May 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1233.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1233.pdf

The dataflow model of computation exposes and exploits parallelism in programs without requiring programmer annotation; however, instruction- level dataflow is too fine-grained to be efficient on general-purpose processors. A popular solution is to develop a "hybrid'' model of computation where regions of dataflow graphs are combined into sequential blocks of code. I have implemented such a system to allow the J- Machine to run Id programs, leaving exposed a high amount of parallelism --- such as among loop iterations. I describe this system and provide an analysis of its strengths and weaknesses and those of the J-Machine, along with ideas for improvement.

AITR-1232

Author[s]: Jeff F. Tabor

Noise Reduction Using Low Weight and Constant Weight Coding Techniques

May 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1232.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1232.pdf

Signalling off-chip requires significant current. As a result, a chip's power-supply current changes drastically during certain output-bus transitions. These current fluctuations cause a voltage drop between the chip and circuit board due to the parasitic inductance of the power-supply package leads. Digital designers often go to great lengths to reduce this "transmitted" noise. Cray, for instance, carefully balances output signals using a technique called differential signalling to guarantee a chip has constant output current. Transmitted-noise reduction costs Cray a factor of two in output pins and wires. Coding achieves similar results at smaller costs.

AIM-1231

Author[s]: Patrick H. Winston and Satayjit Rao

Repairing Learned Knowledge Using Experience

May 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1231.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1231.pdf

Explanation-based learning occurs when something useful is retained from an explanation, usually an account of how some particular problem can be solved given a sound theory. Many real-world explanations are not based on sound theory, however, and wrong things may be learned accidentally, as subsequent failures will likely demonstrate. In this paper, we describe ways to isolate the facts that cause failures, ways to explain why those facts cause problems, and ways to repair learning mistakes. In particular, our program learns to distinguish pails from cups after making a few mistakes.

AIM-1230

Author[s]: Anita Flynn (editor)

Olympic Robot Building Manual

December 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1230.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1230.pdf

The 1989 AI Lab Winter Olympics will take a slightly different twist from previous Olympiads. Although there will still be a dozen or so athletic competitions, the annual talent show finale will now be a display not of human talent, but of robot talent. Spurred on by the question, "Why aren't there more robots running around the AI Lab?", Olympic Robot Building is an attempt to teach everyone how to build a robot and get them started. Robot kits will be given out the last week of classes before the Christmas break and teams have until the Robot Talent Show, January 27th, to build a machine that intelligently connects perception to action. There is no constraint on what can be built; participants are free to pick their own problems and solution implementations. As Olympic Robot Building is purposefully a talent show, there is no particular obstacle course to be traversed or specific feat to be demonstrated. The hope is that this format will promote creativity, freedom and imagination. This manual provides a guide to overcoming all the practical problems in building things. What follows are tutorials on the components supplied in the kits: a microprocessor circuit "brain", a variety of sensors and motors, a mechanical building block system, a complete software development environment, some example robots and a few tips on debugging and prototyping. Parts given out in the kits can be used, ignored or supplemented, as the kits are designed primarily to overcome the intertia of getting started. If all goes well, then come February, there should be all kinds of new members running around the AI Lab!

AITR-1229

Author[s]: David Jerome Braunegg

MARVEL: A System for Recognizing World Locations with Stereo Vision

June 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1229.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1229.pdf

To use a world model, a mobile robot must be able to determine its own position in the world. To support truly autonomous navigation, I present MARVEL, a system that builds and maintains its own models of world locations and uses these models to recognize its world position from stereo vision input. MARVEL is designed to be robust with respect to input errors and to respond to a gradually changing world by updating its world location models. I present results from real- world tests of the system that demonstrate its reliability. MARVEL fits into a world modeling system under development.

AITR-1228

Author[s]: Maja J. Mataric

A Distributed Model for Mobile Robot Environment-Learning and Navigation

May 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1228.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1228.pdf

A distributed method for mobile robot navigation, spatial learning, and path planning is presented. It is implemented on a sonar- based physical robot, Toto, consisting of three competence layers: 1) Low-level navigation: a collection of reflex-like rules resulting in emergent boundary-tracing. 2) Landmark detection: dynamically extracts landmarks from the robot's motion. 3) Map learning: constructs a distributed map of landmarks. The parallel implementation allows for localization in constant time. Spreading of activation computes both topological and physical shortest paths in linear time. The main issues addressed are: distributed, procedural, and qualitative representation and computation, emergent behaviors, dynamic landmarks, minimized communication.

AIM-1227

Author[s]: Rodney A. Brooks

The Behavior Language; User's Guide

April 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1227.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1227.pdf

The Behavior Language is a rule-based real- time parallel robot programming language originally based on ideas from [Brooks 86], [Connell 89], and [Maes 89]. It compiles into a modified and extended version of the subsumption architecture [Brooks 86] and thus has backends for a number of processors including the Motorola 68000 and 68HCll, the Hitachi 6301, and Common Lisp. Behaviors are groups of rules which are activatable by a number of different schemes. There are no shared data structures across behaviors, but instead all communication is by explicit message passing. All rules are assumed to run in parallel and asynchronously. It includes the earlier notions of inhibition and suppression, along with a number of mechanisms for spreading of activation.

AIM-1226

Author[s]: W. Eric L. Grimson

The Effect of Indexing on the Complexity of Object Recognition

April 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1226.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1226.pdf

Many current recognition systems use constrained search to locate objects in cluttered environments. Previous formal analysis has shown that the expected amount of search is quadratic in the number of model and data features, if all the data is known to come from a sinlge object, but is exponential when spurious data is included. If one can group the data into subsets likely to have come from a single object, then terminating the search once a "good enough" interpretation is found reduces the expected search to cubic. Without successful grouping, terminated search is still exponential. These results apply to finding instances of a known object in the data. In this paper, we turn to the problem of selecting models from a library, and examine the combinatorics of determining that a candidate object is not present in the data. We show that the expected search is again exponential, implying that naïve approaches to indexing are likely to carry an expensive overhead, since an exponential amount of work is needed to week out each of the incorrect models. The analytic results are shown to be in agreement with empirical data for cluttered object recognition.

AIM-1225

Author[s]: Andre DeHon, Tom Knight and Marvin Minsky

Fault-Tolerant Design for Multistage Routing Networks

April 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1225.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1225.pdf

As the size of digital systems increases, the mean time between single component failures diminishes. To avoid component related failures, large computers must be fault-tolerant. In this paper, we focus on methods for achieving a high degree of fault- tolerance in multistage routing networks. We describe a multipath scheme for providing end-to-end fault-tolerance on large networks. The scheme improves routing performance while keeping network latency low. We also describe the novel routing component, RN1, which implements this scheme, showing how it can be the basic building block for fault- tolerant multistage routing networks.

AITR-1224

Author[s]: Andre DeHon

Fat-Tree Routing for Transit

February 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1224.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1224.pdf

The Transit network provides high-speed, low-latency, fault-tolerant interconnect for high-performance, multiprocessor computers. The basic connection scheme for Transit uses bidelta style, multistage networks to support up to 256 processors. Scaling to larger machines by simply extending the bidelta network topology will result in a uniform degradation of network latency between all processors. By employing a fat- tree network structure in larger systems, the network provides locality and universality properties which can help minimize the impact of scaling on network latency. This report details the topology and construction issues associated with integrating Transit routing technology into fat-tree interconnect topologies.

AIM-1223

Author[s]: William Singhose

Shaping Inputs to Reduce Vibration: A Vector Diagram Approach

March 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1223.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1223.pdf

This paper describes a method for limiting vibration in flexible systems by shaping the system inputs. Unlike most previous attempts at input shaping, this method does not require an extensive system model or lengthy numerical computation; only knowledge of the system natural frequency and damping ratio are required. The effectiveness of this method when there are errors in the system model is explored and quantified. An algorithm is presented which, given an upper bound on acceptable residual vibration amplitude, determines a shaping strategy that is insensitive to errors in the estimated natural frequency. A procedure for shaping inputs to systems with input constraints is outlined. The shaping method is evaluated by dynamic simulations and hardware experiments.

AIM-1220

Author[s]: Federico Girosi, Tomaso Poggio and Bruno Caprile

Extensions of a Theory of Networks for Approximation and Learning: Outliers and Negative Examples

July 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1220.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1220.pdf

Learning an input-output mapping from a set of examples can be regarded as synthesizing an approximation of a multi-dimensional function. From this point of view, this form of learning is closely related to regularization theory. In this note, we extend the theory by introducing ways of dealing with two aspects of learning: learning in the presence of unreliable examples and learning from positive and negative examples. The first extension corresponds to dealing with outliers among the sparse data. The second one corresponds to exploiting information about points or regions in the range of the function that are forbidden.

AIM-1218

Author[s]: J. Brian Subirana-Vilanova and Whitman Richards

Perceptual Organization, Figure-Ground, Attention and Saliency

August 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1218.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1218.pdf

Notions of figure-ground, inside-outside are difficult to define in a computational sense, yet seem intuitively meaningful. We propose that "figure" is an attention-directed region of visual information processing, and has a non- discrete boundary. Associated with "figure" is a coordinate frame and a "frame curve" which helps initiate the shape recognition process by selecting and grouping convex image chunks for later matching- to-model. We show that human perception is biased to see chunks outside the frame as more salient than those inside. Specific tasks, however, can reverse this bias. Near/far, top/bottom and expansion/contraction also behave similarly.

AIM-1216

Author[s]: Elizabeth Bradley

Causes and Effects of Chaos

December 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1216.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-126.pdf

Most of the recent literature on chaos and nonlinear dynamics is written either for popular science magazine readers or for advanced mathematicians. This paper gives a broad introduction to this interesting and rapidly growing field at a level that is between the two. The graphical and analytical tools used in the literature are explained and demonstrated, the rudiments of the current theory are outlined and that theory is discussed in the context of several examples: an electronic circuit, a chemical reaction and a system of satellites in the solar system.

AIM-1215

Author[s]: David McAllester

Automatic Recognition of Tractability in Inference Relations

February 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1215.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1215.pdf

A procedure is given for recognizing sets of inference rules that generate polynomial time decidable inference relations. The procedure can automatically recognize the tractability of the inference rules underlying congruence closure. The recognition of tractability for that particular rule set constitutes mechanical verification of a theorem originally proved independently by Kozen and Shostak. The procedure is algorithmic, rather than heuristic, and the class of automatically recognizable tractable rule sets can be precisely characterized. A series of examples of rule sets whose tractability is non-trivial, yet machine recognizable, is also given. The technical framework developed here is viewed as a first step toward a general theory of tractable inference relations.

AITR-1214

Author[s]: Nancy S. Pollard

The Grasping Problem: Toward Task-Level Programming for an Articulated Hand

May 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1214.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1214.pdf

This report presents a system for generating a stable, feasible, and reachable grasp of a polyhedral object. A set of contact points on the object is found that can result in a stable grasp; a feasible grasp is found in which the robot contacts the object at those contact points; and a path is constructed from the initial configuration of the robot to the stable, feasible final grasp configuration. The algorithm described in the report is designed for the Salisbury hand mounted on a Puma 560 arm, but a similar approach could be used to develop grasping systems for other robots.

AIM-1210

Author[s]: Manfred Fahle

Parallel Computation of Vernier Offsets, Curvature and Chevrons in Humans

December 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1210.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1210.pdf

A vernier offset is detected at once among straight lines, and reaction times are almost independent of the number of simultaneously presented stimuli (distractors), indicating parallel processing of vernier offsets. Reaction times for identifying a vernier offset to one side among verniers offset to the opposite side increase with the number of distractors, indicating serial processing. Even deviations below a photoreceptor diameter can be detected at once. The visual system thus attains positional accuracy below the photoreceptor diameter simultaneously at different positions. I conclude that deviation from straightness, or change of orientation, is detected in parallel over the visual field. Discontinuities or gradients in orientation may represent an elementary feature of vision.

AIM-1209

Author[s]: Manfred Fahle

Limits of Precision for Human Eye Motor Control

November 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1209.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1209.pdf

Dichoptic presentation of vernier stimuli, i.e., one segment to each eye, yielded three times higher thresholds than binocular presentation, mainly due to uncorrelated movements of both eyes. Thresholds allow one to calculate an upper estimate for the amplitudes of uncorrelated eye movements during fixation. This estimate matches the best results from direct eye position recording, with the calculated mean amplitude of eye tremor corresponding to roughly one photoreceptor diameter. The combined amplitude of both correlated and uncorrelated eye movements was also measured by delaying one segment of the vernier relative to its partner under monocular or dichoptic conditions.

AIM-1208

Author[s]: Manfred Fahle and Tom Troscianko

Computation of Texture and Stereoscopic Depth in Humans

October 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1208.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1208.pdf

The computation of texture and of stereoscopic depth is limited by a number of factors in the design of the optical front-end and subsequent processing stages in humans and machines. A number of limiting factors in the human visual system, such as resolution of the optics and opto-electronic interface, contrast, luminance, temporal resolution and eccentricity are reviewed and evaluated concerning their relevance for the recognition of texture and stereoscopic depth. The algorithms used by the human brain to discriminate between textures and to compute stereoscopic depth are very fast and efficient. Their study might be beneficial for the development of better algorithms in machine vision.

AITR-1205

Author[s]: Howard B. Reubenstein

Automated Acquisition of Evolving Informal Descriptions

June 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1205.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1205.pdf

The Listener is an automated system that unintrusively performs knowledge acquisition from informal input. The Listener develops a coherent internal representation of a description from an initial set of disorganized, imprecise, incomplete, ambiguous, and possibly inconsistent statements. The Listener can produce a summary document from its internal representation to facilitate communication, review, and validation. A special purpose Listener, called the Requirements Apprentice (RA), has been implemented in the software requirements acquisition domain. Unlike most other requirements analysis tools, which start from a formal description language, the focus of the RA is on the transition between informal and formal specifications.

AITR-1204

Author[s]: David Chapman

Vision, Instruction, and Action

April 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1204.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1204.pdf

This thesis describes Sonja, a system which uses instructions in the course of visually- guided activity. The thesis explores an integration of research in vision, activity, and natural language pragmatics. Sonja's visual system demonstrates the use of several intermediate visual processes, particularly visual search and routines, previously proposed on psychophysical grounds. The computations Sonja performs are compatible with the constraints imposed by neuroscientifically plausible hardware. Although Sonja can operate autonomously, it can also make flexible use of instructions provided by a human advisor. The system grounds its understanding of these instructions in perception and action.

AIM-1190

Author[s]: Joachim Heel

Direct Estimation of Structure and Motion from Multiple Frames

March 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1190.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1190.pdf

This paper presents a method for the estimation of scene structure and camera motion from a sequence of images. This approach is fundamentally new. No computation of optical flow or feature correspondences is required. The method processes image sequences of arbitrary length and exploits the redundancy for a significant reduction in error over time. No assumptions are made about camera motion or surface structure. Both quantities are fully recovered. Our method combines the "direct'' motion vision approach with the theory of recursive estimation. Each step is illustrated and evaluated with results from real images.

AIM-1189

Author[s]: Feng Zhao

Machine Recognition as Representation and Search

December 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1189.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1189.pdf

Generality, representation, and control have been the central issues in machine recognition. Model-based recognition is the search for consistent matches of the model and image features. We present a comparative framework for the evaluation of different approaches, particularly those of ACRONYM, RAF, and Ikeuchi et al. The strengths and weaknesses of these approaches are discussed and compared and the remedies are suggested. Various tradeoffs made in the implementations are analyzed with respect to the systems' intended task-domains. The requirements for a versatile recognition system are motivated. Several directions for future research are pointed out.

AIM-1187

Author[s]: M. Ali Taalebinezhaad

Direct Recovery of Motion and Shape in the General Case by Fixation

March 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1187.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1187.pdf

This work introduces a direct method called FIXATION for solving the general motion vision problem. This Fixation method results in a constraint equation between translational and rotational velocities that in combination with the Brightness-Change Constraint Equation (BCCE) solves the general motion vision problem, arbitrary motion with respect to an arbitrary rigid environment. Neither Correspondence nor Optical Flow has been used here. Recently Direct Motion Vision methods have used the BCCE for solving the motion vision problem of special motions or environments. In contrast to those solutions, the Fixation method does not put such severe restrictions on the motion or the environment.

AIM-1186

Author[s]: David J. Braunegg

Location Recognition Using Stereo Vision

October 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1186.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1186.pdf

A mobile robot must be able to determine its own position in the world. To support truly autonomous navigation, we present a system that builds and maintains its own models of world locations and uses these models to recognize its world position from stereo vision input. The system is designed to be robust with respect to input errors and to respond to a gradually changing world by updating the world location models. We present results from tests of the system that demonstrate its reliability. The model builder and recognition system fit into a planned world modeling system that we describe.

AIM-1185

Author[s]: David J. Braunegg

An Alternative to Using the 3D Delaunay Tessellation for Representing Freespace

September 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1185.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1185.pdf

Representing the world in terms of visible surfaces and the freespacesexisting between these surfaces and the viewer is an important problemsin robotics. Recently, researchers have proposed using the 3DsDelaunay Tessellation for representing 3D stereo vision data and thesfreespace determined therefrom. We discuss problems with using thes3D Delaunay Tessellation as the basis of the representation andspropose an alternative representation that we are currentlysinvestigating. This new representation is appropriate for planningsmobile robot navigation and promises to be robust when using stereosdata that has errors and uncertainty.

AIM-1184

Author[s]: David J. Braunegg

Stereo Feature Matching in Disparity Space

September 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1184.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1184.pdf

This paper describes a new method for matching, validating, and disambiguating features for stereo vision. It is based on the Marr-Poggio- Grimson stereo matching algorithm which uses zero-crossing contours in difference-of-Gaussian filtered images as features. The matched contours are represented in disparity space, which makes the information needed for matched contour validation and disambiguation easily accessible. The use of disparity space also makes the algorithm conceptually cleaner than previous implementations of the Marr- Poggio-Grimson algorithm and yields a more efficient matching process.

AIM-1183

Author[s]: Andrew Trice and Randall Davis

Consensus Knowledge Acquisition

December 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1183.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1183.pdf

We have developed a method and prototype program for assisting two experts in their attempts to construct a single, consensus knowledge base. We show that consensus building can be effectively facilitated by a debugging approach that identifies, explains, and resolves discrepancies in their knowledge. To implement this approach we identify and use recognition and repair procedures for a variety of discrepancies. Examples of this knowledge are illustrated\ with sample transcripts from CARTER, a system for reconciling two rule-based systems. Implications for resolving other kinds of knowledge representations are also examined.

AIM-1182

Author[s]: Rodney A. Brooks and Anita M. Flynn

Fast, Cheap and Out of Control

December 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1182.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1182.pdf

Spur-of-the-moment planetary exploration missions are within our reach. Complex systems and complex missions usually take years of planning and force launches to become incredibly expensive. We argue here for cheap, fast missions using large numbers of mass produced simple autonomous robots that are small by today's standards, perhaps 1 to 2kg. We suggest that within a few years it will be possible, at modest cost, to invade a planet with millions of tiny robots.

AIM-1181

Author[s]: Shimon Edelman and Tomaso Poggio

Bringing the Grandmother Back into the Picture: A Memory-Based View of Object Recognition

April 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1181.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1181.pdf

We describe experiments with a versatile pictorial prototype based learning scheme for 3D object recognition. The GRBF scheme seems to be amenable to realization in biophysical hardware because the only kind of computation it involves can be effectively carried out by combining receptive fields. Furthermore, the scheme is computationally attractive because it brings together the old notion of a "grandmother'' cell and the rigorous approximation methods of regularization and splines.

AIM-1180

Author[s]: Pattie Maes

How to Do the Right Thing

October 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1180.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1180.pdf

This paper presents a novel approach to the problem of action selection for an autonomous agent. An agent is viewed as a collection of competence modules. Action selection is modeled as an emergent property of an activation/inhibition dynamics among these modules. A concrete action selection algorithm is presented and a detailed account of the results is given. This algorithm combines characteristics of both traditional planners and reactive systems: it produces fast and robust activity in a tight interaction loop with the environment, while at the same time allowing for some prediction and planning to take place.

AITR-1179

Author[s]: Marc H. Raibert, H. Benjamin Brown, Jr., Michael Chepponis, Jeff Koechling, Jessica K. Hodgins, Diane Dustman, W. Kevin Brennan, David S. Barrett, Clay M. Thompson, John Daniell Hebert, Woojin Lee and Lance Borvansky

Dynamically Stable Legged Locomotion (September 1985-Septembers1989)

September 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1179.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1179.pdf

This report documents our work in exploring active balance for dynamic legged systems for the period from September 1985 through September 1989. The purpose of this research is to build a foundation of knowledge that can lead both to the construction of useful legged vehicles and to a better understanding of animal locomotion. In this report we focus on the control of biped locomotion, the use of terrain footholds, running at high speed, biped gymnastics, symmetry in running, and the mechanical design of articulated legs.

AIM-1178

Author[s]: Eric Sven Ristad and Robert C. Berwick

Computational Consequences of Agreement and Ambiguity in Natural Language

November 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1178.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1178.pdf

The computer science technique of computational complexity analysis can provide powerful insights into the algorithm- neutral analysis of information processing tasks. Here we show that a simple, theory- neutral linguistic model of syntactic agreement and ambiguity demonstrates that natural language parsing may be computationally intractable. Significantly, we show that it may be syntactic features rather than rules that can cause this difficulty. Informally, human languages and the computationally intractable Satisfiability (SAT) problem share two costly computional mechanisms: both enforce agreement among symbols across unbounded distances (Subject-Verb agreement) and both allow ambiguity (is a word a Noun or a Verb?).

AIM-1177

Author[s]: David W. Jacobs

Grouping For Recognition

November 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1177.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1177.pdf

This paper presents a new method of grouping edges in order to recognize objects. This grouping method succeeds on images of both two- and three- dimensional objects. So that the recognition system can consider first the collections of edges most likely to lead to the correct recognition of objects, we order groups of edges based on the likelihood that a single object produced them. The grouping module estimates this likelihood using the distance that separates edges and their relative orientation. This ordering greatly reduces the amount of computation required to locate objects and improves the system's robustness to error.

AIM-1176

Author[s]: David McAllester and Robert Givan

Natural Language Syntax and First Order Preference

October 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1176.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1176.pdf

We have argued elsewhere that first order inference can be made more efficient by using non-standard syntax for first order logic. In this paper we show how a fragment of English syntax under Montague semantics provides the foundation of a new inference procedure. This procedure seems more effective than corresponding procedures based on either classical syntax of our previously proposed taxonomic syntax. This observation may provide a functional explanation for some of the syntactic structure of English.

AIM-1175

Author[s]: Henrich Bulthoff and Manfred Fahle

Disparity Gradients and Depth Scaling

September 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1175.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1175.pdf

The binocular perception of shape and depth relations between objects can change considerably if the viewing direction is changed only by a small angle. We explored this effect psychophysically and found a strong depth reduction effect for large disparity gradients. The effect is found to be strongest for horizontally oriented stimuli, and stronger for line stimuli than for points. This depth scaling effect is discussed in a computational framework of stereo based on a Baysian approach which allows integration of information from different types of matching primitives weighted according to their robustness.

AIM-1174

Author[s]: Harold Abelson

The Bifurcation Interpreter: A Step Towards the Automatic Analysis of Dynamical Systems

September 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1174.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1174.pdf

The Bifurcation Interpreter is a computer program that autonomously explores the steady-state orbits of one-parameter families of periodically- driven oscillators. To report its findings, the Interpreter generates schematic diagrams and English text descriptions similar to those appearing in the science and engineering research literature. Given a system of equations as input, the Interpreter uses symbolic algebra to automatically generate numerical procedures that simulate the system. The Interpreter incorporates knowledge about dynamical systems theory, which it uses to guide the simulations, to interpret the results, and to minimize the effects of numerical error.

AIM-1173

Author[s]: Ed Gamble

A Comparison of Hardware Implementations for Low-Level Vision Algorithms

November 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1173.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1173.pdf

Early and intermediate vision algorithms, such as smoothing and discontinuity detection, are often implemented on general- purpose serial, and more recently, parallel computers. Special-purpose hardware implementations of low-level vision algorithms may be needed to achieve real- time processing. This memo reviews and analyzes some hardware implementations of low-level vision algorithms. Two types of hardware implementations are considered: the digital signal processing chips of Ruetz (and Broderson) and the analog VLSI circuits of Carver Mead. The advantages and disadvantages of these two approaches for producing a general, real-time vision system are considered.

AIM-1171

Author[s]: Michael Eisenberg

Descriptive Simulation: Combining Symbolic and Numerical Methods in the Analysis of Chemical Reaction Mechanisms

September 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1171.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1171.pdf

The Kineticist's Workbench is a computer program currently under development whose purpose is to help chemists understand, analyze, and simplify complex chemical reaction mechanisms. This paper discusses one module of the program that numerically simulates mechanisms and constructs qualitative descriptions of the simulation results. These descriptions are given in terms that are meaningful to the working chemist (e.g., steady states, stable oscillations, and so on); and the descriptions (as well as the data structures used to construct them) are accessible as input to other programs.

AITR-1170

Author[s]: Eric Sven Ristad

Computational Structure of GPSG Models: Revised Generalized Phrase Structure Grammar

September 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1170.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1170.pdf

The primary goal of this report is to demonstrate how considerations from computational complexity theory can inform grammatical theorizing. To this end, generalized phrase structure grammar (GPSG) linguistic theory is revised so that its power more closely matches the limited ability of an ideal speaker--hearer: GPSG Recognition is EXP-POLY time hard, while Revised GPSG Recognition is NP-complete. A second goal is to provide a theoretical framework within which to better understand the wide range of existing GPSG models, embodied in formal definitions as well as in implemented computer programs. A grammar for English and an informal explanation of the GPSG/RGPSG syntactic features are included in appendices.

AIM-1168

Author[s]: Tomaso Poggio and Federico Girosi

Continuous Stochastic Cellular Automata that Have a Stationary Distribution and No Detailed Balance

December 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1168.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1168.pdf

Marroquin and Ramirez (1990) have recently discovered a class of discrete stochastic cellular automata with Gibbsian invariant measures that have a non-reversible dynamic behavior. Practical applications include more powerful algorithms than the Metropolis algorithm to compute MRF models. In this paper we describe a large class of stochastic dynamical systems that has a Gibbs asymptotic distribution but does not satisfy reversibility. We characterize sufficient properties of a sub-class of stochastic differential equations in terms of the associated Fokker-Planck equation for the existence of an asymptotic probability distribution in the system of coordinates which is given. Practical implications include VLSI analog circuits to compute coupled MRF models.

AIM-1167

Author[s]: Tomaso Poggio and Federico Girosi

Extensions of a Theory of Networks for Approximation and Learning: Dimensionality Reduction and Clustering

April 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1167.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1167.pdf

The theory developed in Poggio and Girosi (1989) shows the equivalence between regularization and a class of three-layer networks that we call regularization networks or Hyper Basis Functions. These networks are also closely related to the classical Radial Basis Functions used for interpolation tasks and to several pattern recognition and neural network algorithms. In this note, we extend the theory by defining a general form of these networks with two sets of modifiable parameters in addition to the coefficients $c_\ alpha$: moving centers and adjustable norm- weight.

AIM-1166

Author[s]: Bonnie J. Dorr

Conceptual Basis of the Lexicon in Machine Translation

August 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1166.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1166.pdf

This report describes the organization and content of lexical information required for the task of machine translation. In particular, the lexical-conceptual basis for UNITRAN, an implemented machine translation system, will be described. UNITRAN uses an underlying form called lexical conceptual structure to perform lexical selection and syntactic realization. Lexical word entries have two levels of description: the first is an underlying lexical-semantic representation that is derived from hierarchically organized primitives, and the second is a mapping from this representation to a corresponding syntactic structure. The interaction of these two levels will be discussed and the lexical selection and syntactic realization processes will be described.

AIM-1165

Author[s]: Brian LaMacchia and Jason Nieh

The Standard Map Machine

September 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1165.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1165.pdf

We have designed the Standard Map Machine(SMM) as an answer to the intensive computational requirements involved in the study of chaotic behavior in nonlinear systems. The high-speed and high-precision performance of this computer is due to its simple architecture specialized to the numerical computations required of nonlinear systems. In this report, we discuss the design and implementation of this special-purpose machine.

AIM-1164

Author[s]: Federico Girosi and Tomaso Poggio

Networks and the Best Approximation Property

October 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1164.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1164.pdf

Networks can be considered as approximation schemes. Multilayer networks of the backpropagation type can approximate arbitrarily well continuous functions (Cybenko, 1989; Funahashi, 1989; Stinchcombe and White, 1989). We prove that networks derived from regularization theory and including Radial Basis Function (Poggio and Girosi, 1989), have a similar property. From the point of view of approximation theory, however, the property of approximating continous functions arbitrarily well is not sufficient for characterizing good approximation schemes. More critical is the property of best approximation. The main result of this paper is that multilayer networks, of the type used in backpropagation, are not best approximation. For regularization networks (in particular Radial Basis Function networks) we prove existence and uniqueness of best approximation.

AITR-1163

Author[s]: Kenneth Man-Kam Yip

KAM: Automatic Planning and Interpretation of Numerical Experiments Using Geometrical Methods

August 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1163.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1163.pdf

KAM is a computer program that can automatically plan, monitor, and interpret numerical experiments with Hamiltonian systems with two degrees of freedom. The program has recently helped solve an open problem in hydrodynamics. Unlike other approaches to qualitative reasoning about physical system dynamics, KAM embodies a significant amount of knowledge about nonlinear dynamics. KAM's ability to control numerical experiments arises from the fact that it not only produces pictures for us to see, but also looks at (sic---in its mind's eye) the pictures it draws to guide its own actions. KAM is organized in three semantic levels: orbit recognition, phase space searching, and parameter space searching. Within each level spatial properties and relationships that are not explicitly represented in the initial representation are extracted by applying three operations ---(1) aggregation, (2) partition, and (3) classification--- iteratively.

AITR-1162

Author[s]: Jean-Pierre Schott

Three-Dimensional Motion Estimation Using Shading Information in Multiple Frames

September 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1162.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1162.pdf

A new formulation for recovering the structure and motion parameters of a moving patch using both motion and shading information is presented. It is based on a new differential constraint equation (FICE) that links the spatiotemporal gradients of irradiance to the motion and structure parameters and the temporal variations of the surface shading. The FICE separates the contribution to the irradiance spatiotemporal gradients of the gradients due to texture from those due to shading and allows the FICE to be used for textured and textureless surface. The new approach, combining motion and shading information, leads directly to two different contributions: it can compensate for the effects of shading variations in recovering the shape and motion; and it can exploit the shading/illumination effects to recover motion and shape when they cannot be recovered without it. The FICE formulation is also extended to multiple frames.

AITR-1161

Author[s]: Lyle J. Borg-Graham

Modelling the Somantic Electrical Response of Hippocampal Pyramidal Neurons

September 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1161.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1161.pdf

A modeling study of hippocampal pyramidal neurons is described. This study is based on simulations using HIPPO, a program which simulates the somatic electrical activity of these cells. HIPPO is based on a) descriptions of eleven non-linear conductances that have been either reported for this class of cell in the literature or postulated in the present study, and b) an approximation of the electrotonic structure of the cell that is derived in this thesis, based on data for the linear properties of these cells. HIPPO is used a) to integrate empirical data from a variety of sources on the electrical characteristics of this type of cell, b) to investigate the functional significance of the various elements that underly the electrical behavior, and c) to provide a tool for the electrophysiologist to supplement direct observation of these cells and provide a method of testing speculations regarding parameters that are not accessible.

AIM-1160

Author[s]: Bonnie J. Dorr

Lexical Conceptual Structure and Generation in Machine Translation

June 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1160.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1160.pdf

This report introduces an implemented scheme for generating target- language sentences using a compositional representation of meaning called lexical conceptual structure. Lexical conceptual structure facilitates two crucial operations associated with generation: lexical selection and syntactic realization. The compositional nature of the representation is particularly valuable for these two operations when semantically equivalent source-and-target-language words and phrases are structurally or thematically divergent. To determine the correct lexical items and syntactic realization associated with the surface form in such cases, the underlying lexical-semantic forms are systematically mapped to the target-language syntactic structures. The model described constitutes a lexical-semantic extension to UNITRAN.

AIM-1158

Author[s]: Shimon Edelman and Daphna Weinshall

Computational Vision: A Critical Review

October 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1158.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1158.pdf

We review the progress made in computational vision, as represented by Marr's approach, in the last fifteen years. First, we briefly outline computational theories developed for low, middle and high-level vision. We then discuss in more detail solutions proposed to three representative problems in vision, each dealing with a different level of visual processing. Finally, we discuss modifications to the currently established computational paradigm that appear to be dictated by the recent developments in vision.

AIM-1157

Author[s]: Thomas Marill

Recognizing Three-Dimensional Objects without the Use of Models

September 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1157.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1157.pdf

We present an approach to the problem of recognizing three-dimensional objects from line-drawings. In this approach there are no models. The system needs only to be given a single picture of an object; it can then recognize the object in arbitrary orientations.

AIM-1156

Author[s]: Sandiway Fong

Free Indexation: Combinatorial Analysis and a Compositional Algorithm

December 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1156.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1156.pdf

In the principles-and-parameters model of language, the principle known as "free indexation'' plays an important part in determining the referential properties of elements such as anaphors and pronominals. This paper addresses two issues. (1) We investigate the combinatorics of free indexation. In particular, we show that free indexation must produce an exponential number of referentially distinct structures. (2) We introduce a compositional free indexation algorithm. We prove that the algorithm is "optimal.'' More precisely, by relating the compositional structure of the formulation to the combinatorial analysis, we show that the algorithm enumerates precisely all possible indexings, without duplicates.

AITR-1155

Author[s]: Michael A. Erdmann

On Probabilistic Strategies for Robot Tasks

August 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1155.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1155.pdf

Robots must act purposefully and successfully in an uncertain world. Sensory information is inaccurate or noisy, actions may have a range of effects, and the robot’s environment is only partially and imprecisely modeled. This thesis introduces active randomization by a robot, both in selecting actions to execute and in focusing on sensory information to interpret, as a basic tool for overcoming uncertainty. An example of randomization is given by the strategy of shaking a bin containing a part in order to orient the part in a desired stable state with some high probability. Another example consists of first using reliable sensory information to bring two parts close together, then relying on short random motions to actually mate the two parts, once the part motions lie below the available sensing resolution. Further examples include tapping parts that are tightly wedged, twirling gears before trying to mesh them, and vibrating parts to facilitate a mating operation.

AITR-1154

Author[s]: Anya C. Hurlbert

The Computation of Color

September 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1154.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1154.pdf

This thesis takes an interdisciplinary approach to the study of color vision, focussing on the phenomenon of color constancy formulated as a computational problem. The primary contributions of the thesis are (1) the demonstration of a formal framework for lightness algorithms; (2) the derivation of a new lightness algorithm based on regularization theory; (3) the synthesis of an adaptive lightness algorithm using "learning" techniques; (4) the development of an image segmentation algorithm that uses luminance and color information to mark material boundaries; and (5) an experimental investigation into the cues that human observers use to judge the color of the illuminant. Other computational approaches to color are reviewed and some of their links to psychophysics and physiology are explored.

AITR-1153

Author[s]: Andrew Dean Christian

Design and Implementation of a Flexible Robot

August 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1153.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1153.pdf

This robot has low natural frequencies of vibration. Insights into the problems of designing joint and link flexibility are discussed. The robot has three flexible rotary actuators and two flexible, interchangeable links, and is controlled by three independent processors on a VMEbus. Results from experiments on the control of residual vibration for different types of robot motion are presented. Impulse prefiltering and slowly accelerating moves are compared and shown to be effective at reducing residual vibration.

AIM-1152

Author[s]: Shimon Ullman and Ronen Basri

Recognition by Linear Combinations of Models

August 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1152.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1152.pdf

Visual object recognition requires the matching of an image with a set of models stored in memory. In this paper we propose an approach to recognition in which a 3-D object is represented by the linear combination of 2-D images of the object. If M = {M1,…Mk} is the set of pictures representing a given object, and P is the 2-D image of an object to be recognized, then P is considered an instance of M if P = Eki=aiMi for some constants ai. We show that this approach handles correctly rigid 3-D transformations of objects with sharp as well as smooth boundaries, and can also handle non-rigid transformations. The paper is divided into two parts. In the first part we show that the variety of views depicting the same object under different transformations can often be expressed as the linear combinations of a small number of views. In the second part we suggest how this linear combinatino property may be used in the recognition process.

AITR-1151

Author[s]: Jonathan Connell

A Colony Architecture for an Artificial Creature

September 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1151.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1151.pdf

This report describes a working autonomous mobile robot whose only goal is to collect and return empty soda cans. It operates in an unmodified office environment occupied by moving people. The robot is controlled by a collection of over 40 independent "behaviors'' distributed over a loosely coupled network of 24 processors. Together this ensemble helps the robot locate cans with its laser rangefinder, collect them with its on-board manipulator, and bring them home using a compass and an array of proximity sensors. We discuss the advantages of using such a multi-agent control system and show how to decompose the required tasks into component activities. We also examine the benefits and limitations of spatially local, stateless, and independent computation by the agents.

AIM-1148

Author[s]: Anita M. Flynn and Rodney A. Brooks

Battling Reality

October 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1148.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1148.pdf

In the four years that the MIT Mobile Robot Project has benn in existence, we have built ten robots that focus research in various areas concerned with building intelligent systems. Towards this end, we have embarked on trying to build useful autonomous creatures that live and work in the real world. Many of the preconceived notions entertained before we started building our robots turned out to be misguided. Some issues we thought would be hard have worked successfully from day one and subsystems we imagined to be trivial have become tremendous time sinks. Oddly enough, one of our biggest failures has led to some of our favorite successes. This paper describes the changing paths our research has taken due to the lessons learned from the practical realities of building robots.

AIM-1146

Author[s]: Shimon Edelman and Daphna Weinshall

A Self-Organizing Multiple-View Representation of 3D Objects

August 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1146.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1146.pdf

We explore representation of 3D objects in which several distinct 2D views are stored for each object. We demonstrate the ability of a two-layer network of thresholded summation units to support such representations. Using unsupervised Hebbian relaxation, we trained the network to recognise ten objects from different viewpoints. The training process led to the emergence of compact representations of the specific input views. When tested on novel views of the same objects, the network exhibited a substantial generalisation capability. In simulated psychophysical experiments, the network's behavior was qualitatively similar to that of human subjects.

AIM-1145

Author[s]: Andrew Berlin and Daniel Weise

Compiling Scientific Code Using Partial Evaluation

July 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1145.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1145.pdf

Scientists are faced with a dilemma: either they can write abstract programs that express their understanding of a problem, but which do not execute efficiently; or they can write programs that computers can execute efficiently, but which are difficult to write and difficult to understand. We have developed a compiler that uses partial evaluation and scheduling techniques to provide a solution to this dilemma.

AITR-1144

Author[s]: Andrew A. Berlin

A Compilation Strategy for Numerical Programs Based on Partial Evaluation

February 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1144.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1144.pdf

This work demonstrates how partial evaluation can be put to practical use in the domain of high-performance numerical computation. I have developed a technique for performing partial evaluation by using placeholders to propagate intermediate results. For an important class of numerical programs, a compiler based on this technique improves performance by an order of magnitude over conventional compilation techniques. I show that by eliminating inherently sequential data-structure references, partial evaluation exposes the low-level parallelism inherent in a computation. I have implemented several parallel scheduling and analysis programs that study the tradeoffs involved in the design of an architecture that can effectively utilize this parallelism. I present these results using the 9- body gravitational attraction problem as an example.

AITR-1143

Author[s]: W. Kenneth Stewart, Jr.

Multisensor Modeling Underwater with Uncertain Information

July 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1143.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1143.pdf

This thesis develops an approach to the construction of multidimensional stochastic models for intelligent systems exploring an underwater environment. It describes methods for building models by a three- dimensional spatial decomposition of stochastic, multisensor feature vectors. New sensor information is incrementally incorporated into the model by stochastic backprojection. Error and ambiguity are explicitly accounted for by blurring a spatial projection of remote sensor data before incorporation. The stochastic models can be used to derive surface maps or other representations of the environment. The methods are demonstrated on data sets from multibeam bathymetric surveying, towed sidescan bathymetry, towed sidescan acoustic imagery, and high-resolution scanning sonar aboard a remotely operated vehicle.

AIM-1141

Author[s]: Ellen C. Hildreth, Norberto M. Grzywacz, Edward H. Adelson and Victor K. Inada

The Perceptual Buildup of Three-Dimensional Structure from Motion

August 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1141.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1141.pdf

We present psychophysical experiments that measure the accuracy of perceived 3D structure derived from relative image motion. The experiments are motivated by Ullman's incremental rigidity scheme, which builds up 3D structure incrementally over an extended time. Our main conclusions are: first, the human system derives an accurate model of the relative depths of moving points, even in the presence of noise; second, the accuracy of 3D structure improves with time, eventually reaching a plateau; and third, the 3D structure currently perceived depends on previous 3D models. Through computer simulations, we relate the psychophysical observations to the behavior of Ullman's model.

AIM-1140

Author[s]: Tomaso Poggio and Federico Girosi

A Theory of Networks for Appxoimation and Learning

July 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1140.ps.Z

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1140.pdf

Learning an input-output mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multi-dimensional function, that is solving the problem of hypersurface reconstruction. From this point of view, this form of learning is closely related to classical approximation techniques, such as generalized splines and regularization theory. This paper considers the problems of an exact representation and, in more detail, of the approximation of linear and nolinear mappings in terms of simpler functions of fewer variables. Kolmogorov's theorem concerning the representation of functions of several variables in terms of functions of one variable turns out to be almost irrelevant in the context of networks for learning. We develop a theoretical framework for approximation based on regularization techniques that leads to a class of three-layer networks that we call Generalized Radial Basis Functions (GRBF), since they are mathematically related to the well-known Radial Basis Functions, mainly used for strict interpolation tasks. GRBF networks are not only equivalent to generalized splines, but are also closely related to pattern recognition methods such as Parzen windows and potential functions and to several neural network algorithms, such as Kanerva's associative memory, backpropagation and Kohonen's topology preserving map. They also have an interesting interpretation in terms of prototypes that are synthesized and optimally combined during the learning stage. The paper introduces several extensions and applications of the technique and discusses intriguing analogies with neurobiological data.

AITR-1139

Author[s]: Jason Nieh

Using Special-Purpose Computing to Examine Chaotic Behavior in Nonlinear Mappings

September 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1139.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1139.pdf

Studying chaotic behavior in nonlinear systems requires numerous computations in order to simulate the behavior of such systems. The Standard Map Machine was designed and implemented as a special computer for performing these intensive computations with high-speed and high- precision. Its impressive performance is due to its simple architecture specialized to the numerical computations required of nonlinear systems. This report discusses the design and implementation of the Standard Map Machine and its use in the study of nonlinear mappings; in particular, the study of the standard map.

AIM-1138

Author[s]: Shimon Edelman, Heinrich Bulthoff and Daphna Weinshall

Stimulus Familiarity Determines Recognition Strategy for Novel 3-D Objects

July 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1138.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1138.pdf

We describe a psychophysical investigation of the effects of object complexity and familiarity on the variation of recognition time and recognition accuracy over different views of novel 3D objects. Our findings indicate that with practice the response times for different views become more uniform and the initially orderly dependency of the response time on the distance to a "good" view disappears. One possible interpretation of our results is in terms of a tradeoff between memory needed for storing specific-view representations of objects and time spent in recognizing the objects.

AIM-1137

Author[s]: J. Brian Subirana-Vilanova

Curved Inertia Frames: Visual Attention and Perceptual Organization Using Convexity and Symmetry

October 1991

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1137.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1137.pdf

In this paper we present an approach to perceptual organization and attention based on Curved Inertia Frames (C.I.F.), a novel definition of "curved axis of inertia'' tolerant to noisy and spurious data. The definition is useful because it can find frames that correspond to large, smooth, convex, symmetric and central parts. It is novel because it is global and can detect curved axes. We discuss briefly the relation to human perception, the recognition of non-rigid objects, shape description, and extensions to finding "features", inside/outside relations, and long- smooth ridges in arbitrary surfaces.

AIM-1136

Author[s]: Thomas Marill

Computer Perception of Three-Dimensional Objects

August 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1136.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1136.pdf

We first pose the following problem: to develop a program which takes line-drawings as input and constructs three-dimensional objects as output, such that the output objects are the same as the ones we see when we look at the input line-drawing. We then introduce the principle of minimum standard- deviation of angles (MSDA) and discuss a program based on MSDA. We present the results of testing this program with a variety of line- drawings and show that the program constitutes a solution to the stated problem over the range of line-drawings tested. Finally, we relate this work to its historical antecedents in the psychological and computer-vision literature.

AIM-1134

Author[s]: David McAllester and Robert Givan

Taxonomic Syntax for First-Order Inference

June 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1134.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1134.pdf

Most knowledge representation languages are based on classes and taxonomic relationships between classes. Taxonomic hierarchies without defaults or exceptions are semantically equivalent to a collection of formulas in first order predicate calculus. Although designers of knowledge representation languages often express an intuitive feeling that there must be some advantage to representing facts as taxonomic relationships rather than first order formulas, there are few, if any, technical results supporting this intuition. We attempt to remedy this situation by presenting a taxonomic syntax for first order predicate calculus and a series of theorems that support the claim that taxonomic syntax is superior to classical syntax.

AIM-1133

Author[s]: Todd Anthony Cass

Feature Matching for Object Localization in the Presence of Uncertainty

May 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1133.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1133.pdf

We consider the problem of matching model and sensory data features in the presence of geometric uncertainty, for the purpose of object localization and identification. The problem is to construct sets of model feature and sensory data feature pairs that are geometrically consistent given that there is uncertainty in the geometry of the sensory data features. If there is no geometric uncertainty, polynomial-time algorithms are possible for feature matching, yet these approaches can fail when there is uncertainty in the geometry of data features. Existing matching and recognition techniques which account for the geometric uncertainty in features either cannot guarantee finding a correct solution, or can construct geometrically consistent sets of feature pairs yet have worst case exponential complexity in terms of the number of features. The major new contribution of this work is to demonstrate a polynomial-time algorithm for constructing sets of geometrically consistent feature pairs given uncertainty in the geometry of the data features. We show that under a certain model of geometric uncertainty the feature matching problem in the presence of uncertainty is of polynomial complexity. This has important theoretical implications by demonstrating an upper bound on the complexity of the matching problem, an by offering insight into the nature of the matching problem itself. These insights prove useful in the solution to the matching problem in higher dimensional cases as well, such as matching three-dimensional models to either two or three-dimensional sensory data. The approach is based on an analysis of the space of feasible transformation parameters. This paper outlines the mathematical basis for the method, and describes the implementation of an algorithm for the procedure. Experiments demonstrating the method are reported.

AITR-1132

Author[s]: Todd A. Cass

Robust 2-D Model-Based Object Recognition

May 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1132.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1132.pdf

Techniques, suitable for parallel implementation, for robust 2D model-based object recognition in the presence of sensor error are studied. Models and scene data are represented as local geometric features and robust hypothesis of feature matchings and transformations is considered. Bounds on the error in the image feature geometry are assumed constraining possible matchings and transformations. Transformation sampling is introduced as a simple, robust, polynomial-time, and highly parallel method of searching the space of transformations to hypothesize feature matchings. Key to the approach is that error in image feature measurement is explicitly accounted for. A Connection Machine implementation and experiments on real images are presented.

AIM-1131

Author[s]: Daphna Weinshall

Direct Computation of 3D Shape Invariants and the Focus of Expansion

May 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1131.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1131.pdf

Structure from motion often refers to the computation of 3D structure from a matched sequence of images. However, a depth map of a surface is difficult to compute and may not be a good representation for storage and recognition. Given matched images, I will first show that the sign of the normal curvature in a given direction at a given point in the image can be computed from a simple difference of slopes of line-segments in one image. Using this result, local surface patches can be classified as convex, concave, parabolic (cylindrical), hyperbolic (saddle point) or planar. At the same time the translational component of the optical flow is obtained, from which the focus of expansion can be computed.

AITR-1128

Author[s]: Jeffrey Van Baalen

Toward a Theory of Representation Design

May 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1128.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1128.pdf

This research is concerned with designing representations for analytical reasoning problems (of the sort found on the GRE and LSAT). These problems test the ability to draw logical conclusions. A computer program was developed that takes as input a straightforward predicate calculus translation of a problem, requests additional information if necessary, decides what to represent and how, designs representations capturing the constraints of the problem, and creates and executes a LISP program that uses those representations to produce a solution. Even though these problems are typically difficult for theorem provers to solve, the LISP program that uses the designed representations is very efficient.

AIM-1126

Author[s]: Anita M. Flynn, Rodney A. Brooks and Lee S. Tavrow

Twilight Zones and Cornerstones: A Gnat Robot Double Feature

July 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1126.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1126.pdf

We want to build tiny gnat-sized robots, a millimeter or two in diameter. They will be cheap, disposable, totally self-contained autonomous agents able to do useful things in the world. This paper consists of two parts. The first describes why we want to build them. The second is a technical outline of how to go about it. Gnat robots are going to change the world.

AITR-1125

Author[s]: Michelle Kwok Lee

Summarizing Qualitative Behavior from Measurements of NonlinearsCircuits

May 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1125.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1125.pdf

This report describes a program which automatically characterizes the behavior of any driven, nonlinear, electrical circuit. To do this, the program autonomously selects interesting input parameters, drives the circuit, measures its response, performs a set of numeric computations on the measured data, interprets the results, and decomposes the circuit's parameter space into regions of qualitatively distinct behavior. The output is a two-dimensional portrait summarizing the high-level, qualitative behavior of the circuit for every point in the graph, an accompanying textual explanation describing any interesting patterns observed in the diagram, and a symbolic description of the circuit's behavior which can be passed on to other programs for further analysis.

AIM-1122

Author[s]: Michael R. Brent

Causal/Temporal Connectives: Syntax and Lexicon

September 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1122.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1122.pdf

This report elucidates the linguistic representation of temporal relations among events. This involves examining sentences that contain two clauses connected by words like once, by the time, when, and before. Specifically, the effect of the tenses of the connected clauses on the acceptability of sentences are examined. For example, Rachel disappeared once Jon had fallen asleep is fine, but *Rachel had disappeared once Jon fell asleep is unacceptable. A theory of acceptability is developed and its implications for interpretation discussed. Factoring of the linguisitic knowledge into a general, syntactic component and a lexical component clarifies the interpretation problem. Finally, a computer model of the theory is demonstrated.

AIM-1120

Author[s]: Anita M. Flynn, Rodney A. Brooks, William M. Wells III and David S. Barrett

SQUIRT: The Prototypical Mobile Robot for Autonomous Graduate Students

July 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1120.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1120.pdf

This paper describes an exercise in building a complete robot aimed at being as small as possible but using off-the-shelf components exclusively. The result is an autonomous mobile robot slightly larger than one cubic inch which incorporates sensing, actuation, onboard computation, and onboard power supplies. Nicknamed Squirt, this robot acts as a 'bug', hiding in dark corners and venturing out in the direction of last heard noises, only moving after the noises are long gone.

AIM-1119

Author[s]: Henry M. Wu

A Multiprocessor Architecture Using Modular Arithmetic for Very High Precision Computation

April 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1119.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1119.pdf

We outline a multiprocessor architecture that uses modular arithmetic to implement numerical computation with 900 bits of intermediate precision. A proposed prototype, to be implemented with off-the-shelf parts, will perform high-precision arithmetic as fast as some workstations and mini- computers can perform IEEE double-precision arithmetic. We discuss how the structure of modular arithmetic conveniently maps into a simple, pipelined multiprocessor architecture. We present techniques we developed to overcome a few classical drawbacks of modular arithmetic. Our architecture is suitable to and essential for the study of chaotic dynamical systems.

AIM-1118

Author[s]: Michael D. Monegan

An Object-Oriented Software Reuse Tool

April 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1118.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1118.pdf

The Object-oriented Reuse Tool (ORT) supports the reuse of object-oriented software by maintaining a library of reusable classes and recording information about their reusability as well as information associated with their design and verification. In the early design phases of object-oriented development, ORT facilitates reuse by providing a flexible way to navigate the library, thereby aiding in the process of refining a design to maximally reuse existing classes. A collection of extensions to ORT have also been identified. These extensions would compose the remainder of a system useful in increasing reuse in object-oriented software production.

AITR-1117

Author[s]: Bror V. H. Saxberg

A Modern Differential Geometric Approach to Shape from Shading

June 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1117.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1117.pdf

How the visual system extracts shape information from a single grey-level image can be approached by examining how the information about shape is contained in the image. This technical report considers the characteristic equations derived by Horn as a dynamical system. Certain image critical points generate dynamical system critical points. The stable and unstable manifolds of these critical points correspond to convex and concave solution surfaces, giving more general existence and uniqueness results. A new kind of highly parallel, robust shape from shading algorithm is suggested on neighborhoods of these critical points. The information at bounding contours in the image is also analyzed.

AIM-1115

Author[s]: Andrew Christian

Design Considerations for an Earth-Based Flexible Robotic System

March 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1115.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1115.pdf

This paper provides insights into the problems of designing a robot with joint and link flexibility. The relationship between the deflection of the robot under gravity is correlated with the fundamental frequency of vibration. We consider different types of link geometry and evaluate the flexibility potential of different materials. Some general conclusions and guidelines for constructing a flexible robot are given.

AIM-1114

Author[s]: Davi Geiger and Federico Girosi

Parallel and Deterministic Algorithms for MRFs: Surface Reconstruction and Integration

May 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1114.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1114.pdf

In recent years many researchers have investigated the use of Markov random fields (MRFs) for computer vision. The computational complexity of the implementation has been a drawback of MRFs. In this paper we derive deterministic approximations to MRFs models. All the theoretical results are obtained in the framework of the mean field theory from statistical mechanics. Because we use MRFs models the mean field equations lead to parallel and iterative algorithms. One of the considered models for image reconstruction is shown to give in a natural way the graduate non-convexity algorithm proposed by Blake and Zisserman.

AITR-1113

Author[s]: Karen Beth Sarachik

Visual Navigation: Constructing and Utilizing Simple Maps of an Indoor Environment

March 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1113.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1113.pdf

The goal of this work is to navigate through an office environmentsusing only visual information gathered from four cameras placed onboard a mobile robot. The method is insensitive to physical changes within the room it is inspecting, such as moving objects. Forward and rotational motion vision are used to find doors and rooms, and these can be used to build topological maps. The map is built without the use of odometry or trajectory integration. The long term goal of the project described here is for the robot to build simple maps of its environment and to localize itself within this framework.

AITR-1112

Author[s]: Richard P. Wildes

On Interpreting Stereo Disparity

February 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1112.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1112.pdf

The problems under consideration center around the interpretation of binocular stereo disparity. In particular, the goal is to establish a set of mappings from stereo disparity to corresponding three-dimensional scene geometry. An analysis has been developed that shows how disparity information can be interpreted in terms of three-dimensional scene properties, such as surface depth, discontinuities, and orientation. These theoretical developments have been embodied in a set of computer algorithms for the recovery of scene geometry from input stereo disparity. The results of applying these algorithms to several disparity maps are presented. Comparisons are made to the interpretation of stereo disparity by biological systems.

AIM-1111

Author[s]: W. Eric L. Grimson

The Combinatorics of Heuristic Search Termination for Object Recognition in Cluttered Environments

May 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1111.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1111.pdf

Many recognition systems use constrained search to locate objects in cluttered environments. Earlier analysis showed that the expected search is quadratic in the number of model and data features, if all the data comes from one object, but is exponential when spurious data is included. To overcome this, many methods terminate search once an interpretation that is "good enough" is found. We formally examine the combinatorics of this, showing that correct termination procedures dramatically reduce search. We provide conditions on the object model and the scene clutter such that the expected search is quartic. These results are shown to agree with empirical data for cluttered object recognition.

AIM-1110

Author[s]: W. Eric L. Grimson and Daniel P. Huttenlocher

On the Verification of Hypothesized Matches in Model-Based Recognition

May 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1110.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1110.pdf

In model-based recognition, ad hoc techniques are used to decide if a match of data to model is correct. Generally an empirically determined threshold is placed on the fraction of model features that must be matched. We rigorously derive conditions under which to accept a match, relating the probability of a random match to the fraction of model features accounted for, as a function of the number of model features, number of image features and the sensor noise. We analyze some existing recognition systems and show that our method yields results comparable with experimental data.

AITR-1109

Author[s]: Jeremy M. Wertheimer

Derivation of an Efficient Rule System Pattern Matcher

February 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1109.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1109.pdf

Formalizing algorithm derivations is a necessary prerequisite for developing automated algorithm design systems. This report describes a derivation of an algorithm for incrementally matching conjunctive patterns against a growing database. This algorithm, which is modeled on the Rete matcher used in the OPS5 production system, forms a basis for efficiently implementing a rule system. The highlights of this derivation are: (1) a formal specification for the rule system matching problem, (2) derivation of an algorithm for this task using a lattice-theoretic model of conjunctive and disjunctive variable substitutions, and (3) optimization of this algorithm, using finite differencing, for incrementally processing new data.

AIM-1108

Author[s]: Thomas M. Breuel

Indexing for Visual Recognition from a Large Model Base

August 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1108.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1108.pdf

This paper describes a new approach to the model base indexing stage of visual object recognition. Fast model base indexing of 3D objects is achieved by accessing a database of encoded 2D views of the objects using a fast 2D matching algorithm. The algorithm is specifically intended as a plausible solution for the problem of indexing into very large model bases that general purpose vision systems and robots will have to deal with in the future. Other properties that make the indexing algorithm attractive are that it can take advantage of most geometric and non- geometric properties of features without modification, and that it addresses the incremental model acquisition problem for 3D objects.

AIM-1106

Author[s]: Jeff Palmucci, Carl Waldsburger, David Duis and Paul Krause

Experience with Acore: Implementing GHC with Actors

August 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1106.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1106.pdf

This paper presents a concurrent interpreter for a general-purpose concurrent logic programming language, Guarded Horn Clauses (GHC). Unlike typical implementations of GHC in logic programming languages, the interpreter is implemented in the Actor language Acore. The primary motivation for this work was to probe the strengths and weaknesses of Acore as a platform for developing sophisticated programs. The GHC interpreter provided a rich testbed for exploring Actor programming methodology. The interpreter is a pedagogical investigation of the mapping of GHC constructs onto the Actor model. Since we opted for simplicity over optimization, the interpreter is somewhat inefficient.

AIM-1105

Author[s]: Berthold K.P. Horn

Height and Gradient from Shading

May 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1105.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1105.pdf

The method described here for recovering the shape of a surface from a shaded image can deal with complex, wrinkled surfaces. Integrability can be enforced easily because both surface height and gradient are represented. The robustness of the method stems in part from linearization of the reflectance map about the current estimate of the surface orientation at each picture cell. The new scheme can find an exact solution of a given shape-from-shading problem even though a regularizing term is included. This is a reflection of the fact that shape-from- shading problems are not ill-posed when boundary conditions are available or when the image contains singular points.

AITR-1103

Author[s]: Henry M. Wu

Performance Evaluation of the Scheme 86 and HP Precision Architecture

April 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1103.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1103.pdf

The Scheme86 and the HP Precision Architectures represent different trends in computer processor design. The former uses wide micro-instructions, parallel hardware, and a low latency memory interface. The latter encourages pipelined implementation and visible interlocks. To compare the merits of these approaches, algorithms frequently encountered in numerical and symbolic computation were hand-coded for each architecture. Timings were done in simulators and the results were evaluated to determine the speed of each design. Based on these measurements, conclusions were drawn as to which aspects of each architecture are suitable for a high- performance computer.

AIM-1102

Author[s]: Richard C. Waters

XP. A Common Lisp Pretty Printing System

March 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1102.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1102.pdf

XP provides efficient and flexible support for pretty printing in Common Lisp. Its single greatest advantage is that it allows the full benefits of pretty printing to be obtained when printing data structures, as well as when printing program code. XP is efficient, because it is based on a linear time algorithm that uses only a small fixed amount of storage. XP is flexible, because users can control the exact form of the output via a set of special format directives. XP can operate on arbitrary data structures, because facilities are provided for specifying pretty printing methods for any type of object. XP also modifies the way abbreviation based on length, nesting depth, and circularity is supported so that they automatically apply to user-defined functions that perform output – e.g., print functions for structures. In addition, a new abbreviation mechanism is introduced that can be used to limit the total numbers of lines printed.

AIM-1102A

Author[s]: Richard C. Waters

XP. A Common Lisp Pretty Printing System

August 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1102a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1102a.pdf

AIM-1101

Author[s]: Thomas F. Knight, Jr. and Patrick G. Sobalvarro

Routing Statistics for Unqueued Banyan Networks

September 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1101.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1101.pdf

Banyan networks comprise a large class of networks that have been used for interconnection in large-scale multiprocessors and telephone switching systems. Regular variants of Banyan networks, such as delta and butterfly networks, have been used in multiprocessors such as the IBM RP3 and the BBN Butterfly. Analysis of the performance of Banyan networks has typically focused on these regular variants. We present a methodology for performance analysis of unbuffered Banyan multistage interconnection networks. The methodology has two novel features: it allows analysis of networks where some inputs are more likely to be active than others, and allows analysis of Banyan networks of arbitrary topology.

AIM-1100

Author[s]: Charles Rich and Richard C. Waters

Intelligent Assistance for Program Recognition, Design, Optimization, and Debugging

January 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1100.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1100.pdf

A recognition assistant will help reconstruct the design of a program, given only its source code. A design assistant will assist a programmer by detecting errors and inconsistencies in his design choices and by automatically making many straightforward implementation decisions. An optimization assistant will help improve the performance of programs by identifying intermediate results that can be reused. A debugging assistant will aid in the detection, localization, and repair of errors in designs as well as completed programs.

AITR-1099

Author[s]: Mark Harper Shirley

Generating Circuit Tests by Exploiting Designed Behavior

December 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1099.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1099.pdf

This thesis describes two programs for generating tests for digital circuits that exploit several kinds of expert knowledge not used by previous approaches. First, many test generation problems can be solved efficiently using operation relations, a novel representation of circuit behavior that connects internal component operations with directly executable circuit operations. Operation relations can be computed efficiently by searching traces of simulated circuit behavior. Second, experts write test programs rather than test vectors because programs are more readable and compact. Test programs can be constructed automatically by merging program fragments using expert-supplied goal-refinement rules and domain-independent planning techniques.

AIM-1096

Author[s]: Boris Katz

Using English For Indexing and Retrieving

October 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1096.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1096.pdf

This paper describes a natural language system START. The system analyzes English text and automatically transforms it into an appropriate representation, the knowledge base, which incorporates the information found in the text. The user gains access to information stored in the knowledge base by querying it in English. The system analyzes the query and decides through a matching process what information in the knowledge base is relevant to the question. Then it retrieves this information and formulates its response also in English.

AIM-1094

Author[s]: Harold Abelson, Michael Eisenberg, Mathew Halfact, Jacob Katzenelson, Elisha Sacks, Gerald Jay Sussman, Jack Wisdom and Ken Yip

Intelligence in Scientific Computing

November 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1094.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1094.pdf

Combining numerical techniques with ideas from symbolic computation and with methods incorporating knowledge of science and mathematics leads to a new category of intelligent computational tools for scientists and engineers. These tools autonomously prepare simulation experiments from high- level specifications of physical models. For computationally intensive experiments, they automatically design special-purpose numerical engines optimized to perform the necessary computations. They actively monitor numerical and physical experiments. They interpret experimental data and formulate numerical results in qualitative terms. They enable their human users to control computational experiments in terms of high-level behavioral descriptions.

AITR-1092

Author[s]: Eric Saund

The Role of Knowledge in Visual Shape Representation

October 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1092.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1092.pdf

This report shows how knowledge about the visual world can be built into a shape representation in the form of a descriptive vocabulary making explicit the important geometrical relationships comprising objects' shapes. Two computational tools are offered: (1) Shapestokens are placed on a Scale- Space Blackboard, (2) Dimensionality- reduction captures deformation classes in configurations of tokens. Knowledge lies in the token types and deformation classes tailored to the constraints and regularities ofparticular shape worlds. A hierarchical shape vocabulary has been implemented supporting several later visual tasks in the two-dimensional shape domain of the dorsal fins of fishes.

AIM-1091

Author[s]: Rodney A. Brooks

A Robot that Walks: Emergent Behaviors from a Carefully Evolved Network

February 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1091.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1091.pdf

Most animals have significant behavioral expertise built in without having to explicitly learn it all from scratch. This expertise is a product of evolution of the organism; it can be viewed as a very long term form of learning which provides a structured system within which individuals might learn more specialized skills or abilities. This paper suggests one possible mechanism for analagous robot evolution by describing a carefully designed series of networks, each one being a strict augmentation of the previous one, which control a six legged walking machine capable of walking over rough terrain and following a person passively sensed in the infrared spectrum. As the completely decentralized networks are augmented, the robot's performance and behavior repertoire demonstrably improve. The rationale for such demonstrations is that they may provide a hint as to the requirements for automatically building massive networks to carry out complex sensory-motor tasks. The experiments with an actual robot ensure that an essence of reality is maintained and that no critical problems have been ignored.

AITR-1089

Author[s]: Allen C. Ward

A Theory of Quantitative Inference Applied to a Mechanical Design Compiler

January 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1089.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1089.pdf

This thesis presents the ideas underlying a computer program that takes as input a schematic of a mechanical or hydraulic power transmission system, plus specifications and a utility function, and returns catalog numbers from predefined catalogs for the optimal selection of components implementing the design. Unlike programs for designing single components or systems, the program provides the designer with a high level "language" in which to compose new designs. It then performs some of the detailed design process. The process of "compilation" is based on a formalization of quantitative inferences about hierarchically organized sets of artifacts and operating conditions. This allows the design compilation without the exhaustive enumeration of alternatives.

AITR-1086

Author[s]: Terence D. Sanger

Optimal Unsupervised Learning in Feedforward Neural Networks

January 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1086.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1086.pdf

We investigate the properties of feedforward neural networks trained with Hebbian learning algorithms. A new unsupervised algorithm is proposed which produces statistically uncorrelated outputs. The algorithm causes the weights of the network to converge to the eigenvectors of the input correlation with largest eigenvalues. The algorithm is closely related to the technique of Self-supervised Backpropagation, as well as other algorithms for unsupervised learning. Applications of the algorithm to texture processing, image coding, and stereo depth edge detection are given. We show that the algorithm can lead to the development of filters qualitatively similar to those found in primate visual cortex.

AITR-1085

Author[s]: Philip E. Agre

The Dynamic Structures of Everyday Life

October 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1085.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1085.pdf

Computational theories of action have generally understood the organized nature of human activity through the construction and execution of plans. By consigning the phenomena of contingency and improvisation to peripheral roles, this view has led to impractical technical proposals. As an alternative, I suggest that contingency is a central feature of everyday activity and that improvisation is the central kind of human activity. I also offer a computational model of certain aspects of everyday routine activity based on an account of improvised activity called running arguments and an account of representation for situated agents called deictic representation .

AIM-1084

Author[s]: Allen C. Ward and Warren Seering

The Performance of a Mechanical Design 'Compiler'

January 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1084.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1084.pdf

A mechanical design "compiler" has been developed which, given an appropriate schematic, specifications, and utility function for a mechanical design, returns catalog numbers for an optimal implementation. The compiler has been successfully tested on a variety of mechanical and hydraulic power transmission designs and a few temperature sensing designs. Times required have been at worst proportional to the logarithm of the number of possible combinations of catalog numbers.

AIM-1083

Author[s]: Richard C. Waters

Optimization of Series Expressions: Part II: Overview of the Theory and Implementation

January 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1083.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1083.pdf

The benefits of programming in a functional style are well known. In particular, algorithms that are expressed as compositions of functions operating on series/vectors/streams of data elements are much easier to understand and modify than equivalent algorithms expressed as loops. Unfortunately, many programmers hesitate to use series expressions, because they are typically implemented very inefficiently---the prime source of inefficiency being the creation of intermediate series objects. A restricted class of series expressions, obviously synchronizable series expressions, is defined which can be evaluated very efficiently. At the cost of introducing restrictions which place modest limits on the series expressions which can be written, the restrictions guarantee that the creation of intermediate series objects is never necessary. This makes it possible to automatically convert obviously synchronizable series expressions into highly efficient loops using straight forward algorithms.

AIM-1082

Author[s]: Richard C. Waters

Optimization of Series Expressions: Part I: User's Manual for the Series Macro Package

January 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1082.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1082.pdf

The benefits of programming in a functional style are well known. In particular, algorithms that are expressed as compositions of functions operating on series/vectors/streams of data elements are much easier to understand and modify than equivalent algorithms expressed as loops. Unfortunately, many programmers hesitate to use series expressions, because they are typically implemented very inefficiently. A Common Lisp macro package (OSS) has been implemented which supports a restricted class of series expressions, obviously synchronizable series expressions, which can be evaluated very efficiently by automatically converting them into loops. Using this macro package, programmers can obtain the advantages of expressing computations as series expressions without incurring any run-time overhead.

AITR-1081

Author[s]: Benjamin J. Paul

A Systems Approach to the Torque Control of a Permanent Magnet Brushless Motor

August 1987

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1081.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1081.pdf

Many approaches to force control have assumed the ability to command torques accurately. Concurrently, much research has been devoted to developing accurate torque actuation schemes. Often, torque sensors have been utilized to close a feedback loop around output torque. In this paper, the torque control of a brushless motor is investigated through: the design, construction, and utilization of a joint torque sensor for feedback control; and the development and implementation of techniques for phase current based feedforeward torque control. It is concluded that simply closing a torque loop is no longer necessarily the best alternative since reasonably accurate current based torque control is achievable.

AITR-1080

Author[s]: Waldemar Horwat

A Concurrent Smalltalk Compiler for the Message-Driven Processor

May 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1080.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1080.pdf

This thesis describes Optimist, an optimizing compiler for the Concurrent Smalltalk language developed by the Concurrent VLSI Architecture Group. Optimist compiles Concurrent Smalltalk to the assembly language of the Message-Driven Processor (MDP). The compiler includes numerous optimization techniques such as dead code elimination, dataflow analysis, constant folding, move elimination, concurrency analysis, duplicate code merging, tail forwarding, use of register variables, as well as various MDP-specific optimizations in the code generator. The MDP presents some unique challenges and opportunities for compilation. Due to the MDP's small memory size, it is critical that the size of the generated code be as small as possible. The MDP is an inherently concurrent processor with efficient mechanisms for sending and receiving messages; the compiler takes advantage of these mechanisms. The MDP's tagged architecture allows very efficient support of object-oriented languages such as Concurrent Smalltalk. The initial goals for the MDP were to have the MDP execute about twenty instructions per method and contain 4096 words of memory. This compiler shows that these goals are too optimistic -- most methods are longer, both in terms of code size and running time. Thus, the memory size of the MDP should be increased.

AITR-1079

Author[s]: Eric W. Aboaf

Task-Level Robot Learning

August 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1079.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1079.pdf

We are investigating how to program robots so that they learn from experience. Our goal is to develop principled methods of learning that can improve a robot's performance of a wide range of dynamic tasks. We have developed task-level learning that successfully improves a robot's performance of two complex tasks, ball-throwing and juggling. With task- level learning, a robot practices a task, monitors its own performance, and uses that experience to adjust its task-level commands. This learning method serves to complement other approaches, such as model calibration, for improving robot performance.

AIM-1078

Author[s]: Davi Geiger and Tomaso Poggio

An Optimal Scale for Edge Detection

September 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1078.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1078.pdf

Many problems in early vision are ill posed. Edge detection is a typical example. This paper applies regularization techniques to the problem of edge detection. We derive an optimal filter for edge detection with a size controlled by the regularization parameter $\ lambda $ and compare it to the Gaussian filter. A formula relating the signal-to-noise ratio to the parameter $\lambda $ is derived from regularization analysis for the case of small values of $\lambda$. We also discuss the method of Generalized Cross Validation for obtaining the optimal filter scale. Finally, we use our framework to explain two perceptual phenomena: coarsely quantized images becoming recognizable by either blurring or adding noise.

AITR-1074

Author[s]: Walter Charles Hamscher

Model-Based Troubleshooting of Digital Systems

August 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1074.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1074.pdf

This thesis describes a methodology, a representation, and an implemented program for troubleshooting digital circuit boards at roughly the level of expertise one might expect in a human novice. Existing methods for model-based troubleshooting have not scaled up to deal with complex circuits, in part because traditional circuit models do not explicitly represent aspects of the device that troubleshooters would consider important. For complex devices the model of the target device should be constructed with the goal of troubleshooting explicitly in mind. Given that methodology, the principal contributions of the thesis are ways of representing complex circuits to help make troubleshooting feasible. Temporally coarse behavior descriptions are a particularly powerful simplification. Instantiating this idea for the circuit domain produces a vocabulary for describing digital signals. The vocabulary has a level of temporal detail sufficient to make useful predictions abut the response of the circuit while it remains coarse enough to make those predictions computationally tractable. Other contributions are principles for using these representations. Although not embodied in a program, these principles are sufficiently concrete that models can be constructed manually from existing circuit descriptions such as schematics, part specifications, and state diagrams. One such principle is that if there are components with particularly likely failure modes or failure modes in which their behavior is drastically simplified, this knowledge should be incorporated into the model. Further contributions include the solution of technical problems resulting from the use of explicit temporal representations and design descriptions with tangled hierarchies.

AIM-1073

Author[s]: Daphna Weinshall

Seeing 'Ghost' Solutions in Stereo Vision

September 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1073.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1073.pdf

A unique matching is a stated objective of most computational theories of stereo vision. This report describes situations where humans perceive a small number of surfaces carried by non-unique matching of random dot patterns, although a unique solution exists and is observed unambiguously in the perception of isolated features. We find both cases where non-unique matchings compete and suppress each other and cases where they are all perceived as transparent surfaces. The circumstances under which each behavior occurs are discussed and a possible explanation is sketched. It appears that matching reduces many false targets to a few, but may still yield multiple solutions in some cases through a (possibly different) process of surface interpolation.

AITR-1072

Author[s]: Steven D. Eppinger

Modeling Robot Dynamic Performance for Endpoint Force Control

September 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1072.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1072.pdf

This research aims to understand the fundamental dynamic behavior of servo- controlled machinery in response to various types of sensory feedback. As an example of such a system, we study robot force control, a scheme which promises to greatly expand the capabilities of industrial robots by allowing manipulators to interact with uncertain and dynamic tasks. Dynamic models are developed which allow the effects of actuator dynamics, structural flexibility, and workpiece interaction to be explored in the frequency and time domains. The models are used first to explain the causes of robot force control instability, and then to find methods of improving this performance.

AIM-1071

Author[s]: Berthold K.P. Horn

Parallel Networks for Machine Vision

December 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1071.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1071.pdf

The amount of computation required to solve many early vision problems is prodigious, and so it has long been thought that systems that operate in a reasonable amount of time will only become feasible when parallel systems become available. Such systems now exist in digital form, but most are large and expensive. These machines constitute an invaluable test- bed for the development of new algorithms, but they can probably not be scaled down rapidly in both physical size and cost, despite continued advances in semiconductor technology and machine architecture. Simple analog networks can perform interesting computations, as has been known for a long time. We have reached the point where it is feasible to experiment with implementation of these ideas in VLSI form, particularly if we focus on networks composed of locally interconnected passive elements, linear amplifiers, and simple nonlinear components. While there have been excursions into the development of ideas in this area since the very beginnings of work on machine vision, much work remains to be done. Progress will depend on careful attention to matching of the capabilities of simple networks to the needs of early vision. Note that this is not at all intended to be anything like a review of the field, but merely a collection of some ideas that seem to be interesting.

AIM-1070

Author[s]: Brian K. Totty

An Operating Environment for the Jellybean Machine

May 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1070.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1070.pdf

The Jellybean Machine is a scalable MIMD concurrent processor consisting of special purpose RISC processors loosely coupled into a low latency network. I have developed an operating system to provide the supportive environment required to efficiently coordinate the collective power of the distributed processing elements. The system services are developed in detail, and may be of interest to other designers of fine grain, distributed memory processing networks.

AIM-1069

Author[s]: William Dally, Andrew Chien, Stuart Fiske, Waldemar Horwat, John Keen, Peter Nuth, Jerry Larivee and Brian Totty

Message-Driven Processor Architecture: Verson 11

August 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1069.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1069.pdf

The Message-Driven Processor is a node of a large-scale multiprocessor being developed by the Concurrent VLSI Architecture Group. It is intended to support fine-grained, message passing, parallel computation. It contains several novel architectural features, such as a low-latency network interface, extensive type- checking hardware, and on-chip memory that can be used as an associative lookup table. This document is a programmer's guide to the MDP. It describes the processor's register architecture, instruction set, and the data types supported by the processor. It also details the MDP's message sending and exception handling facilities.

AIM-1068

Author[s]: Hsien-Che Lee

Estimating the Illuminant Color from the Shading of a Smooth Surface

August 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1068.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1068.pdf

5{ a uniform wall illuminated by a spot light often gives a strong impression of the illuminant color. How can it be possible to know if it is a white wall illuminated by yellow light or a yellow wall illuminated by white light? If the wall is a Lambertian reflector, it would not be possible to tell the difference. However, in the real world, some amount of specular reflection is almost always present. In this memo, it is shown that the computation is possible in most practical cases.

AIM-1066

Author[s]: James J. Little and Alessandro Verri

Analysis of Differential and Matching Methods for Optical Flow

August 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1066.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1066.pdf

Several algorithms for optical flow are studied theoretically and experimentally. Differential and matching methods are examined; these two methods have differing domains of application- differential methods are best when displacements in the image are small (<2 pixels) while matching methods work well for moderate displacements but do not handle sub-pixel motions. Both types of optical flow algorithm can use either local or global constraints, such as spatial smoothness. Local matching and differential techniques and global differential techniques will be examined. Most algorithms for optical flow utilize weak assumptions on the local variation of the flow and on the variation of image brightness. Strengthening these assumptions improves the flow computation. The computational consequence of this is a need for larger spatial and temporal support. Global differential approaches can be extended to local (patchwise) differential methods and local differential methods using higher derivatives. Using larger support is valid when constraint on the local shape of the flow are satisfied. We show that a simple constraint on the local shape of the optical flow, that there is slow spatial variation in the image plane, is often satisfied. We show how local differential methods imply the constraints for related methods using higher derivatives. Experiments show the behavior of these optical flow methods on velocity fields which so not obey the assumptions. Implementation of these methods highlights the importance of numerical differentiation. Numerical approximation of derivatives require care, in two respects: first, it is important that the temporal and spatial derivatives be matched, because of the significant scale differences in space and time, and, second, the derivative estimates improve with larger support.

AITR-1065

Author[s]: Margaret Morrison Fleck

Bondaries and Topological Algorithms

September 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1065.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1065.pdf

This thesis develops a model for the topological structure of situations. In this model, the topological structure of space is altered by the presence or absence of boundaries, such as those at the edges of objects. This allows the intuitive meaning of topological concepts such as region connectivity, function continuity, and preservation of topological structure to be modeled using the standard mathematical definitions. The thesis shows that these concepts are important in a wide range of artificial intelligence problems, including low- level vision, high-level vision, natural language semantics, and high-level reasoning.

AIM-1062

Author[s]: Allen C. Ward and Warren Seering

Quantitative Inference in a Mechanical Design Compiler

January 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1062.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1062.pdf

This paper presents the ideas underlying a program that takes as input a schematic of a mechanical or hydraulic power transmission system, plus specifications and a utility function, and returns catalog numbers from predefined catalogs for the optimal selection of components implementing the design. It thus provides the designer with a high level "language" in which to compose new designs, then performs some of the detailed design process for him. The program is based on a formalization of quantitative inferences about hierarchically organized sets of artifacts and operating conditions, which allows design compilation without the exhaustive enumeration of alternatives.

AIM-1061

Author[s]: Shimon Ullman and Amnon Sha'ashua

Structural Saliency: The Detection of Globally Salient Structures Using a Locally Connected Network

July 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1061.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1061.pdf

Certain salient structures in images attract our immediate attention without requiring a systematic scan. We present a method for computing saliency by a simple iterative scheme, using a uniform network of locally connected processing elements. The network uses an optimization approach to produce a "saliency map," a representation of the image emphasizing salient locations. The main properties of the network are: (i) the computations are simple and local, (ii) globally salient structures emerge with a small number of iterations, and (iii) as a by- product of the computations, contours are smoothed and gaps are filled in.

AIM-1060

Author[s]: Shimon Ullman and Ronen Basri

The Alignment of Objects with Smooth Surfaces

July 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1060.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1060.pdf

This paper examines the recognition of rigid objects bounded by smooth surfaces using an alignment approach. The projected image of such an object changes during rotation in a manner that is difficult to predict. A method to approach this problem is suggested, using the 3D surface curvature at the points along the silhouette. The curvature information requires a single number for each point along the object's silhouette, the magnitude of the curvature vector at the point. We have implemented and tested this method on images of complex 3D objects; it was found to give accurate predictions of the objects' appearances for large transformations. A small number of models can be used to predict the new appearance of an object from any viewpoint.

AIM-1059

Author[s]: Randall Davis and Walter C. Hamscher

Model-Based Reasoning: Troubleshooting

July 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1059.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1059.pdf

To determine why something has stopped working, it is useful to know how it was supposed to work in the first place. That simple observation underlies some of the considerable interest generated in recent years on the topic of model-based reasoning, particularly its application to diagnosis and troubleshooting. This paper surveys the current state of the art, reviewing areas that are well understood and exploring areas that present challenging research topics. It views the fundamental paradigm as the interaction of prediction and observation, and explores it by examining three fundamental subproblems: Generating hypotheses by reasoning from a symptom to a collection of components whose misbehavior may plausibly have caused that symptom; testing each hypothesis to see whether it can account for all available observations of device behavior; then discriminating among the ones that survive testing. We analyze each of these independently at the knowledge level, i.e., attempting to understand what reasoning capabilities arise from the different varieties of knowledge available to the program. We find that while a wide range of apparently diverse model-based systems have been built for diagnosis and troubleshooting, they are for the most part variations on the central theme outlined here. Their diversity lies primarily in the varying amounts and kinds of knowledge they bring to bear at each stage of the process; the underlying paradigm is fundamentally the same. Our survey of this familiar territory leads to a second major conclusion of the paper: Diagnostic reasoning from a model is reasonably understood. Given a model of behavior and structure, we know how to use it in a variety of ways to produce a diagnosis. There is, by contrast, a rich supply of open research issues in the modeling process itself. In a sense we know how to do model-based reasoning; we do not know how to model the behavior of complex devices, how to create models, and how to select the "right" model for the task at hand.

AITR-1056

Author[s]: Sundar Narasimhan

Dexterous Robotic Hands: Kinematics and Control

November 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1056.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1056.pdf

This report presents issues relating to the kinematics and control of dexterous robotic hands using the Utah-MIT hand as an illustrative example. The emphasis throughout is on the actual implementation and testing of the theoretical concepts presented. The kinematics of such hands is interesting and complicated owing to the large number of degrees of freedom involved. The implementation of position and force control algorithms on such tendon driven hands has previously suffered from inefficient formulations and a lack of sophisticated computer hardware. Both these problems are addressed in this report. A multiprocessor architecture has been built with high performance microcomputers on which real-time algorithms can be efficiently implemented. A large software library has also been built to facilitate flexible software development on this architecture. The position and force control algorithms described herein have been implemented and tested on this hardware.

AITR-1055

Author[s]: Panayotis S. Skordos

Multistep Methods for Integrating the Solar System

July 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1055.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1055.pdf

High order multistep methods, run at constant stepsize, are very effective for integrating the Newtonian solar system for extended periods of time. I have studied the stability and error growth of these methods when applied to harmonic oscillators and two-body systems like the Sun-Jupiter pair. I have also tried to design better multistep integrators than the traditional Stormer and Cowell methods, and I have found a few interesting ones.

AITR-1054

Author[s]: William T. Townsend

The Effect of Transmission Design on Force-Controlled Manipulator Performance

April 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1054.ps

ftp://publications.ai.mit.edu/ai-publcations/pdf/AITR-1054.pdf

Previous research in force control has focused on the choice of appropriate servo implementation without corresponding regard to the choice of mechanical hardware. This report analyzes the effect of mechanical properties such as contact compliance, actuator-to-joint compliance, torque ripple, and highly nonlinear dry friction in the transmission mechanisms of a manipulator. A set of requisites for high performance then guides the development of mechanical- design and servo strategies for improved performance. A single-degree-of-freedom transmission testbed was constructed that confirms the predicted effect of Coulomb friction on robustness; design and construction of a cable-driven, four-degree-of- freedom, "whole-arm" manipulator illustrates the recommended design strategies.

AITR-1053

Author[s]: Ron I. Kuper

Dependency-Directed Localization of Software Bugs

May 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1053.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1053.pdf

Software bugs are violated specifications. Debugging is the process that culminates in repairing a program so that it satisfies its specification. An important part of debugging is localization, whereby the smallest region of the program that manifests the bug is found. The Debugging Assistant (DEBUSSI) localizes bugs by reasoning about logical dependencies. DEBUSSI manipulates the assumptions that underlie a bug manifestation, eventually localizing the bug to one particular assumption. At the same time, DEBUSSI acquires specification information, thereby extending its understanding of the buggy program. The techniques used for debugging fully implemented code are also appropriate for validating partial designs.

AITR-1052

Author[s]: Paul Resnick

Generalizing on Multiple Grounds: Performance Learning in Model-Based Technology

February 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1052.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1052.pdf

This thesis explores ways to augment a model-based diagnostic program with a learning component, so that it speeds up as it solves problems. Several learning components are proposed, each exploiting a different kind of similarity between diagnostic examples. Through analysis and experiments, we explore the effect each learning component has on the performance of a model-based diagnostic program. We also analyze more abstractly the performance effects of Explanation-Based Generalization, a technology that is used in several of the proposed learning components.

AITR-1051

Author[s]: Peng Wu

Test Generation Guided Design for Testability

July 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1051.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1051.pdf

This thesis presents a new approach to building a design for testability (DFT) system. The system takes a digital circuit description, finds out the problems in testing it, and suggests circuit modifications to correct those problems. The key contributions of the thesis research are (1) setting design for testability in the context of test generation (TG), (2) using failures during FG to focus on testability problems, and (3) relating circuit modifications directly to the failures. A natural functionality set is used to represent the maximum functionalities that a component can have. The current implementation has only primitive domain knowledge and needs other work as well. However, armed with the knowledge of TG, it has already demonstrated its ability and produced some interesting results on a simple microprocessor.

AIM-1050A

Author[s]: Philip E. Agre and David Chapman

What Are Plans For?

October 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1050A.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1050A.pdf

What plans are like depends on how they're used. We contrast two views of plan use. On the plan-as-program-view, plan use is the execution of an effective procedure. On the plan-as-communication view, plan use is like following natural language instructions. We have begun work on computational models of plans-as-communication, building on our previous work on improvised activity and on ideas from sociology.

AIM-1049

Author[s]: Alan Bawden and Jonathan Rees

Syntactic Closures

June 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1049.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1049.pdf

In this paper we describe {\it syntactic closures}. Syntactic closures address the scoping problems that arise when writing macros. We discuss some issues raised by introducing syntactic closures into the macro expansion interface, and we compare syntactic closures with other approaches. Included is a complete implementation.

AITR-1048

Author[s]: Reid G. Simmons

Combining Associational and Causal Reasoning to Solve Interpretation and Planning Problems

August 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1048.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1048.pdf

This report describes a paradigm for combining associational and causal reasoning to achieve efficient and robust problem-solving behavior. The Generate, Test and Debug (GTD) paradigm generates initial hypotheses using associational (heuristic) rules. The tester verifies hypotheses, supplying the debugger with causal explanations for bugs found if the test fails. The debugger uses domain- independent causal reasoning techniques to repair hypotheses, analyzing domain models and the causal explanations produced by the tester to determine how to replace faulty assumptions made by the generator. We analyze the strengths and weaknesses of associational and causal reasoning techniques, and present a theory of debugging plans and interpretations. The GTD paradigm has been implemented and tested in the domains of geologic interpretation, the blocks world, and Tower of Hanoi problems.

AITR-1047

Author[s]: Richard James Doyle

Hypothesizing Device Mechanisms: Opening Up the Black Box

June 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1047.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1047.pdf

I describe an approach to forming hypotheses about hidden mechanism configurations within devices given external observations and a vocabulary of primitive mechanisms. An implemented causal modelling system called JACK constructs explanations for why a second piece of toast comes out lighter, why the slide in a tire gauge does not slip back inside when the gauge is removed from the tire, and how in a refrigerator a single substance can serve as a heat sink for the interior and a heat source for the exterior. I report the number of hypotheses admitted for each device example, and provide empirical results which isolate the pruning power due to different constraint sources.

AIM-1046

Author[s]: Steven D. Eppinger and Warren P. Seering

Modeling Robot Flexibility for Endpoint Force Control

May 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1046.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1046.pdf

Dynamic models have been developed in an attempt to match the response of a robot arm. The experimental data show rigid-body and five resonant modes. The frequency response and pole-zero arrays for various models of structural flexibility are compared with the data to evaluate the characteristics of the models, and to provide insight into the nature of the flexibility in the robot. Certain models are better able to depict transmission flexibility while others describe types of structural flexibility.

AITR-1045

Author[s]: Daniel Peter Huttenlocher

Three-Dimensional Recognition of Solid Objects from a Two-Dimensional Image

October 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1045.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1045.pdf

This thesis addresses the problem of recognizing solid objects in the three- dimensional world, using two-dimensional shape information extracted from a single image. Objects can be partly occluded and can occur in cluttered scenes. A model based approach is taken, where stored models are matched to an image. The matching problem is separated into two stages, which employ different representations of objects. The first stage uses the smallest possible number of local features to find transformations from a model to an image. This minimizes the amount of search required in recognition. The second stage uses the entire edge contour of an object to verify each transformation. This reduces the chance of finding false matches.

AIM-1044

Author[s]: W. Eric L. Grimson and David Huttenlocher

On the Sensitivity of the Hough Transform for Object Recognition

May 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1044.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1044.pdf

A common method for finding an object's pose is the generalized Hough transform, which accumulates evidence for possible coordinate transformations in a parameter space and takes large clusters of similar transformations as evidence of a correct solution. We analyze this approach by deriving theoretical bounds on the set of transformations consistent with each data- model feature pairing, and by deriving bounds on the likelihood of false peaks in the parameter space, as a function of noise, occlusion, and tessellation effects. We argue that blithely applying such methods to complex recognition tasks is a risky proposition, as the probability of false positives can be very high.

AITR-1043

Author[s]: Karl T. Ulrich

Computation and Pre-Parametric Design

September 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1043.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1043.pdf

My work is broadly concerned with the question "How can designs bessynthesized computationally?" The project deals primarily with mechanical devices and focuses on pre- parametric design: design at the level of detail of a blackboard sketch rather than at the level of detail of an engineering drawing. I explore the project ideas in the domain of single-input single-output dynamic systems, like pressure gauges, accelerometers, and pneumatic cylinders. The problem solution consists of two steps: 1) generate a schematic description of the device in terms of idealized functional elements, and then 2) from the schematic description generate a physical description.

AIM-1042

Author[s]: Jacob Katzenelson

Computational Structure of the N-body Problem

April 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1042.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1042.pdf

This work considers the organization and performance of computations on parallel computers of tree algorithms for the N-body problem where the number of particles is on the order of a million. The N-body problem is formulated as a set of recursive equations based on a few elementary functions, which leads to a computational structure in the form of a pyramid-like graph, where each vertex is a process, and each arc a communication link. The pyramid is mapped to three different processor configurations: (1) A pyramid of processors corresponding to the processes pyramid graph; (2) A hypercube of processors, e.g., a connection-machine like architecture; (3) A rather small array, e.g., $2 \times 2 \ times 2$, of processors faster than the ones considered in (1) and (2) above. Simulations of this size can be performed on any of the three architectures in reasonable time.

AIM-1041

Author[s]: Boris Katz and Beth Levin

Exploiting Lexical Regularities in Designing Natural Language Systems

April 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1041.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1041.pdf

This paper presents the lexical component of the START Question Answering system developed at the MIT Artificial Intelligence Laboratory. START is able to interpret correctly a wide range of semantic relationships associated with alternate expressions of the arguments of verbs. The design of the system takes advantage of the results of recent linguistic research into the structure of the lexicon, allowing START to attain a broader range of coverage than many existing systems.

AIM-1040

Author[s]: Andrew A. Berlin and Henry M. Wu

Scheme86: A System for Interpreting Scheme

April 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1040.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1040.pdf

Scheme86 is a computer system designed to interpret programs written in the Scheme dialect of Lisp. A specialized architecture, coupled with new techniques for optimizing register management in the interpreter, allows Scheme86 to execute interpreted Scheme at a speed comparable to that of compiled Lisp on conventional workstations.

AIM-1039

Author[s]: Gerald Jay Sussman and Jack Wisdom

Numerical Evidence that the Motion of Pluto is Chaotic

April 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1039.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1039.pdf

The Digital Orrery has been used to perform an integration of the motion of the outer planets for 845 million years. This integration indicates that the long-term motion of the planet Pluto is chaotic. Nearby trajectories diverge exponentially with an e-folding time of only about 20 million years.

AIM-1038

Author[s]: Ellen C. Hildreth and Shimon Ullman

The Computational Study of Vision

April 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1038.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1038.pdf

The computational approach to the study of vision inquires directly into the sort of information processing needed to extract important information from the changing visual image---information such as the three- dimensional structure and movement of objects in the scene, or the color and texture of object surfaces. An important contribution that computational studies have made is to show how difficult vision is to perform, and how complex are the processes needed to perform visual tasks successfully. This article reviews some computational studies of vision, focusing on edge detection, binocular stereo, motion analysis, intermediate vision, and object recognition.

AIM-1037

Author[s]: Joachim Heel

Dynamical Systems and Motion Vision

April 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1037.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1037.pdf

In this paper we show how the theory of dynamical systems can be employed to solve problems in motion vision. In particular we develop algorithms for the recovery of dense depth maps and motion parameters using state space observers or filters. Four different dynamical models of the imaging situation are investigated and corresponding filters/ observers derived. The most powerful of these algorithms recovers depth and motion of general nature using a brightness change constraint assumption. No feature- matching preprocessor is required.

AITR-1036

Author[s]: Kenneth Alan Pasch

Heuristics for Job-Shop Scheduling

January 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1036.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1036.pdf

Two methods of obtaining approximate solutions to the classic General Job-shop Scheduling Program are investigated. The first method is iterative. A sampling of the solution space is used to decide which of a collection of space pruning constraints are consistent with "good" schedules. The selected space pruning constraints are then used to reduce the search space and the sampling is repeated. This approach can be used either to verify whether some set of space pruning constraints can prune with discrimination or to generate solutions directly. Schedules can be represented as trajectories through a Cartesian space. Under the objective criteria of Minimum maximum Lateness family of “good" schedules (trajectories) are geometric neighbors (reside with some “tube”) in this space. This second method of generating solutions takes advantage of this adjacency by pruning the space from the outside in thus converging gradually upon this “tube.” One the average this methods significantly outperforms an array of the Priority Dispatch rules when the object criteria is that of Minimum Maximum Lateness. It also compares favorably with a recent relaxation procedure.

AITR-1035

Author[s]: Daniel S. Weld

Theories of Comparative Analysis

May 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1035.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1035.pdf

Comparative analysis is the problem of predicting how a system will react to perturbations in its parameters, and why. For example, comparative analysis could be asked to explain why the period of an oscillating spring/block system would increase if the mass of the block were larger. This thesis formalizes the task of comparative analysis and presents two solution techniques: differential qualitative (DQ) analysis and exaggeration. Both techniques solve many comparative analysis problems, providing explanations suitable for use by design systems, automated diagnosis, intelligent tutoring systems, and explanation based generalization. This thesis explains the theoretical basis for each technique, describes how they are implemented, and discusses the difference between the two. DQ analysis is sound; it never generates an incorrect answer to a comparative analysis question. Although exaggeration does occasionally produce misleading answers, it solves a larger class of problems than DQ analysis and frequently results in simpler explanations.

AIM-1032

Author[s]: Steven J. Gordon and Warren P. Seering

Real-Time Part Position Sensing

May 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1032.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1032.pdf

A light stripe vision system is used to measure the location of polyhedral features of parts from a single frame of video camera output. Issues such as accuracy in locating the line segments of intersection in the image and combining redundant information from multiple measurements and multiple sources are addressed. In 2.5 seconds, a prototype sensor was capable of locating a two inch cube to an accuracy (one standard deviation) of .002 inches (.055 mm) in translation and .1 degrees (.0015 radians) in rotation. When integrated with a manipulator, the system was capable of performing high precision assembly tasks.

AITR-1031

Author[s]: Elisha Sacks

Automatic Qualitative Analysis of Ordinary Differential Equations Using Piecewise Linear Approximations

March 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1031.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1031.pdf

This paper explores automating the qualitative analysis of physical systems. It describes a program, called PLR, that takes parameterized ordinary differential equations as input and produces a qualitative description of the solutions for all initial values. PLR approximates intractable nonlinear systems with piecewise linear ones, analyzes the approximations, and draws conclusions about the original systems. It chooses approximations that are accurate enough to reproduce the essential properties of their nonlinear prototypes, yet simple enough to be analyzed completely and efficiently. It derives additional properties, such as boundedness or periodicity, by theoretical methods. I demonstrate PLR on several common nonlinear systems and on published examples from mechanical engineering.

AITR-1030

Author[s]: Neil C. Singer

Residual Vibration Reduction in Computer Controlled Machines

February 1989

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1030.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1030.pdf

Control of machines that exhibit flexibility becomes important when designers attempt to push the state of the art with faster, lighter machines. Three steps are necessary for the control of a flexible planet. First, a good model of the plant must exist. Second, a good controller must be designed. Third, inputs to the controller must be constructed using knowledge of the system dynamic response. There is a great deal of literature pertaining to modeling and control but little dealing with the shaping of system inputs. Chapter 2 examines two input shaping techniques based on frequency domain analysis. The first involves the use of the first deriviate of a gaussian exponential as a driving function template. The second, acasual filtering, involves removal of energy from the driving functions at the resonant frequencies of the system. Chapter 3 presents a linear programming technique for generating vibration-reducing driving functions for systems. Chapter 4 extends the results of the previous chapter by developing a direct solution to the new class of driving functions. A detailed analysis of the new technique is presented from five different perspectives and several extensions are presented. Chapter 5 verifies the theories of the previous two chapters with hardware experiments. Because the new technique resembles common signal filtering, chapter 6 compares the new approach to eleven standard filters. The new technique will be shown to result in less residual vibrations, have better robustness to system parameter uncertainty, and require less computation than other currently used shaping techniques.

AITR-1029

Author[s]: Stephen L. Chiu

Generating Compliant Motion of Objects with an Articulated Hand

June 1985

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1029.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1029.pdf

The flexibility of the robot is the key to its success as a viable aid to production. Flexibility of a robot can be explained in two directions. The first is to increase the physical generality of the robot such that it can be easily reconfigured to handle a wide variety of tasks. The second direction is to increase the ability of the robot to interact with its environment such that tasks can still be successfully completed in the presence of uncertainties. The use of articulated hands are capable of adapting to a wide variety of grasp shapes, hence reducing the need for special tooling. The availability of low mass, high bandwidth points close to the manipulated object also offers significant improvements I the control of fine motions. This thesis provides a framework for using articulated hands to perform local manipulation of objects. N particular, it addresses the issues in effecting compliant motions of objects in Cartesian space. The Stanford/JPL hand is used as an example to illustrate a number of concepts. The examples provide a unified methodology for controlling articulated hands grasping with point contacts. We also present a high-level hand programming system based on the methodologies developed in this thesis. Compliant motion of grasped objects and dexterous manipulations can be easily described in the LISP-based hand programming language.

AIM-1028

Author[s]: Eric Saund

Symbolic Construction of a 2D Scale-Space Image

April 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1028.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1028.pdf

The shapes of naturally occurring objects characteristically involve spatial events occurring at many scales. This paper offers a symbolic approach to constructing a primitive shape description across scales for 2D binary (silhouette) shape images: grouping operations are performed over collections of tokens residing on a Scale-Space Blackboard. Two types of grouping operations are identified that, respectively: (1) aggregate edge primitives at one scale into edge primitives at a coarser scale and (2) group edge primitives into partial-region assertions, including curved- contours, primitive-corners, and bars. This approach avoids several drawbacks of numerical smoothing methods.

AIM-1027

Author[s]: Neil Singer and Warren Seering

Preshaping Command Inputs to Reduce System Vibration

January 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1027.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1027.pdf

A method is presented for generating shaped command inputs which significantly reduce or eliminate endpoint vibration. Desired system inputs are altered so that the system completes the requested move without residual vibration. A short move time penalty is incurred (on the order of one period of the first mode of vibration). The preshaping technique is robust under system parameter uncertainty and may be applied to both open and closed loop systems. The Draper Laboratory's Space Shuttle Remote Manipulator System simulator (DRS) is used to evaluate the method. Results show a factor of 25 reduction in endpoint residual vibration for typical moves of the DRS.

AIM-1026A

Author[s]: Gary L. Drescher

Demystifying Quantum Mechanics: A Simple Universe with Quantum Uncertainty

December 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1026a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1026a.pdf

An artificial universe is defined that has entirely deterministic laws with exclusively local interactions, and that exhibits the fundamental quantum uncertainty phenomenon: superposed states mutually interfere, but only to the extent that no observation distinguishes among them. Showing how such a universe could be elucidates interpretational issues of actual quantum mechanics. The artificial universe is a much-simplified version of Everett's real- world model (the so-called multiple-worlds formulation). In the artificial world, as in Everett's model, the tradeoff between interference and observation is deducible from the universe formalism. Artificial world examples analogous to the quantum double- slit experiment and the EPR experiment are presented.

AIM-1025

Author[s]: Jonathan H. Connell

A Behavior-Based Arm Controller

June 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1025.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1025.pdf

In this paper we describe a working, implemented controller for a real, physical mobile robot arm. The controller is composed of a collection of 15 independent behaviors which run, in real time, on a set of 8 loosely coupled on-board 8-bit microprocessors. We describe how these behaviors cooperate to actually seek out and retrieve objects using local sensory data. We also discuss the methodology used to decompose this collection task and the types of spatial representation and reasoning used by the system.

AIM-1024

Author[s]: Christopher G. Atkeson, Eric W. Aboaf, Joseph McIntyre and David J. Reinkensmeyer

Model-Based Robot Learning

April 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1024.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1024.pdf

Models play an important role in learning from practice. Models of a controlled system can be used as learning operators to refine commands on the basis of performance errors. The examples used to demonstrate this include positioning a limb at a visual target and following a defined trajectory. Better models lead to faster correction of command errors, requiring less practice to attain a given level of performance. The benefits of accurate modeling are improved performance in all aspects of control, while the risks of inadequate modeling are poor learning performance, or even degradation of performance with practice.

AITR-1023

Author[s]: David W. Jacobs

The Use of Grouping in Visual Object Recognition.

January 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1023.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1023.pdf

The report describes a recognition system called GROPER, which performs grouping by using distance and relative orientation constraints that estimate the likelihood of different edges in an image coming from the same object. The thesis presents both a theoretical analysis of the grouping problem and a practical implementation of a grouping system. GROPER also uses an indexing module to allow it to make use of knowledge of different objects, any of which might appear in an image. We test GROPER by comparing it to a similar recognition system that does not use grouping.

AITR-1022

Author[s]: Pyung H. Chang

Analysis and Control of Robot Manipulators with Kinematic Redundancy

May 1987

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1022.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1022.pdf

A closed-form solution formula for the kinematic control of manipulators with redundancy is derived, using the Lagrangian multiplier method. Differential relationship equivalent to the Resolved Motion Method has been also derived. The proposed method is proved to provide with the exact equilibrium state for the Resolved Motion Method. This exactness in the proposed method fixes the repeatability problem in the Resolved Motion Method, and establishes a fixed transformation from workspace to the joint space. Also the method, owing to the exactness, is demonstrated to give more accurate trajectories than the Resolved Motion Method. In addition, a new performance measure for redundancy control has been developed. This measure, if used with kinematic control methods, helps achieve dexterous movements including singularity avoidance. Compared to other measures such as the manipulability measure and the condition number, this measure tends to give superior performances in terms of preserving the repeatability property and providing with smoother joint velocity trajectories. Using the fixed transformation property, Taylor’s Bounded Deviation Paths Algorithm has been extended to the redundant manipulators.

AIM-1021

Author[s]: Pyung H. Chang

A Dexterity Measure for the Kinematic Control of Robot Manipulator with Redundany

February 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1021.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1021.pdf

We have derived a new performance measure, product of minors of the Jacobian matrix, that tells how far kinematically redundant manipulators are from singularity. It was demonstrated that previously used performance measures, namely condition number and manipulability measure allowed to change configurations, caused repeatability problems and discontinuity effects. The new measure, on the other hand, assures that the arm solution remains in the same configuration, thus effectively preventing these problems.

AIM-1020

Author[s]: Richard C. Waters

System Validation via Constraint Modeling

February 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1020.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1020.pdf

Constraint modeling could be a very important system validation method, because its abilities are complementary to both testing and code inspection. In particular, even though the ability of constraint modeling to find errors is limited by the simplifications which are introduced when making a constraint model, constraint modeling can locate important classes of errors which are caused by non-local faults (i.e., are hard to find with code inspection) and manifest themselves as failures only in unusual situations (i.e., are hard to find with testing).

AIM-1019

Author[s]: W. Eric L. Grimson

The Combinatorics of Object Recognition in Cluttered Environments Using Constrained Search

February 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1019.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1019.pdf

When clustering techniques such as the Hough transform are used to isolate likely subspaces of the search space, empirical performance in cluttered scenes improves considerably. In this paper we establish formal bounds on the combinatorics of this approach. Under some simple assumptions, we show that the expected complexity of recognizing isolated objects is quadratic in the number of model and sensory fragments, but that the expected complexity of recognizing objects in cluttered environments is exponential in the size of the correct interpretation. We also provide formal bounds on the efficacy of using the Hough transform to preselect likely subspaces, showing that the problem remains exponential, but that in practical terms, the size of the problem is significantly decreased.

AITR-1018

Author[s]: Peter Heinrich Meckl

Control of Vibration in Mechanical Systems Using Shaped Reference Inputs

January 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1018.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1018.pdf

Dynamic systems which undergo rapid motion can excite natural frequencies that lead to residual vibration at the end of motion. This work presents a method to shape force profiles that reduce excitation energy at the natural frequencies in order to reduce residual vibration for fast moves. Such profiles are developed using a ramped sinusoid function and its harmonics, choosing coefficients to reduce spectral energy at the natural frequencies of the system. To improve robustness with respect to parameter uncertainty, spectral energy is reduced for a range of frequencies surrounding the nominal natural frequency. An additional set of versine profiles are also constructed to permit motion at constant speed for velocity-limited systems. These shaped force profiles are incorporated into a simple closed-loop system with position and velocity feedback. The force input is doubly integrated to generate a shaped position reference for the controller to follow. This control scheme is evaluated on the MIT Cartesian Robot. The shaped inputs generate motions with minimum residual vibration when actuator saturation is avoided. Feedback control compensates for the effect of friction Using only a knowledge of the natural frequencies of the system to shape the force inputs, vibration can also be attenuated in modes which vibrate in directions other than the motion direction. When moving several axes, the use of shaped inputs allows minimum residual vibration even when the natural frequencies are dynamically changing by a limited amount.

AIM-1017

Author[s]: Yishai A. Feldman and Charles Rich

Pattern-Directed Invocation with Changing Equations

May 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1017.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1017.pdf

The interaction of pattern-directed invocation with equality in an automated reasoning system gives rise to a completeness problem. In such systems, a demon needs to be invoked not only when its pattern exactly matches a term in the reasoning data base, but also when it is possible to create a variant that matches. An incremental algorithm has been developed which solves this problem without generating all possible variants of terms in the database. The algorithm is shown to be complete for a class of demons, called transparent demons, in which there is a well-behaved logical relationship between the pattern and the body of the demon.

AIM-1016

Author[s]: Rodney A. Brooks, Jonathan Connell and Peter Ning

Herbert: A Second Generation Mobile Robot

January 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1016.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1016.pdf

In mobile robot research we believe the structure of the platform, its capabilities, the choice of sensors, their capabilities, and the choice of processors, both onboard and offboard, greatly constrains the direction of research activity centered on the platform. We examine the design and tradeoffs in a low cost mobile platform we have built while paying careful attention to issues of sensing, manipulation, onboard processing and debuggability of the total system. The robot, named Herbert, is a completely autonomous mobile robot with an onboard parallel processor and special hardware support for the subsumption architecture [Brooks (1986)], an onboard manipulator and a laser range scanner. All processors are simple low speed 8-bit micro-processors. The robot is capable of real time three dimensional vision, while simultaneously carrying out manipulator and navigation tasks.

AIM-1015

Author[s]: Bonnie J. Dorr

A Lexical Conceptual Approach to Generation for Machine Translation

January 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1015.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1015.pdf

Current approaches to generation for machine translation make use of direct- replacement templates, large grammars, and knowledge-based inferencing techniques. Not only are rules language-specific, but they are too simplistic to handle sentences that exhibit more complex phenomena. Furthermore, these systems are not easily extendable to other languages because the rules that map the internal representation to the surface form are entirely dependent on both the domain of the system and the language being generated. Finally an adequate interlingual representation has not yet been discovered; thus, knowledge-based inferencing is necessary and syntactic cross-linguistic generalization cannot be exploited. This report introduces a plan for the development of a theoretically based computational scheme of natural language generation for a translation system. The emphasis of the project is the mapping from the lexical conceptual structure of sentences to an underlying or "base" syntactic structure called deep structure. This approach tackles the problems of thematic and structural divergence, i.e., it allows generation of target language sentences that are not thematically or structurally equivalent to their conceptually equivalent source language counterparts. Two other more secondary tasks, construction of a dictionary and mapping from dep structure to surface structure, will also be discussed. The generator operates on a constrained grammatical theory rather than on a set of surface level transformations. If the endeavor succeeds, there will no longer be a need for large, detailed grammars; general knowledge-based inferencing will not be necessary; lexical selection and syntactic realization will bw facilitated; and the model will be general enough for extension to other languages.

AIM-1014

Author[s]: Kenneth Haase

Soft Objects: A Paradigm for Object Oriented Programming

March 1990

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1014.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1014.pdf

This paper introduces soft objects, a new paradigm for object oriented programming. This paradigm replaces the traditional notion of object classes with the specification of transforming procedures which transform simpler objects into more complicated objects. These transforming procedures incrementally construct new objects by adding new state or providing handlers for new messages. Unlike other incremental approaches (e.g. the inherited exist handlers of Object Logo [Drescher, 1987]), transforming procedures are strict functions which always return new objects; rather than conflating objects and object abstractions (classes), soft objects distinctly separates objects and their abstractions. The composition of these transforming procedures replaces the inheritance schemes of class oriented approaches; order of composition of transforming procedure makes explicit the inheritance indeterminancies introduced by multiple super classes. Issues regarding semantics, efficiency, and security are discussed in the context of several alternative implementation models and the code of a complete implementation is provided in an appendix.

AIM-1013

Author[s]: Neil C. Singer and Warren P. Seering

Utilizing Dynamic Stability to Orient Parts

February 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1013.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1013.pdf

The intent of this research is to study the dynamic behavior of a solid body resting on a moving surface. Results of the study are then used to propose methods for controlling the orientation of parts in preparation for automatic assembly. Two dynamic models are discussed in detail. The first examines the impacts required to cause reorientation of a part. The second investigates the use of oscillatory motion to selectively reorient parts. This study demonstrates that the dynamic behaviors of solid bodies, under the conditions mentioned above, vary considerably with small changes in geometry or orientation.

AIM-1011

Author[s]: Jintae Lee

Knowledge Base Integration: What Can We Learn from Database Integration Research?

January 1988

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1011.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1011.pdf

This paper examines the issues and the solutions that have been studied in database (DB) integration research and tries to draw lessons from them for knowledge base (KB) integration.

AITR-1009

Author[s]: Alfonso Garcia Reynoso

Structural Dynamics Model of a Cartesian Robot

October 1985

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1009.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1009.pdf

Methods are developed for predicting vibration response characteristics of systems which change configuration during operation. A cartesian robot, an example of such a position-dependent system, served as a test case for these methods and was studied in detail. The chosen system model was formulated using the technique of Component Mode Synthesis (CMS). The model assumes that he system is slowly varying, and connects the carriages to each other and to the robot structure at the slowly varying connection points. The modal data required for each component is obtained experimentally in order to get a realistic model. The analysis results in prediction of vibrations that are produced by the inertia forces as well as gravity and friction forces which arise when the robot carriages move with some prescribed motion. Computer simulations and experimental determinations are conducted in order to calculate the vibrations at the robot end- effector. Comparisons are shown to validate the model in two ways: for fixed configuration the mode shapes and natural frequencies are examined, and then for changing configuration the residual vibration at the end of the mode is evaluated. A preliminary study was done on a geometrically nonlinear system which also has position-dependency. The system consisted of a flexible four-bar linkage with elastic input and output shafts. The behavior of the rocker-beam is analyzed for different boundary conditions to show how some limiting cases are obtained. A dimensional analysis leads to an evaluation of the consequences of dynamic similarity on the resulting vibration.

AIM-1008

Author[s]: Jacob Katzenelson and Richard Zippel

Software Structuring Principles for VLSI CAD

December 1987

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1008.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1008.pdf

A frustrating aspect of the frequent changes to large VLSI CAD systems is that so little of the old available programs can be reused. It takes too much time and effort to find the reusable pieces and recast them for the new use. Our thesis is that such systems can be designed for reusability by designing the software as layers of problem oriented languages, which are implemented by suitably extending a "base" language. We illustrate this methodology with respect to VLSI CAD programs and a particular language layer: a language for handling networks. We present two different implementations. The first uses UNIX and Enhanced C. The second approach uses Common Lisp on a Lisp machine.

AIM-1007

Author[s]: Daphna Weinshall

Qualitative Depth and Shape from Stereo, in Agreement with Psychophysical Evidendence

December 1987

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1007.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1007.pdf

Obtaining exact depth from binocular disparities is hard if camera calibration is needed. We will show that qualitative depth information can be obtained from stereo disparities with almost no computations and with no prior knowledge (or computation) of camera parameters. We derive two expressions that order all matched points in the images in two distinct depth-consistent ways from image coordinates only. One is a tilt-related order $\lambda$, the other is a depth-related order $\chi$. Using $\lambda$ demonstrates some anomalies and unusual characteristics that have been observed in psychophysical experiments. The same approach is applied to qualitatively estimate changes in the curvature of a contour on the surface of an object, with either $x$- or $y$- coordinate fixed.

AIM-1006

Author[s]: Eric W. Aboaf, Christopher G. Atkeson and David J. Reinkensmeyer

Task-Level Robot Learning: Ball Throwing

December 1987

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1006.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1006.pdf

We are investigating how to program robots so that they learn tasks from practice. One method, task-level learning, provides advantages over simply perfecting models of the robot's lower level systems. Task-level learning can compensate for the structural modeling errors of the robot's lower level control systems and can speed up the learning process by reducing the degrees of freedom of the models to be learned. We demonstrate two general learning procedures---fixed-model learning and refined-model learning---on a ball- throwing robot system.

AIM-1005

Author[s]: Charles Rich

Inspection Methods in Programming: Cliches and Plans

December 1987

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1005.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1005.pdf

Inspection methods are a kind of engineering problem solving based on the recognition and use of standard forms or cliches. Examples are given of program analysis, program synthesis and program validation by inspection. A formalism, called the Plan Calculus, is defined and used to represent programming cliches in a convenient, canonical, and programming- language independent fashion.

AIM-1004

Author[s]: Charles Rich and Richard C. Waters

The Programmer's Apprentice Project: A Research Overview

November 1987

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AIM-1004.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-1004.pdf

The goal of the Programmer's Apprentice project is to develop a theory of how expert programmers analyze, synthesize, modify, explain, specify, verify, and document programs. This research goal overlaps both artificial intelligence and software engineering. From the viewpoint of artificial intelligence, we have chosen programming as a domain in which to study fundamental issues of knowledge representation and reasoning. From the viewpoint of software engineering, we seek to automate the programming process by applying techniques from artificial intelligence.

AITR-1001

Author[s]: Aaron F. Bobick

Natural Object Categorization

November 1987

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1001.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1001.pdf

This thesis addresses the problem of categorizing natural objects. To provide a criteria for categorization we propose that the purpose of a categorization is to support the inference of unobserved properties of objects from the observed properties. Because no such set of categories can be constructed in an arbitrary world, we present the Principle of Natural Modes as a claim about the structure of the world. We first define an evaluation function that measures how well a set of categories supports the inference goals of the observer. Entropy measures for property uncertainty and category uncertainty are combined through a free parameter that reflects the goals of the observer. Natural categorizations are shown to be those that are stable with respect to this free parameter. The evaluation function is tested in the domain of leaves and is found to be sensitive to the structure of the natural categories corresponding to the different species. We next develop a categorization paradigm that utilizes the categorization evaluation function in recovering natural categories. A statistical hypothesis generation algorithm is presented that is shown to be an effective categorization procedure. Examples drawn from several natural domains are presented, including data known to be a difficult test case for numerical categorization techniques. We next extend the categorization paradigm such that multiple levels of natural categories are recovered; by means of recursively invoking the categorization procedure both the genera and species are recovered in a population of anaerobic bacteria. Finally, a method is presented for evaluating the utility of features in recovering natural categories. This method also provides a mechanism for determining which features are constrained by the different processes present in a multiple modal world.

AITR-1000

Author[s]: Bonnie Jean Dorr

UNITRAN: A Principle-Based Approach to Machine Translation

December 1987

ftp://publications.ai.mit.edu/ai-publications/1000-1499/AITR-1000.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-1000.pdf

Machine translation has been a particularly difficult problem in the area of Natural Language Processing for over two decades. Early approaches to translation failed since interaction effects of complex phenomena in part made translation appear to be unmanageable. Later approaches to the problem have succeeded (although only bilingually), but are based on many language- specific rules of a context-free nature. This report presents an alternative approach to natural language translation that relies on principle-based descriptions of grammar rather than rule-oriented descriptions. The model that has been constructed is based on abstract principles as developed by Chomsky (1981) and several other researchers working within the “Government and Binding” (GB) framework. Thus, the grammar is viewed as a modular system of principles rather than a large set of ad hoc language-specific rules.

AIM-999

Author[s]: Gerald Roylance

Expressing Mathematical Subroutines Constructively

November 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-999.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-999.pdf

The typical subroutines that compute $\sin(x)$ and $\exp(x)$ bear little resemblance to our mathematical knowledge of these functions: they are composed of concrete arithmetic expressions that include many mysterious numerical constants. Instead of programming these subroutines conventionally, we can express their construction using symbolic ideas such as periodicity and Taylor series. Such an approach has many advantages: the code is closer to the mathematical basis of the function, less vulnerable to errors, and is trivially adaptable to various precisions.

AIM-998

Author[s]: Bonnie Jean Dorr

UNITRAN: An Interlingual Machine Translation System

December 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-998.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-998.pdf

This report describes the UNITRAN (UNIversal TRANslator) system, an implementation of a principle-based approach to natural language translation. The system is "interlingual", i.e., the model is based on universal principles that hold across all languages; the distinctions among languages are handled by settings of parameters associated with the universal principles. Interaction effects of linguistic principles are handled by the syste so that the programmer does not need to specifically spell out the details of rule applications. Only a small set of principles covers all languages; thus, the unmanageable grammar size of alternative approaches is no longer a problem.

AIM-997

Author[s]: Matthew Halfant and Gerald Jay Sussman

Abstraction in Numerical Methods

October 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-997.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-997.pdf

We illustrate how the liberal use of high-order procedural abstractions and infinite streams helps us to express some of the vocabulary and methods of numerical analysis. We develop a software toolbox encapsulating the technique of Richardson extrapolation, and we apply these tools to the problems of numerical integration and differentiation. By separating the idea of Richardson extrapolation from its use in particular circumstances, we indicate how numerical programs can be written that exhibit the structure of the ideas from which they are formed.

AIM-996

Author[s]: Rado Jasinschi and Alan Yuille

Non-Rigid Motion and Regge Calculus

November 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-996.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-996.pdf

We study the problem of recovering the structure from motion of figures which are allowed to perform a controlled non-rigid motion. We use Regge Calculus to approximate a general surface by a net of triangles. The non- rigid flexing motion we deal with corresponds to keeping the triangles rigid and allowing bending only at the joins between triangles. We show that depth information can be obtained by using a modified version of the Incremental Rigidity Scheme devised by Ullman (1984). We modify this scheme to allow for flexing motion and call our version the Incremental Semirigidity Scheme.

AITR-995

Author[s]: Feng Zhao

An O(N) Algorithm for Three-Dimensional N-Body Simulations

October 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-995.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-995.pdf

We develop an algorithm that computes the gravitational potentials and forces on N point- masses interacting in three-dimensional space. The algorithm, based on analytical techniques developed by Rokhlin and Greengard, runs in order N time. In contrast to other fast N-body methods such as tree codes, which only approximate the interaction potentials and forces, this method is exact – it computes the potentials and forces to within any prespecified tolerance up to machine precision. We present an implementation of the algorithm for a sequential machine. We numerically verify the algorithm, and compare its speed with that of an O(N2) direct force computation. We also describe a parallel version of the algorithm that runs on the Connection Machine in order 0(logN) time. We compare experimental results with those of the sequential implementation and discuss how to minimize communication overhead on the parallel machine.

AIM-994

Author[s]: Berthold K.P. Horn

Relative Orientation

September 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-994.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-994.pdf

Before corresponding points in images taken with two cameras can be used to recover distances to objects in a scene, one has to determine the position and orientation of one camera relative to the other. This is the classic photogrammetric problem of relative orientation, central to the interpretation of binocular stereo information. Described here is a particularly simple iterative scheme for recovering relative orientation that, unlike existing methods, does not require a good initial guess for the baseline and the rotation.

AITR-993

Author[s]: Michael B. Kashket

A Government-Binding Based Parser for Warlpiri, a Free-Word Order Language

January 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-993.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-993.pdf

Free-word order languages have long posed significant problems for standard parsing algorithms. This thesis presents an implemented parser, based on Government- Binding (GB) theory, for a particular free-word order language, Warlpiri, an aboriginal language of central Australia. The words in a sentence of a free-word order language may swap about relatively freely with little effect on meaning: the permutations of a sentence mean essentially the same thing. It is assumed that this similarity in meaning is directly reflected in the syntax. The parser presented here properly processes free word order because it assigns the same syntactic structure to the permutations of a single sentence. The parser also handles fixed word order, as well as other phenomena. On the view presented here, there is no such thing as a “configurational” or “non-configurational” language. Rather, there is a spectrum of languages that are more or less ordered. The operation of this parsing system is quite different in character from that of more traditional rule-based parsing systems, e.g., context-free parsers. In this system, parsing is carried out via the construction of two different structures, one encoding precedence information and one encoding hierarchical information. This bipartite representation is the key to handling both free- and fixed-order phenomena. This thesis first presents an overview of the portion of Warlpiri that can be parsed. Following this is a description of the linguistic theory on which the parser is based. The chapter after that describes the representations and algorithms of the parser. In conclusion, the parser is compared to related work. The appendix contains a substantial list of test cases – both grammatical and ungrammatical – that the parser has actually processed.

AITR-992

Author[s]: David L. Brock

Enhancing the Dexterity of a Robot Hand Using Controlled Slip

May 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-992.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-992.pdf

Humans can effortlessly manipulate objects in their hands, dexterously sliding and twisting them within their grasp. Robots, however, have none of these capabilities, they simply grasp objects rigidly in their end effectors. To investigate this common form of human manipulation, an analysis of controlled slipping of a grasped object within a robot hand was performed. The Salisbury robot hand demonstrated many of these controlled slipping techniques, illustrating many results of this analysis. First, the possible slipping motions were found as a function of the location, orientation, and types of contact between the hand and object. Second, for a given grasp, the contact types were determined as a function of the grasping force and the external forces on the object. Finally, by changing the grasping force, the robot modified the constraints on the object and affect controlled slipping slipping motions.

AIM-989

Author[s]: Alan Yuille and Shimon Ullman

Rigidity and Smoothness of Motion

November 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-989.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-989.pdf

sMany theories of structure from motion divide the process into twosparts which are solved using different assumptions. Smoothness of thesvelocity field is often assumed to solve the motion correspondencesproblem, and then rigidity is used to recover the 3D structure. Wesprove results showing that, in a statistical sense, smoothness of thesvelocity field follows from rigidity of the motion.

AITR-988

Author[s]: Kenneth W. Haase, Jr.

TYPICAL: A Knowledge Representation System for Automated Discovery and Inference

August 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-988.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-988.pdf

TYPICAL is a package for describing and making automatic inferences about a broad class of SCHEME predicate functions. These functions, called types following popular usage, delineate classes of primitive SCHEME objects, composite data structures, and abstract descriptions. TYPICAL types are generated by an extensible combinator language from either existing types or primitive terminals. These generated types are located in a lattice of predicate subsumption which captures necessary entailment between types; if satisfaction of one type necessarily entail satisfaction of another, the first type is below the second in the lattice. The inferences make by TYPICAL computes the position of the new definition within the lattice and establishes it there. This information is then accessible to both later inferences and other programs (reasoning systems, code analyzers, etc) which may need the information for their own purposes. TYPICAL was developed as a representation language for the discovery program Cyrano; particular examples are given of TYPICAL’s application in the Cyrano program.

AIM-987

Author[s]: Alan Yuille

Energy Functions for Early Vision and Analog Networks

November 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-987.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-987.pdf

This paper describes attempts to model the modules of early vision in terms of minimizing energy functions, in particular energy functions allowing discontinuities in the solution. It examines the success of using Hopfield-style analog networks for solving such problems. Finally it discusses the limitations of the energy function approach.

AIM-986

Author[s]: Harold Abelson and Gerald Jay Sussman

Lisp: A Language for Stratified Design

August 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-986.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-986.pdf

We exhibit programs that illustrate the power of Lisp as a language for expressing the design and organization of computational systems. The examples are chosen to highlight the importance of abstraction in program design and to draw attention to the use of procedures to express abstractions.

AIM-985

Author[s]: W. Eric L. Grimson

On the Recognition of Parameterized Objects

October 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-985.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-985.pdf

Determining the identity and pose of occluded objects from noisy data is a critical step in interacting intelligently with an unstructured environment. Previous work has shown that local measurements of position and surface orientation may be used in a constrained search process to solve this problem, for the case of rigid objects, either two-dimensional or three-dimensional. This paper considers the more general problem of recognizing and locating objects that can vary in parameterized ways. We consider objects with rotational, translational, or scaling degrees of freedom, and objects that undergo stretching transformations. We show that the constrained search method can be extended to handle the recognition and localization of such generalized classes of object families.

AIM-984

Author[s]: Rodney A. Brooks, Anita M. Flynn and Thomas Marill

Self Calibration of Motion and Stereo Vision for Mobile RobotsNavigation

August 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-984.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-984.pdf

We report on experiments with a mobile robot using one vision process (forward motion vision) to calibrate another (stereo vision) without resorting to any external units of measurement. Both are calibrated to a velocity dependent coordinate system which is natural to the task of obstacle avoidance. The foundations of these algorithms, in a world of perfect measurement, are quite elementary. The contribution of this work is to make them noise tolerant while remaining simple computationally. Both the algorithms and the calibration procedure are easy to implement and have shallow computational depth, making them (1) run at reasonable speed on moderate uni-processors, (2) appear practical to run continuously, maintaining an up-to-the- second calibration on a mobile robot, and (3) appear to be good candidates for massively parallel implementations.

AIM-983

Author[s]: W. Eric L. Grimson

On the Recognition of Curved Objects

July 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-983.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-983.pdf

Determining the identity and pose of occluded objects from noisy data is a critical part of a system's intelligent interaction with an unstructured environment. Previous work has shown that local measurements of the position and surface orientation of small patches of an object's surface may be used in a constrained search process to solve this problem for the case of rigid polygonal objects using two-dimensional sensory data, or rigid polyhedral objects using three-dimensional data. This note extends the recognition system to deal with the problem of recognizing and locating curved objects. The extension is done in two dimensions, and applies to the recognition of two-dimensional objects from two-dimensional data, or to the recognition of three-dimensional objects in stable positions from two- dimensional data.

AITR-982

Author[s]: Bruce Randall Donald

Error Detection and Recovery for Robot Motion Planning with Uncertainty

July 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-982.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-982.pdf

Robots must plan and execute tasks in the presence of uncertainty. Uncertainty arises from sensing errors, control errors, and uncertainty in the geometry of the environment. The last, which is called model error, has received little previous attention. We present a framework for computing motion strategies that are guaranteed to succeed in the presence of all three kinds of uncertainty. The motion strategies comprise sensor- based gross motions, compliant motions, and simple pushing motions.

AITR-980

Author[s]: James V. Mahoney

Image Chunking: Defining Spatial Building Blocks for Scene Analysis

August 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-980.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-980.pdf

Rapid judgments about the properties and spatial relations of objects are the crux of visually guided interaction with the world. Vision begins, however, with essentially pointwise representations of the scene, such as arrays of pixels or small edge fragments. For adequate time-performance in recognition, manipulation, navigation, and reasoning, the processes that extract meaningful entities from the pointwise representations must exploit parallelism. This report develops a framework for the fast extraction of scene entities, based on a simple, local model of parallel computation.sAn image chunk is a subset of an image that can act as a unit in the course of spatial analysis. A parallel preprocessing stage constructs a variety of simple chunks uniformly over the visual array. On the basis of these chunks, subsequent serial processes locate relevant scene components and assemble detailed descriptions of them rapidly. This thesis defines image chunks that facilitate the most potentially time- consuming operations of spatial analysis--- boundary tracing, area coloring, and the selection of locations at which to apply detailed analysis. Fast parallel processes for computing these chunks from images, and chunk-based formulations of indexing, tracing, and coloring, are presented. These processes have been simulated and evaluated on the lisp machine and the connection machine.

AITR-979

Author[s]: David Allen McAllester

ONTIC: A Knowledge Representation System for Mathematics

July 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-979.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-979.pdf

Ontic is an interactive system for developing and verifying mathematics. Ontic’s verification mechanism is capable of automatically finding and applying information from a library containing hundreds of mathematical facts. Starting with only the axioms of Zermelo- Fraenkel set theory, the Ontic system has been used to build a data base of definitions and lemmas leading to a proof of the Stone representation theorem for Boolean lattices. The Ontic system has been used to explore issues in knowledge representation, automated deduction, and the automatic use of large data bases.

AITR-978

Author[s]: Daniel Wayne Weise

Formal Multilevel Hierarchical Verification of Synchronous MOS Circuits

June 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-978.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-978.pdf

I have designed and implemented a system for the multilevel verification of synchronous MOS VLSI circuits. The system, called Silica Pithecus, accepts the schematic of an MOS circuit and a specification of the circuit’s intended digital behavior. Silica Pithecus determines if the circuit meets its specification. If the circuit fails to meet its specification Silica Pithecus returns to the designer the reason for the failure. Unlike earlier verifiers which modelled primitives (e.g., transistors) as unidirectional digital devices, Silica Pithecus models primitives more realistically. Transistors are modelled as bidirectional devices of varying resistances, and nodes are modelled as capacitors. Silica Pithecus operates hierarchically, interactively, and incrementally. Major contributions of this research include a formal understanding of the relationship between different behavioral descriptions (e.g., signal, boolean, and arithmetic descriptions) of the same device, and a formalization of the relationship between the structure, behavior, and context of device. Given these formal structures my methods find sufficient conditions on the inputs of circuits which guarantee the correct operation of the circuit in the desired descriptive domain. These methods are algorithmic and complete. They also handle complex phenomena such as races and charge sharing. Informal notions such as races and hazards are shown to be derivable from the correctness conditions used by my methods.

AIM-977

Author[s]: Sundar Narasimhan, David M. Siegel and John M. Hollerbach

A Standard Architecture for Controlling Robots

July 1988

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-977.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-977.pdf

This paper describes a fully implemented computational architecture that controls the Utah-MIT dextrous hand and other complex robots. Robots like the Utah-MIT hand are characterized by large numbers of actuators and sensors, and require high servo rates. Consequently, powerful and flexible computer architectures are needed to control them. The architecture described in this paper derives its power from the highly efficient real-time environment provided for its control processors, coupled with a development host that enables flexible program development. By mapping the memory of a dedicated group of processors into the address space of a host computer, efficient sharing of system resources between them is possible. The software is characterized by a few simple design concepts but provides the facilities out of which more powerful utilities like multi- processor pseudoterminal emulator, a transparent and fast file server, and a flexible symbolic debugger could be constructed.

AIM-975

Author[s]: Michael A. Gennert and Shahriar Negahdaripour

Relaxing the Brightness Constancy Assumption in Computing Optical Flow

June 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-975.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-975.pdf

Optical flow is the apparent (or perceived) motion of image brightness patterns arising from relative motion of objects and observer. Estimation of the optical flow requires the application of two kinds of constraint: the flow field smoothness constraint and the brightness constancy constraint. The brightness constancy constraint permits one to match image brightness values across images, but is very restrictive. We propose replacing this constraint with a more general constraint, which permits a linear transformation between image brightness values. The transformation parameters are allowed to vary smoothly so that inexact matching is allowed. We describe the implementation on a highly parallel computer and present sample results.

AITR-974

Author[s]: Michael D. Riley

Time-Frequency Representations for Speech Signals

May 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-974.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-974.pdf

This work addresses two related questions. The first question is what joint time-frequency energy representations are most appropriate for auditory signals, in particular, for speech signals in sonorant regions. The quadratic transforms of the signal are examined, a large class that includes, for example, the spectrograms and the Wigner distribution. Quasi-stationarity is not assumed, since this would neglect dynamic regions. A set of desired properties is proposed for the representation: (1) shift-invariance, (2) positivity, (3) superposition, (4) locality, and (5) smoothness. Several relations among these properties are proved: shift-invariance and positivity imply the transform is a superposition of spectrograms; positivity and superposition are equivalent conditions when the transform is real; positivity limits the simultaneous time and frequency resolution (locality) possible for the transform, defining an uncertainty relation for joint time-frequency energy representations; and locality and smoothness tradeoff by the 2-D generalization of the classical uncertainty relation. The transform that best meets these criteria is derived, which consists of two-dimensionally smoothed Wigner distributions with (possibly oriented) 2-D guassian kernels. These transforms are then related to time-frequency filtering, a method for estimating the time- varying ‘transfer function’ of the vocal tract, which is somewhat analogous to ceptstral filtering generalized to the time-varying case. Natural speech examples are provided. The second question addressed is how to obtain a rich, symbolic description of the phonetically relevant features in these time-frequency energy surfaces, the so-called schematic spectrogram. Time-frequency ridges, the 2-D analog of spectral peaks, are one feature that is proposed. If non-oriented kernels are used for the energy representation, then the ridge tops can be identified, with zero-crossings in the inner product of the gradient vector and the direction of greatest downward curvature. If oriented kernels are used, the method can be generalized to give better orientation selectivity (e.g., at intersecting ridges) at the cost of poorer time-frequency locality. Many speech examples are given showing the performance for some traditionally difficult cases: semi- vowels and glides, nasalized vowels, consonant-vowel transitions, female speech, and imperfect transmission channels.

AIM-973

Author[s]: T. Poggio and C. Koch

Synapses That Compute Motion

June 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-973.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-973.pdf

Biophysics of computation is a new field that attempts to characterize the role in information processing of the several biophysical mechanisms in neurons, synapses, and membranes that have been uncovered in recent years. In this article, we review a synaptic mechanism, based on the interaction between excitation and silent inhibition, that implements a veto-like operation. Synapses of this type may underlie direction selectivity to direction of motion in the vertebrate retina.

AITR-972

Author[s]: Robert C. Berwick

Principle-Based Parsing

June 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-972.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-972.pdf

During the past few years, there has been much discussion of a shift from rule-based systems to principle-based systems for natural language processing. This paper outlines the major computational advantages of principle-based parsing, its differences from the usual rule-based approach, and surveys several existing principle-based parsing systems used for handling languages as diverse as Warlpiri, English, and Spanish, as well as language translation.

AIM-970

Author[s]: Ed Gamble and Tomaso Poggio

Visual Integration and Detection of Discontinuities: The Key Role of Intensity Edges

October 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-970.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-970.pdf

Integration of several vision modules is likely to be one of the keys to the power and robustness of the human visual system. The problem of integrating early vision cues is also emerging as a central problem in current computer vision research. In this paper we suggest that integration is best performed at the location of discontinuities in early processes, such as discontinuities in image brightness, depth, motion, texture and color. Coupled Markov Random Field models, based on Bayes estimation techiques, can be used to combine vision modalities with their discontinuities. These models generate algorithms that map naturally onto parallel fine-grained architectures such as the Connection Machine. We derive a scheme to integrate intensity edges with stereo depth and motion field information and show results on synthetic and natural images. The use of intensity edges to integrate other visual cues and to help discover discontinuities emerges as a general and powerful principle.

AITR-968

Author[s]: Harry Voorhees

Finding Texture Boundaries in Images

June 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-968.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-968.pdf

Texture provides one cue for identifying the physical cause of an intensity edge, such as occlusion, shadow, surface orientation or reflectance change. Marr, Julesz, and others have proposed that texture is represented by small lines or blobs, called 'textons' by Julesz [1981a], together with their attributes, such as orientation, elongation, and intensity. Psychophysical studies suggest that texture boundaries are perceived where distributions of attributes over neighborhoods of textons differ significantly. However, these studies, which deal with synthetic images, neglect to consider two important questions: How can these textons be extracted from images of natural scenes? And how, exactly, are texture boundaries then found? This thesis proposes answers to these questions by presenting an algorithm for computing blobs from natural images and a statistic for measuring the difference between two sample distributions of blob attributes. As part of the blob detection algorithm, methods for estimating image noise are presented, which are applicable to edge detection as well.

AIM-967

Author[s]: Robert J. Hall, Richard H. Lathrop and Robert S. Kirk

A Multiple Representation Approach to Understanding the Time Behavior of Digital Circuits

May 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-967.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-967.pdf

We put forth a multiple representation approach to deriving the behavioral model of a digital circuit automatically from its structure and the behavioral simulation models of its components. One representation supports temporal reasoning for composition and amplification, another supports simulation and a third helps to partition the translation problem. A working prototype, FUNSTRUX, is described.

AIM-966

Author[s]: Robert J. Hall

A Fully Abstract Semantics for Event-Based Simulation

May 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-966.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-966.pdf

This paper shows that, provided circuits contain no zero-delay loops, a tight relationship, full abstraction, exists between a natural event-based operational semantics for circuits and a natural denotational semantics for circuits based on causal functions on value timelines. The paper also discusses what goes wrong if zero-delay loops are allowed, and illustrates the application of this semantic relationship to modeling questions.

AIM-965

Author[s]: Heinrich H. Bulthoff and Hanspeter A. Mallot

Interaction of Different Modules in Depth Perception: Stereo and Shading

May 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-965.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-965.pdf

A method has been developed to measure the perceived depth of computer generated images of simple solid objects. Computer graphic techniques allow for independent control of different depth queues (stereo, shading, and texture) and enable the investigator thereby to study psychophysically the interaction of modules for depth perception. Accumulation of information from shading and stereo and vetoing of depth from shading by edge information have been found. Cooperativity and other types of interactions are discussed. If intensity edges are missing, as in a smooth-shaded surface, the image intensities themselves could be used for stereo matching. The results are compared with computer vision algorithms for both single modules and their integration for 3D vision.

AIM-964

Author[s]: Eric Sven Ristad

Complexity of Human Language Comprehension

December 1988

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-964.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-964.ps

The goal of this article is to reveal the computational structure of modern principle- and-parameter (Chomskian) linguistic theories: what computational problems do these informal theories pose, and what is the underlying structure of those computations? To do this, I analyze the computational complexity of human language comprehension: what linguistic representation is assigned to a given sound? This problem is factored into smaller, interrelated (but independently statable) problems. For example, in order to understand a given sound, the listener must assign a phonetic form to the sound; determine the morphemes that compose the words in the sound; and calculate the linguistic antecedent of every pronoun in the utterance. I prove that these and other subproblems are all NP-hard, and that language comprehension is itself PSPACE- hard.

AITR-963

Author[s]: Gil J. Ettinger

Hierarchical Object Recognition Using Libraries of Parameterized Model Sub-Parts

May 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-963.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-963.pdf

This thesis describes the development of a model-based vision system that exploits hierarchies of both object structure and object scale. The focus of the research is to use these hierarchies to achieve robust recognition based on effective organization and indexing schemes for model libraries. The goal of the system is to recognize parameterized instances of non-rigid model objects contained in a large knowledge base despite the presence of noise and occlusion. Robustness is achieved by developing a system that can recognize viewed objects that are scaled or mirror-image instances of the known models or that contain components sub-parts with different relative scaling, rotation, or translation than in models. The approach taken in this thesis is to develop an object shape representation that incorporates a component sub-part hierarchy- to allow for efficient and correct indexing into an automatically generated model library as well as for relative parameterization among sub- parts, and a scale hierarchy- to allow for a general to specific recognition procedure. After analysis of the issues and inherent tradeoffs in the recognition process, a system is implemented using a representation based on significant contour curvature changes and a recognition engine based on geometric constraints of feature properties. Examples of the system’s performance are given, followed by an analysis of the results. In conclusion, the system’s benefits and limitations are presented.

AIM-962

Author[s]: Thomas R. Kennedy III

Using Program Transformation to Improve Program Translation

May 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-962.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-962.pdf

Direct, construct by construct translation from one high level language to another often produces convoluted, unnatural, and unreadable results, particularly when the source and target languages support different models of programming. A more readable and natural translation can be obtained by augmenting the translator with a program transformation system.

AIM-959

Author[s]: Richard C. Waters

Synchronizable Series Expressions: Part II: Overview of the Theory and Implementation

November 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-959.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-959.pdf

The benefits of programming in a functional style are well known. In particular, algorithms that are expressed as compositions of functions operating on series/vectors/streams of data elements are much easier to understand and modify than equivalent algorithms expressed as loops. Unfortunately, many programmers hesitate to use series expressions, because they are typically implemented very inefficiently- the prime source of inefficiency being the creation of intermediate series objects. A restricted class of series expressions, obviously synchronizable series expressions, is defined which can be evaluated very efficiently. At the cost of introducing restrictions which place modest limits on the series expressions which can be written, the restrictions guarantee that the creation of intermediate series objects is never necessary. This makes it possible to automatically convert obviously synchronizable series expressions into highly efficient loops using straightforward algorithms.

AIM-959A

Author[s]: Richard C. Waters

Obviously Synchronizable Series Expressions: Part II: Overview of the Theory and Implementation

March 1988

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-959a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-959a.pdf

The benefits of programming in a functional style are well known. In particular, algorithms that are expressed as compositions of functions operating on series/vectors/streams of data elements are much easier to understand and modify than equivalent algorithms expressed as loops. Unfortunately, many programmers hesitate to use series expressions, because they are typically implemented very inefficiently- the prime source of inefficiency being the creation of intermediate series objects. A restricted class of series expressions, obviously synchronizable series expressions, is defined which can be evaluated very efficiently. At the cost of introducing restrictions which place modest limits on the series expressions which can be written, the restrictions guarantee that the creation of intermediate series objects is never necessary. This makes it possible to automatically convert obviously synchronizable series expressions into highly efficient loops using straightforward algorithms.

AIM-958

Author[s]: Richard C. Waters

Obviously Synchronizable Series Expressions: Part I: User's Manual for the OSS Macro Package

October 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-958.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-958.pdf

The benefits of programming in a functional style are well known. In particular, algorithms that are expressed as compositions of functions operating on series/vectors/streams of data elements are much easier to understand and modify than equivalent algorithms expressed as loops. Unfortunately, many programmers hesitate to use series expressions, because they are typically implemented very inefficiently. Common Lisp macro packages (OSS) has been implemented which supports a restricted class of series expressions, obviously synchronizable series expressions, which can be evaluated very efficiently by automatically converting them into loops. Using this macro package, programmers can obtain the advantages of expressing computations as series expressions without incurring any run-time overhead.

AIM-958A

Author[s]: Richard C. Waters

Obviously Synchronizable Series Expression: Part I: User's Manual for the OSS Macro Package

March 1988

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-958a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-958a.pdf

The benefits of programming in a functional style are well known. In particular, algorithms that are expressed as compositions of functions operating on series/vectors/streams of data elements are much easier to understand and modify than equivalent algorithms expressed as loops. Unfortunately, many programmers hesitate to use series expressions, because they are typically implemented very inefficiently. Common Lisp macro packages (OSS) has been implemented which supports a restricted class of series expressions, obviously synchronizable series expressions, which can be evaluated very efficiently by automatically converting them into loops. Using this macro package, programmers can obtain the advantages of expressing computations as series expressions without incurring any run-time overhead.

AIM-957

Author[s]: John Canny and Bruce Donald

Simplified Voronoi Diagrams

April 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-957.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-957.pdf

The Voronoi diagram has proved to be a useful tool in a variety of contexts in computational geometry. Our interest here is in using the diagram to simplify the planning of collision-free paths for a robot among obstacles, the so-called generalized movers' problem. The Voronoi diagram, as usually defined, is a strong deformation retract of free space so that free space can be continuously deformed onto the diagram. In particular, any path in free space can be continuously deformed onto the diagram. This means that the diagram is complete for path planning, i.e., searching the original space for paths can be reduced to a search on the diagram. Reducing the dimension of the set to be searched usually reduces the time complexity of the search. Secondly, the diagram leads to robust paths, i.e., paths that are maximally clear of obstacles.

AIM-955

Author[s]: Harold Abelson and Gerald Jay Sussman

The Dynamicist's Workbench: I Automatic Preparation of Numerical Experiments

May 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-955.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-955.pdf

The dynamicist's workbench is a system for automating some of the work of experimental dynamics. We describe a portion of our system that deals with the setting up and execution of numerical simulations. This part of the workbench includes a spectrum of computational tools---numerical methods, symbolic algebra, and semantic constraints. These tools are designed so that combined methods, tailored to particular problems, can be constructed.

AIM-954

Author[s]: Charles Rich and Richard C. Waters

Formalizing Reusable Software Components in the Programmer's Apprentice

February 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-954.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-954.pdf

There has been a long-standing desire in computer science for a way of collecting and using libraries of standard software components. The limited success in actually doing this stems not from any resistance to the idea, nor from any lack of trying, but rather from the difficulty of choosing an appropriate formalism for representing components. For a formalism to be maximally useful, it must satisfy five key desiderata: expressiveness, convenient combinability, semantic soundness, machine manipulability, and programming language independence. The Plan Calculus formalism developed as part of the Programmer's Apprentice project satisfies each of these desiderata quite well. It does this by combining the ideas from flowchart schemas, data abstraction, logical formalisms, and program transformations. The efficacy of the Plan Calculus has been demonstrated in part by a prototype program editor called the Knowledge- based Editor in Emacs. This editor makes it possible for a programmer to construct a program rapidly and reliably by combining components represented as plans.

AIM-953

Author[s]: Henry M. Wu

Scheme 86: An Architecture for Microcoding a Scheme Interpreter

August 1988

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-953.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-953.pdf

I describe the design and implementation plans for a computer that is optimized as a microcoded interpreter for Scheme. The computer executes SCode, a typed-pointer representation. The memory system has low- latency as well as high throughput. Multiple execution units in the processor complete complex operations in less than one memory cycle, allowing efficient use of memory bandwidth. The processor provides hardware support for tagged data objects and runtime type checking. I will discuss the motivation for this machine, its architecture, why it can interpret Scheme efficiently, and the computer-aided design tools developed for building this computer.

AIM-952

Author[s]: Guy E. Blelloch and James J. Little

Parallel Solutions to Geometric Problems on the Scan Model of Computation

February 1988

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-952.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-952.pdf

This paper describes several parallel algorithms that solve geometric problems. The algorithms are based on a vector model of computation---the scan-model. The purpose of this paper is both to show how the model can be used and to show a set of interesting algorithms, most of which have been implemented on the Connection Machine, a highly parallel single instruction multiple data (SIMD) computer.

AIM-951

Author[s]: Daniel S. Weld

Comparative Analysis

November 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-951.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-951.pdf

Comparative analysis is the problem of predicting how a system will react to perturbations in its parameters, and why. For example, comparative analysis could be asked to explain why the period of an oscillating spring/block system would increase if the mass of the block were larger. This paper formalizes the problem of comparative analysis and presents a technique, differential qualitative (DQ) analysis, which solves the task, providing explanations suitable for use by design systems, automated diagnosis, intelligent tutoring systems, and explanation-based generalization. DQ analysis uses inference rules to deduce qualitative information about the relative change of system parameters. Multiple perspectives are used to represent relative change values over intervals of time. Differential analysis has been implemented, tested on a dozen examples, and proven sound. Unfortunately, the technique is incomplete; it always terminates, but does not always return an answer.

AIM-950

Author[s]: Kenneth Man-Kam Yip

Extracting Qualitative Dynamics from Numerical Experiments

March 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-950.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-950.pdf

The Phase Space is a powerful tool for representing and reasoning about the qualitative behavior of nonlinear dynamical systems. Significant physical phenomena of the dynamical system---periodicity, recurrence, stability and the like---are reflected by outstanding geometric features of the trajectories in the phase space. This paper presents an approach for the automatic reconstruction of the full dynamical behavior from the numerical results by exploiting knowledge of Dynamical Systems Theory and techniques from computational geometry and computer vision. The approach is applied to an important class of dynamical systems, the area-preserving maps, which often arise from the study of Hamiltonian systems.

AIM-949

Author[s]: Richard C. Waters

Program Translation via Abstraction and Reimplementation

December 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-949.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-949.pdf

Essentially all program translators (both source-to-source translators and compilers) operate via transliteration and refinement. This approach is fundamentally limited in the quality of the output it can produce. In particular, it tends to be insufficiently sensitive to global features of the source program and too sensitive to irrelevant local details. This paper presents the alternate translation paradigm of abstraction and reimplementation, which is one of the goals of the Programmer's Apprentice project. A translator has been constructed which translates Cobol programs into Hibol (a very high level, business data processing language). A compiler has been designed which generates extremely efficient PDP-11 object code for Pascal programs.

AIM-948

Author[s]: Steven D. Eppinger and Warren P. Seering

Understanding Bandwidth Limitations in Robot Force Control

August 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-948.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-948.pdf

This paper provides an analytical overview of the dynamics involved in force control. Models are developed which demonstrate, for the one-axis explicit force control case, the effects on system closed-loop bandwidth of: a) robot system dynamics that are not usually considered in the controller design; b) drive- train and task nonlinearities; and c) actuator and controller dynamics. The merits and limitations of conventional solutions are weighed, and some new solutions are proposed. Conclusions are drawn which give insights into the relative importance of the effects discussed.

AIM-947

Author[s]: Bonnie J. Dorr

Principle-Based Parsing for Machine Translation

December 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-947.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-947.pdf

Many syntactic parsing strategies for machine translation systems are based entirely on context-free grammars. These parsers require an overwhelming number of rules; thus, translation systems using rule-based parsers either have limited linguistic coverage, or they have poor performance due to formidable grammar size. This report shows how a principle-based parser with a 'co-routine' design improves parsing for translation. The parser consists of a skeletal structure-building mechanism that operates in conjunction with a linguistically based constraint module, passing control back and forth until a set of underspecified skeletal phrase-structures is converted into a fully instantiated parse tree. The modularity of the parsing design accomodates linguistic generalization, reduces the grammar size, allows extension to other languages, and is compatible with studies of human language processing.

AIM-946

Author[s]: Alan Bawden

Reification without Evaluation

June 1988

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-946.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-946.pdf

Constructing self-referential systems, such as Brian Smith's 3-Lisp language, is actually more straightforward than you think. Anyone can build an infinite tower of processors (where each processor implements the processor at the next level below) by employing some common sense and one simple trick. In particular, it is not necessary to re-design quotation, take a stand on the relative merits of evaluation vs. normalization, or treat continuations as meta-level objects. This paper presents a simple programming language interpreter that illustrates how this can be done. By keeping its expression evaluator entirely separate from the mechanisms that implement its infinite tower, this interpreter avoids many troublesome aspects of previous self-referential programming languages. Given these basically straightforward techniques, processor towers might be easily constructed for a wide variety of systems to enable them to manipulate and reason about themselves.

AIM-945

Author[s]: Carl E. Hewitt

Offices are Open Systems

February 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-945.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-945.pdf

This paper takes a prescriptive stance on how to establish the information-processing foundations for taking action and making decisions in office work from an open system perspective. We propose due process as a central activity in organizational information processing.

AITR-942

Author[s]: Christopher Granger Atkeson

Roles of Knowledge in Motor Learning

February 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-942.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-942.pdf

The goal of this thesis is to apply the computational approach to motor learning, i.e., describe the constraints that enable performance improvement with experience and also the constraints that must be satisfied by a motor learning system, describe what is being computed in order to achieve learning, and why it is being computed. The particular tasks used to assess motor learning are loaded and unloaded free arm movement, and the thesis includes work on rigid body load estimation, arm model estimation, optimal filtering for model parameter estimation, and trajectory learning from practice. Learning algorithms have been developed and implemented in the context of robot arm control. The thesis demonstrates some of the roles of knowledge in learning. Powerful generalizations can be made on the basis of knowledge of system structure, as is demonstrated in the load and arm model estimation algorithms. Improving the performance of parameter estimation algorithms used in learning involves knowledge of the measurement noise characteristics, as is shown in the derivation of optimal filters. Using trajectory errors to correct commands requires knowledge of how command errors are transformed into performance errors, i.e., an accurate model of the dynamics of the controlled system, as is demonstrated in the trajectory learning work. The performance demonstrated by the algorithms developed in this thesis should be compared with algorithms that use less knowledge, such as table based schemes to learn arm dynamics, previous single trajectory learning algorithms, and much of traditional adaptive control.

AIM-941

Author[s]: Eric Saund

Dimensionality-Reduction Using Connectionist Networks

January 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-941.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-941.pdf

This paper presents a method for using the self-organizing properties of connectionist networks of simple computing elements to discover a particular type of constraint in multidimensional data. The method performs dimensionality-reduction in a wide class of situations for which an assumption of linearity need not be made about the underlying constraint surface. We present a scheme for representing the values of continuous (scalar) variables in subsets of units. The backpropagation weight updating method for training connectionist networks is extended by the use of auxiliary pressure in order to coax hidden units into the prescribed representation for scalar-valued variables.

AIM-940

Author[s]: Shahriar Negahdaripour

Ambiguities of a Motion Field

January 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-940.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-940.pdf

We study the conditions under which a perspective motion field can have multiple interpretations. Furthermore, we show that in most cases, the ambiguity in the interpretation of a motion field can be resolved by imposing the physical constraint that depth is positive over the image region onto which the surface projects.

AIM-939

Author[s]: Shahriar Negahdaripour and Berthold K.P. Horn

A Direct Method for Locating the Focus of Expansion

January 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-939.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-939.pdf

We address the problem of recovering the motion of a monocular observer relative to a rigid scene. We do not make any assumptions about the shapes of the surfaces in the scene, nor do we use estimates of the optical flow or point correspondences. Instead, we exploit the spatial gradient and the time rate of change of brightness over the whole image and explicitly impose the constraint that the surface of an object in the scene must be in front of the camera for it to be imaged.

AIM-937

Author[s]: Daniel P. Huttenlocher and Shimon Ullman

Recognizing Rigid Objects by Aligning Them with an Image

January 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-937.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-937.pdf

This paper presents an approach to recognition where an object is first {\it aligned} with an image using a small number of pairs of model and image features, and then the aligned model is compared directly against the image. To demonstrate the method, we present some examples of recognizing flat rigid objects with arbitrary three-dimensional position, orientation, and scale, from a single two-scale-space segmentation of edge contours. The method is extended to the domain of non-flat objects as well.

AITR-936

Author[s]: Stephen J. Buckley

Planning and Teaching Compliant Motion Strategies

January 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-936.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-936.pdf

This thesis presents a new high level robot programming system. The programming system can be used to construct strategies consisting of compliant motions, in which a moving robot slides along obstacles in its environment. The programming system is referred to as high level because the user is spared of many robot-level details, such as the specification of conditional tests, motion termination conditions, and compliance parameters. Instead, the user specifies task- level information, including a geometric model of the robot and its environment. The user may also have to specify some suggested motions. There are two main system components. The first component is an interactive teaching system which accepts motion commands from a user and attempts to build a compliant motion strategy using the specified motions as building blocks. The second component is an autonomous compliant motion planner, which is intended to spare the user from dealing with “simple” problems. The planner simplifies the representation of the environment by decomposing the configuration space of the robot into a finite state space, whose states are vertices, edges, faces, and combinations thereof. States are inked to each other by arcs, which represent reliable compliant motions. Using best first search, states are expanded until a strategy is found from the start state to a global state. This component represents one of the first implemented compliant motion planners. The programming system has been implemented on a Symbolics 3600 computer, and tested on several examples. One of the resulting compliant motion strategies was successfully executed on an IBM 7565 robot manipulator.

AIM-933A

Author[s]: Charles Rich and Richard C. Waters

The Programmer's Apprentice: A Program Design Scenario

November 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-933a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-933a.pdf

A scenario is used to illustrate the capabilities of a proposed Design Apprentice, focussing on the area of detailed, low-level design. Given a specification, the Design Apprentice will be able to make many of the design decisions needed to synthesize the required program. The Design Apprentice will also be able to detect various kinds of contradictions and omissions in a specification.

AITR-932

Author[s]: Steven Jeffrey Gordon

Automated Assembly Using Feature Localization

December 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-932.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-932.pdf

Automated assembly of mechanical devices is studies by researching methods of operating assembly equipment in a variable manner; that is, systems which may be configured to perform many different assembly operations are studied. The general parts assembly operation involves the removal of alignment errors within some tolerance and without damaging the parts. Two methods for eliminating alignment errors are discussed: a priori suppression and measurement and removal. Both methods are studied with the more novel measurement and removal technique being studied in greater detail. During the study of this technique, a fast and accurate six degree-of- freedom position sensor based on a light- stripe vision technique was developed. Specifications for the sensor were derived from an assembly-system error analysis. Studies on extracting accurate information from the sensor by optimally reducing redundant information, filtering quantization noise, and careful calibration procedures were performed. Prototype assembly systems for both error elimination techniques were implemented and used to assemble several products. The assembly system based on the a priori suppression technique uses a number of mechanical assembly tools and software systems which extend the capabilities of industrial robots. The need for the tools was determined through an assembly task analysis of several consumer and automotive products. The assembly system based on the measurement and removal technique used the six degree-of- freedom position sensor to measure part misalignments. Robot commands for aligning the parts were automatically calculated based on the sensor data and executed.

AIM-931

Author[s]: Shimon Ullman

An Approach To Object Recognition: Aligning Pictorial Descriptions

December 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-931.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-931.pdf

This paper examines the problem of shape- based object recognition and proposes a new approach, the alignment of pictorial descriptions. The first part of the paper reviews general approaches to visual object recognition and divides these approaches into three broad classes: invariant properties methods, object decomposition methods, and alignment methods. The second part presents the alignment method. In this approach the recognition process is divided into two stages. The first determines the transformation in space that is necessary to bring the viewed object into alignment with possible object-models. The second stage determines the model that best matches the viewed object. The proposed alignment method also uses abstract description, but unlike structural description methods, it uses them pictorially, rather than in symbolic structural descriptions.

AIM-930

Author[s]: J.R. Quinlan

Simplifying Decision Trees

December 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-930.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-930.pdf

Many systems have been developed for constructing decision trees from collections of examples. Although the decision trees generated by these methods are accurate and efficient, they often suffer the disadvantage of excessive complexity that can render them incomprehensible to experts. It is questionable whether opaque structures of this kind can be described as knowledge, no matter how well they function. This paper discusses techniques for simplifying decision trees without compromising their accuracy. Four methods are described, illustrated, and compared on a test- bed of decision trees from a variety of domains.

AIM-928

Author[s]: James J. Little

Parallel Algorithms for Computer Vision on the Connection Machine

November 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-928.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-928.pdf

The Connection Machine is a fine-grained parallel computer having up to 64K processors. It supports both local communication among the processors, which are situated in a two-dimensional mesh, and high-bandwidth communication among processors at arbitrary locations, using a message-passing network. We present solutions to a set of Image Understanding problems for the Connection Machine. These problems were proposed by DARPA to evaluate architectures for Image Understanding systems, and are intended to comprise a representative sample of fundamental procedures to be used in Image Understanding. The solutions on the Connection Machine embody general methods for filtering images, determining connectivity among image elements, determining spatial relations of image elements, and computing graph properties, such as matchings and shortest paths.

AIM-927

Author[s]: Davi Geiger and Alan Yuille

Stereo and Eye Movement

January 1988

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-927.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-927.pdf

We describe a method to solve the stereo correspondence using controlled eye (or camera) movements. These eye movements essentially supply additional image frames which can be used to constrain the stereo matching. Because the eye movements are small, traditional methods of stereo with multiple frames will not work. We develop an alternative approach using a systematic analysis to define a probability distribution for the errors. Our matching strategy then matches the most probable points first, thereby reducing the ambiguity for the remaining matches. We demonstrate this algorithm with several examples.

AITR-925

Author[s]: Guillermo Juan Rozas

A Computational Model for Observation in Quantum Mechanics

March 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-925.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-925.pdf

A computational model of observation in quantum mechanics is presented. The model provides a clean and simple computational paradigm which can be used to illustrate and possibly explain some of the unintuitive and unexpected behavior of some quantum mechanical systems. As examples, the model is used to simulate three seminal quantum mechanical experiments. The results obtained agree with the predictions of quantum mechanics (and physical measurements), yet the model is perfectly deterministic and maintains a notion of locality.

AIM-924

Author[s]: Mario Bertero, Tomaso Poggio and Vincent Torre

Ill-Posed Problems in Early Vision

May 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-924.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-924.pdf

The first processing stage in computational vision, also called early vision, consists in decoding 2D images in terms of properties of 3D surfaces. Early vision includes problems such as the recovery of motion and optical flow, shape from shading, surface interpolation, and edge detection. These are inverse problems, which are often ill-posed or ill-conditioned. We review here the relevant mathematical results on ill-posed and ill- conditioned problems and introduce the formal aspects of regularization theory in the linear and non-linear case. More general stochastic regularization methods are also introduced. Specific topics in early vision and their regularization are then analyzed rigorously, characterizing existence, uniqueness, and stability of solutions.

AIM-919

Author[s]: Ellen C. Hildreth and Christof Koch

The Analysis of Visual Motion: From Computational Theory to Neuronal Mechanisms

December 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-919.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-919.pdf

This paper reviews a number of aspects of visual motion analysis in biological systems from a computational perspective. We illustrate the kinds of insights that have been gained through computational studies and how these observations can be integrated with experimental studies from psychology and the neurosciences to understand the particular computations used by biological systems to analyze motion. The particular areas of motion analysis that we discuss include early motion detection and measurement, the optical flow computation, motion correspondence, the detection of motion discontinuities, and the recovery of three-dimensional structure from motion.

AITR-918

Author[s]: Guy Blelloch

AFL-1: A Programming Language for Massively Concurrent Computers

November 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-918.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-918.pdf

Computational models are arising is which programs are constructed by specifying large networks of very simple computational devices. Although such models can potentially make use of a massive amount of concurrency, their usefulness as a programming model for the design of complex systems will ultimately be decided by the ease in which such networks can be programmed (constructed). This thesis outlines a language for specifying computational networks. The language (AFL- 1) consists of a set of primitives, ad a mechanism to group these elements into higher level structures. An implementation of this language runs on the Thinking Machines Corporation, Connection machine. Two significant examples were programmed in the language, an expert system (CIS), and a planning system (AFPLAN). These systems are explained and analyzed in terms of how they compare with similar systems written in conventional languages.

AIM-917

Author[s]: Alessandro Verri and Tomaso Poggio

Motion Field and Optical Flow: Qualitative Properties

December 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-917.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-917.pdf

In this paper we show that the optical flow, a 2D field that can be associated with the variation of the image brightness pattern, and the 2D motion field, the projection on the image plane of the 3D velocity field of a moving scene, are in general different, unless very special conditions are satisfied. The optical flow, therefore, is ill-suited for computing structure from motion and for reconstructing the 3D velocity field, problems that require an accurate estimate of the 2D motion field. We then suggest a different use of the optical flow. We argue that stable qualitative properties of the 2D motion field give useful information about the 3D velocity field and the 3D structure of the scene, and that they can usually be obtained from the optical flow. To support this approach we show how the (smoothed) optical flow and 2D motion field, interpreted as vector fields tangent to flows of planar dynamical systems, may have the same qualitative properties from the point of view of the theory of structural stability of dynamical systems.

AIM-916

Author[s]: Alessandro Verri and Tomaso Poggio

Regularization Theory and Shape Constraints

September 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-916.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-916.pdf

Many problems of early vision are ill-posed; to recover unique stable solutions regularization techniques can be used. These techniques lead to meaningful results, provided that solutions belong to suitable compact sets. Often some additional constraints on the shape or the behavior of the possible solutions are available. This note discusses which of these constraints can be embedded in the classic theory of regularization and how, in order to improve the quality of the recovered solution. Connections with mathematical programming techniques are also discussed. As a conclusion, regularization of early vision problems may be improved by the use of some constraints on the shape of the solution (such as monotonicity and upper and lower bounds), when available.

AIM-915

Author[s]: Anya Hurlbert and Tomaso Poggio

Visual Attention in Brains and Computers

September 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-915.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-915.pdf

Existing computer programs designed to perform visual recognition of objects suffer from a basic weakness: the inability to spotlight regions in the image that potentially correspond to objects of interest. The brain’s mechanisms of visual attention, elucidated by psychophysicists and neurophysiologists, may suggest a solution to the computer’s problem of object recognition.

AIM-914

Author[s]: Christof Koch, Tomaso Poggio and Vincent Torre

Computations in the Vertebrate Retina: Gain Enhancement, Differentiation and Motion Discrimination

September 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-914.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-914.pdf

The vertebrate retina, which provides the visual input to the brain and its main interface with the outside world, is a very attractive model system for approaching the question of the information processing role of biological mechanisms of nerve cells. It is as yet impossible to provide a complete circuit diagram of the retina, but it is now possible to identify a few simple computations that the retina performs and to relate them to specific biophysical mechanisms and circuit elements. In this paper we consider three operations carried out by most retinae: amplification, temporal differentiation, and computation of the direction of motion of visual patterns.

AITR-912

Author[s]: Chae Hun An

Trajectory and Force Control of a Direct Drive Arm

September 1986

ftp://publications.ai.mit.edu/ai-publications/500-912/AITR-912.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-912.pdf

Using the MIT Serial Link Direct Drive Arm as the main experimental device, various issues in trajectory and force control of manipulators were studied in this thesis. Since accurate modeling is important for any controller, issues of estimating the dynamic model of a manipulator and its load were addressed first. Practical and effective algorithms were developed fro the Newton-Euler equations to estimate the inertial parameters of manipulator rigid-body loads and links. Load estimation was implemented both on PUMA 600 robot and on the MIT Serial Link Direct Drive Arm. With the link estimation algorithm, the inertial parameters of the direct drive arm were obtained. For both load and link estimation results, the estimated parameters are good models of the actual system for control purposes since torques and forces can be predicted accurately from these estimated parameters. The estimated model of the direct drive arm was them used to evaluate trajectory following performance by feedforward and computed torque control algorithms. The experimental evaluations showed that the dynamic compensation can greatly improve trajectory following accuracy. Various stability issues of force control were studied next. It was determined that there are two types of instability in force control. Dynamic instability, present in all of the previous force control algorithms discussed in this thesis, is caused by the interaction of a manipulator with a stiff environment. Kinematics instability is present only in the hybrid control algorithm of Raibert and Craig, and is caused by the interaction of the inertia matrix with the Jacobian inverse coordinate transformation in the feedback path. Several methods were suggested and demonstrated experimentally to solve these stability problems. The result of the stability analyses were then incorporated in implementing a stable force/position controller on the direct drive arm by the modified resolved acceleration method using both joint torque and wrist force sensor feedbacks.

AIM-911

Author[s]: David McAllester and Ramin Zabih

Boolean Classes

September 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-911.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-911.pdf

Object-oriented programming languages all involve the notions of class and object. We extend the notion of class so that any Boolean combination of classes is also a class. Boolean classes allow greater precision and conciseness in naming the class of objects governed by a particular method. A class can be viewed as a predicate which is either true or false of any given object. Unlike predicates however classes have an inheritance hierarchy which is known at compile time. Boolean classes extend the notion of class, making classes more like predicates, while preserving the compile time computable inheritance hierarchy.

AIM-910

Author[s]: Steven D. Eppinger and Warren P. Seering

On Dynamic Models of Robot Force Control

July 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-910.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-910.pdf

For precise robot control, endpoint compliance strategies utilize feedback from a force sensor located near the tool/workpiece interface. Such endpoint force control systems have been observed in the laboratory to be limited to unsatisfactory closed-loop performance. This paper discusses the particular dynamic properties of robot systems which can lead to instability and limit performance. A series of lumped-parameter models is developed in an effort to predict the closed-loop dynamics of a force-controlled single axis arm. The models include some effects of robot structural dynamics, sensor compliance, and workpiece dynamics. The qualitative analysis shows that the robot dynamics contribute to force-controlled instability. Recommendations are made for models to be used in control system design.

AIM-909

Author[s]: Anya Hurlbert and Tomaso Poggio

Learning a Color Algorithm from Examples

June 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-909.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-909.pdf

We show that a color algorithm capable of separating illumination from reflectance in a Mondrian world can be learned from a set of examples. The learned algorithm is equivalent to filtering the image data---in which reflectance and illumination are mixed-- -through a center-surround receptive field in individual chromatic channels. The operation resembles the "retinex" algorithm recently proposed by Edwin Land. This result is a specific instance of our earlier results that a standard regularization algorithm can be learned from examples. It illustrates that the natural constraints needed to solve a problemsin inverse optics can be extracted directly from a sufficient set of input data and the corresponding solutions. The learning procedure has been implemented as a parallel algorithm on the Connection Machine System.

AITR-908

Author[s]: John G. Harris

The Coupled Depth/Slope Approach to Surface Reconstruction

June 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-908.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-908.pdf

Reconstructing a surface from sparse sensory data is a well known problem in computer vision. Early vision modules typically supply sparse depth, orientation and discontinuity information. The surface reconstruction module incorporates these sparse and possibly conflicting measurements of a surface into a consistent, dense depth map. The coupled depth/slope model developed here provides a novel computational solution to the surface reconstruction problem. This method explicitly computes dense slope representation as well as dense depth representations. This marked change from previous surface reconstruction algorithms allows a natural integration of orientation constraints into the surface description, a feature not easily incorporated into earlier algorithms. In addition, the coupled depth/ slope model generalizes to allow for varying amounts of smoothness at different locations on the surface. This computational model helps conceptualize the problem and leads to two possible implementations- analog and digital. The model can be implemented as an electrical or biological analog network since the only computations required at each locally connected node are averages, additions and subtractions. A parallel digital algorithm can be derived by using finite difference approximations. The resulting system of coupled equations can be solved iteratively on a mesh-pf-processors computer, such as the Connection Machine. Furthermore, concurrent multi-grid methods are designed to speed the convergence of this digital algorithm.

AIM-907

Author[s]: Charles Rich and Richard C. Waters

Toward a Requirements Apprentice: On the Boundary Between Informal and Formal Specifications

July 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-907.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-907.pdf

Requirements acquisition is one of the most important and least well supported parts of the software development process. The Requirements Apprentice (RA) will assist a human analyst in the creation and modification of software requirements. Unlike current requirements analysis tools, which assume a formal description language, the focus of the RA is on the boundary between informal and formal specifications. The RA is intended to support the earliest phases of creating a requirement, in which incompleteness, ambiguity, and contradiction are inevitable features. From an artificial intelligence perspective, the central problem the RA faces is one of knowledge acquisition. It has to develop a coherent internal representation from an initial set of disorganized statements. To do so, the RA will rely on a variety of techniques, including dependency-directed reasoning, hybrid knowledge representation, and the reuse of common forms (clichés). The Requirements Apprentice is being developed in the context of the Programmer’s Apprentice project, whose overall goal is the creation of an intelligent assistant for all aspects of software development.

AITR-906

Author[s]: Robert Joseph Hall

Learning by Failing to Explain

May 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-906.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-906.pdf

Explanation-based Generalization requires that the learner obtain an explanation of why a precedent exemplifies a concept. It is, therefore, useless if the system fails to find this explanation. However, it is not necessary to give up and resort to purely empirical generalization methods. In fact, the system may already know almost everything it needs to explain the precedent. Learning by Failing to Explain is a method which is able to exploit current knowledge to prune complex precedents, isolating the mysterious parts of the precedent. The idea has two parts: the notion of partially analyzing a precedent to get rid of the parts which are already explainable, and the notion of re-analyzing old rules in terms of new ones, so that more general rules are obtained.

AITR-905

Author[s]: Van-Duc Nguyen

The Synthesis of Stable Force-Closure Grasps

July 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-905.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-905.pdf

This thesis addresses the problem of synthesizing grasps that are force-closure and stable. The synthesis of force-closure grasps constructs independent regions of contact for the fingertips, such that the motion of the grasped object is totally constrained. The synthesis of stable grasps constructs virtual springs at the contacts, such that the grasped object is stable, and has a desired stiffness matrix about its stable equilibrium. A grasp on an object is force-closure if and only if we can exert, through the set of contacts, arbitrary forces and moments on the object. So force-closure implies equilibrium exists because zero forces and moment is spanned. In the reverse direction, we prove that a non-marginal equilibrium grasp is also a force-closure grasp, if it has at least two point contacts with friction in 2D, or two soft- finger contacts or three hard-finger contacts in 3D. Next, we prove that all force-closure grasps can be made stable, by using either active or passive springs at the contacts. The thesis develops a simple relation between the stability and stiffness of the grasp and the spatial configuration of the virtual springs at the contacts. The stiffness of the grasp depends also on whether the points of contact stick, or slide without friction on straight or curved surfaces of the object. The thesis presents fast and simple algorithms for directly constructing stable fore- closure grasps based on the shape of the grasped object. The formal framework of force-closure and stable grasps provides a partial explanation to why we stably grasp objects to easily, and to why our fingers are better soft than hard.

AITR-904

Author[s]: Linda M. Wills

Automated Program Recognition

February 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-904.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-904.pdf

The key to understanding a program is recognizing familiar algorithmic fragments and data structures in it. Automating this recognition process will make it easier to perform many tasks which require program understanding, e.g., maintenance, modification, and debugging. This report describes a recognition system, called the Recognizer, which automatically identifies occurrences of stereotyped computational fragments and data structures in programs. The Recognizer is able to identify these familiar fragments and structures, even though they may be expressed in a wide range of syntactic forms. It does so systematically and efficiently by using a parsing technique. Two important advances have made this possible. The first is a language-independent graphical representation for programs and programming structures which canonicalizes many syntactic features of programs. The second is an efficient graph parsing algorithm.

AIM-903

Author[s]: Richard H. Lathrop, Robert J. Hall, and Robert S. Kirk

Functional Abstraction From Structure in VLSI Simulation Models

May 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-903.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-903.pdf

High-level functional (or behavioral) simulation models are difficult, time- consuming, and expensive to develop. We report on a method for automatically generating the program code for a high-level functional simulation model. The high-level model is produced directly from the program code for the circuit components’ functional models and a netlist description of their connectivity. A prototype has been implemented in LISP for the SIMMER functional simulator.

AIM-902

Author[s]: Richard H. Lathrop, Teresa A. Webster and Temple F. Smith

ARIADNE: Pattern-Directed Inference and Hierarchical Abstraction in Protein Structure Recognition

May 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-902.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-902.pdf

There are many situations in which a very detailed low-level description encodes, through a hierarchical organization, a recognizable higher-order pattern. The macro- molecular structural conformations of proteins exhibit higher order regularities whose recognition is complicated by many factors. ARIADNE searches for similarities between structural descriptors and hypothesized protein structure at levels more abstract than the primary sequence, based on differential similarity to rule antecedents and the controlled use of tentative higher-order structural hypotheses. Inference is grounded solely in knowledge derivable from the primary sequence, and exploits secondary structure predictions. A novel proposed alignment and functional domain identification of the aminoacyl-tRNA synthetases was found using this system.

AITR-901

Author[s]: Kenneth W. Haase, Jr.

ARLO: Another Representation Language Offer

October 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-901.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-901.pdf

This paper describes ARLO, a representation language loosely modelled after Greiner and Lenant’s RLL-1. ARLO is a structure-based representation language for describing structure-based representation languages, including itself. A given representation language is specified in ARLO by a collection of structures describing how its descriptions are interpreted, defaulted, and verified. This high level description is compiles into lisp code and ARLO structures whose interpretation fulfills the specified semantics of the representation. In addition, ARLO itself- as a representation language for expressing and compiling partial and complete language specifications- is described and interpreted in the same manner as the language it describes and implements. This self- description can be extended of modified to expand or alter the expressive power of ARLO’s initial configuration. Languages which describe themselves like ARLO- provide powerful mediums for systems which perform automatic self-modification, optimization, debugging, or documentation. AI systems implemented in such a self- descriptive language can reflect on their own capabilities and limitations, applying general learning and problem solving strategies to enlarge or alleviate them.

AITR-900

Author[s]: David Mark Siegel

Contact Sensors for Dexterous Robotic Hands

June 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-900.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-900.pdf

This thesis examines a tactile sensor and a thermal sensor for use with the Utah-MIT dexterous four fingered hand. Sensory feedback is critical or full utilization of its advanced manipulatory capabilities. The hand itself provides tendon tensions and joint angles information. However, planned control algorithms require more information than these sources can provide. The tactile sensor utilizes capacitive transduction with a novel design based entirely on silicone elastomers. It provides an 8 x 8 array of force cells with 1.9 mm center-to-center spacing. A pressure resolution of 8 significant bits is available over a 0 to 200 grams per square mm range. The thermal sensor measures a material’s heat conductivity by radiating heat into an object and measuring the resulting temperature variations. This sensor has a 4 x 4 array of temperature cells with 3.5 mm center-to- center spacing. Experiments show that the thermal sensor can discriminate among material by detecting differences in their thermal conduction properties. Both sensors meet the stringent mounting requirements posed by the Utah-MIT hand. Combining them together to form a sensor with both tactile and thermal capabilities will ultimately be possible. The computational requirements for controlling a sensor equipped dexterous hand are severe. Conventional single processor computers do not provide adequate performance. To overcome these difficulties, a computational architecture based on interconnecting high performance microcomputers and a set of software primitives tailored for sensor driven control has been proposed. The system has been implemented and tested on the Utah-MIT hand. The hand, equipped with tactile and thermal sensors and controlled by its computational architecture, is one of the most advanced robotic manipulatory devices available worldwide. Other ongoing projects will exploit these tools and allow the hand to perform tasks that exceed the capabilities of current generation robots.

AIM-899

Author[s]: Rodney A. Brooks

Achieving Artificial Intelligence through Building Robots

May 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-899.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-899.pdf

We argue that generally accepted methodologies of Artificial Intelligence research are limited in the proportion of human level intelligence they can be expected to emulate. We argue that the currently accepted decompositions and static representations used in such research are wrong. We argue for a shift to a process based model, with a decomposition based on task achieving behaviors as the organizational principle. In particular we advocate building robotic insects.

AIM-898

Author[s]: Kenneth W. Haase, Jr.

Discovery Systems

April 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-898.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-898.pdf

Cyrano is a thoughtful reimplementation of Lenat's controversial Eurisko program, designed to perform automated discovery and concept formation in a variety of technical fields. The 'thought' in the reimplementation has come from several directions: an appeal to basic principles, which led to identifying constraints of modularity and consistency on the design of discovery systems; an appeal to transparency, which led to collapsing more and more of the control structure into the representation; and an appeal to accountability, which led to the explicit specification of dependencies in the concept formation process. The process of reimplementing Lenat's work has already revealed several insights into the nature of Eurisko-like systems in general; these insights are incorporated into the design of Cyrano. Foremost among these new insights is the characterization of Eurisko-like systems (shich I call inquisitive systems) as search processes which dynamically reconfigure their search space by the formation of new concepts and representations. This insight reveals requirements for modularity and 'consistency' in the definition of new concepts and representations.

AIM-897

Author[s]: J. Marroquin, S. Mitter and T. Poggio

Probabilistic Solution of Ill-Posed Problems in Computational Vision

March 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-897.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-897.pdf

We formulate several problems in early vision as inverse problems. Among the solution methods we review standard regularization theory, discuss its limitations, and present new stochastic (in particular, Bayesian) techniques based on Markov Random Field models for their solution. We derive efficient algorithms and describe parallel implementations on digital parallel SIMD architectures, as well as a new class of parallel hybrid computers that mix digital with analog components.

AIM-896

Author[s]: Tomas Lozano-Perez

A Simple Motion Planning Algorithm for General Robot Manipulators

June 1986

ftp://publications.ai.mit.edu/ai-publication/500-999/AIM-896.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-896.pdf

This paper presents a simple and efficient algorithm, using configuration space, to plan collision-free motions for general manipulators. We describe an implementation of the algorithm for manipulators made up of revolute joints. The configuration-space obstacles for an n degree-of-freedom manipulator are approximated by sets of n-1 dimensional slices, recursively built up from one dimensional slices. This obstacle representation leads to an efficient approximation of the free space outside of the configuration-space obstacles.

AIM-895

Author[s]: Eric Sven Ristad

Defining Natural Language Grammars in GPSG

April 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-895.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-895.pdf

This paper is a formal analysis of whether generalized phrase structure grammar’s (GPSG) weak context-free generative power will allow it to achieve three of its central goals: (1) to characterize all and only the natural language grammars, (2) to algorithmically determine membership and generative power consequences of GPSG’s and (3) to embody the universalism of natural language entirely in the formal system. I prove that “=E*?” is undecidable for GPSGs and, on the basis of this result and the unnaturalness of E*, I argue that GPSG’s three goals and its weak context-free generative power conflict with each other: there is no algorithmic way of knowing whether any given GPSG generates a natural language or an unnatural one. The paper concludes with a diagnosis of the result and suggests that the problem might be met by abandoning the weak context-free framework and assuming substantive constraints.

AIM-894

Author[s]: Eric Sven Ristad

Computational Complexity of Current GPSG Theory

April 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-894.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-894.pdf

An important goal of computational linguistics has been to use linguistic theory to guide the construction of computationally efficient real- world natural language processing systems. At first glance, the entirely new generalized phrase structure grammar (GPSG) theory of Gazdar, Klein, Pullum, and Sag (1985) appears to be a blessing on two counts. First, their precise formal system and the broad empirical coverage of their published English grammar might be a direct guide for a transparent parser design and implementation. Second, since GPSG has weak context-free generative power and context-free languages can be parsed in O(n3) by a wide range of algorithms, GPSG parsers would appear to run in polynomial time. This widely-assumed GPSG “efficient parsbility” result is misleading: here we prove that the universal recognition problem for the new GPSG theory is exponentially-polynomial time hard, and assuredly intractable. The paper pinpoints sources of intractability (e.g. metarules and syntactic features in the GPSG formal system and concludes with some linguistically and computationally motivated restrictions on GPSG.

AIM-893

Author[s]: Walter Hamscher and Randall Davis

Issues in Model Based Troubleshooting

March 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-893.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-893.pdf

To determine why something has stopped working, it's helpful to know how it was supposed to work in the first place. This simple fact underlies recent work on a number of systems that do diagnosis from knowledge about the internal structure of behavior of components of the malfunctioning device. Recently much work has been done in this vein in many domains with an apparent diversity of techniques. But the variety of domains and the variety of computational mechanisms used to implement these systems tend to obscure two important facts. First, existing programs have similar mechanisms for generating and testing fault hypotheses. Second, most of these systems have similar built-in assumptions about both the devices being diagnosed and their failure modes; these assumptions in turn limit the generality of the programs. The purpose of this paper is to identify the problems and non- problems in model based troubleshooting. The non-problems are in generating and testing fault hypotheses about misbehaving components in simple static devices; a small core of largely equivalent techniques covers the apparent profusion of existing approaches. The problems occur with devices that aren't static, aren't simple and whose components fail in ways current programs don't hypothesize and hence can't diagnose.

AIM-890

Author[s]: Gary L. Drescher

Genetic AI: Translating Piaget into Lisp

February 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-890.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-890.pdf

This paper presents a constuctivist model of human cognitive development during infancy. According to constructivism, the elements of mental representation -- even such basic elements as the concept of physical object -- are constructed afresh by each individual, rather than being innately supplied. Here I propose a (partially specified, not yet implemented) mechanism, the Schema Mechanism; this mechanism is intended to achieve a series of cognitive constructions characteristic of infants' sensorimotor-stage development, primarily as described by Piaget. In reference to Piaget's 'genetic epistemology', I call this approach genetic AI -- 'genetic' not in the sense of genes, but in the sense of genesis: development from the point of origin.

AIM-888

Author[s]: Norberto Grzywacz and Alan Yuille

Massively Parallel Implementations of Theories for Apparent Motion

June 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-888.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-888.pdf

We investigate two ways of solving the correspondence problem for motion using the assumptions of minimal mapping and rigidity. Massively parallel analog networks are designed to implement these theories. Their effectiveness is demonstrated with mathematical proofs and computer simulations. We discuss relevant psychophysical experiments.

AIM-887

Author[s]: Chae H. An, Christopher G. Atkeson and John M. Hollerbach

Estimation of Inertial Parameters of Rigid Body Links of Manipulators

February 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-887.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-887.pdf

A method of estimating the mass, the location of center of mass, and the moments of inertia of each rigid body link of a robot during general manipulator movement is presented. The algorithm is derived from the Newton- Euler equations, and uses measurements of the joint torques as well as the measurement and calculation of the kinematics of the manipulator while it is moving. The identification equations are linear in the desired unknown parameters, and a modified least squares algorithm is used to obtain estimates of these parameters. Some of the parameters, however, are not identifiable due to restricted motion of proximal links and the lack of full force/torque sensing. The algorithm was implemented on the MIT Serial Link Direct Drive Arm. A good match was obtained between joint torques predicted from the estimated parameters and the joint torques computed from motor currents.

AIM-885

Author[s]: A.L. Yuille

Shape from Shading, Occlusion and Texture

May 1987

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-885.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-885.pdf

Shape from Shading, Occlusion and Texture are three important sources of depth information. We review and summarize work done on these modules.

AIM-883

Author[s]: Michael Erdmann and Tomas Lozano-Perez

On Multiple Moving Objects

May 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-883.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-883.pdf

This paper explores the motion planning problem for multiple moving objects. The approach taken consists of assigning priorities to the objects, then planning motions one object at a time. For each moving object, the planner constructs a configuration space-time that represents the time-varying constraints imposed on the moving object by the other moving and stationary objects. The planner represents this space-time approximately, using two-dimensional slices. The space-time is then searched for a collision-free path. The paper demonstrates this approach in two domains. One domain consists of translating planar objects; the other domain consists of two-link planar articulated arms.

AIM-882

Author[s]: John M. Hollerbach and Ki C. Suh

Redundancy Resolution of Manipulators through Torque Optimization

January 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-882.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-882.pdf

Methods for resolving kinematic redundancies of manipulators by the effect on joint torque are examined. When the generalized inverse is formulated in terms of accelerations and incorporated into the dynamics, the effect of redundancy resolution on joint torque can be directly reflected. One method chooses the joint acceleration null-space vector to minimize joint torque in a least squares sense; when the least squares is weighted by allowable torque range, the joint torques tend to be kept within their limits. Contrasting methods employing only the pseudoinverse with and without weighting by the inertia matrix are presented. The results show an unexpected stability problem during long trajectories for the null-space methods and for the inertia-weighted pseudoinverse method, but rarely for the unweighted pseudoinverse method. Evidently a whiplash action develops over time that thrusts the endpoint off the intended path, and extremely high torques are required to overcome these natural movement dynamics.

AIM-879

Author[s]: Aaron Bobick and Whitman Richards

Classifying Objects from Visual Information

June 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-879.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-879.pdf

Consider a world of 'objects.' Our goal is to place these objects into categories that are useful to the observer using sensory data. One criterion for utility is that the categories allow the observer to infer the object's potential behaviors, which are often non- observable. Under what condidtions can such useful categories be created? We propose a solution which requires 1.) that modes or clusters of natural structures are present in the world, and, 2.) that the physical properties of these structures are reflected in the sensory data used by the observer for classification. Given these two constraints, we explore the type of additional knowledge sufficient for the observer to generate an internal representation that makes explicit the natural modes. Finally we develop a formal expression of the object classification problem.

AIM-877

Author[s]: James H. Applegate, Michael R. Douglas, Yekta Gursel, Gerald Jay Sussman and Jack Wisdom

The Outer Solar System for 210 Million Years

February 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-877.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-877.pdf

We used a special purpose computer to integrate the orbits of the outer five planets for 100 Myr into the future and 100 Myr into the past. The strongest features in the Fourier transforms of the orbital elements of the Jovian planets can be indentified with the frequencies predicted by linear secular theory. Many of the weaker features in the Fourier spectra are identified as linear combinations of the basic frequencies. We note serious differences between our measurements and the predictions of Bretagnon (1974). The amplitude of the 3.796 Myr period libration of Pluto’s longitude of perihelion is modulated with a period of 34 Myr. Very long periods, on the order of 137 million years, are also seen.

AIM-876

Author[s]: Shahriar Negahdaripour and Alan Yuille

Direct Passive Navigation: Analytical Solution for Quadratic Patches

March 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-876.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-876.pdf

In this paper, we solve the problem of recovering the motion of an observer relative to a surface which can be locally approximated by a quadratic patch directly from image brightness values. We do not compute the optical flow as an intermediate step. We use the coefficients of the Taylor series expansion of the intensity function in two frames to determine 15 intermediate parameters, termed the essential parameters, from a set of linear equations. We then solve analytically for the motion and structure parameters from a set of nonlinear equations in terms of these intermediate parameters. We show that the solution is always unique, unlike some earlier results that reported two-fold ambiguities in some special cases.

AIM-875

Author[s]: Raul E. Valdes-Perez

Spatio-Temporal Reasoning and Linear Inequalities

May 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-875.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-875.pdf

Time and space are sufficiently similar to warrant in certain cases a common representation in AI problem-solving systems. What is represented is often the constraints that hold between objects, and a concern is the overall consistency of a set of constraints. This paper scrutinizes two current approaches to spatio-temporal reasoning. The suitableness of Allen’s temporal algebra for constraint networks is influenced directly by the mathematical properties of the algebra. These properties are extracted by a formulation as a network of set-theoretic relations, such that some previous theorems due to Montanari apply. Some new theorems concerning consistency of these temporal constraint networks are also presented.

AITR-874

Author[s]: Richard Elliot Robbins

BUILD: A Tool for Maintaining Consistency in Modular Systems

November 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-874.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-874.pdf

Build is a tool for keeping modular systems in a consistent state by managing the construction tasks (e.g. compilation, linking, etc.) associated with such systems. It employs a user supplied system model and a procedural description of a task to be performed in order to perform the task. This differs from existing tools which do not explicitly separate knowledge about systems from knowledge about how systems are manipulated. BUILD provides a static framework for modeling systems and handling construction requests that makes use of programming environment specific definitions. By altering the set of definitions, BUILD can be extended to work with new programming environments to perform new tasks.

AIM-873

Author[s]: Fanya S. Montalvo

Diagram Understanding: The Intersection of Computer Vision and Graphics

November 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-873.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-873.pdf

A problem common to Computer Vision and Computer Graphics is identified. It is the problem of representing, acquiring and validating symbolic descriptions of visual properties. The intersection of Computer Vision and Computer Graphics provides a basis for diagrammatic conversations between users and systems. I call this problem domain Diagram Understanding because of its analogy with Natural Language Understanding. The recognition and generation of visual objects from symbolic descriptions aare two sides of the same coin. A paradigm for the discovery and validation of higher-level visual properties is introduced. The paradigm involves two aspects. One is the notion of denotation: the map between symbolic descriptions and visual properties. The denotation map can be validated by focus on the conversation between users and a system. The second aspect involves a method for discovering a natural rich set of visual primitives. The notion of visual property is expanded, and the paradigm is further illustrated with a traditional business graphics example.

AIM-871

Author[s]: John C. Mallery, Roger Hurwitz and Gavan Duffy

Hermeneutics: From Textual Explication to Computer Understanding?

May 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-871.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-871.pdf

Hermeneutics, a branch of continental European philosophy concerned with human understanding and the interpretation of written texts, offers insights that may contribute to the understanding of meaning, translation, architectures for natural language understanding, and even to the methods suitable for scientific inquiry in AI. After briefly reviewing the historical development of hermeneutics as a method of interpretation, this article examines the contributions of hermeneutics to the human sciences. This background provides perspective for a review of recent hermeneutically-oriented AI research, including the Alker, Lehnert and Schneider computer-assisted techniques for coding the affective structure of narratives, the earlier positive proposal by Winograd and Bateman, the later pessimism of Winograd and Flores on the possibility of AI, as well as the system-building efforts of Duffey and Mallery.

AIM-870

Author[s]: Shimon Ullman

The Optical Flow of Planar Surfaces

December 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-870.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-870.pdf

The human visual system can recover the 3D shape of moving objects on the basis of motion information alone. Computational studies of this capacity have considered primarily non-planar rigid objects. With respect to moving planar surfaces, previous studies by Hay (1966), Tsai and Huang (1981), Longuet-Higgins (1984), have shown that the planar velocity field has in general a two-fold ambiguity: there are two different planes engaged in different motions that can induce the same velocity field. The current analysis extends the analysis of the planar velocity field in four directions: (1) the use of flow parameters of the type suggested by Koenderink and van Doorn (1975), (2) the exclusion of confusable non-planar solutions, (3) a new proof and a new method for computing the 3D motion and surface orientation, and (4) a comparison with the information available in orthographic velocity fields, which is important for determining the stability of the 3D recovery process.

AIM-869

Author[s]: John Batali

A Vision Chip

May 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-869.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-869.pdf

Some well understood and well justified algorithms for early visual processing must be implemented in hardware for later visual processing to be studied. This paper describes the design and hardware implementation of a particular operator of visual processing. I constructed an NMOS VLSI circuit that computes the gradient, and detects zero-crossings, in a digital video image in real time. The algorithms employed by the chip, the design process that led to it, and its capabilites and limitations are discussed. For hardware to be a useful tool for AI, designing it must be as much like programming as possible. This paper concludes with some discussion of how such a goal can be met.

AIM-868

Author[s]: Brian C. Williams

Circumscribing Circumscription: A Guide to Relevance and Incompleteness

October 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-868.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-868.pdf

Intelligent agents in the physical world must work from incomplete information due to partial knowledge and limited resources. An agent copes with these limitations by applying rules of conjecture to make reasonable assumptions about what is known. Circumscription, proposed by McCarthy, is the formalization of a particularly important rule of conjecture likened to Occam’s razor. That is, the set of all objects satisfying a certain property is the smallest set of objects that is consistent with what is known. This paper examines closely the properties and the semantics underlying circumscription, considering both its expressive power and limitations. In addition we study circumscription’s relationship to several related formalisms, such as negation by failure, the closed world assumption, default reasoning and Planner’s THNOT. In the discussion a number of extensions to circumscription are proposed, allowing one to tightly focus its scope of applicability. In addition, several new rules of conjecture are proposed based on the notions of relevance and minimality. Finally a synthesis between the approaches of McCarthy and Konolige is used to extend circumscription, as well as several other rules of conjecture, to account for resource limitations.

AIM-867

Author[s]: Daniel P. Huttenlocher

Exploiting Sequential Phonetic Constraints in Recognizing Spoken Words

October 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-867.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-867.pdf

Machine recognition of spoken language requires developing more robust recognition algorithms. The current paper extends the work of Shipman and Zue by investigating the power of partial phonetic descriptions. First we demonstrate that sequences of manner of articulation classes are more reliable and provide more constraint than other classes. Alone these are of limited utility, due to the high degree of variability in natural speech. This variability is not uniform, however, as most modifications and deletions occur in unstressed syllables. The stressed syllables provide substantially more constraint. This indicates that recognition algorithms can be made more robust by exploiting the manner of articulation information in stressed syllables.

AIM-865

Author[s]: Gul Agha and Carl Hewitt

Concurrent Programming Using Actors: Exploiting Large-Scale Parallelism

October 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-865.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-865.pdf

We argue that the ability to model shared objects with changing local states, dynamic reconfigurability, and inherent parallelism are desirable properties of any model of concurrency. The actor model addresses these issues in a uniform framework. This paper briefly describes the concurrent programming language Act3 and the principles that have guided its development. Act3 advances the state of the art in programming languages by combining the advantages of object-oriented programming with those of functional programming. We also discuss considerations relevant to large- scale parallelism in the context of open systems, and define an abstract model which establishes the equivalence of systems defined by actor programs.

AIM-864

Author[s]: Rodney A. Brooks

A Robust Layered Control System for a Mobile Robot

September 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-864.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-864.pdf

We describe a new architecture for controlling mobile robots. Layers of control system are built to let the robot operate at increasing levels of competence. Layers are made up of asynchronous modules which communicate over low bandwidth channels. Each module is an instance of a fairly simple computational machine. Higher level layers can subsume the roles of lower levels by suppressing their outputs. However, lower levels continue to function as higher levels are added. The result is a robust and flexible robot control system. The system is intended to control a robot that wanders the office areas of our laboratory building maps of its surroundings. In this paper we demonstrate the system controlling a detailed simulation of the robot.

AIM-863

Author[s]: Shahriar Negahdaripour

Direct Passive Navigation: Analytical Solution for Planes

August 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-863.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-863.pdf

In this paper, we derive a closed form solution for recovering the motion of an observer relative to a planar surface directly from image brightness derivatives. We do not compute the optical flow as an intermediate step, only the spatial and temporal intensity gradients at a minimum of 8 points. We solve a linear matrix equation for the elements of a 3x3 matrix. The eigenvalue decomposition of its symmetric part is then used to compute the motion parameters and the plane orientation.

AIM-862

Author[s]: Van-Duc Nguyen

The Synthesis of Stable Grasps in the Plane

October 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-862.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-862.pdf

This paper addresses the problem of synthesizing stable grasps on arbitrary planar polygons. Each finger is a virtual spring whose stiffnes and compression can be programmed. The contacts between the finger tips and the object are point contacts without friction. We prove that all force-closure grasps can be made stable, and it costs 0(n) time to synthesize a set of n virtual springs such that a given force closure grasp is stable. We can also choose the compliance center and the stiffness matrix of the grasp, and so choose the compliant behavior of the grasped object about its equilibrium. The planning and execution of grasps and assembly operations become easier and less sensitive to errors.

AIM-861

Author[s]: Nguyen, Van-Duc

The Synthesis of Force-Closure Grasps

September 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-861.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-861.pdf

This paper addresses the problem of synthesizing planar grasps that have force closure. A grasp on an object is a force closure grasp if and only if we can exert, through the set of contacts, arbitrary force and moment on this object. Equivalently, any motion of the object is resisted by a contact force, that is the object cannot break contact with the finger tips without some non-zero external work. The force closure constraint is addressed from three different points of view: mathematics, physics, and computational geometry. The last formulation results in fast and simple polynomial time algorithms for directly constructing force closure grasps. We can also find grasps where each finger has an independent region of contact on the set of edges.

AITR-860

Author[s]: Jose Luis Marroquin

Probabilistic Solution of Inverse Problems

September 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-860.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-860.pdf

In this thesis we study the general problem of reconstructing a function, defined on a finite lattice from a set of incomplete, noisy and/or ambiguous observations. The goal of this work is to demonstrate the generality and practical value of a probabilistic (in particular, Bayesian) approach to this problem, particularly in the context of Computer Vision. In this approach, the prior knowledge about the solution is expressed in the form of a Gibbsian probability distribution on the space of all possible functions, so that the reconstruction task is formulated as an estimation problem. Our main contributions are the following: (1) We introduce the use of specific error criteria for the design of the optimal Bayesian estimators for several classes of problems, and propose a general (Monte Carlo) procedure for approximating them. This new approach leads to a substantial improvement over the existing schemes, both regarding the quality of the results (particularly for low signal to noise ratios) and the computational efficiency. (2) We apply the Bayesian appraoch to the solution of several problems, some of which are formulated and solved in these terms for the first time. Specifically, these applications are: teh reconstruction of piecewise constant surfaces from sparse and noisy observationsl; the reconstruction of depth from stereoscopic pairs of images and the formation of perceptual clusters. (3) For each one of these applications, we develop fast, deterministic algorithms that approximate the optimal estimators, and illustrate their performance on both synthetic and real data. (4) We propose a new method, based on the analysis of the residual process, for estimating the parameters of the probabilistic models directly from the noisy observations. This scheme leads to an algorithm, which has no free parameters, for the restoration of piecewise uniform images. (5) We analyze the implementation of the algorithms that we develop in non-conventional hardware, such as massively parallel digital machines, and analog and hybrid networks.

AITR-859

Author[s]: Anita M. Flynn

Redundant Sensors for Mobile Robot Navigation

September 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-859.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-859.pdf

Redundant sensors are needed on a mobile robot so that the accuracy with which it perceives its surroundings can be increased. Sonar and infrared sensors are used here in tandem, each compensating for deficiencies in the other. The robot combines the data from both sensors to build a representation which is more accurate than if either sensor were used alone. Another representation, the curvature primal sketch, is extracted from this perceived workspace and is used as the input to two path planning programs: one based on configuration space and one based on a generalized cone formulation of free space.

AIM-858

Author[s]: Ellen C. Hildreth

Edge Detection

September 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-858.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-858.pdf

The goal of vision is to recover physical properties of objects in a scene, such as the location of object boundaries and the structure, color, and texture of object surfaces, from the two-dimensional image that is projected onto the eye or camera. The first clues about the physical properties of the scene are provided by the changes of intensity in the image. The importance of intensity changes and edges in early visual processing has led to extensive research on their detection, description, and use, both in computer and biological vision systems. This article reviews some of the theory that underlies the detection of edges and the methods used to carry out this analysis.

AIM-857

Author[s]: Ryszard S. Michalski and Patrick H. Winston

Variable Precision Logic

August 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-857.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-857.pdf

Variable precision logic is concerned with problems of reasoning with incomplete information and under time constraints. It offers mechanisms for handling trade-offs between the precision of inferences and the computational efficiency of deriving them. Of the two aspects of precision, the specificity of conclusions and the certainty of belief in them, we address here primarily the latter, and employ censored production rules as an underlying representational and computational mechanism. Such rules are created by augmenting ordinary production rules with an exception condition and are written in the form if A then D unless C, where C is the exception condition. From a control viewpoint, censored production rules are intended for situations in which the implication A {arrow} B holds frequently and the assertion C holds rarely. Systems using censored production rules are free to ignore the exception conditions, when time is a premium. Given more time, the exception conditions are examined, lending credibility to initial, high-speed answers, or changing them. Such logical systems therefore exhibit variable certainty of conclusions, reflecting variable investments of computational resources in conducting reasoning. From a logical viewpoint, the unless operator between B and C acts as the exclusive-or operator. From an expository viewpoint, the if A then B part of the censored production rule expresses an important information (e.g., a causal relationship), while the unless C part acts only as a switch that changes the polarity of B to –B when C holds. Expositive properties are captured quantitatively by augmenting censored rules with two parameters that indicate the certainty of the implication if A then B. Parameter 6 is the certainty when the truth value C is unknown, and 7 is the certainty when C is known to be false.

AIM-856

Author[s]: G. Edward Barton, Jr.

The Computational Complexity of Two-Level Morphology

November 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-856.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-856.pdf

Morphological analysis requires knowledge of the stems, affixes, combnatory patterns, and spelling-change processes of a language. The computational difficulty of the task can be clarified by investigating the computational characteristics of specific models of morphologial processing. The use of finite- state machinery in the “two-level” model by Kimmo Koskenicimi model does not guarantee efficient processing. Reductions of the satisfiability problem show that finding the proper lexical–surface correspondence in a two-level generation or recognition problem can be computationally difficult. However, another source of complexity in the existing algorithms can be sharply reduced by changing the implementation of the dictionary component. A merged dictionary with bit- vectors reduces the number of choices among alternative dictionary subdivisions by allowing several subdivisions to be searched at once.

AIM-855

Author[s]: W. Eric L. Grimson

Sensing Strategies for Disambiguating Among Multiple Objects in Known Poses

August 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-855.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-855.pdf

The need for intelligent interaction of a robot with its environment frequently requires sensing of the environment. Further, the need for rapid execution requires that the interaction between sensing and action take place using as little sensory data as possible, while still being reliable. Previous work has developed a technique for rapidly determining the feasible poses of an object from sparse, noisy, occluded sensory data. In this paper, we examine techniques for acquiring position and surface orientation data about points on the surfaces of objects, with the intent of selecting sensory points that will force a unique interpretation of the pose of the object with as few data points as possible. Under some simple assumptions about the sensing geometry, we derive a technique for predicting optimal sensing positions. The technique has been implemented and tested. To fully specify the algorithm, we need estimates of the error in estimating the position and orientation of the object, and we derive analytic expressions for such error for the case of one particular approach to object recognition.

AIM-854

Author[s]: Pyung H. Chang

A Closed Form Solution for Inverse Kinematics of Robot Manipulator with Redundancy

March 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-854.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-854.pdf

A closed form equation for inverse kinematics of manipulator with redundancy is derived, using the Lagrangian multiplier method. The proposed equation is proved to provide the exact equilibrium state for the resolved motion method. And is shown to be a general expression that yields the extended Jacobian method. The repeatability problem n the resolved motion method does not exist in the proposed equation. The equation is demonstrated to give more accurate trajectories than the resolved motion method.

AITR-853

Author[s]: Jonathan Hudson Connell

Learning Shape Descriptions: Generating and Generalizing Models of Visual Objects

September 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-853.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-853.pdf

We present the results of an implemented system for learning structural prototypes from grey-scale images. We show how to divide an object into subparts and how to encode the properties of these subparts and the relations between them. We discuss the importance of hierarchy and grouping in representing objects and show how a notion of visual similarities can be embedded in the description language. Finally we exhibit a learning algorithm that forms class models from the descriptions produced and uses these models to recognize new members of the class.

AITR-852

Author[s]: Margaret Morrison Fleck

Local Rotational Symmetries

August 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-852.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-852.pdf

This thesis describes a new representation for two-dimensional round regions called Local Rotational Symmetries. Local Rotational Symmetries are intended as a companion to Brady’s Smoothed Local Symmetry Representation for elongated shapes. An algorithm for computing Local Rotational Symmetry representations at multiple scales of resolution has been implemented and results of this implementation are presented. These results suggest that Local Rotational Symmetries provide a more robustly computable and perceptually accurate description of round regions than previous proposed representations. In the course of developing this representation, it has been necessary to modify the way both Smoothed Local Symmetries and Local Rotational Symmetries are computed. First, grey-scale image smoothing proves to be better than boundary smoothing for creating representations at multiple scales of resolution, because it is more robust and it allows qualitative changes in representations between scales. Secondly, it is proposed that shape representations at different scales of resolution be explicitly related, so that information can be passed between scales and computation at each scale can be kept local. Such a model for multi-scale computation is desirable both to allow efficient computation and to accurately model human perceptions.

AIM-849

Author[s]: John M. Hollerbach and Christopher G. Atkeson

Characterization of Joint-Interpolated Arm Movements

June 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-849.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-849.pdf

Two possible sets of planning variables for human arm movement are point angles and hand position. Although one might expect these possibilities to be mutually exclusive, recently an apparently contradictory set of data has appeared that indicated straight-line trajectories in both hand space and joint space at the same time. To assist in distinguishing between these viewpoints applied to the same data, we have theoretically characterized the set of trajectories derivable from a joint based planning strategy and have compared them to experimental measurements. We conclude that the apparent straight-lines in joint space happen to be artifacts of movement kinematics near the workspace boundary.

AIM-848A

Author[s]: Jonathan Rees and William Clinger (editors)

Revised Report on the Algorithmic Language Scheme

September 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-848a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-848a.pdf

Data and procedures and the values they amass, Higher-order functions to combine and mix and match, Objects with their local state, the message they pass, A property, a package, the control of point for a catch- In the Lambda Order they are all first-class. One thing to name them all, one things to define them, one thing to place them in environments and bind them, in the Lambda Order they are all first-class. Keywords: Scheme, Lisp, functional programming, computer languages.

AIM-848B

Author[s]: William Clinger and Jonathan Rees (editors)

Revised Report On The Algorithmic Language Scheme

November 1991

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-848b.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-848b.pdf

AIM-848

Author[s]: William Clinger (editor)

The Revised Revised Report on Scheme or An Uncommon Lisp

August 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-848.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-848.pdf

Data and procedures and the values they amass, Higher-order functions to combine and mix and match, Objects with their local state, the message they pass, A property, a package, the control of point for a catch- In the Lambda Order they are all first-class. One thing to name them all, one things to define them, one thing to place them in environments and bind them, in the Lambda Order they are all first-class. Keywords: SCHEME, LISP, functional programming, computer languages.

AIM-846

Author[s]: Ellen C. Hildreth and John M. Hollerbach

The Computational Approach to Vision and Motor Control

August 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-846.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-846.pdf

Over the past decade it has become increasingly clear that to understand the brain, we must study not only its biochemical and biophysical mechanisms and its outward perceptual and physical behavior. We also must study the brain at a theoretical level that investigated the computations that are necessary to perform its functions. The control of movements such as reaching, grasping and manipulating objects requires complex mechanisms that elaborate information form many sensors and control the forces generated by a large number of muscles. The act of seeing, which intuitively seems so simple and effortless, requires information processing whose complexity we are just beginning to grasp. A computational approach to the study of vision and motor tasks. This paper discusses a particular view of the computational approach and its relevance to experimental neuroscience.

AIM-845

Author[s]: Norberto M. Grzywacz and Ellen C. Hildreth

The Incremental Rigidity Scheme for Recovering Structure from Motion: Position vs. Velocity Based Formulations

October 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-845.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-845.pdf

Perceptual studies suggest that the visual system uses the “rigidity” assumption to recover three dimensional structures from motion. Ullman (1984) recently proposed a computational scheme, the incremental rigidity scheme, which uses the rigidity assumptions to recover the structure of rigid and non-rigid objects in motion. The scheme assumes the input to be discrete positions of elements in motion, under orthographic projection. We present formulations of Ullmans’ method that use velocity information and perspective projection in the recovery of structure. Theoretical and computer analyses show that the velocity based formulations provide a rough estimate of structure quickly, but are not robust over an extended time period. The stable long term recovery of structure requires disparate views of moving objects. Our analysis raises interesting questions regarding the recovery of structure from motion in the human visual system.

AITR-844

Author[s]: Gul Abdulnabi Agha

ACTORS: A Model of Concurrent Computation in Distributed Systems

June 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-844.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-844.pdf

A foundational model of concurrency is developed in this thesis. We examine issues in the design of parallel systems and show why the actor model is suitable for exploiting large-scale parallelism. Concurrency in actors is constrained only by the availability of hardware resources and by the logical dependence inherent in the computation. Unlike dataflow and functional programming, however, actors are dynamically reconfigurable and can model shared resources with changing local state. Concurrency is spawned in actors using asynchronous message-passing, pipelining, and the dynamic creation of actors. This thesis deals with some central issues in distributed computing. Specifically, problems of divergence and deadlock are addressed. For example, actors permit dynamic deadlock detection and removal. The problem of divergence is contained because independent transactions can execute concurrently and potentially infinite processes are nevertheless available for interaction.

AITR-843

Author[s]: Peter J. Sterpe

TEMPEST: A Template Editor for Structured Text

June 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-843.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-843.pdf

TEMPEST is a full-screen text editor that incorporates a structural paradigm in addition to the more traditional textual paradigm provided by most editors. While the textual paradigm treats the text as a sequence of characters, the structural paradigm treats it as a collection of named blocks which the user can define, group, and manipulate. Blocks can be defined to correspond to the structural features of he text, thereby providing more meaningful objects to operate on than characters of lines. The structural representation of the text is kept in the background, giving TEMPEST the appearance of a typical text editor. The structural and textual interfaces coexist equally, however, so one can always operate on the text from wither point of view. TEMPEST’s representation scheme provides no semantic understanding of structure. This approach sacrifices depth, but affords a broad range of applicability and requires very little computational overhead. A prototype has been implemented to illustrate the feasibility and potential areas of application of the central ideas. It was developed and runs on an IBM Personal Computer.

AIM-842

Author[s]: Tomas Lozano-Perez and Rodney A. Brooks

An Approach to Automatic Robot Programming

April 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-842.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-842.pdf

In this paper we propose an architecture for a new task-level system, which we call TWAIN. Task-level programming attempts to simplify the robot programming process but requiring that the user specify only goals for the physical relationships among objects, rather than the motions needed to achieve those goals. A task-level specification is meant to be completely robot independent; no positions or paths that depend on the robot geometry or kinematics are specified by the user. We have two goals for this paper. Th is first is to present a more unified t reatment of some individual pieces of r esearch in task planning, whose r elationship has not previously been d escribed. The second is to provide a new framework for further research in task- planning. This is a slightly modified version of a paper that appeared in Proceedings of Soli d Modeling by Computers: from Theory to A pplications, Research laboratories Sympo sium Series, sponsored by General Motors, Warren, Michigan, September 1983.

AIM-841

Author[s]: W. Eric L. Grimson and Tomas Lozano-Perez

Recognition and Localization of Overlapping Parts from Sparse Data

June 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-841.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-841.pdf

This paper discusses how sparse local measurements of positions and surface normals may be used to identify and locate overlapping objects. The objects are modeled as polyhedra (or polygons) having up to six degreed of positional freedom relative to the sensors. The approach operated by examining all hypotheses about pairings between sensed data and object surfaces and efficiently discarding inconsistent ones by using local constraints on: distances between faces, angles between face normals, and angles (relative to the surface normals) of vectors between sensed points. The method described here is an extension of a method for recognition and localization of non-overlapping parts previously described in [Grimson and Lozano- Perez 84] and [Gaston and Lozano-Perez 84].

AIM-840

Author[s]: Whitman Richards, Jan J. Koenderink and D.D. Hoffman

Inferring 3D Shapes from 2D Codons

April 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-840.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-840.pdf

All plane curves can be described at an abstract level by a sequence of five primitive elemental shapes, called “condons”, which capture the sequential relations between the singular points of curvature. The condon description provides a basis for enumerating all smooth 2D curves. Let each of these smooth plane be considered as the si lhouette of an opaque 3D object. Clearly an in finity of 3D objects can generate any one of ou r “condon” silhouettes. How then can we p redict which 3D object corresponds to a g iven 2D silhouette? To restrict the infinity of choices, we impose three mathematical properties of smooth surfaces plus one simple viewing constraint. The constraint is an extension of the notion of general position, and seems to drive our preferred inferences of 3D shapes, given only the 2D contour.

AIM-839

Author[s]: Jose L. Marroquin

Optimal Bayesian Estimators for Image Segmentation and Surface Reconstruction

April 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-839.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-839.pdf

sA very fruitful approach to the solution of image segmentation andssurface reconstruction tasks is their formulation as estimationsproblems via the use of Markov random field models and Bayes theory.sHowever, the Maximuma Posteriori (MAP) estimate, which is the one mostsfrequently used, is suboptimal in these cases. We show that forssegmentation problems the optimal Bayesian estimator is the maximizersof the posterior marginals, while for reconstruction tasks, thesthreshold posterior mean has the best possible performance. We presentsefficient distributed algorithms for approximating these estimates insthe general case. Based on these results, we develop a maximumslikelihood that leads to a parameter-free distributed algorithm forsrestoring piecewise constant images. To illustrate these ideas, thesreconstruction of binary patterns is discussed in detail.

AIM-838

Author[s]: Jean Ponce

Prism Trees: An Efficient Representation for Manipulating and Displaying Polyhedra with Many Faces

April 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-838.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-838.pdf

Computing surface and/or object intersections is a cornerstone of many algorithms in Geometric Modeling and Computer Graphics, for example Set Operations between solids, or surface Ray Casting display. We present an object centered, information preserving, hierarchical representation for polyhedra called Prism Tree. We use the representation to decompose the intersection algorithms into two steps: the localization of intersections, and their processing. When dealing with polyhedra with many faces (typically more than one thousand), the first step is by far the most expensive. The Prism Tree structure is used to compute efficiently this localization step. A preliminary implementation of the Set Operations and Ray casting algorithms has been constructed.

AIM-837

Author[s]: Eric Sven Ristad

GPSG-Recognition is NP-Hard

March 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-837.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-837.pdf

Proponents of generalized phrase structure grammar (GPSG) cite its weak context-free generative power as proof of the computational tractability of GPSG- Recognition. Since context-free languages (CFLs) can be parsed in time proportional to the cube of the sentence length, and GPSGs only generate CFLs, it seems plausible the GPSGs can also be parsed in cubic time. This longstanding, widely assumed GPSG “efficient parsability” result in misleading: parsing the sentences of an arbitrary GPSG is likely to be intractable, because a reduction from 3SAT proves that the universal recognition problem for the GPSGs of Gazdar (1981) is NP-hard. Crucially, the time to parse a sentence of a CFL can be the product of sentence length cubed and context-free grammar size squared, and the GPSG grammar can result in an exponentially large set of derived context-free rules. A central object in the 1981 GPSG theory, the metarule, inherently results in an intractable parsing problem, even when severely constrained. The implications for linguistics and natural language parsing are discussed.

AIM-836

Author[s]: Robert C. Berwick and Amy S. Weinberg

Parsing and Linguistic Explanation

April 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-836.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-836.pdf

This article summarizes and extends recent results linking deterministic parsing to observed “locality principles” in syntax. It also argues that grammatical theories based on explicit phrase structure rules are unlikely to provide comparable explanations of why natural languages are built the way they are.

AIM-835

Author[s]: John M. Rubin and W.A. Richards

Boundaries of Visual Motion

April 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-835.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-835.pdf

A representation of visual motion convenient for recognition shouldsmake prominent the qualitative differences among simple motions. Wesargue that the first stage in such a motion representation is to makesexplicit boundaries that we define as starts, stops, and forcesdiscontinuities. When one of these boundaries occurs in motion, humansobservers have the subjective impression that some fleeting,ssignificant event has occurred. We go farther and hypothesize that onesof the subjective motion boundaries is seen if and only if one of oursdefined boundaries occurs. We enumerate all possible motion boundariessand provide evidence that they are psychologically real.

AITR-834

Author[s]: Peter Merrett Andreae

Justified Generalization: Acquiring Procedures from Examples

January 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-834.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-834.pdf

This thesis describes an implemented system called NODDY for acquiring procedures from examples presented by a teacher. Acquiring procedures form examples involves several different generalization tasks. Generalization is an underconstrained task, and the main issue of machine learning is how to deal with this underconstraint. The thesis presents two principles for constraining generalization on which NODDY is based. The first principle is to exploit domain based constraints. NODDY demonstrated how such constraints can be used both to reduce the space of possible generalizations to manageable size, and how to generate negative examples out of positive examples to further constrain the generalization. The second principle is to avoid spurious generalizations by requiring justification before adopting a generalization. NODDY demonstrates several different ways of justifying a generalization and proposes a way of ordering and searching a space of candidate generalizations based on how much evidence would be required to justify each generalization. Acquiring procedures also involves three types of constructive generalizations: inferring loops (a kind of group), inferring complex relations and state variables, and inferring predicates. NODDY demonstrates three constructive generalization methods for these kinds of generalization.

AIM-833

Author[s]: Tomaso Poggio, Harry Voorhees and Alan Yuille

A Regularized Solution to Edge Detection

April 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-833.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-833.pdf

We consider edge detection as the problem of measuring and localizing changes of light intensity in the image. As discussed by Torre and Poggio (1984), edge detection, when defined in this way, is an ill-posed problem in the sense of Hadamard. The regularized solution that arises is then the solution to a variational principle. In the case of exact data, one of the standard regularization methods (see Poggio and Torre, 1984) leads to cubic spline interpolation before differentiation. We show that in the case of regularly-spaced data this solution corresponds to a convolution filter---to be applied to the signal before differentiation -- which is a cubic spline. In the case of non-exact data, we use another regularization method that leads to a different variational principle. We prove (1) that this variational principle leads to a convolution filter for the problem of one-dimensional edge detection, (2) that the form of this filter is very similar to the Gaussian filter, and (3) that the regularizing parameter $lambda$ in the variational principle effectively controls the scale of the filter.

AIM-832

Author[s]: Alessandro Verri and Alan Yuille

Perspective Projection Invariants

February 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-832.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-832.pdf

An important part of stereo vision consists of finding and matching points in two images which correspond to the same physical element in the scene. We show that zeros of curvature of curves are perspective projection invariants and can therefore be used to find corresponding points. They can be used to help solve the registration problem (Longuet- Higgins, 1982) and to obtain the correct depth when a curve enters the forbidden zone (Krol and van de Grind, 1982). They are also relevant to theories for representing image curves. We consider the stability of these zeros of curvature.

AIM-829

Author[s]: Kent M. Pitman

CREF: An Editing Facility for Managing Structured Text

February 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-829.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-829.pdf

This paper reports work in progress on an experimental text editor called CREF, the Cross Referenced Editing Facility. CREF deals with chunks of text, called segments, which may have associated features such as keywords or various kinds of links to other segments. Text in CREF is organized into linear collections for normal browsing. The use of summary and cross-reference links in CREF allows the imposition of an auxiliary network structure upon the text which can be useful for “zooming in and out” or “non-local transitions.” Although it was designed as a tool for use in complex protocol analysis by a “knowledge Engineer’s Assistant,” CREF has many interesting features which should make it suitable for a wide variety of applications, including browsing, program editing, document preparation, and mail reading.

AIM-828

Author[s]: Philip E. Agre

Routines

May 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-828.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-828.pdf

Regularities in the word give rise to regularities in the way which we deal with the world. That is to say, we fall into routines. I have been studying the phenomena of routinization, the process by which institutionalized patterns of interaction with the world arise and evolve in everyday life. Underlying this evolution is a dialectical process of internalization. First you build a model of some previously unarticulated emergent aspect of an existing routine. Armed with an incrementally more global view of interaction, you can often formulate an incrementally better informed plan of attack. A routine is not a plan in the sense of the classical planning literature, except in the theoretical limit of this process. I am implementing this theory using running arguments, a technique for writing rule-based programs for intelligent agents. Because a running argument is compiled into TMS networks as it proceeds, incremental changes in the world require only incremental recomputation of the reasoning about what actions to take next. The system supports a style of programming, dialectival argumentation that had many important properties that recommend it as a substrate for large AI systems. One of these might be called additivity: an agent can modify its reasoning in a class of situations by adducing arguments as to why its previous arguments were incorrect in those cases. Because no side-effects are ever required, reflexive systems based on dialectical argumentation ought to be less fragile than intuition and experience suggest. I outline the remaining implementation problems.

AIM-826

Author[s]: Michael Drumheller

Mobile Robot Localization Using Sonar

January 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-826.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-826.pdf

This paper describes a method by which range data from a sonar or other type of rangefinder can be used to determine the 2- dimensional position and orientation of a mobile robot inside a room. The plan of the room is modeled as a list of segments indicating the positions of walls. The method works by extracting straight segments from the range data and examining all hypotheses about pairings between the segments and walls in the model of the room. Inconsistent pairings are discarded efficiently by using local constraints based on distances between walls, angles between walls, and ranges between walls along their normal vectors. These constraints are used to obtain a small set of possible positions, which is further pruned using a test for physical consistency. The approach is extremely tolerant of noise and clutter. Transient objects such as furniture and people need not be included in the room model, and very noisy, low- resolution sensors can be used. The algorithm’s performance is demonstrated using Polaroid Ultrasonic Rangefinder, which is a low-resolution, high-noise sensor.

AIM-825

Author[s]: S. Murray Sherman and Christof Koch

The Anatomy and Physiology of Gating Retinal Signals in the Mammalian Lateral Geniculate Nucleus

June 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-825.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-825.pdf

In the mammalian visual system, the lateral geniculate nucleus is commonly thought to act merely as a relay for the transmission of visual information from the retina to the visual cortex, a relay without significant elaboration in receptive field properties or signal strength. However, many morphological and electrophysiological observations are at odds with this view. In this paper, we will review the different anatomical pathways and biophysical mechanisms possibly implementing a selective gating of visual information flow from the retina to the visual cortex. We will argue that the lateral geniculate nucleus in mammals is one of the earliest sites where selective, visual attention operates and where general changes in neuronal excitability as a function of the behavioral states of the animal, for instance, sleep, paradoxical sleep, arousal, etc., occur.

AIM-824

Author[s]: Jean Ponce and Michael Brady

Toward a Surface Primal Sketch

April 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-824.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-824.pdf

This paper reports progress toward the development of a representation of significant surface changes in dense depth maps. We call the representation the Surface Primal Sketch by analogy with representation of intensity changes, image structure, and changes in curvature of planar curves. We describe an implemented program that detects, localizes, and symbolically describes: steps, where the surface height function is discontinuous; roofs, where the surface is continuous but the surface normal is discontinuous; smooth joins, where the surface normal is continuous but a principle curvature is discontinuous and changes sign; and shoulders, which consists of two roofs and correspond to a step viewed obliquely. We illustrate the performance of the program on range maps of objects of varying complexity.

AIM-823

Author[s]: Jonathan H. Connell and Michael Brady

Generating and Generalizing Models of Visual Objects

July 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-823.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-823.pdf

We report on initial experiments with an implemented learning system whose inputs are images of two-dimensional shapes. The system first builds semantic network descriptions of shapes based on Brady’s smoothed local symmetry representation. It learns shape models form them using a substantially modified version of Winston’s ANALOGY program. A generalization of Gray coding enables the representation to be extended and also allows a single operation, called ablation, to achieve the effects of many standard induction heuristics. The program can learn disjunctions, and can learn concepts suing only positive examples. We discuss learnability and the pervasive importance of representational hierarchies.

AIM-822

Author[s]: Michael Brady, Jean Ponce, Alan Yuille and Haruo Asada

Describing Surfaces

January 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-822.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-822.pdf

This paper continues our work on visual representation s of three-dimensional surfaces [Brady and Yuille 1984b]. The theoretical component of our work is a study of classes of surface curves as a source of constraint n the surface on which they lie, and as a basis for describing it. We analyze bounding contours, surface intersections, lines of curvature, and asymptotes. Our experimental work investigates whether the information suggested by our theoretical study can be computed reliably and efficiently. We demonstrate algorithms that compute lines of curvature of a (Gaussian smoothed) surface; determine planar patches and umbilic regions; extract axes of surfaces of revolution and tube surfaces. We report preliminary results on adapting the curvature primal sketch algorithms of Asada and Brady [1984] to detect and describe surface intersections.

AIM-821

Author[s]: Shahriar Negahdaripour and Berthold K.P. Horn

Direct Passive Navigation

February 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-821.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-821.pdf

In this paper, we show how to recover the motion of an observer relative to a planar surface directly from image brightness derivatives. We do not compute the optical flow as an intermediate step. We derive a set of nine non-linear equations using a least- squares formulation. A simple iterative scheme allows us to find either of two possible solutions of these equations. An initial pass over the relevant image region is used to accumulate a number of moments of the image brightness derivatives. All of the quantities used in the iteration can be efficiently computed from these totals, without the need to refer back to the image. A new, compact notation allows is to show easily that there are at most two planar solutions. Key words: Passive Navigation, Optical flow, Structure and Motion, Least Squares, Planar surface, Non-linear Equations, Dial Solution, Planar Motion Field Equation.

AIM-820

Author[s]: Michael J. Brooks and Berthold K.P. Horn

Shape and Source from Shading

January 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-820.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-820.pdf

Well-known methods for solving the shape- from-shading problem require knowledge of the reflectance map. Here we show how the shape-from-shading problem can be solved when the reflectance map is not available, but is known to have a given form with some unknown parameters. This happens, for example, when the surface is known to be Lambertian, but the direction to the light source is not known. We give an iterative algorithm that alternately estimates the surface shape and the light source direction. Use of the unit normal in parameterizing the reflectance map, rather than the gradient or stereographic coordinates, simpliflies the analysis. Our approach also leads to an iterative scheme for computing shape from shading that adjusts the current estimates of the focal normals toward or away from the direction of the light source. The amount of adjustment is proportional to the current difference between the predicted and the observed brightness. We also develop generalizations to less constrained forms of reflectance maps.

AIM-817

Author[s]: A. Hurlbert and T. Poggio

Spotlight on Attention

April 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-817.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-817.pdf

We review some recent psychophysical, psychological and anatomical data which highlight the important role of attention in visual information processing, and discuss the evidence for a serial spotlight of attention. We point out the connections between the questions raised by the spotlight model and computational results on the intrinsic parallelism of several tasks in vision.

AIM-816

Author[s]: Richard C. Waters

PP: A LISP Pretty Printing System

December 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-816.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-816.pdf

The PP system provides an efficient implementation of the Common Lisp pretty printing function PPRINT. In addition, PP goes beyond ordinary pretty printers by providing mechanisms which allow the user to control the exact form of pretty printed output. This is done by extending LISP in two ways. First, several new FORMAT directives are provided which support dynamic decisions about the placement of newlines based on the line width available for output. Second, the concept of print-self methods is extended so that it can be applied to lists as well as to objects which can receive messages. Together, these extensions support pretty printing of both programs and data structures. The PP system also modifies the way that the Lisp printer handles the abbreviation of output. The traditional mechanisms for abbreviating lists based on nesting depth and length are extended so that they automatically apply to every kind of structure without the user having to take any explicit action when writing print-self methods. A new abbreviation mechanism introduced which can be used to limit the total number of lines printed.

AIM-815

Author[s]: Kenneth Man-Kam Yip

Tense, Aspect and the Cognitive Representation of Time

December 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-815.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-815.pdf

This paper explores the relationships between a computation theory of temporal representation (as developed by James Allen) and a formal linguistic theory of tense (as developed by Norbert Hornstein) and aspect. It aims to provide explicit answers to four fundamental questions: (1) what is the computational justification for the primitive of a linguistic theory; (2) what is the computational explanation of the formal grammatical constraints; (3) what are the processing constraints imposed on the learnability and markedness of these theoretical constructs; and (4) what are the constraints that a linguistic theory imposes on representations. We show that one can effectively exploit the interface between the language faculty and the cognitive faculties by using linguistic constraints to determine restrictions on the cognitive representation and vice versa. Three main results are obtained: (1) We derive an explanation of an observed grammatical constraint on tense—the Linear Order Constraint—from the information monotonicity property of the constraint propagation algorithm of Allen’s temporal system: (2) We formulate a principle of markedness for the basic tense structures based on the computational efficiency of the temporal representations; and (3) We show Allen’s interval-based temporal system is not arbitrary, but it can be used to explain independently motivated linguistic constraints on tense and aspect interpretations. We also claim that the methodology of research developed in this study—“cross- level” investigation of independently motivated formal grammatical theory and computational models—is a powerful paradigm with which to attack representational problems in basic cognitive domains, e.g., space, time, causality, etc.

AIM-813

Author[s]: Berthold K.P. Horn

The Variational Approach to Shape from Shading

March 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-813.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-813.pdf

We develop a systematic approach to the discovery of parallel iterative schemes for solving the shape-from-shading problem on a grid. A standard procedure for finding such schemes is outlines, and subsequently used to derive several new ones. The shape-from- shading problem is known to be mathematically equivalent to a non-linear first- order partial differential equation in surface elevation. To avoid the problems inherent in methods used to solve such equations, we follow previous work in reformulating the problem as one of finding a surface orientation field that minimizes the integral of the brightness error. The calculus of variations is then employed to derive the appropriate Euler equations on which iterative schemes can be based. The problem of minimizing the integral of the brightness error term it ill posed, since it has an infinite number of solutions in terms of surface orientation fields. A previous method used a regularization technique to overcome this difficulty. An extra term was added to the integral to obtain an approximation to a solution that was as smooth as possible. We point out here that surface orientation has to obey an integrability constraint if it is to correspond to an underlying smooth surface. Regularization methods do not guarantee that the surface orientation recovered satisfies this constraint. Consequently, we attempt to develop a method that enforces integrability, but fail to find a convergent iterate scheme based on the resulting Euler equations. We show, however, that such a scheme can be derived if, instead of strictly enforcing the constraint, a penalty term derived from the constraint is adopted. This new scheme, while it can be expressed simply and elegantly using the surface gradient, unfortunately cannot deal with constraints imposed by occluding boundaries. These constraints are crucial if ambiguities in the solution of the shape-from shading problem are to be avoided, Different schemes result if one uses different parameters to describe surface orientation We derive two new schemes, using unit surface normals, that facilitate the incorporation of the occluding boundary information. These schemes, while more complex, have several advantages over previous ones.

AIM-812

Author[s]: G. Edward Barton, Jr.

On the Complexity of ID/LP Parsing

December 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-812.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-812.pdf

Recent linguistic theories cast surface complexity as the result of interacting subsystems of constraints. For instance, the ID/LP grammar formalism separates constraints on immediate dominance from those on linear order. Shieber (1983) has shown how to carry out direct parsing of ID/LP grammars. His algorithm uses ID and LP constraints directly in language processing, without expanding them into a context-free “object grammar.” This report examines the computational difficulty of ID/LP parsing. Shieber’s purported O (G square times n cubed) runtime bound underestimated the difficulty of ID/LP parsing; the worst-case runtime of his algorithm is exponential in size. A reduction of the vertex-cover problem proves that ID/LP parsing is NP-complete. The growth of the internal data structures is the source of difficulty in Shieber’s algorithm. The computational and linguistic implications of these results are discussed. Despite the potential for combinatorial explosion, Shieber’s algorithm remains better than the alternative of parsing an expanded object grammar.

AIM-811

Author[s]: Richard J. Doyle

Hypothesizing and Refining Causal Models

December 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-811.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-811.pdf

An important common sense competence is the ability to hypothesize causal relations. This paper presents a set of constraints which make the problem of formulating causal hypotheses about simple physical systems a tractable one. The constraints include: (1) a temporal and physical proximity requirement, (2) a set of abstract causal explanations for changes in physical systems in terms of dependences between quantities, and (3) a teleological assumption that dependences in designed physical systems are functions. These constraints were embedded in a learning system which was tested in two domains: a sink and a toaster. The learning system successfully generated and refined naïve causal models of these simple physical systems. The causal models which emerge from the learning process support causal reasoning- explanation, prediction, and planning. Inaccurate predictions and failed plans in turn indicate deficiencies in the causal models and the need to re- hypothesize. Thus learning supports reasoning which leads to further learning. The learning system makes use of standard inductive rules of inference as well as the constraints on causal hypotheses to generalize its causal models. Finally, a simple example involving an analogy illustrates another way to repair incomplete causal models.

AITR-810

Author[s]: Michael Andreas Erdmann

On Motion Planning with Uncertainty

August 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-810.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-810.pdf

Robots must successfully plan and execute tasks in the presence of uncertainty. Uncertainty arises from errors in modeling, sensing, and control. Planning in the presence of uncertainty constitutes one facet of the general motion planning problem in robotics. This problem is concerned with the automatic synthesis of motion strategies from high level task specification and geometric models of environments. In order to develop successful motion strategies, it is necessary to understand the effect of uncertainty on the geometry of object interactions. Object interactions, both static and dynamic, may be represented in geometrical terms. This thesis investigates geometrical tools for modeling and overcoming uncertainty. The thesis describes an algorithm for computing backprojections o desired task configurations. Task goals and motion states are specified in terms of a moving object’s configuration space. Backprojections specify regions in configuration space from which particular motions are guaranteed to accomplish a desired task. The backprojection algorithm considers surfaces in configuration space that facilitate sliding towards the goal, while avoiding surfaces on which motions may prematurely halt. In executing a motion for a backprojection region, a plan executor must be able to recognize that a desired task has been accomplished. Since sensors are subject to uncertainty, recognition of task success is not always possible. The thesis considers the structure of backprojection regions and of task goals that ensures goal recognizability. The thesis also develops a representation of friction in configuration space, in terms of a friction cone analogous to the real space friction cone. The friction cone provides the backprojection algorithm with a geometrical tool for determining points at which motions may halt.

AIM-809

Author[s]: Ronald S. Fearing

Simplified Grasping and Manipulation with Dextrous Robot Hands

November 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-809.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-809.pdf

A method is presented for stably grasping 2 dimensional polygonal objects with a dextrous hand when object models are not avaiable. Basic constraints on object vertex angles are found for feasible grasping with two fingers. Local tactile information can be used to determine the finger motion that will reach feasible grasping locations. With an appropriate choice of finger stiffness, a hand can automatically grasp these objects with two fingers. The bounded slip of a part in a hand is shown to be valuable for adapting the fingers and object to a stable situation. Examples are given to show the ability of this grasping method to accomodate disturbance forces and to perform simple part reorientations and regrasping operations.

AITR-807

Author[s]: Andrew Lewis Ressler

A Circuit Grammar For Operational Amplifier Design

January 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-807.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-807.pdf

Electrical circuit designers seldom create really new topologies or use old ones in a novel way. Most designs are known combinations of common configurations tailored for the particular problem at hand. In this thesis I show that much of the behavior of a designer engaged in such ordinary design can be modelled by a clearly defined computational mechanism executing a set of stylized rules. Each of my rules embodies a particular piece of the designer’s knowledge. A circuit is represented as a hierarchy of abstract objects, each of which is composed of other objects. The leaves of this tree represent the physical devices from which physical circuits are fabricated. By analogy with context-free languages, a class of circuits is generated by a phrase-structure grammar of which each rule describes how one type of abstract object can be expanded into a combination of more concrete parts. Circuits are designed by first postulating an abstract object which meets the particular design requirements. This object is then expanded into a concrete circuit by successive refinement using rules of my grammar. There are in general many rules which can be used to expand a given abstract component. Analysis must be done at each level of the expansion to constrain the search to a reasonable set. Thus the rule of my circuit grammar provide constraints which allow the approximate qualitative analysis of partially instantiated circuits. Later, more careful analysis in terms of more concrete components may lead to the rejection of a line of expansion which at first looked promising. I provide special failure rules to direct the repair in this case.

AIM-806

Author[s]: John Canny

Collision Detection for Moving Polyhedra

October 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-806.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-806.pdf

We consider the problem of moving a three dimensional solid object among polyhedral obstacles. The traditional formulation of configuration space for this problem uses three translational parameters and three angles (typically Euler angles), and the constraints between the object and obstacles involve transcendental functions. We show that a quaternion representation of rotation yields constraints which are purely algebraic in a higher-dimensional space. By simple manipulation, the constraints may be projected down into a six dimensional space with no increase in complexity. Using this formulation, we derive an efficient exact intersection test for an object which is translating and rotating among obstacles.

AIM-805

Author[s]: Michael A. Gennert

Any Dimensional Reconstruction from Hyperplanar Projections

October 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-805.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-805.pdf

In this paper we examine the reconstruction of functions of any dimension from hyperplanar projections. This is a generalization of a problem that has generated much interest recently, especially in the field of medical imaging. Computed Axial Tomography (CAT) and Nuclear Magnetic Resonance (NMR) are two medical techniques that fall in this framework. CAT scans measure the hydrogen density along planes through the body. Here we will examine reconstruction methods that involve backprojecting the projection data and summing this over the entire region of interest. There are two methods for doing this. One method is to filter the projection data first, and then backproject this filtered data and sum over all projection directions. The other method is to backproject and sum the projection data first, and then filter. The two methods are mathematically equivalent, producing very similar equations. We will derive the reconstruction formulas for both methods for any number of dimensions. We will examine the cases of two and three dimensions, since these are the only ones encountered in practice. The equations are very different for these cases. In general, the equations are very different for even and odd dimensionality. We will discuss why this is so, and show that the equations for even and odd dimensionality are related by the Hilbert Transform.

AIM-804

Author[s]: Gideon Sahar and John M. Hollerbach

Planning of Minimum-Time Trajectories for Robot Arms

November 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-804.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-804.pdf

The minimum-time for a robot arm has been a longstanding and unsolved problem of considerable interest. We present a general solution to this problem that involves joint- space tesselation, a dynamic time-scaling algorithm, and graph search. The solution incorporates full dynamics of movement and actuator constraints, and can be easily extended for joint limits and work space obstacles, but is subject to the particular tesselation scheme used. The results presented show that, in general the optimal paths are not straight lines, bit rather curves in joint-space that utilize the dynamics of the arm and gravity to help in moving the arm faster to its destination. Implementation difficulties due to the tesselation and to combinatorial proliferation of paths are discussed.

AIM-803

Author[s]: Demetri Terzopoulos

Multigrid Relaxation Methods and the Analysis of Lightness, Shading and Flow

October 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-803.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-803.pdf

Image analysis problems, posed mathematically as variational principles or as partial differential equations, are amenable to numerical solution by relaxation algorithms that are local, iterative, and often parallel. Although they are well suited structurally for implementation on massively parallel, locally- interconnected computational architectures, such distributed algorithms are seriously handicapped by an inherent inefficiency at propagating constraints between widely separated processing elements. Hence, they converge extremely slowly when confronted by the large representations necessary for low- level vision. Application of multigrid methods can overcome this drawback, as we established in previous work on 3-D surface reconstruction. In this paper, we develop efficient multiresolution iterative algorithms for computing lightness, shape-from-shading, and optical flow, and we evaluate the performance of these algorithms on Synthetic images. The multigrid methodology that we describe is broadly applicable in low-level vision. Notably, it is an appealing strategy to use in conjunction with regularization analysis for the efficient solution of a wide range of ill- posed visual reconstruction problems.

AITR-802

Author[s]: David Chapman

Planning for Conjunctive Goals

November 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-802.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-802.pdf

The problem of achieving conjunctive goals has been central to domain independent planning research; the nonlinear constraint- posting approach has been most successful. Previous planners of this type have been comlicated, heuristic, and ill-defined. I have combined and distilled the state of the art into a simple, precise, implemented algorithm (TWEAK) which I have proved correct and complete. I analyze previous work on domain- independent conjunctive planning; in retrospect it becomes clear that all conjunctive planners, linear and nonlinear, work the same way. The efficiency of these planners depends on the traditional add/delete-list representation for actions, which drastically limits their usefulness. I present theorems that suggest that efficient general purpose planning with more expressive action representations is impossible, and suggest ways to avoid this problem.

AIM-801

Author[s]: Kent Pitman

The Description of Large Systems

September 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-801.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-801.pdf

In this paper we discuss the problems associated with the description and manipulation of large systems when their sources are not maintained as single fields. We show why and how tools that address these issues, such as Unix MAKE and Lisp Machine DEFSYSTEM, have evolved. Existing formalisms suffer from the problem that their syntax is not easily separable from their functionality. In programming languages, standard “calling conventions” exist to insulate the caller of a function from the syntactic details of how that function was defined, but until now no such conventions have existed to hide consumers of program systems from the details of how those systems were specified. We propose a low-level data abstraction which can support notations such as those used by MAKE and DEFSYSTEM without requiring that the introduction of a new notation be accompanied by a completely different set of tools for instantiating or otherwise manipulating the resulting system. Lisp is used for presentation, bit the issues are not idiosyncratic to LISP.

AIM-800

Author[s]: Demetri Terzopoulos

Computing Visible-Surface Representations

March 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-800.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-800.pdf

The low-level interpretation of images provides constraints on 3D surface shape at multiple resolutions, but typically only at scattered locations over the visual field. Subsequent visual processing can be facilitated substantially if the scattered shape constraints are immediately transformed into visible-surface representations that unambiguously specify surface shape at every image point. The required transformation is shown to lead to an ill-posed surface reconstruction problem. A well-posed variational principle formulation is obtained by invoking 'controlled continuity,' a physically nonrestrictive (generic) assumption about surfaces which is nonetheless strong enough to guarantee unique solutions. The variational principle, which admits an appealing physical interpretation, is locally discretized by applying the finite element method to a piecewise, finite element representation of surfaces. This forms the mathematical basis of a unified and general framework for computing visible-surface representations. The computational framework unifies formal solutions to the key problems of (i) integrating multiscale constraints on surface depth and orientation from multiple visual sources, (ii) interpolating these scattered constraints into dense, piecewise smooth surfaces, (iii) discovering surface depth and orientation discontinuities and allowing them to restrict interpolation appropriately, and (iv) overcoming the immense computational burden of fine resolution surface reconstruction. An efficient surface reconstruction algorithm is developed. It exploits multiresolution hierarchies of cooperative relaxation processes and is suitable for implementation on massively parallel networks of simple, locally interconnected processors. The algorithm is evaluated empirically in a diversity of applications.

AIM-798

Author[s]: Hormoz Mansour

The Use of Censors for Nonmonotonic Reasoning and Analogy in Medical Desicion-Making

November 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-798.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-798.pdf

A patient rarely has a single, isolated disease. The situation is usually much more complex since the different parts of the human organism and metabolism interact with each other and follow several feedback patterns. These interactions and feedback patterns become more important with the addition of the external environment. When a disease is present, the first steps of the medical diagnosis should be to research and to determine whether another disease interacts with (“Censors”) or changed the significant symptoms, syndromes, or results of the laboratory tests of the first disease. Understanding of this interaction and the appropriate reasoning is based on a type of non-monotonic logic. We will try, within this paper, to see the effect of two diseases on each other. One important part of the effect of two diseases on each other is the entrancing effect of what we call “Censors.” In addition, causal reasoning, reasoning by analogy, and learning from precedents are important and necessary for a human-like expert in medicine. Some aspects of their application to thyroid diseases, with an implemented system, are considered in this paper.

AIM-796

Author[s]: Alan Bawden and Philip E. Agre

What a Parallel Programming Language Has to Let You Say

September 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-796.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-796.pdf

We have implemented in simulation a prototype language for the Connection Machine called CL1. CL1 is an extrapolation of serial machine programming language technology: in CL1 one programs the individual processors to perform local computations and talk to the communications network. We present details of the largest of out experiments with CL1, an interpreter for Scheme (a dialect of Lisp) that allows a large number of different Scheme programs to be run in parallel on the otherwise SIMD Connection Machine. Our aim was not to propose Scheme as a language for a Connection Machine programming, but to gain experience using CL1 to implement an interesting and familiar algorithm. Consideration of the difficulties we encountered led us to the conclusion that CL1 programs do not capture enough of the causal structure of the processes they describe. Starting from this observation, we have designed a successor language called CGL (for Connection Graph Language).

AIM-795

Author[s]: Christof Koch and Tomaso Poggio

Biophysics of Computation: Neurons, Synapses and Membranes

October 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-795.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-795.pdf

Synapses, membranes and neurotransmitters play an important role in processing information in the nervous system. We do not know, however, what biophysical mechanisms are critical for neuronal computations, what elementary information processing operations they implement, and which sensory or motor computations they underlie. In this paper, we outline an approach to these problems. We will review a number of different biophysical mechanisms such as synaptic interactions between excitation and inhibition, dendritic spines, non-impulse generating membrane nonlinearities and transmitter-regulated voltage channels. For watch one, we discuss the information processing operations that may be implemented. All of these mechanisms act either within a few milliseconds, such as the action potential or synaptic transmission, or over several hundred milliseconds or even seconds, modulating some property of the circuit. In some cases we will suggest specific examples where a biophysical mechanism underlies a given computation. In particular, we will discuss the neuronal operations, and their implementation, underlying direction selectivity in the vertebrate retina.

AITR-794

Author[s]: Eugene C. Ciccarelli, IV

Presentation Based User Interface

August 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-794.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-794.pdf

A prototype presentation system base is described. It offers mechanisms, tools, and ready-made parts for building user interfaces. A general user interface model underlies the base, organized around the concept of a presentation: a visible text or graphic for conveying information. Te base and model emphasize domain independence and style independence, to apply to the widest possible range of interfaces. The primitive presentation system model treats the interface as a system of processes maintaining a semantic relation between an application data base and a presentation data base, the symbolic screen description containing presentations. A presenter continually updates the presentation data base from the application data base. The user manipulates presentations with a presentation editor. A recognizer translates the user’s presentation manipulation into application data base commands. The primitive presentation system can be extended to model more complex systems by attaching additional presentation systems. In order to illustrate the model’s generality and descriptive capabilities, extended model structures for several existing user interfaces are discussed. The base provides support for building the application and presentation data bases, linked together into a single, uniform network, including descriptions of classes of objects as we as the objects themselves. The base provides an initial presentation data base network graphics to continually display it, and editing functions. A variety of tools and mechanisms help create and control presenters and recognizers. To demonstrate the base’s utility, three interfaces to an operating system were constructed, embodying different styles: icons, menu, and graphical annotation.

AITR-793

Author[s]: Daniel Sabey Weld

Switching Between Discrete and Continuous Process Models to Predict Molecular Genetic Activity

May 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-793.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-793.pdf

Two kinds of process models have been used in programs that reason about change: Discrete and continuous models. We describe the design and implementation of a qualitative simulator, PEPTIDE, which uses both kinds of process models to predict the behavior of molecular energetic systems. The program uses a discrete process model to simulate both situations involving abrupt changes in quantities and the actions of small numbers of molecules. It uses a continuous process model to predict gradual changes in quantities. A novel technique, called aggregation, allows the simulator to switch between theses models through the recognition and summary of cycles. The flexibility of PEPTIDE’s aggregator allows the program to detect cycles within cycles and predict the behavior of complex situations.

AIM-792

Author[s]: Marroquin, J.L.

Surface Reconstruction Preserving Discontinuities

August 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-792.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-792.pdf

Well-known methods for solving the shape-from-shading problem require knowledge of the reflectance map. Here we show how the shape-from-shading problem can be solved when the reflectance map is not available, but is known to have a given form with some unknown parameters. This happens, for example, when the surface is known to be Lambertian, but the direction to the light source is not known. We give an iterative algorithm that alternately estimates the surface shape and the light source direction. Use of the unit normal in parameterizing the reflectance map, rather than the gradient or stereographic coordinates, simpliflies the analysis. Our approach also leads to an iterative scheme for computing shape from shading that adjusts the current estimates of the focal normals toward or away from the direction of the light source. The amount of adjustment is proportional to the current difference between the predicted and the observed brightness. We also develop generalizations to less constrained forms of reflectance maps.

AITR-791

Author[s]: Bruce R. Donald

Motion Planning with Six Degrees of Freedom

May 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-791.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-791.pdf

The motion planning problem is of central importance to the fields of robotics, spatial planning, and automated design. In robotics we are interested in the automatic synthesis of robot motions, given high-level specifications of tasks and geometric models of the robot and obstacles. The Mover’s problem is to find a continuous, collision-free path for a moving object through an environment containing obstacles. We present an implemented algorithm for the classical formulation of the three-dimensional Mover’s problem: given an arbitrary rigid polyhedral moving object P with three translational and three rotational degrees of freedom, find a continuous, collision-free path taking P from some initial configuration to a desired goal configuration. This thesis describes the first known implementation of a complete algorithm (at a given resolution) for the full six degree of freedom Movers’ problem. The algorithm transforms the six degree of freedom planning problem into a point navigation problem in a six-dimensional configuration space (called C-Space). The C-Space obstacles, which characterize the physically unachievable configurations, are directly represented by six-dimensional manifolds whose boundaries are five dimensional C- surfaces. By characterizing these surfaces and their intersections, collision-free paths may be found by the closure of three operators which (i) slide along 5-dimensional intersections of level C-Space obstacles; (ii) slide along 1- to 4-dimensional intersections of level C-surfaces; and (iii) jump between 6 dimensional obstacles. Implementing the point navigation operators requires solving fundamental representational and algorithmic questions: we will derive new structural properties of the C-Space constraints and shoe how to construct and represent C-Surfaces and their intersection manifolds. A definition and new theoretical results are presented for a six- dimensional C-Space extension of the generalized Voronoi diagram, called the C- Voronoi diagram, whose structure we relate to the C-surface intersection manifolds. The representations and algorithms we develop impact many geometric planning problems, and extend to Cartesian manipulators with six degrees of freedom.

AIM-790

Author[s]: Christopher G. Atkeson and John M. Hollerback

Kinematic Features of Unrestrained Arm Movements

July 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-790.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-790.pdf

Unrestrained human arm trajectories between point targets have been investigated using a three dimensional tracking apparatus, the Selspot system. Movements were executed between different points in a vertical plane under varying conditions of speed and hand-held load. In contrast to past results which emphasized the straightness of hand paths, movement regions were discovered in which the hand paths were curved. All movements, whether curved or straight, showed an invariant tangential velocity profile when normalized for speed and distance. The velocity profile invariance with speed and load is interpreted in terms of simplification of the underlying arm dynamics, extending the results of Hollerbach and Flash (1982).

AITR-789

Author[s]: Kenneth D. Forbus

Qualitative Process Theory

July 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-789.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-789.pdf

Objects move, collide, flow, bend, heat up, cool down, stretch, compress and boil. These and other things that cause changes in objects over time are intuitively characterized as processes. To understand common sense physical reasoning and make programs that interact with the physical world as well as people do we must understand qualitative reasoning about processes, when they will occur, their effects, and when they will stop. Qualitative Process theory defines a simple notion of physical process that appears useful as a language in which to write dynamical theories. Reasoning about processes also motivates a new qualitative representation for quantity in terms of inequalities, called quantity space. This report describes the basic concepts of Qualitative Process theory, several different kinds of reasoning that can be performed with them, and discusses its impact on other issues in common sense reasoning about the physical world, such as causal reasoning and measurement interpretation. Several extended examples illustrate the utility of the theory, including figuring out that a boiler can blow up, that an oscillator with friction will eventually stop, and how to say that you can pull with a string but not push with it. This report also describes GIZMO, an implemented computer program which uses Qualitative Process theory to make predictions and interpret simple measurements. The represnetations and algorithms used in GIZMO are described in detail, and illustrated using several examples.

AIM-788

Author[s]: G. Edward Barton, Jr.

Toward a Principle-Based Parser

July 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-788.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-788.pdf

Parser design lags behind linguistic theory. While modern transformational grammar has largely abandoned complex, language- specific rule systems in favor of modular subsystems of principles and parameters, the rule systems that underlie existing natural- language parsers are still large, detailed, and complicated. The shift to modular theories in linguistics took place because of the scientific disadvantages of such rule systems. Those scientific ills translate into engineering maladies that make building natural- language systems difficult. The cure for these problems should be the same in parser design as it was in linguistic theory. The shift to modular theories of syntax should be replicated in parsing practice; a parser should base its actions on interacting modules of principles and parameters rather than a complex, monolithic rule system. If it can be successfully carried out, the shift will make it easier to build natural-language systems because it will shorten and simplify the language descriptions that are needed for parsing. It will also allow parser design to track new developments in linguistic theory.

AIM-787

Author[s]: Christof Koch

A Theoretical Analysis of the Electrical Properties of a X-Cell in the Cat's LGN

March 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-787.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-787.pdf

Electron microscope studies of relay cells in the lateral geniculate nucleus of the CAT have shown that the retinal input of X-cells is associated with a special synaptic circuitry, termed the spine-triad complex. The retinal afferents make an asymmetrical synapse with both a dendritic appendage of the X-cell and a geniculate interneuron. The interneuron contacts in turn the same dendritic appendage with a symmetrical synaptic profile. The retinal input to geniculate Y-cells is predominately found on dendritic shafts without any triadic arrangement. We explore the integrative properties of X- and Y-cells resulting from this striking dichotomy in synaptic architecture. The basis of our analysis is the solution of the cable equation for a branched dendritic tree with a known somatic input resistance. Under the assumption that the geniculate interneuron mediates a shunting inhibition, activation of the interneuron reduces very efficiently the excitatory post-synaptic potential induced by the retinal afferent without affecting the electrical activity in the rest of the cell. Therefore, the spine-triad circuit implements the analogy of an AND-NOT gate, unique to the X-system. Functionally, this corresponds to a presynaptic, feed-forward type of inhibition of the optic tract terminal. Since Y-cells lack this structure, inhibition acts globally, reducing the general electrical activity of the cell. We propose that geniculate interneurons gate the flow of visual information into the X-system as a function of the behavioral state of the animal, enhancing the center-surround antagonism and possibly mediating reciprocal lateral inhibition, eye-movement related suppression and selective visual attention.

AIM-786

Author[s]: Tamar Flash and Neville Hogan

The Coordination of Arm Movements: An Experimentally Confirmed Mathematical Model

November 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-786.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-786.pdf

This paper presents studies of the coordination f voluntary human arm movements. A mathematical model is formulated which is shown to predict both the qualitative features and the quantitative details observed experimentally in planar, multi-joint arm movements. Coordination is modelled mathematically by defining an objective function, a measure of performance for any possible movement. The unique trajectory which yields the best performance is determined using dynamic optimization theory. In the work presented here the objective function is the square of the magnitude of jerk (rate of change of acceleration) of the hand integrated over the entire movement. This is equivalent to assuming that a major goal of motor coordination is the production of the smoothest possible movement of the hand. The theoretical analysis is based solely on the kinematics of movement independent of the dynamics of the musculoskeletal system, and is successful only when formulated in terms of the motion of the hand in extracorporal space. The implications with respect to movement organization are discussed.

AIM-783

Author[s]: Tomaso Poggio and Christof Koch

An Analog Model of Computation for the Ill-Posed Problems of Early Vision

May 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-783.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-783.pdf

A large gap exists at present between computational theories of vision and their possible implementation in neural hardware. The model of computation provided by the digital computer is clearly unsatisfactory for the neurobiologist, given the increasing evidence that neurons are complex devices, very different from simple digital switches. It is especially difficult to imagine how networks of neurons may solve the equations involved in vision algorithms in a way similar to digital computers. In this paper, we suggest an analog model of computation in electrical or chemical networks for a large class of vision problems, that map more easily into biological plausible mechanisms. Poggio and Torre (1984) have recently recognized that early vision problems such as motion analysis (Horn and Schunck, 1981; Hildreth, 1984a,b), edge detection (Torre and Poggio, 1984), surface interpolation (Grimson, 1981; Terzopoulos 1984), shape-from-shading (Ikeuchi and Horn, 1981) and stereomatching can be characterized as mathematically ill- posed problems in the sense of Hadamard (1923). Ill-posed problems can be “solved”, according to regularization theories, by variational principles of a specific type. A natural way of implementing variational problems are electrical, chemical or neuronal networks. We present specific networks for solving several low-level vision problems, such as the computation of visual motion and edge detection.

AIM-781

Author[s]: Carl Hewitt, Tom Reinhardt, Gul Agha and Giuseppe Attardi

Linguistic Support of Receptionists for Shared Resources

September 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-781.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-781.pdf

This paper addressed linguistic issues that arise in providing support for shared resources in large scale concurrent systems. Our work is based on the Actor Model of computation which unifies the lambda calculus, the sequential stored-program and the object-oriented models of computation. We show how receptionist can be used to regulate the se of shared resources by scheduling their access and providing protection against unauthorized or accidental access. A shared financial account is an example of the kind of resource that needs a receptionist. Issues involved in the implementation of scheduling policies for shared resources are also addressed. The modularity problems involved in implementing servers which multiplex the use of physical devices illustrated how delegation aids in the implementation of parallel problem solving systems for communities of actors.

AIM-780

Author[s]: H.K. Nishihara

PRISM: A Practical Real-Time Imaging Stereo Matcher

May 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-780.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-780.pdf

A binocular-stereo-matching algorithm for making rapid visual range measurements in noisy images is described. This technique is developed for application to problems in robotics where noise tolerance, reliability, and speed are predominant issues. A high speed pipelined convolver for preprocessing images and an unstructured light technique for improving signal quality are introduced to help enhance performance to meet the demands of this task domain. These optimizations, however, are not sufficient. A closer examination of the problems encountered suggests that broader interpretations of both the objective of binocular stereo and of the zero-crossing theory of Marr and Poggio and required. In this paper, we restrict ourselves to the problem of making a single primitive surface measurement. For example, to determine whether or not a specified volume of space is occupied, to measure the range to a surface at an indicated image location, or to determine the elevation gradient at that position. In this framework we make a subtle but important shift from the explicit use of zero-crossing contours (in band-pass filtered images) as the elements matched between left and right images, to use of the signs between zero-crossings. With this change, we obtain a simpler algorithm with a reduced sensitivity to noise and a more predictable behavior. The PRISM system incorporates this algorithm with the unstructured light technique and a high speed digital convolver. It has been used successfully by others as a sensor in a path planning system and a bin picking system.

AIM-779

Author[s]: Hugh Robinson and Christof Koch

An Information Storage Mechanism: Calcium and Spines

April 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-779.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-779.pdf

This proposal addresses some of the biophysical events possibly underlying fast activity-dependent changes in synaptic efficiency. Dendritic spines in the cortex have attracted increased attention over the last years as a possible locus of cellular plasticity given the large number of studies reporting a close correlation between presynaptic activity (or lack of thereof) and changes in spine shape. This is highlighted by recent reports, showing that the spine cytoplasm contains high levels of actin. Moreover, it has been demonstrated that a high level of intracellular free calcium Ca squared positive, is a prerequisite for various forms of synaptic potentiation. We propose a series of plausible steps, linking presynaptic electrical activity at dendritic spines with a short lasting change in spine geometry. Specifically, we conjecture that the spike-induced excitatory postsynaptic potential triggers an influx of Ca squared positive into the spine, where it will rapidly bind to intracellular calcium buffers such as calmodulin and calcineurin. However, for prolonged or intense presynaptic electrical activity, these buffers will saturate, the free Ca squared positive will then activate the actin/myosin network in the spine neck, reversibly shortening the length of the neck and increasing its diameter. This change in the geometry of the spine will lead to an increase in the synaptic efficiency of the synapse. We will discuss the implication of our proposal for the control of cellular plasticity and its relation to generalized attention and arousal.

AIM-777

Author[s]: A.L. Yuille and T. Poggio

A Generalized Ordering Constraint for Stereo Correspondence

May 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-777.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-777.pdf

The ordering constraint along epipolar lines is a powerful constraint that has been exploited by some recent stereomatching algorithms. We formulate a generalized ordering constraint, not restricted to epipolar lines. We prove several properties of the generalized ordering constraint and of the “forbidden zone”, the set of matches that would violate the constraint. We consider both the orthographic and the perspective projection case, the latter for a simplified but standard stereo geometry. The disparity gradient limit found in the human stereo system may be related to a form of the ordering constraint. To illustrate our analysis we outline a simple algorithm that exploits the generalized ordering constraint for matching contours of wireframe objects. We also show that the use of the generalized ordering constraint implies several other stereo matching constraints: a0 the ordering constraint along epipolar lines, b) figural continuity, c) Binford’s cross-product constraint, d) Mayhew and Frisby’s figural continuity constraint. We finally discuss ways of extending the algorithm to arbitrary 3-D objects.

AIM-776

Author[s]: Tomaso Poggio

Vision by Man and Machine

March 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-776.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-776.pdf

The development of increasingly sophisticated and powerful computers in the last few decades has frequently stimulated comparisons between them and the human brain. Such comparisons will become more earnest as computers are applied more and more to tasks formerly associated with essentially human activities and capabilities. The expectation of a coming generation of “intelligent” computers and robots with sensory, motor and even “intellectual” skills comparable in quality to (and quantitatively surpassing) our own is becoming more widespread and is, I believe, leading to a new and potentially productive analytical science of “information processing”. In no field has this new approach been so precisely formulated and so thoroughly exemplified as in the field of vision. As the dominant sensory modality of man, vision is one of the major keys to our mastery of the environment, to our understanding and control of the objects which surround us. If we wish to created robots capable of performing complex manipulative tasks in a changing environment, we must surely endow them with (among other things) adequate visual powers. How can we set about designing such flexible and adaptive robots? In designing them, can we make use of our rapidly growing knowledge of the human brain, and if so, how at the same time, can our experiences in designing artificial vision systems help us to understand how the brain analyzes visual information?

AIM-774

Author[s]: Gerald Roylance

Some Scientific Subroutines in LISP

September 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-774.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-774.pdf

Here's a LISP library of mathematical functions that calculate hyperbolic and inverse hyperbolic functions. Bessel functions, elliptic integrals, the gamma and beta functions, and the incomplete gamma and beta functions. There are probability density functions, cumulative distributions, and random number generators for the normal, Poisson, chi- square, Student's T. and Snedecor's F integration, root finding, and convergence. Code to factor numbers and to the Solovay- Strassen probabilistic prime test.

AIM-773

Author[s]: Poggio Tomaso and Vincent Torre

Ill-Posed Problems and Regularization Analysis in Early Vision

April 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-773.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-773.pdf

One of the best definitions of early vision is that it is inverse optics --- a set of computational problems that both machines and biological organisms have to solve. While in classical optics the problem is to determine the images of physical objects, vision is confronted with the inverse problem of recovering three-dimensional shape from the light distribution in the image. Most processes of early vision such as stereomatching, computation of motion and the "structure from" processes can be regarded as solutions to inverse problems. This common characteristic of early vision can be formalized: most early vision problems are "ill- posed problems" in the sense of Hadamard. We will show that a mathematical theory developed for regularizing ill-posed problems leads in a natural way to the solution of the early vision problems in terms of variational principles of a certain class. This is a new theoretical framework for some of the variational solutions already obtained in the analysis of early vision processes. It also shows how several other problems in early vision can be approached and solved.

AIM-772

Author[s]: Katsushi Ikeuchi, Keith H. Nishihara, Berthold K.P. Horn, Patrick Sobalvarro and Shigemi Nagata

Determining Grasp Points Using Photometric Stereo and the PRISM Binocular Stereo System

August 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-772.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-772.pdf

This paper describes a system which locates and grasps doughnut shaped parts from a pile. The system uses photometric stereo and binocular stereo as vision input tools. Photometric stereo is used to make surface orientation measurements. With this information the camera field is segmented into isolated regions of continuous smooth surface. One of these regions is then selected as the target region. The attitude of the physical object associated with the target region is determined by histograming surface orientations over that region and comparing with stored histograms obtained from prototypical objects. Range information, not available from photometric stereo is obtained by the PRISM binocular stereo system. A collision-free grasp configuration and approach trajectory is computed and executed using the attitude, and range data.

AIM-771

Author[s]: Ronald S. Fearing and John M. Hollerbach

Basic Solid Mechanics for Tactile Sensing

March 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-771.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-771.pdf

In order to stably grasp objects without using object models, tactile feedback from the fingers is sometimes necessary. This feedback can be used to adjust grasping forces to prevent a part form slipping from a hand. If the angle of force at the object finger contact can be determined, slip can be prevented by the proper adjustment of finger forces. Another important tactile sensing task is finding the edged and corners of an object, since they are usually feasible grasping locations. This paper describes how this information can be extracted from the finger- object contact using strain sensors beneath a compliant skin. For determining contact forces, strain measurements are easier to use than the surface deformation profile. The finger is modelled as an infinite linear elastic half plane to predict the measured strain for several contact types and forces. The number of sensors required is less than has been proposed for other tactile recognition tasks. A rough upper bound on sensor density requirements for a specific depth is presented that is bas3ed on the frequency response of the elastic medium. The effects of different sensor stiffness on sensor performance are discussed.

AIM-770

Author[s]: Christof Koch and Shimon Ullman

Selecting One Among the Many: A Simple Network Implementing Shifts in Selective Visual Attention

January 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-770.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-770.pdf

This study addresses the question of how simple networks can account for a variety of phenomena associated with the shift of a specialized processing focus across the visual scene. We address in particular aspects of the dichotomy between the preattentive-paralel and the attentive-serial modes of visual perception and their hypothetical neuronal implementations. Specifically we propose the following: 1.) A number of elementary features, such as color, orientation, direction of movement, disparity ect. are represented in parallel in different topographical maps, called the early representation. 2.) There exists a selective mapping from this early representation into a more central representation, such that at any instant the central representation contains the properties of only a single location in the visual scene, the selected location. 3.) We discuss some selection rules that determine which location will be mapped into the central representation. The major rule, using the saliency or conspicuity of locations in the early representation, is implemented using a so- called Winner-Take-All network. A hierarchical pyramid-like architecture is proposed for this network. We suggest possible implementatinos in neuronal hardware, including a possible role for the extensive back-projection from the cortex to the LGN.

AIM-769

Author[s]: Whitman Richards and Donald D. Hoffman

Codon Constraints on Closed 2D Shapes

May 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-769.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-769.pdf

Codons are simple primitives for describing plane curves. They thus are primarily image- based descriptors. Yet they have the power to capture important information about the 3-D world, such as making part boundaries explicit. The codon description is highly redundant (useful for error-correction). This redundancy can be viewed as a constraint on the number of possible codon strings. For smooth closed strings that represent the bounding contour (silhouette) of many smooth 3D objects, the constraints are so strong that sequences containing 6 elements yield only 33 generic shapes as compared with a possible number of 15, 625 combinations.

AIM-768

Author[s]: V. Torre and T. Poggio

On Edge Detection

August 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-768.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-768.pdf

Edge detection is the process that attempts to characterize the intensity changes in the image in terms of the physical processes that have originated them. A critical, intermediate goal of edge detection is the detection and characterization of significant intensity changes. This paper discusses this part fo the edge detection problem. To characterize the types of intensity changes derivatives of different types, and possibly different scales, are needed. Thus we consider this part of edge detection as a problem in numerical differentiation. We show that numerical differentiation of images is an ill-posed problem in the sense of Hadamard. Differentiation needs to be regularized by a regularizing filtering operation before differentiation. This shows that his part of edge detection consists of two steps, a filtering step and differentiation step.

AITR-767

Author[s]: Brian C. Williams

Qualitative Analysis of MOS Circuits

July 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-767.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-767.pdf

With the push towards sub-micron technology, transistor models have become increasingly complex. The number of components in integrated circuits has forced designer’s efforts and skills towards higher levels of design. This has created a gap between design expertise and the performance demands increasingly imposed by the technology. To alleviate this problem, software tools must be developed that provide the designer with expert advice on circuit performance and design. This requires a theory that links the intuitions of an expert circuit analyst with the corresponding principles of formal theory (i.e. algebra, calculus, feedback analysis, network theory, and electrodynamics), and that makes each underlying assumption explicit.

AIM-764

Author[s]: John M. Rubin and W.A. Richards

Color Vision: Representing Material Categories

May 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-764.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-764.pdf

We argue that one of the early goals of color vision is to distinguish one kind of material from another. Accordingly, we show that when a pair of image regions is such that one region has greater intensity at one wavelength than at another wavelength, and the second region has the opposite property, then the two regions are likely to have arisen from distinct materials in the scene. We call this material change circumstance the 'opposite slope sign condition.' With this criterion as a foundation, we construct a representation of spectral information that facilitates the recognition of material changes. Our theory has implications for both psychology and neurophysiology. In particular, Hering's notion of opponent colors and psychologically unique primaries, and Land's results in two-color projection can be interpreted as different aspects of the visual system's goal of categorizing materials. Also, the theory provides two basic interpretations of the function of double-opponent color cells described by neurophysiologists.

AIM-763A

Author[s]: W. Eric L. Grimson

The Combinatorics of Local Constraints in Model-Based Recognition and Localization from Sparse Data

March 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-763a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-763a.pdf

The problem of recognizing what objects are where in the workspace of a robot can be cast as one of searching for a consistent matching between sensory data elements and equivalent model elements. In principle, this search space is enormous and to control the potential combinatorial explosion, constraints between the data and model elements are needed. We derive a set of constraints for sparse sensory data that are applicable to a wide variety of sensors and examine their characteristics. We then use known bounds on the complexity of constraint satisfaction problems together with explicit estimates of the effectiveness of the constraints derived for the case of sparse, noisy three-dimensional sensory data to obtain general theoretical bounds on the number of interpretations expected to be consistent with the data. We show that these bounds are consistent with empirical results reported previously. The results are used to demonstrate the graceful degradation of the recognition technique with the presence of noise in the data, and to predict the number of data points needed in general to uniquely determine the object being sensed.

AIM-762

Author[s]: W. Eric L. Grimson

Computational Experiments with a Feature Based Stereo Algorithm

January 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-762.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-762.pdf

Computational models of the human stereo system can provide insight into general information processing constraints that apply to any stereo system, either artificial or biological. In 1977, Marr and Poggio proposed one such computational model, that was characterized as matching certain feature points in difference-of-Gaussian filtered images, and using the information obtained by matching coarser resolution of representations to restrict the search space for matching finer resolution representations. An implementation of the algorithm and its testing on a range of images was reported in 1980. Since then a number psychophysical experiments have suggested possible refinements to the model and modifications to the algorithm. As well, recent computational experiments applying the algorithm to a variety of natural images, especially aerial photographs, have led to a number of modifications. In this article, we present a version of the Marr-Poggio-Grimson algorithm that embodies these modifications and illustrate its performance on a series of natural images.

AIM-761

Author[s]: Ellen C. Hildreth

Computations Underlying the Measurement of Visual Motion

March 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-761.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-761.pdf

The organization of movement in a changing image provides a valuable source of information for analyzing the environment in terms of objects, their motion in space, and their three-dimensional structure. This movement may be represented by a two- dimensional velocity field that assigns a direction and magnitude of velocity to elements in the image. This paper presents a method for computing the velocity field, with three main components. First, initial measurements of motion in the image take place at the location of significant changes, which give rise to zero-crossings in the output of the convolution of the image with a *** operator. The initial motion measurements provide the component of velocity in the direction perpendicular to the local orientation of the zero-crossing contours. Second, these initial measurements are integrated along contours to compute the two-dimensional velocity field. Third, an additional constraint of smoothness of the velocity field, based on the physical constraint that surfaces are generally smooth, allows the computation of a unique velocity field. The details of an algorithm are presented, with results of the algorithm applied to artificial and natural image sequences.

AIM-760

Author[s]: Van-Duc Nguyen

The Find-Path Problem in the Plane

February 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-760.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-760.pdf

This paper presents a fast heuristic algorithm for planning collision-free paths of a moving robot in a cluttered planar workspace. The algorithm is based on describing the free space between the obstacles as a network of linked cones. Cones capture the freeways and the bottle-necks between the obstacles. Links capture the connectivity of the free space. Paths are computed by intersecting the valid configuration volumes of the moving robot inside these cones and inside the regions described by the links.

AIM-759

Author[s]: Tomas Lozano-Perez, Matthew T. Mason and Russell H. Taylor

Automatic Synthesis of Fine-Motion Strategies for Robots

December 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-759.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-759.pdf

The use of active compliance enables robots to carry out tasks in the presence of significant sensing and control errors. Compliant motions are quite difficult for humans to specify, however. Furthermore, robot programs are quite sensitive to details of geometry and to error characteristics and must, therefore, be constructed anew for each task. These factors motivate the need for automatic synthesis tools for robot programming, especially for compliant motion. This paper describes a formal approach to the synthesis of compliant motion strategies from geometric descriptions of assembly operations and explicit estimates of errors in sensing and control. A key aspect of the approach is that it provides correctness criteria for compliant motion strategies.

AIM-758

Author[s]: Haruo Asada and Michael Brady

The Curvature Primal Sketch

February 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-758.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-758.pdf

In this paper we introduce a novel representation of the significant changes in curvature along the bounding contour of planar shape. We call the representation the curvature primal sketch. We describe an implemented algorithm that computes the curvature primal sketch and illustrate its performance on a set of tool shapes. The curvature primal sketch derives its name from the close analogy to the primal sketch representation advocated by Marr for describing significant intensity changes. We define a set of primitive parameterized curvature discontinuities, and derive expressions for their convolutions with the first and second derivatives of a Gaussian. The convolved primitives, sorted according to the scale at which they are detected, provide us with a multi-scaled interpretation of the contour of a shape.

AIM-757

Author[s]: Michael Brady and Haruo Asada

Smoothed Local Symmetries and Their Implementation

February 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-757.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-757.pdf

We introduce a novel representation of two-dimensional shape that we call smoothed local symmetries (SLS). Smoothed local symmetries represent both the bounding contour of a shape fragment and the region that it occupies. In this paper we develop the main features of the SLS representation and describe an implemented algorithm that computes it. The performance of the algorithm is illustrated for a set of tools. We conclude by sketching a method for determining the articulation of a shape into subshapes.

AIM-756

Author[s]: Michael Brady

Artificial Intelligence and Robotics

February 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-756.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-756.pdf

Since Robotics is the field concerned with the connection of perception to action, Artificial Intelligence must have a central role in Robotics if the connection is to be intelligent. Artificial Intelligence addresses the crucial questions of: what knowledge is required in any aspect of thinking; how that knowledge should be represented; and how that knowledge should be used. Robotics challenges AI by forcing it to deal with real objects in the real world. Techniques and representations developed for purely cognitive problems, often in toy domains, do not necessarily extend to meet the challenge. Robots combine mechanical effectors, sensors, and computers. AI has made significant contributions to each component. We review AI contributions to perception and object oriented reasoning. Object-oriented reasoning includes reasoning about space, path-planning, uncertainty, and compliance. We conclude with three examples that illustrate the kinds of reasoning or problem solving abilities we would like to endow robots with and that we believe are worthy goals of both Robotics and Artificial Intelligence, being within reach of both.

AIM-755

Author[s]: Douglas Hofstadter

The Copycat Project: An Experiment in Nondeterminism and Creative Analogies

January 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-755.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-755.pdf

A micro-world is described, in which many analogies involving strikingly different concepts and levels of subtlety can be made. The question “What differentiates the good ones from the bad ones?” is discussed, and then the problem of how to implement a computational model of the human ability to come up with such analogies (and to have a sense for their quality) is considered. A key part of the proposed system, now under development is its dependence on statistically emergent properties of stochastically interacting “codelets” (small pieces of ready- to-run code created by the system, and selected at random to run with probability proportional to heuristically assigned “urgencies”). Another key element is a network of linked concepts of varying levels of “semanticity”, in which activation spreads and indirectly controls the urgencies of new codelets. There is pressure in the system toward maximizing the degree of “semanticity” or “intensionality” of descriptions of structures, but many such pressures, often conflicting, must interact with one another, and compromises must be made. The shifting of (1) perceived oundaries inside structures, (2) descriptive concepts chosen to apply to structures, and (3) features perceived as “salient” or not, is called “slippage”. What can slip, and how are emergent consequences of the interaction of (1) the temporary (“cytoplasmic”) structures involved in the analogy with (2) the permanent (“Platonic”) concepts and links in the conceptual proximity network, or “slippability network”. The architecture of this system is postulated as a general architecture suitable for dealing not only with fluid analogies, but also with other types of abstract perception and categorization tasks, such as musical perception, scientific theorizing, Bongard problems and others.

AITR-754

Author[s]: Richard D. Lathrop

Parallelism in Manipulator Dynamics

December 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-754.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-754.pdf

This paper addresses the problem of efficiently computing the motor torques required to drive a lower-pair kinematic chain (e.g., a typical manipulator arm in free motion, or a mechanical leg in the swing phase) given the desired trajectory; i.e., the Inverse Dynamics problem. It investigates the high degree of parallelism inherent in the computations, and presents two “mathematically exact” formulations especially suited to high-speed, highly parallel implementations using special-purpose hardware or VLSI devices. In principle, the formulations should permit the calculations to run at a speed bounded only by I/O. The first presented is a parallel version of the recent linear Newton-Euler recursive algorithm. The time cost is also linear in the number of joints, but the real-time coefficients are reduced by almost two orders of magnitude. The second formulation reports a new parallel algorithm which shows that it is possible to improve upon the linear time dependency. The real time required to perform the calculations increases only as the [log2] of the number of joints. Either formulation is susceptible to a systolic pipelined architecture in which complete sets of joint torques emerge at successive intervals of four floating-point operations. Hardware requirements necessary to support the algorithm are considered and found not to be excessive, and a VLSI implementation architecture is suggested. We indicate possible applications to incorporating dynamical considerations into trajectory planning, e.g. it may be possible to build an on-line trajectory optimizer.

AITR-753

Author[s]: Richard C. Waters

KBEmacs: A Step Toward the Programmer's Apprentice

May 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-753.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-753.pdf

The Knowledge-Based Editor in Emacs (KBEmacs) is the current demonstration system implemented as part of the Programmer’s Apprentice project. KBEmacs is capable of acting as a semi-expert assistant to a person who is writing a program – taking over some parts of the programming task. Using KBEmacs, it is possible to construct a program by issuing a series of high level commands. This series of commands can be as much as an order of magnitude shorter than the program is describes. KBEmacs is capable of operating on Ada and Lisp programs of realistic size and complexity. Although KBEmacs is neither fast enough nor robust enough to be considered a true prototype, both of these problems could be overcome if the system were to be reimplemented.

AIM-752

Author[s]: A. Yuille

A Method for Computing Spectral Reflectance

December 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-752.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-752.pdf

Psychophysical experiments show that the perceived colour of an object is relatively independent of the spectrum of the incident illumination and depends only on the surface reflectance. We demonstrate a possible solution to this undetermined problem by expanding the illumination and surface reflectance in terms of a finite number of basis functions. This yields a number of nonlinear equations for each colour patch. We show that given a sufficient number of surface patches with the same illumination it is possible to solve these equations up to an overall scaling factor. Generalizations to the spatial dependent situation are discussed. We define a method for detecting material changes and illustrate a way of detecting the colour of a material at its boundaries and propagating it inwards.

AIM-751

Author[s]: Christof Koch, Jose Marroquin and Alan Yuille

Analog "Neuronal" Networks in Early Vision

June 1985

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-751.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-751.pdf

Many problems in early vision can be formulated in terms of minimizing an energy or cost function. Examples are shape-from- shading, edge detection, motion analysis, structure from motion and surface interpolation (Poggio, Torre and Koch, 1985). It has been shown that all quadratic variational problems, an important subset of early vision tasks, can be “solved” by linear, analog electrical or chemical networks (Poggio and Koch, 1985). IN a variety of situateions the cost function is non-quadratic, however, for instance in the presence of discontinuities. The use of non-quadratic cost functions raises the question of designing efficient algorithms for computing the optimal solution. Recently, Hopfield and Tank (1985) have shown that networks of nonlinear analog “neurons” can be effective in computing the solution of optimization problems. In this paper, we show how these networks can be generalized to solve the non-convex energy functionals of early vision. We illustrate this approach by implementing a specific network solving the problem of reconstructing a smooth surface while preserving its discontinuities from sparsely sampled data (Geman and Geman, 1984; Marroquin 1984; Terzopoulos 1984). These results suggest a novel computational strategy for solving such problems for both biological and artificial vision systems.

AIM-750

Author[s]: Carl Hewitt and Henry Lieberman

Design Issues in Parallel Architecture for Artificial Intelligence

November 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-750.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-750.pdf

Development of highly intelligent computers requires a conceptual foundation that will overcome the limitations of the von Neumann architecture. Architectures for such a foundation should meet the following design goals: * Address the fundamental organizational issues of large-scale parallelism and sharing in a fully integrated way. This means attention to organizational principles, as well as hardware and software. * Serve as an experimental apparatus for testing large-scale artificial intelligence systems. * Explore the feasibility of an architecture based on abstractions, which serve as natural computational primitives for parallel processing. Such abstractions should be logically independent of their software and hardware host implementations. In this paper we lay out some of the fundamental design issues in parallel architectures for Artificial Intelligence, delineate limitations of previous parallel architectures, and outline a new approach that we are pursuing.

AITR-749

Author[s]: Reid Gordon Simmons

Representing and Reasoning About Change in Geologic Interpretation

December 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-749.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-749.pdf

Geologic interpretation is the task of inferring a sequence of events to explain how a given geologic region could have been formed. This report describes the design and implementation of one part of a geologic interpretation problem solver -- a system which uses a simulation technique called imagining to check the validity of a candidate sequence of events. Imagining uses a combination of qualitative and quantitative simulations to reason about the changes which occured to the geologic region. The spatial changes which occur are simulated by constructing a sequence of diagrams. The quantitative simulation needs numeric parameters which are determined by using the qualitative simulation to establish the cumulative changes to an object and by using a description of the current geologic region to make quantitative measurements. The diversity of reasoning skills used in imagining has necessitated the development of multiple representations, each specialized for a different task. Representations to facilitate doing temporal, spatial and numeric reasoning are described in detail. We have also found it useful to explicitly represent processes. Both the qualitative and quantitative simulations use a discrete 'layer cake' model of geologic processes, but each uses a separate representation, specialized to support the type of simulation. These multiple representations have enabled us to develop a powerful, yet modular, system for reasoning about change.

AIM-747

Author[s]: Hormoz Mansour

A Structural Approach to Analogy

November 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-747.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-747.pdf

There are multiple sorts of reasoning by analogy between two domains; the one with which we are concerned is a type of contextual analogy. The purpose of this paper is to see whether two domains that look analogous would be analogous in all aspects and contexts. To perform this, we analyse the domain according to different particularities. For each particularity or context we continue the analysis and search for another one within the same domain. In this way we create a kind of structure for the different domains. This sort of analysis is represented by frames and frames which are nested within each other. This paper describes this concept and an implemented system “MULTI_ANALOG”, a limited example of knowledge-acquisition, problem solving, and automatic-acquisition based on this particular form of analogy namely structural analogy.

AIM-746

Author[s]: Berthold K.P. Horn and Katsushi Ikeuchi

Picking Parts out of a Bin

October 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-746.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-746.pdf

One of the remaining obstacles to the widespread application of industrial robots is their inability to deal with parts that are not precisely positioned. In the case of manual assembly, components are often presented in bins. Current automated systems, on the other hand, require separate feeders which present the parts with carefully controlled position and attitude. Here we show how results in machine vision provide techniques for automatically directing a mechanical manipulator to pick one object at a time out of a pile. The attitude of the object to be picked up is determined using a histogram of the orientations of visible surface patches. Surface orientation, in turn, is determined using photometric stereo applied to multiple images. These images are taken with the same camera but differing lighting. The resulting needle map, giving the orientations of surface patches, is used to create an orientation histogram which is a discrete approximation to the extended Gaussian image. This can be matched against a synthetic orientation histogram obtained from prototypical models of the objects to be manipulated. Such models may be obtained from computer aided design (CAD) databases. The method thus requires that the shape of the objects be described, but it is not restricted to particular types of objects.

AIM-744

Author[s]: Katsushi Ikeuchi

Constructing a Depth Map from Images

August 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-744.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-744.pdf

This paper describes two methods for constructing a depth map from images. Each method has two stages. First, one or more needle maps are determined using a pair of images. This process employs either the Marr-Poggio-Grimson stereo and shape-from- shading, or, instead, photometric stereo. Secondly, a depth map is constructed from the needle map or needle maps computed by the first stage. Both methods make use of an iterative relaxation method to obtain the final depth map.

AIM-743

Author[s]: K.R.K. Nielsen and T. Poggio

Vertical Image Registration in Stereopsis

October 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-743.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-743.pdf

Most computational theories of stereopsis require a registration stage prior to stereo matching to reduce the matching to a one- dimensional search. Even after registration, it is critical that the stereo matching process tolerate some degree of residual misalignment. In this paper, we study with psychophysical techniques the tolerance to vertical disparity in situations in which false targets abound – as in random dot stereograms – and eye movements are eliminated. Our results show that small amounts of vertical disparity significantly impair depth discrimination in a forced-choice task. Our main results are: a) vertical disparity of only the central “figure” part of a random dot stereogram can be tolerated up to about 3.5’, b) vertical disparity of the “figure + ground” is tolerated up to about 6.5’, and c) the performance of the Grimson implementation of the Marr-Poggio stereo matching algorithm for the stereograms of experiment (a) is consistent with the psychophysical results. The algorithm’s tolerance to vertical disparity is due exclusively to the spatial averaging of the underlying filters. The algorithm cannot account by itself for the results of experiment (b). Eye movements, which are the principal registration mechanism for human stereopsis, are accurate to within about 7’. Our data suggest that tolerance to this residual vertical disparity is attained by two non-motor mechanisms: 1) the spatial average performed by the receptive fields that filter the two images prior to stereo matching, and 2) a non-motor shift mechanism that may be driven at least in part by monocular cues.

AIM-740

Author[s]: Berthod K.P. Horn

Extended Gaussian Images

July 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-740.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-740.pdf

This is a primer on extended Gaussian Images. Extended Gaussian Images are useful for representing the shapes of surfaces. They can be computed easily from: 1. Needle maps obtained using photometric stereo, or 2. Depth maps generated by ranging devices or stereo. Importantly, they can also be determined simply from geometric models of the objects. Extended Gaussian images can be of use in at least two of the tasks facing a machine vision system. 1. Recognition, and 2. Determining the attitude in space of an object. Here, the extended Gaussian image is defined and some of its properties discussed. An elaboration for non-convex objects is presented and several examples are shown.

AIM-739

Author[s]: Davis, Randall

Diagnostic Reasoning Based on Structure and Behavior

June 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-739.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-739.pdf

We describe a system that reasons from first principles, i.e., using knowledge of structure and behavior. The system has been implemented and tested on several examples in the domain of troubleshooting digital electronic circuits. We give an example of the system in operation, illustrating that this approach provides several advantages, including a significant degree of device independence, the ability to constrain the hypotheses it considers at the outset, yet deal with a progressively wider range of problems, and the ability to deal with situations that are novel in the sense that their outward manifestations may not have been encountered previously. As background we review our basic approach to describing structure and behavior, then explore some of the technologies used previously in troubleshooting. Difficulties encountered there lead us to a number of new contributions, four of which make up the central focus of this paper. We describe a technique we call constraint suspension that provides a powerful tool for troubleshooting. We point out the importance of making explicit the assumptions underlying reasoning and describe a technique that helps enumerate assumptions methodically. The result is an overall strategy for troubleshooting based on the progressive relaxation of underlying assumptions. The system can focus its efforts initially, yet will methodically expand its focus to include a broad range of faults. Finally, abstracting from our examples, we find that the concept of adjacency proves to be useful in understanding why some faults are especially difficult and why multiple different representations are useful.

AIM-738

Author[s]: W. Eric L. Grimson and Tomas Lozano-Perez

Model-Based Recognition and Localization from Sparse Range or Tactile Data

August 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-738.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-738.pdf

This paper discusses how local measurements of three-dimensional positions and surface normals (recorded by a set of tactile sensors, or by three-dimensional range sensors), may be used to identify and locate objects, from among a set of known objects. The objects are modeled as polyhedra having up to six degrees of freedom relative to the sensors. We show that inconsistent hypotheses about pairings between sensed points and object surfaces can be discarded efficiently by using local constraints on: distances between faces, angles between face normals, and angles (relative to the surface normals) of vectors between sensed points. We show by simulation and by mathematical bounds that the number of hypotheses consistent with these constraints is small. We also show how to recover the position and orientation of the object from the sense data. The algorithm’s performance on data obtained from a triangulation range sensor is illustrated.

AIM-737

Author[s]: H.K. Nishihara and T. Poggio

Hidden Clues in Random Line Stereograms

August 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-737.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-737.pdf

Successful fusion of random-line stereograms with breaks in the vernier acuity range has been previously interpreted to suggest that the interpolation process underlying hyperacuity is parallel and preliminary to stereomatching. In this paper (a) we demonstrate with computer experiments that vernier cues are not needed to solve the stereomatching problem posed by these stereograms and (b) we provide psychophysical evidence that human stereopsis probably does not use vernier cues alone to achieve fusion of these random-line stereograms.

AIM-736

Author[s]: Bruce R. Donald

Hypothesizing Channels through Free-Space in Solving the Findpath Problem

June 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-736.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-736.pdf

Given a polyhedral environment, a technique is presented for hypothesizing a channel volume through the free space containing a class of successful collision-free paths. A set of geometric constructions between obstacle faces is proposed, and we define a mapping from a field of view analysis to a direct local construction of free space. The algorithm has the control structure of a search which propagates construction of a connected channel towards a goal along a frontier of exterior free faces. Thus a channel volume starts out by surrounding the moving object in the initial configuration and “grows” towards the goal. Finally, we show techniques for analyzing the channel decomposition of free space and suggesting a path.

AITR-735

Author[s]: Harold Abelson and Gerald Jay Sussman

Structure and Interpretation of Computer Programs

July 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-735.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-735.pdf

“The Structure and Interpretation of Computer Programs” is the entry-level subject in Computer Science at the Massachusetts Institute of Technology. It is required of all students at MIT who major in Electrical Engineering or in Computer Science, as one fourth of the “common core curriculum,” which also includes two subjects on circuits and linear systems and a subject on the design of digital systems. We have been involved in the development of this subject since 1978, and we have taught this material in its present form since the fall of 1980 to approximately 600 students each year. Most of these students have had little or no prior formal training in computation, although most have played with computers a bit and a few have had extensive programming or hardware design experience. Our design of this introductory Computer Science subject reflects two major concerns. First we want to establish the idea that a computer language is not just a way of getting a computer to perform operations, but rather that it is a novel formal medium for expressing ideas about methodology. Thus, programs must be written for people to read, and only incidentally for machines to execute. Secondly, we believe that the essential material to be addressed by a subject at this level, is not the syntax of particular programming language constructs, nor clever algorithms for computing particular functions of efficiently, not even the mathematical analysis of algorithms and the foundations of computing, but rather the techniques used to control the intellectual complexity of large software systems.

AIM-734

Author[s]: Ellen C. Hildreth

The Computation of the Velocity Field

September 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-734.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-734.pdf

The organization of movement in the changing retinal image provides a valuable source of information for analyzing the environment in terms of objects, their motion in space and their three-dimensional structure. A description of this movement is not provided to our visual system directly, however; it must be inferred from the pattern of changing intensity that reaches the eye. This paper examines the problem of motion measurement, which we formulate as the computation of an instantaneous two- dimensional velocity field from the changing image. Initial measurements of motion take place at the location of significant intensity changes, as suggested by Marr and Ullman (1981). These measurements provide only one component of local velocity, and must be integrated to compute the two-dimensional velocity field. A fundamental problem for this integration stage is that the velocity field is not determined uniquely from information available in the changing image. We formulate an additional constraint of smoothness of the velocity field, based on the physical assumption that surfaces are generally smooth, which allows the computation of a unique velocity field. A theoretical analysis of the conditions under which this computation yields the correct velocity field suggests that the solution is physically plausible. Empirical studies show the predictions of this computation to be consistent with human motion perception.

AIM-732

Author[s]: D.D. Hoffman and Whitman Richards

Parts of Recognition

December 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-732.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-732.pdf

A complete theory of object recognition is an impossibility – not simply because of the multiplicity of visual cues we exploit in elegant coordination to identify an object, but primarily because recognition involves fixation of belief, and anything one knows may be relevant. We finesse this obstacle with two moves. The first restricts attention to one visual cue, the shapes of objects; the second restricts attention to one problem, the initial guess at the identity of an object. We propose that the visual system decomposes a shape into parts, that it does so using a rule defining part boundaries rather than part shapes, that the rule exploits a uniformity of nature – transversality, and that parts with their descriptions and spatial relations provide a first index into a memory of shapes. These rules lead to a more comprehensive explanation of several visual illusions. The role of inductive inference is stressed in our theory. We conclude with a précis of unsolved problems.

AIM-731

Author[s]: Whitman Richards

Structure from Stereo and Motion

September 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-731.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-731.pdf

Stereopsis and motion parallax are two methods for recovering three dimensional shape. Theoretical analyses of each method show that neither alone can recover rigid 3D shapes correctly unless other information, such as perspective, is included. The solutions for recovering rigid structure from motion have a reflection ambiguity; the depth scale of the stereoscopic solution will not be known unless the fixation distance is specified in units of interpupil separation. (Hence the configuration will appear distorted.) However, the correct configuration and the disposition of a rigid 3D shape can be recovered if stereopsis and motion are integrated, for then a unique solution follows from a set of linear equations. The correct interpretation requires only three points and two stereo views.

AIM-730

Author[s]: A.L. Yuille and T. Poggio

Fingerprints Theorems for Zero-Crossings

October 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-730.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-730.pdf

We prove that the scale map of the zero- crossings of almost all signals filtered by the second derivative of a gaussian of variable size determines the signal uniquely, up to a constant scaling and a harmonic function. Our proof provides a method for reconstructing almost all signals from knowledge of how the zero-crossing contours of the signal, filtered by a gaussian filter, change with the size of the filter. The proof assumes that the filtered signal can be represented as a polynomial of finite, albeit possibly very high, order. An argument suggests that this restriction is not essential. Stability of the reconstruction scheme is briefly discussed. The result applies to zero- and level-crossings of linear differential operators of gaussian filters. The theorem is extended to two dimensions, that is to images. These results are reminiscent of Logan’s theorem. They imply that extrema of derivatives at different scales are a complete representation of a signal.

AITR-728

Author[s]: Daniel G. Theriault

Issues in the Design and Implementation of Act 2

June 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-728.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-728.pdf

Act2 is a highly concurrent programming language designed to exploit the processing power available from parallel computer architectures. The language supports advanced concepts in software engineering, providing high-level constructs suitable for implementing artificially-intelligent applications. Act2 is based on the Actor model of computation, consisting of virtual computational agents which communicate by message-passing. Act2 serves as a framework in which to integrate an actor language, a description and reasoning system, and a problem-solving and resource management system. This document describes issues in Act2’s design and the implementation of an interpreter for the language.

AIM-727

Author[s]: Carl Hewitt and Peter de Jong

Analyzing the Roles of Descriptions and Actions in Open Systems

April 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-727.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-727.pdf

This paper analyzes relationships between the roles of descriptions and actions in large scale, open ended, geographically distributed, concurrent systems. Rather than attempt to deal with the complexities and ambiguities of currently implemented descriptive languages, we concentrate our analysis on what can be expressed in the underlying frameworks such as the lambda calculus and first order logic. By this means we conclude that descriptions and actions complement one another: neither being sufficient unto itself. This paper provides a basis to begin the analysis of the very subtle relationships that hold between descriptions and actions in Open Systems.

AIM-726

Author[s]: Katsushi Ikeuchi, Berthold K.P. Horn, Shigemi Nagata, Tom Callahan and Oded Fein

Picking Up an Object from a Pile of Objects

May 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-726.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-726.pdf

This paper describes a hand-eye system we developed to perform the binpicking task. Two basic tools are employed: the photometric stereo method and the extended Gaussian image. The photometric stereo method generates the surface normal distribution of a scene. The extended Gaussian image allows us to determine the attitude of the object based on the normal distribution. Visual analysis of an image consists of two stages. The first stage segments the image into regions and determines the target region. The photometric stereo system provides the surface normal distribution of the scene. The system segments the scene into isolated regions using the surface normal distribution rather than the brightness distribution. The second stage determines object attitude and position by comparing the surface normal distribution with the extended-Gaussian- image. Fingers, with LED sensor, mounted on the PUMA arm can successfully pick an object from a pile based on the information from the vision part.

AIM-725

Author[s]: Rodney A. Brooks

Planning Collision Free Motions for Pick and Place Operations

May 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-725.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-725.pdf

An efficient algorithm which finds collision free paths for a manipulator with 5 or 6 revolute joints is described. It solves the problem for four degree of freedom pick and place operations. Examples are given of paths found by the algorithm in tightly cluttered workspaces. The algorithm first describes free space in two ways: as freeways for the hand and payload ensemble and as freeways for the upperarm. Freeways match volumes swept out by manipulator motions and can be “inverted” to find a class of topologically equivalent path segments. The two freeway spaces are searched concurrently under projection of constraints determined by motion of the forearm.

AIM-724

Author[s]: A.L. Yuille

The Smoothest Velocity Field and Token Matching

August 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-724.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-724.pdf

This paper presents some mathematical results concerning the measurement of motion of contours. A fundamental problem of motion measurement in general is that the velocity field is not determined uniquely from the changing intensity patterns. Recently Hildreth & Ullman have studied a solution to this problem based on an Extremum Principle [Hildreth (1983), Ullman & Hildreth (1983)]. That is, they formulate the measurement of motion as the computation of the smoothest velocity field consistent with the changing contour. We analyse this Extremum principle and prove that it is closely related to a matching scheme for motion measurement which matches points on the moving contour that have similar tangent vectors. We then derive necessary and sufficient conditions for the principle to yield the correct velocity field. These results have possible implications for the design of computer vision systems, and for the study of human vision.

AIM-723

Author[s]: Shimon Ullman

Visual Routines

June 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-723.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-723.pdf

This paper examines the processing of visual information beyond the creation of the early representations. A fundamental requirement at this level is the capacity to establish visually abstract shape properties and spatial relations. This capacity plays a major role in object recognition, visually guided manipulation, and more abstract visual thinking. For the human visual system, the perception of spatial properties and relations that are complex from a computational standpoint, nevertheless often appears immediate and effortless. This apparent immediateness and ease of perceiving spatial relations is, however, deceiving. It conceals in fact a complex array of processes highly specialized for the task. The proficiency of the human system in analyzing spatial information far surpasses the capacities of current artificial systems. The study of the computations that underlie this competence may therefore lead to the development of new more efficient processors for the spatial analysis of visual information. It is suggested that the perception of spatial relations is achieved by the application to the base representations of visual routines that are composed of sequences of elemental operations. Routines for different properties and relations share elemental operations. Using a fixed set of basic operations, the visual system can assemble different routines to extract an unbounded variety of shape properties and spatial relations. At a more detailed level, a number of plausible basic operations are suggested, based primarily on their potential usefulness, and supported in part by empirical evidence. The operations discussed include shifting of the processing focus, indexing to an odd-man-out location, bounded activation, boundary tracing, and marking. The problem of assembling such elemental operations into meaningful visual routines is discussed briefly.

AIM-722

Author[s]: A.L. Yuille and T. Poggio

Scaling Theorems for Zero-Crossings

June 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-722.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-722.pdf

We characterize some properties of the zero-crossings of the laplacian of signals - in particular images - filtered with linear filters, as a function of the scale of the filter (following recent work by A. Witkin, 1983). We prove that in any dimension the only filter that does not create zero crossings as the scale increases is gaussian. This result can be generalized to apply to level-crossings of any linear differential operator: it applies in particular to ridges and ravines in the image density. In the case of the second derivative along the gradient we prove that there is no filter that avoids creation of zero-crossings.

AIM-721

Author[s]: Shimon Ullman

Maximizing Rigidity: The Incremental Recovery of 3-D Structure from Rigid and Rubbery Motion

June 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-721.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-721.pdf

The human visual system can extract 3-D shape information of unfamiliar moving objects from their projected transformations. Computational studies of this capacity have established that 3-D shape, can be extracted correctly from a brief presentation, provided that the moving objects are rigid. The human visual system requires a longer temporal extension, but it can cope, however, with considerable deviations from rigidity. It is shown how the 3-D structure of rigid and non- rigid objects can be recovered by maintaining an internal model of the viewed object and modifying it at each instant by the minimal non-rigid change that is sufficient to account for the observed transformation. The results of applying this incremental rigidity scheme to rigid and non-rigid objects in motion are described and compared with human perceptions.

AITR-720

Author[s]: John Francis Canny

Finding Edges and Lines in Images

June 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-720.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-720.pdf

The problem of detecting intensity changes in images is canonical in vision. Edge detection operators are typically designed to optimally estimate first or second derivative over some (usually small) support. Other criteria such as output signal to noise ratio or bandwidth have also been argued for. This thesis is an attempt to formulate a set of edge detection criteria that capture as directly as possible the desirable properties of an edge operator. Variational techniques are used to find a solution over the space of all linear shift invariant operators. The first criterion is that the detector have low probability of error i.e. failing to mark edges or falsely marking non- edges. The second is that the marked points should be as close as possible to the centre of the true edge. The third criterion is that there should be low probability of more than one response to a single edge. The technique is used to find optimal operators for step edges and for extended impulse profiles (ridges or valleys in two dimensions). The extension of the one dimensional operators to two dimentions is then discussed. The result is a set of operators of varying width, length and orientation. The problem of combining these outputs into a single description is discussed, and a set of heuristics for the integration are given.

AIM-719

Author[s]: Gerald Barber, Peter de Jong and Carl Hewitt

Semantic Support for Work in Organizations

April 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-719.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-719.pdf

Present day computer systems cannot implement much of the work carried out in organizations such as: planning, decision making, analysis, and dealing with unanticipated situations. Such organizational activities have traditionally been considered too unstructured to be suitable for automation by computer. We are working on the development of computer technology to overcome these limitations. Our goal is the development of a computer system which is capable of the following: describing the semantics of applications as well as the structure of the organization carrying out the work, aiding workers in carrying out the applications using these descriptions, and acquiring these capabilities in the course of the daily work through a process which is analogous to apprenticeship.

AIM-718

Author[s]: A. Yuille

Zero-Crossings on Lines of Curvature

December 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-718.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-718.pdf

We investigate the relations between the structure of the image and events in the geometry of the underlying surface. We introduce some elementary differential geometry and use it to define a coordinate system on the object based on the lines of curvature. Using this coordinate system we can prove results connecting the extrema, ridges and zero-crossings in the image to geometrical features of the object. We show that extrema of the image typically correspond to points on the surface with zero Gaussian curvature and that parabolic lines often give rise to ridges, or valleys, in the image intensity. We show that directional zero- crossings of the image along the lines of curvature generally correspond to extrema of curvature along such lines.

AIM-717

Author[s]: John M. Hollerbach and Gideon Sahar

Wrist-Partitioned Inverse Kinematic Accelerations and Manipulator Dynamics

April 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-717.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-717.pdf

An efficient algorithm is presented for the calculation of the inverse kinematic accelerations for a 6 degree-of-freedom manipulator with a spherical wrist. The inverse kinematic calculation is shown to work synergistically with the inverse dynamic calculation, producing kinematic parameters needed in the recursive Newton-Euler dynamics formulation. Additional savings in the dynamics computation are noted for a class of kinematically well-structured manipulators such as spherical-wrist arms and for manipulators with simply-structured inertial parameters.

AIM-716

Author[s]: Graziella Tonfoni and Richard J. Doyle

Understanding Text through Summarization and Analogy

April 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-716.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-716.pdf

Understanding a text exactly in the way that the Text Producer meant the text to be understood is highly unlikely unless the text interpretation process is constrained. Specific understanding-directing criteria are given in the form of a Premise which is a configuration of plot-units. After performing a Premise- directed text summarization, the Text Receiver will have understood the text as the Text Producer intended and will then be able to replace missing relations within the exercises and produce new texts by applying analogy.

AITR-715

Author[s]: George Edward Barton, Jr.

A Multiple-Context Equality-Based Reasoning System

April 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-715.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-715.pdf

Expert systems are too slow. This work attacks that problem by speeding up a useful system component that remembers facts and tracks down simple consequences. The redesigned component can assimilate new facts more quickly because it uses a compact, grammar-based internal representation to deal with whole classes of equivalent expressions at once. It can support faster hypothetical reasoning because it remembers the consequences of several assumption sets at once. The new design is targeted for situations in which many of the stored facts are equalities. The deductive machinery considered here supplements stored premises with simple new conclusions. The stored premises include permanently asserted facts and temporarily adopted assumptions. The new conclusions are derived by substituting equals for equals and using the properties of the logical connectives AND, Or, and NOT. The deductive system provides supporting premises for its derived conclusions. Reasoning that involves quantifiers is beyond the scope of its limited and automatic operation. The expert system of which the reasoning system is a component is expected to be responsible for overall control of reasoning.

AIM-714

Author[s]: Ikeuchi, Katsushi

Determining Attitude of Object from Needle Map Using Extended Gaussian Image

April 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-714.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-714.pdf

An extended Gaussian image (EGI) is constructed by mapping the surface normals of an object onto the Gussian sphere. The attitude of an object is greatly constrained by the global distribution of EGI mass over the visible Gaussian hemisphere. Constraints on the viewer direction are derived from the position of the EGI mass center, and from the direction of the EGI inertia axis. The algorithm embodying these constraints and the EGI mass distribution are implemented using a lookup table. A function for matching an observed EGI with the prototypical EGIs is also proposed. The algorithm determines the attitude of an object successfully both from a synthesized needle map and a real needle map.

AIM-713

Author[s]: C. Koch and T. Poggio

A Theoretical Analysis of Electrical Properties of Spines

April 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-713.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-713.pdf

The electrical properties of a cortical (spiny) pyramidal cell were analyzed on the basis of passive cable theory from measurements made on histological material (Koch, Poggio & Torre 1982). The basis of this analysis is the solution o the cable equation for an arbitrary branched dendritic tree. We determined the potential at the soma as a function of the synaptic input (transient conductance changes) and as a function of the spine neck dimensions. From our investigation four major points emerge: 1. Spine may effectively compress the effect of each single excitatory synapse on the soma, mapping a wide range of inputs onto a limited range of outputs (nonlinear saturation). This is also true for very fast transient inputs, in sharp contrast with the case of a synapse on a dendrite. 2. The somatic depolarization due to an excitatory synapse on a spine is a very sensitive function of the spine neck length and diameter. Thus the spine can effectively control the resulting saturation curve. This might be the basic mechanism underlying ultra-short memory, long-term potentiation in the hippocampus or learning in the cerebellum. 3. Spines with shunting inhibitory synapses on them are ineffective in reducing the somatic depolarization due to excitatory inputs on the dendritic shaft or on other spines. Thus isolated inhibitory synapses on a spine are not expected to occur. 4. The conjunction of an excitatory synapse with a shunting inhibitory synapse on the same spine may result in a time-discrimination circuit with a temporal resolution of around 100usec.

AIM-712

Author[s]: C. Koch and T. Poggio

Information Processing in Dendritic Spines

March 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-712.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-712.pdf

Dendritic spines are small twigs on the dendrites of a very large class of neurons in the central nervous system. There are between 10 (3) and 10 (5) spines per neuron, each one including at least one synapse, i.e. a connection with other neurons. Thus, spines are usually associated with an important feature of neurons – their high degree of connectivity – one of the most obvious differences between present computers and brains. We have analysed the electrical properties of a cortical (spiny) pyramidal cell on the basis of passive cable theory, from measurements made on histological material, using the solution of the cable equation for an arbitrary branched dendritic tree. As postulated by Rall, we found that the somatic potential induced by firing synapse on a spine is a very sensitive function of the dimension of the spine. This observation leads to several hypotheses concerning the electrical functions of spines, especially with respect to their role in memory.

AIM-711

Author[s]: Michael Brady and Alan Yuille

An Extremum Principle for Shape from Contour

June 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-711.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-711.pdf

An extremum principle is developed that determines three-dimensional surface orientation from a two-dimensional contour. The principle maximizes the ratio of the area to the square of the perimeter, a measure of the compactness or symmetry of the three- dimensional surface. The principle interprets regular figures correctly and it interprets skew symmetries as oriented real symmetries. The maximum likelihood method approximates the principle on irregular figures, but we show that it consistently overestimates the slant of an ellipse.

AIM-710

Author[s]: David Allen McAllester

Symmetric Set Theory: A General Theory of Isomorphism, Abstraction, and Representation

August 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-710.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-710.pdf

It is possible to represent a finite set of points (atoms) by a finite sequence of points. However a finite set of points has no distinguished member and therefore it is impossible to define a function which takes a finite set of points and returns a “first” point in that set. Thus it is impossible to represent a finite sequence of points by a finite set of points. The theory of symmetric sets provides a framework in which the observation about sets and sequences can be proven. The theory of symmetric sets is similar to classical (Zermello-Fraenkel) set theory with the exception that the universe of symmetric sets includes points (ur-elements). Points provide a basis for general notions of isomorphism and symmetry. The general notions of isomorphism and symmetry in turn provide a basis for natural, simple, and universal definitions of abstractness, essential properties and functions, canonicality, and representations. It is expected that these notions will play an important role in the theory of data structures and in the construction of general techniques for reasoning about data structures.

AIM-708

Author[s]: David Allen McAllester

Solving Uninterpreted Equations with Context Free Expression Grammars

May 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-708.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-708.pdf

It is shown here that the equivalence class of an expression under the congruence closure of any finite set of equations between ground terms is a context free expression language. An expression is either a symbols or an n- tuple of expressions; the difference between expressions and strings is that expressions have inherent phrase structure. The Downey, Sethi and Tarjan algorithm for computing congruence closures can be used to convert finite set of equations E to a context free expression grammar G such that for any expression u the equivalence class of u under E is precisely the language generated by an expression form I’(u) under grammar G. the fact that context free expression languages are closed under intersection is used to derive an algorithm for computing a grammar for the equivalence class of a given expression under any finite disjunction of finite sets of equations between ground expressions. This algorithm can also be used to derive a grammar representing the equivalence class of conditional expressions of the form if P then u else v. The description of an equivalence class by a context free expression grammar can also be used to simplify expressions under “well behaved” simplicity orders. Specifically if G is a context free expression grammar which generates an equivalence class of expressions then for any well behaved simplicity order there is a subset G’ of the productions G such that the expressions generated by G’ are exactly those expressions of the equivalence class which are simplicity bounds and whose subterms are also simplicity bounds. Furthermore G’ can be computed from G in order nlog(n) time plus the time required to do order nlog(n) comparisons between expressions where n is the size G.

AITR-707

Author[s]: Walter Hamscher

Using Structural and Functional Information in Diagnostic Design

June 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-707.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-707.pdf

We wish to design a diagnostic for a device from knowledge of its structure and function. the diagnostic should achieve both coverage of the faults that can occur in the device, and should strive to achieve specificity in its diagnosis when it detects a fault. A system is described that uses a simple model of hardware structure and function, representing the device in terms of its internal primitive functions and connections. The system designs a diagnostic in three steps. First, an extension of path sensitization is used to design a test for each of the connections in teh device. Next, the resulting tests are improved by increasing their specificity. Finally the tests are ordered so that each relies on the fewest possible connections. We describe an implementation of this system and show examples of the results for some simple devices.

AIM-706

Author[s]: Shimon Ullman

Computational Studies in the Interpretation of Structure and Motion: Summary and Extension

March 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-706.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-706.pdf

Computational studies of the interpretation of structure from motion examine the conditions under which three-dimensional structure can be recovered from motion in the image. The first part of this paper summarizes the main results obtained to date in these studies. The second part examines two issues: the robustness of the 3-D interpretation of perspective velocity fields, and the 3-D information contained in orthographic velocity fields. The two are related because, under local analysis, limitations on the interpretation of orthographic velocity fields also apply to perspective projection. The following results are established: When the interpretation is applied locally, the 3-D interpretation of the perspective velocity field is unstable. The orthographic velocity field determines the structure of the inducing object exactly up to a depth-scaling. For planar objects, the orthographic velocity field always admits two distinct solutions up to depth-scaling. The 3-D structure is determined uniquely by a “view and a half” of the orthographic velocity field.

AIM-705

Author[s]: Peter C. Gaston and Tomaso Lozano-Perez

Tactile Recognition and Localization Using Object Models: The Case of Polyhedra on a Plane

March 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-705.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-705.pdf

This paper discusses how data from multiple tactile sensors may be used to identify and locate one object, from among a set of known objects. We use only local information from sensors: (1) the position of contact points, and (2) ranges of surface normals at the contact points. The recognition and localization process is structured as the development and pruning of a tree of consistent hypotheses about pairings between contact points and object surfaces. In this paper, we deal with polyhedral objects constrained to lie on a known plane, i.e., having three degrees of positioning freedom relative to the sensors.

AITR-704

Author[s]: Daniel Carl Brotsky

An Algorithm for Parsing Flow Graphs

March 1984

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-704.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-704.pdf

This report describes research about flow graphs - labeled, directed, acyclic graphs which abstract representations used in a variety of Artificial Intelligence applications. Flow graphs may be derived from flow grammars much as strings may be derived from string grammars; this derivation process forms a useful model for the stepwise refinement processes used in programming and other engineering domains. The central result of this report is a parsing algorithm for flow graphs. Given a flow grammar and a flow graph, the algorithm determines whether the grammar generates the graph and, if so, finds all possible derivations for it. The author has implemented the algorithm in LISP. The intent of this report is to make flow-graph parsing available as an analytic tool for researchers in Artificial Intelligence. The report explores the intuitions behind the parsing algorithm, contains numerous, extensive examples of its behavior, and provides some guidance for those who wish to customize the algorithm to their own uses.

AITR-703

Author[s]: Gerald Roylance

A Simple Model of Circuit Design

May 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-703.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-703.pdf

A simple analog circuit designer has been implemented as a rule based system. The system can design voltage followers. Miller integrators, and bootstrap ramp generators from functional descriptions of what these circuits do. While the designer works in a simple domain where all components are ideal, it demonstrates the abilities of skilled designers. While the domain is electronics, the design ideas are useful in many other engineering domains, such as mechanical engineering, chemical engineering, and numerical programming. Most circuit design systems are given the circuit schematic and use arithmetic constraints to select component values. This circuit designer is different because it designs the schematic. The designer uses a unidirectional CONTROL relation to find the schematic. The circuit designs are built around this relation; it restricts the search space, assigns purposes to components and finds design bugs.

AIM-702

Author[s]: Reid G. Simmons and Randall Davis

Representations for Reasoning About Change

April 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-702.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-702.pdf

This paper explores representations used to reason about objects which change over time and the processes which cause changes. Specifically, we are interested in solving a problem known as geologic interpretation. To help solve this problem, we have developed a simulation technique, which we call imagining. Imagining takes a sequence of events and simulates them by drawing diagrams. In order to do this imagining, we have developed two representations of objects, one involving histories and the other involving diagrams, and two corresponding representations of physical processes, each suited to reasoning about one of the object representations. These representations facilitate both spatial and temporal reasoning.

AIM-701

Author[s]: John Batali

Computational Introspection

February 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-701.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-701.pdf

Introspection is the process of thinking about one’s own thoughts and feelings. In this paper, I discuss recent attempts to make computational systems that exhibit introspective behavior: [Smith, 982], [Weyhrauch, 1978], and [Doyle, 1980]. Each presents a system capable of manipulating representations of its own program and current context. I argue that introspective ability is crucial for intelligent systems – without it an agent cannot represent certain problems that it must be able to solve. A theory of intelligent action would describe how and why certain actions intelligently achieve an agent’s goals. The agent would both embody and represent this theory; it would be implemented as the program for the agent; and the importance of introspection suggests that the agent represent its theory of action to itself.

AIM-700

Author[s]: John M. Hollerbach

Dynamic Scaling of Manipulator Trajectories

January 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-700.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-700.pdf

A fundamental time-scaling property of manipulator dynamics has been identified that allows modification of movement speed without complete dynamics recalculation. By exploiting this property, it can be determined whether a planned trajectory is dynamically realizable given actuator torque limits, and if not, how to modify the trajectory to bring to bring it within dynamic an actuating constraints.

AIM-699

Author[s]: Ellen C. Hildreth and Shimon Ullman

The Measurement of Visual Motion

December 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-699.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-699.pdf

The analysis of visual motion divides naturally into two stages: the first is the measurement of motion, for example, the assignment of direction and magnitude of velocity to elements in the image, on the basis of the changing intensity pattern; the second is the use of motion measurements, for example, to separate the scene into distinct objects, and infer their three-dimensional structure. In this paper, we present a computational study of the measurement of motion. Similar to other visual processes, the motion of elements is not determined uniquely by information in the changing image; additional constraint is required to compute a unique velocity field. Given this global ambiguity of motion, local measurements from the changing image, such as those provided by directionally- selective simple cells in primate visual cortex, cannot possibly specify a unique local velocity vector, and in fact, specify only one component of velocity. Computation of the full two- dimensional velocity field requires the integration of local motion measurements, either over an area, or along contours in the image. We will examine possible algorithms for computing motion, based on a range of additional constraints. Finally, we will present implications for the biological computation of motion.

AIM-698A

Author[s]: Tomas Lozano-Perez

Robot Programming

December 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-698a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-698a.pdf

The industrial robot’s principal advantage over traditional automation is programmability. Robots can perform arbitrary sequences of pre-stored motions or of motions computed as functions of sensory input. This paper reviews requirements for and developments in robot programming systems. The key requirements for robot programming systems examined in the paper are in the areas of sensing, world modeling, motion specification, flow of control, and programming support. Existing and proposed robot programming systems fall into three broad categories: guiding systems in which the user leads a robot through the motions to be performed, robot-level programming systems in which the user writes a computer program specifying motion and sensing, and task-level programming systems in which the user specifies operations by their desired effect on objects. A representative sample of systems in each of these categories is surveyed in the paper.

AIM-698

Author[s]: Tomas Lozano-Perez

Robot Programming

December 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-698.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-698.pdf

AIM-697

Author[s]: W.E.L. Grimson

Binocular Shading and Visual Surface Reconstruction

August 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-697.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-697.pdf

Zero-crossing or feature-point based stereo algorithms can, by definition, determine explicit depth information only at particular points on the image. To compute a complete surface description, this sparse depth map must be interpolated. A computational theory of this interpolation or reconstruction process, based on a surface consistency constraint, has previously been proposed. In order to provide stronger boundary conditions for the interpolation process, other visual cues to surface shape are examined in this paper. In particular, it is shown that, in principle, shading information from the two views can be used to determine the orientation of the surface normal along the feature-point contours, as well as the parameters of the reflective properties of the surface material. The numerical stability of the resulting equations is also examined.

AIM-692

Author[s]: C.J. Barter

Policy-Protocol Interaction in Composite Processes

September 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-692.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-692.pdf

Message policy is defined to be the description of the disposition of messages of a single type, when received by a group of processes. Group policy applies to all the processes of a group, but for a single message type. It is proposed that group policy be specified in an expression which is separate from the code of the processes of the group, and in a separate notation. As a result, it is possible to write policy expressions which are independent of process state variables, and as well use a simpler control notation based on regular expressions. Input protocol, on the other hand, applies to single processes or a group as a whole for all message types. Encapsulation of processes is presented with an unusual emphasis on the transactions and resources which associate with an encapsulated process rather than the state space of the process environment. This is due to the notion of encapsulation without shared variables, and to the association between group policies, message sequences and transactions.

AIM-691

Author[s]: Carl Hewitt and Peter de Jong

Open Systems

December 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-691.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-691.pdf

This paper describes some problems and opportunities associated with conceptual modeling for the kind of “open systems” we foresee must and will be increasingly recognized as a central line of computer system development. Computer applications will be based on communication between sub-systems which will have been developed separately and independently. Some of the reasons for independent development are the following: competition, different goals and responsibilities, economics, and geographical distribution. We must deal with all the problems that arise from this conceptual disparity of sub-systems which have been independently developed. Sub- systems will be open-ended and incremental – undergoing continual evolution. There are no global objects. The only thing that all the various sub-systems hold in common is the ability to communicate with each other. In this paper we study Open Systems from the viewpoint of Message Passing Semantics, a research programme to explore issues in the semantics of communication in parallel systems such as negotiation, transaction management, problem solving, change, and self-knowledge.

AITR-690

Author[s]: Matthew Thomas Mason

Manipulator Grasping and Pushing Operations

June 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-690.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-690.pdf

The primary goal of this research is to develop theoretical tools for analysis, synthesis, application of primitive manipulator operations. The primary method is to extend and apply traditional tools of classical mechanics. The results are of such a general nature that they address many different aspects of industrial robotics, including effector and sensor design, planning and programming tools and design of auxiliary equipment. Some of the manipulator operations studied are: (1) Grasping an object. The object will usually slide and rotate during the period between first contact and prehension. (2) Placing an object. The object may slip slightly in the fingers upon contact with the table as the base aligns with the table. (3) Pushing. Often the final stage of mating two parts involves pushing one object into the other.

AITR-688

Author[s]: Robert W. Sjoberg

Atmospheric Effects in Satellite Imaging of Mountainous Terrain

September 1982

http://ncstrl.mit.edu

AIM-687

Author[s]: Tomaso Poggio and B.L. Rosser

The Computational Problem of Motor Control

May 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-687.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-687.pdf

We review some computational aspects of motor control. The problem of trajectory control is phrased in terms of an efficient representation of the operator connecting joint angles to joint torques. Efficient look-up table solutions of the inverse dynamics are related to some results on the decomposition of function of many variables. In a biological perspective, we emphasize the importance of the constraints coming from the properties of the biological hardware for determining the solution to the inverse dynamic problem.

AIM-686

Author[s]: John M. Hollerbach

Computers, Brains, and the Control of Movement

June 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-686.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-686.pdf

Many of the problems associated with the planning and execution of human arm trajectories are illuminated by planning and control strategies which have been developed for robotic manipulators. This comparison may provide explanations for the predominance of straight line trajectories in human reaching and pointing movements, the role of feedback during arm movement, as well as plausible compensatory mechanisms for arm dynamics.

AIM-685

Author[s]: Rodney A. Brooks

Symbolic Error Analysis and Robot Planning

September 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-685.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-685.pdf

A program to control a robot manipulator for industrial assembly operations must take into account possible errors in parts placement and tolerances of the parts themselves. Previous approaches to this problem have been to (1) engineer the situation so that the errors are small or (2) let the programmer analyze the errors and take explicit account of them. This paper gives the mathematical underpinnings for building programs (plan checkers) to carry out approach (2) automatically. The plan checker uses a geometric CAD-type database to infer the effects of actions and the propagation of errors. It does this symbolically rather than numerically, so that computations can be reversed and desired resultant tolerances can be used to infer required initial tolerances or the necessity for sensing. The checker modifies plans to include sensing and adds constraints to the plan which ensure that it will succeed. An implemented system is described and results of its execution are presented. The plan checker could be used as part of an automatic planning system of as an aid to a human robot programmer.

AIM-684

Author[s]: Rodney A. Brooks and Tomas Lozano-Perez

A Subdivision Algorithm in Configuration Space for Findpath with Rotation

December 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-684.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-684.pdf

A hierarchical representation for configuration space is presented, along with an algorithm for searching that space for collision-free paths. The detail of the algorithm are presented for polygonal obstacles and a moving object with two translational and one rotational degrees of freedom.

AIM-683

Author[s]: Tomaso Poggio

Visual Algorithms

May 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-683.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-683.pdf

Nonlinear, local and highly parallel algorithms can perform several simple but important visual computations. Specific classes of algorithms can be considered in an abstract way. I study here the class of polynomial algorithms to exemplify some of the important issues for visual processing like linear vs. nonlinear and local vs. global. Polynomial algorithms are a natural extension of Perceptrons to time dependent grey level images.. Although they share most of the limitations of Perceptrons, they are powerful parallel computational devices. Several of their properties are characterized and especially (a) their equivalence with Perceptrons for geometrical figures and (b) the synthesis of non-linear algorithms (mappings) via associative learning. Finally, the paper considers how algorithms of this type could be implemented in nervous hardware, in terms of synaptic interactions strategically located in a dendritic tree.

AIM-681

Author[s]: Gerald Barber

Supporting Organizational Problem Solving with a Workstation

July 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-681.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-681.pdf

This paper describes an approach to supporting work in the office. Using and extending ideas from the field of Artificial Intelligence (AI) we describe office work as a problem solving activity. A knowledge embedding language called Omega is used to embed knowledge of the organization into an office worker’s workstation in order to support the office worker in his or her problem solving. A particular approach to reasoning about change and contradiction is discussed. This approach uses Omega’s viewpoint mechanism. Omega’s viewpoint mechanism is a general contradiction handling facility. Unlike other Knowledge Representation systems, when a contradiction is reached the reasons for the contradiction can be analyzed by the deduction mechanism without having to resort to a backtracking mechanism. The Viewpoint mechanism is the heart of the Problem Solving Support Paradigm. This paradigm supplements the classical AI view of problem solving. Office workers are supported using the Problem Solving Support Paradigm. An example is presented where Omega’s facilities are used to support an office worker’s problem solving activities. The example illustrates the use of viewpoints and of Omega’s capabilities to reason about it’s own reasoning process.

AIM-680A

Author[s]: Richard C. Waters

LetS: An Expressional Loop Notation

February 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-680a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-680a.pdf

Many loops can be more easily understood and manipulated if they are viewed as being built up out of operations on sequences of values. A notation is introduced which makes this viewpoint explicit. Using it, loops can be represented as compositions of functions operating on sequences of values. A library of standard sequence functions is provided along with facilities for defining additional ones. The notation is not intended to be applicable to every kind of loop. Rather, it has been simplified wherever possible so that straightforward loops can be represented extremely easily. The expressional form of the notation makes it possible to construct and modify such loops rapidly and accurately. The implementation of the notation does not actually use sequences but rather compiles loop expressions into iterative loop code. As a result, using the notation leads to no reduction in run time efficiency.

AIM-679

Author[s]: Patrick H. Winston, Thomas O. Binford, Boris Katz and Michael Lowry

Learning Physical Descriptions from Functional Definitions, Examples, and Precedents

November 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-679.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-679.pdf

It is too hard to tell vision systems what things look like. It is easier to talk about purpose and what things are for. Consequently, we want vision systems to use functional descriptions to identify things when necessary, and we want them to learn physical descriptions for themselves, when possible. This paper describes a theory that explains how to make such systems work. The theory is a synthesis of two sets of ideas: ideas about learning from precedents and exercises developed at MIT and ideas about physical description developed at Stanford. The strength of the synthesis is illustrated by way of representative experiments. All of these experiments have been performed with an implemented system.

AIM-678

Author[s]: Patrick H. Winston

Learning by Augmenting Rules and Accumulating Censors

May 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-678.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-678.pdf

This paper is a synthesis of several sets of ideas: ideas about learning from precedents and exercises, ideas about learning using near misses, ideas about generalizing if-then rules, and ideas about using censors to prevent procedure misapplication. The synthesis enables two extensions to an implemented system that solves problems involving precedents and exercises and that generates if-then rules as a byproduct . These extensions are as follows: If-then rules are augmented by unless conditions, creating augmented if-then rules. An augmented if- then rule is blocked whenever facts in hand directly demonstrate the truth of an unless condition, the rule is called a censor. Like ordinary augmented if-then rules, censors can be learned. Definition rules are introduced that facilitate graceful refinement. The definition rules are also augmented if-then rules. They work by virtue of unless entries that capture certain nuances of meaning different from those expressible by necessary conditions. Like ordinary augmented if-then rules, definition rules can be learned. The strength of the ideas is illustrated by way of representative experiments. All of these experiments have been performed with an implemented system.

AIM-677

Author[s]: Boris Katz and Patrick H. Winston

Parsing and Generating English Using Commutative Transformations

May 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-677.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-677.pdf

This paper is about an implemented natural language interface that translates from English into semantic net relations and from semantic net relations back into English. The parser and companion generator were implemented for two reasons: (a) to enable experimental work in support of a theory of learning by analogy; (b) to demonstrate the viability of a theory of parsing and generation built on commutative transformations. The learning theory was shaped to a great degree by experiments that would have been extraordinarily tedious to perform without the English interface with which the experimental data base was prepared, revise, and revised again. Inasmuch as current work on the learning theory is moving toward a tenfold increase in data-base size, the English interface is moving from a facilitating role to an enabling one. The parsing and generation theory has two particularly important features: (a) the same grammar is used for both parsing and generation; (b) the transformations of the grammar are commutative. The language generation procedure converts a semantic network fragment into kernel frames, chooses the set of transformations that should be performed upon each frame, executes the specified transformations, combines the altered kernels into a sentence, performs a pronominalization process, and finally produces the appropriate English word string. Parsing is essentially the reverse of generation. The first step in the parsing process is splitting a given sentence into a set of kernel clauses along with a description of how those clauses hierarchically related to each other. The clauses are hierarchically related to each other. The clauses are used to produce a matrix embedded kernel frames, which in turn supply arguments to relation- creating functions. The evaluation of the relation-creating functions results in the construction of the semantic net fragments.

AIM-676

Author[s]: Kent A. Stevens

Implementation of a Theory for Inferring Surface Shape from Contours

August 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-676.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-676.pdf

Human vision is adept at inferring the shape of a surface from the image of curves lying across the surface. The strongest impression of 3-D shape derives from parallel (but not necessarily equally spaced) contours. In [Stevens 1981a] the computational problem of inferring 3-D shape from image configurations is examined, and a theory is given for how the visual system constrains the problem by certain assumptions. The assumptions are three: that neither the viewpoint not the placement of the physical curves on the surface is misleading, and that the physical curves are lines of curvature across the surface. These assumptions imply that parallel image contours correspond to parallel curves lying across an approximately cylindrical surface. Moreover, lines of curvature on a cylinder are geodesic and planar. These properties provide strong constraint on the local surface orientation. We describe a computational method embodying these geometric constraints that is able to determine the surface orientation even in places where locally it is very weakly constrained, by extrapolating from places where it is strongly constrained. This computation has been implemented, and predicts local surface orientation that closely matches the apparent orientation. Experiments with the implementation support the theory that our visual interpretation of surface shape from contour assumes the image contours correspond to lines of curvature.

AIM-675

Author[s]: Tomaso Poggio, Kenneth Nielsen and Keith Nishihara

Zero-Crossings and Spatiotemporal Interpretation in Vision

May 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-675.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-675.pdf

We will briefly outline a computational theory of the first stages of human vision according to which (a) the retinal image is filtered by a set of centre-surround receptive fields (of about 5 different spatial sizes) which are approximately bandpass in spatial frequency and (b) zero-crossings are detected independently in the output of each of these channels. Zero-crossings in each channel are then a set of discrete symbols which may be used for later processing such as contour extraction and stereopsis. A formulation of Logan’s zero-crossing results is proved for the case of Fourier polynomials and an extension of Logan’s theorem to 2- dimentsional functions is also approved. Within this framework, we shall describe an experimental and theoretical approach (developed by one of us with M. Fahle) to the problem of visual acuity and hyperacuity of human vision. The positional accuracy achieved, for instance, in reading a vernier is astonishingly high, corresponding to a fraction of the spacing between adjacent photoreceptors in the fovea. Stroboscopic presentation of a moving object can be interpolated by our visual system into the perception of continuous motion; and this “spatio-temporal” interpolation also can be very accurate. It is suggested that the known spatiotemporal properties of the channels envisaged by the theory of visual processing outlined above implement an interpolation scheme which can explain human vernier acuity for moving targets.

AIM-674

Author[s]: Rodney A. Brooks

Solving the Find-Path Problem by Representing Free Space as Generalized Cones

May 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-674.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-674.pdf

Free space is represented as a union of (possibly overlapping) generalized cones. An algorithm is presented which efficiently finds good collision free paths for convex polygonal bodies through space littered with obstacle polygons. The paths are good in the sense that the distance of closest approach to an obstacle over the path is usually far from minimal over the class of topologically equivalent collision free paths. The algorithm is based on characterizing the volume swept by a body as it is translated and rotated as a generalized cone and determining under what conditions generalized cone is a subset of another.

AIM-673

Author[s]: Ken Haase

CAULDRONS: An Abstraction for Concurrent Problem Solving

September 1986

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-673.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-673.pdf

This research extends a tradition of distributed theories of mind into the implementation of a distributed problem solver. In this problem solver a number of ideas from Minsky's Society of Mind are implemented and are found to provide powerful abstractions for the programming of distributed systems. These abstractions are the cauldron, a mechanism for instantiating reasoning contexts, the frame, a way of modularly describing those contexts and the goal-node, a mechanism for bringing a particular context to bear on a specific task. The implementation of both these abstractions and the distributed problem solver in which they run is described, accompanied by examples of their application to various domains.

AIM-672

Author[s]: Daniel G. Theriault

A Primer for the Act-1 Language

April 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-672.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-672.pdf

This paper describes the current design for the Act-1 computer programming language and describes the Actor computational model, which the language was designed to support. It provides a perspective from which to view the language, with respect to existing computer language systems and to the computer system and environment under development for support of the language. The language is informally introduced in a tutorial fashion and demonstrated through examples. A programming strategy for the language is described, further illustrating its use.

AIM-671

Author[s]: Demetri Terzopoulos

Multi-Level Reconstruction of Visual Surfaces: Variational Principles and Finite Element Representations

April 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-671.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-671.pdf

Computational modules early in the human vision system typically generate sparse information about the shapes of visible surfaces in the scene. Moreover, visual processes such as stereopsis can provide such information at a number of levels spanning a range of resolutions. In this paper, we extend this multi-level structure to encompass the subsequent task of reconstructing full surface descriptions from the sparse information. The mathematical development proceeds in three steps. First, the surface most consistent with the sparse constraints is characterized as the solution to an equilibrium state of a thin flexible plate. Second, local, finite element representations of surfaces are introduced and, by applying the finite element method, the continuous variational principle is transformed into a discrete problem in the form of a large system of linear algebraic equations whose solution is computable by local-support, cooperative mechanisms. Third, to exploit the information available at each level of resolution, a hierarchy of discrete problems is formulated and a highly efficient multi-level algorithm, involving both intra-level relaxation processes and bi-directional inter-level algorithm, involving both intra-level relaxation processes and bidirectional inter-level local interpolation processes is applied to their simultaneous solution.. Examples of the generation of hierarchies of surface representations from stereo constraints are given. Finally, the basic surface approximation problem is revisited in a broader mathematical context whose implications are of relevance to vision.

AIM-670

Author[s]: Steven W. Zucker, Kent A. Stevens and Peter T. Sander

The Relation Between Proximity and Brightness Similarity in Dot Patterns

May 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-670.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-670.pdf

The Gestalt studies demonstrated the tendency to visually organize dots on the basis of similarity, proximity, and global properties such as closure, good continuation, and symmetry. The particular organization imposed on a collection of dots is thus determined by many factors, some local, some global. We discuss computational reasons for expecting the initial stages of grouping to be achieved by processes with purely local support. In the case of dot patterns, the expectation is that neighboring dots are grouped on the basis of proximity and similarity of contrast, by processes that are independent of the overall organization and the various global factors. We describe experiments that suggest a purely local relationship between proximity and brightness similarity in perceptual grouping.

AIM-668

Author[s]: W. Richards, H.K. Nishihara and B. Dawson

CARTOON: A Biologically Motivated Edge Detection Algorithm

June 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-668.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-668.pdf

Caricatures demonstrate that only a few significant “edges” need to be captured to convey the meaning of a complex pattern of image intensities. The most important of these “edges” are image intensity changes arising from surface discontinuities or occluding boundaries. The CARTOON algorithm is an attempt to locate these special intensity changes using a modification of the zero-crossing coincidence scheme suggested by Marr and Hildreth (1980).

AIM-667

Author[s]: David Allen McAllester

Reasoning Utility Package User's Manual, Version One

April 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-667.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-667.pdf

RUP (Reasoning Utility Package) is a collection of procedures for performing various computations relevant to automated reasoning. RUP contains a truth maintenance system (TMS) which can be used to perform simple propositional deduction (unit clause resolution) to record justifications, to track down underlying assumptions and to perform incremental modifications when premises are changed. This TMS can be used with an automatic premise controller which automatically retracts “assumptions” before “solid facts” when contradictions arise and searches for the most solid proof of an assertion. RUP also contains a procedure for efficiently computing all the relevant consequences of any set of equalities between ground terms. A related utility computes “substitution simplifications” of terms under an arbitrary set of unquantified equalities and a user defined simplicity order. RUP also contains demon writing macros which allow one to write PLANNER like demons that trigger on various types of events in the data base. Finally there is a utility for reasoning about partial orders and arbitrary transitive relations. In writing all of these utilities an attempt has been made to provide a maximally flexible environment for automated reasoning.

AIM-666

Author[s]: Michael Brady and W. Eric L. Grimson

The Perception of Subjective Surfaces

November 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-666.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-665.pdf

It is proposed that subjective contours are an artifact of the perception of natural three- dimensional surfaces. A recent theory of surface interpolation implies that “subjective surfaces” are constructed in the visual system by interpolation between three-dimensional values arising from interpretation of a variety of surface cues. We show that subjective surfaces can take any form, including singly and doubly curved surfaces, as well as the commonly discussed fronto-parallel planes. In addition, it is necessary in the context of computational vision to make explicit the discontinuities, both in depth and in surface orientation, in the surfaces constructed by interpolation. It is proposed that subjective surfaces and subjective contours are demonstrated. The role played by figure completion and enhanced brightness contrast in the determination of subjective surfaces is discussed. All considerations of surface perception apply equally to subjective surfaces.

AIM-665

Author[s]: Randall Davis

Expert Systems: Where Are We? And Where Do We Go from Here?

June 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-665.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-665.pdf

Work on Expert Systems has received extensive attention recently, prompting growing interest in a range of environments. Much has been made of the basic concept and the rule-based system approach typically used to construct the programs. Perhaps this is a good time then to review what we know, assess the current prospects, and suggest directions appropriate for the next steps of basic research. I’d like to do that today and propose to do it by taking you on a journey of sorts, a metaphorical trip through the State of the Art of Expert Systems. We’ll wander about the landscape, ranging from the familiar territory of the Land of Accepted Wisdom, to the vast unknowns at the Frontiers of Knowledge. I guarantee we’ll all return safely, so come along…

AIM-664A

Author[s]: Kenneth D. Forbus

Qualitative Process Theory

May 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-664a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-664a.pdf

Things move, collide, flow, bend, heat up, cool down, stretch, break and boil. These and other things that happen to cause changes in objects over time are intuitively characterized as processes. To understand common sense physical reasoning and make machines that interact significantly with the physical world we must understand the qualitative reasoning about processes, their effects, and their limits. Qualitative Process theory defines a simple notion of physical process that appears quite useful as a language in which to write physical theories. Reasoning about processes also motivates a new qualitative representation for quantity, the Quantity Space. This paper includes the basic definitions of Qualitative Process theory, describes several different kinds of reasoning that can be performed with them, and discusses its implications for causal reasoning. The use of the theory is illustrated by several examples, including figuring out that a boiler can blow up, that an oscillator with friction will eventually stop, and how to say that you can pull with a string, but not push with it.

AIM-663

Author[s]: W.E.L Grimson

The Implicit Constraints of the Primal Sketch

October 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-663.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-663.pdf

Computational theories of structure-from- motion and stereo vision only specify the computation of three-dimensional surface information at points in the image at which the irradiance changes. Yet, the visual perception is clearly of complete surfaces, and this perception is consistent for different observers. Since mathematically the class of surfaces which could pass through the known boundary points provided by the stereo system is infinite and contains widely varying surfaces, the visual system must incorporate some additional constraints besides the known points in order to compute the complete surface. Using the image irradiance equation, we derive the surface consistency constraint, informally referred to as no news is good news. The constraint implies that the surface must agree with the information from stereo or motion correspondence, and not vary radically between these points. An explicit form of this surface consistency constraint is derived, by relating the probability of a zero- crossing in a region of the image to the variation in the local surface orientation of the surface, provided that the surface albedo and the illumination are roughly constant. The surface consistency constraint can be used to derive an algorithm for reconstructing the surface that “best” fits the surface information provided by stereo or motion correspondence.

AIM-662

Author[s]: Anna R. Bruss and Berthold K.P. Horn

Passive Navigation

November 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-662.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-662.pdf

A method is proposed for determining the motion of a body relative to a fixed environment using the changing image seen by a camera attached to the body. The optical flow in the image plane is the input, while the instantaneous rotation and translation of the body are the output. If optical flow could be determined precisely, it would only have to be known at a few places to compute the parameters of the motion. In practice, however, the measured optical flow will be somewhat inaccurate. It is therefore advantageous to consider methods which use as much of the available information as possible. We employ a least-squares approach which minimizes some measure of the discrepancy between the measured flow and that predicted from the computed motion parameters. Several different error norms are investigated. In general, our algorithm leads to a system of nonlinear equations from which the motion parameters may be computed numerically. However, in the special cases where the motion of the camera is purely translational or purely rotational, use of the appropriate norm leads to a system of equations from which these parameters can be determined in closed form.

AIM-661

Author[s]: John M. Hollerbach

Workshop on the Design and Control of Dextrous Hands

April 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-661.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-661.pdf

The Workshop for the Design and Control of Dexterous Hands was held at the MIT Artificial Intelligence Laboratory on November 5-6, 1981. Outside experts were brought together to discuss four topics: kinematics of hands, actuation and materials, touch sensing and control. This report summarizes the discussions of the participants and attempts to identify a consensus on applications, mechanical design, and control.

AIM-660

Author[s]: Whitman Richards

How to Play Twenty Questions with Nature and Win

December 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-660.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-660.pdf

The 20 Questions Game played by children has an impressive record of rapidly guessing an arbitrarily selected object with rather few, well-chosen questions. This same strategy can be used to drive the perceptual process, likewise beginning the search with the intent of deciding whether the object is Animal- Vegetable-or-Mineral. For a perceptual system, however, several simple questions are required even to make this first judgment as to the Kingdom the object belongs. Nevertheless, the answers to these first simple questions, or their modular outputs, provide a rich data base which can serve to classify objects or events in much more detail than one might expect, thanks to constraints and laws imposed upon natural processes and things. The questions, then, suggest a useful set of primitive modules for initializing perception.

AIM-657

Author[s]: T. Poggio and C. Koch

Nonlinear Interactions in a Dendritic Tree: Localization, Timing and Role in Information Processing

September 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-657.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-657.pdf

In a dendritic tree transient synaptic inputs activating ionic conductances with an equilibrium potential near the resting potential can veto very effectively other excitatory inputs. Analog operations of this type can be very specific with respect to relative locations of the inputs and their timing. We examine with computer experiments the precise conditions underlying this effect in the case of b-like cat retinal ganglion cell. The critical condition required for strong and specific interactions is that the peak inhibitory conductance change must be sufficiently large almost independently of other electrical parameters. In this case, a passive dendritic tree may perform hundreds of independent analog operations on its synaptic inputs, without requiring any threshold mechanism.

AIM-656

Author[s]: Henry Lieberman

Seeing What Your Programs Are Doing

February 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-656.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-656.pdf

An important skill in programming is being able to visualize the operation of procedures, both for constructing programs and debugging them. Tinker is a programming environment for Lisp that enables the programmer to “see what the program is doing” while the program is being constructed, by displaying the result of each step in the program on representative examples. To help the reader visualize the operation of Tinker itself, an example is presented of how he or she might use Tinker to construct an alpha-beta tree search program.

AIM-654

Author[s]: Michael Brady and Berthold K.P. Horn

Rotationally Symmetric Operators for Surface Interpolation

November 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-654.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-654.pdf

The use of rotationally symmetric operators in vision is reviewed and conditions for rotational symmetry are derived for linear and quadratic forms in the first and second partial directional derivatives of a function f(x,y). Surface interpolation is considered to be the process of computing the most conservative solution consistent with boundary conditions. The “most conservative” solution is modeled using the calculus of variations to find the minimum function that satisfies a given performance index. To guarantee the existence of a minimum function, Grimson has recently suggested that the performance index should be a semi-norm. It is shown that all quadratic forms in the second partial derivatives of the surface satisfy this criterion. The seminorms that are, in addition, rotationally symmetric form a vector space whose basis is the square Laplacian and the quadratic variation. Whereas both seminorms give rise to the same Euler condition in the interior, the quadratic variation offers the tighter constraint at the boundary and is to be preferred for surface interpolation.

AIM-653

Author[s]: Michael Brady

Computational Approaches to Image Understanding

October 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-653.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-653.pdf

Recent theoretical developments in Image Understanding are surveyed. Among the issues discussed are: edge finding, region finding, texture, shape from shading, shape from texture, shape from contour, and the representations of surfaces and objects. Much of the work described was developed in the DARPA Image Understanding project.

AIM-652

Author[s]: Robert Lawler

Some Powerful Ideas

December 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-652.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-652.pdf

Here is a set of problem solving ideas (absorbed by and developed through the MIT Logo project over many years) presented in such a way as to useful to someone with a Logo computer. With the ideas on unbound, single sheets, you can easily pick out those you like and set aside the others. The ideas vary in sophistication and accessibility: no threshold, no ceiling.

AIM-651

Author[s]: David Chapman

A Program Testing Assistant

November 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-651.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-651.pdf

This paper describes the design and implementation of a program testing assistant which aids a programmer in the definition, execution, and modification of test cases during incremental program development. The testing assistant helps in the interactive definition of test cases and executes them automatically when appropriate. It modifies test cases to preserve their usefulness when the program they test undergoes certain types of design changes. The testing assistant acts as a fully integrated part of the programming environment and cooperates with existing programming tools, including a display editor, compiler, interpreter, and debugger.

AIM-650

Author[s]: T. Poggio and V. Torre

Microelectronics In Nerve Cells: Dendritic Morphology and Information Processing

October 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-650.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-650.pdf

The electrical properties of the different anatomical types of retinal ganglion cells in the cat were calculated on the basis of passive cable theory from measurements made on histological material provided by Boycott and Wassle (1974). The interactions between excitation and inhibition when the inhibitory battery is near the resting potential can be strongly nonlinear in these cells. We analyse some of the integrative properties of an arbitrary passive dendritic tree and we then derive the functional properties which are characteristic for the various types of ganglion cells. In particular, we derive several general results concerning the spatial specificity of shunting inhibition in “vetoing” an excitatory input (the “on path” property) and its dependence on the geometrical and electric properties of the dendritic tree. Our main conclusion is that specific branching patterns coupled with a suitable distribution of synapses are able to support complex information processing operations on the incoming signals. Thus, a neuron seems likely to resemble an (analog) ISI circuit with thousands of elementary processing units – the synapses – rather than a single logical gate. A dendritic tree would be near to the ultimate in microelectronics with little patches of postsynaptic membrane representing the fundamental units for several elementary computations.

AITR-649

Author[s]: Michael Dennis Riley

The Representation of Image Texture

September 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-649.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-649.pdf

This thesis explores how to represent image texture in order to obtain information about the geometry and structure of surfaces, with particular emphasis on locating surface discontinuities. Theoretical and psychophysical results lead to the following conclusions for the representation of image texture: (1) A texture edge primitive is needed to identify texture change contours, which are formed by an abrupt change in the 2-D organization of similar items in an image. The texture edge can be used for locating discontinuities in surface structure and surface geometry and for establishing motion correspondence. (2) Abrupt changes in attributes that vary with changing surface geometry – orientation, density, length, and width – should be used to identify discontinuities in surface geometry and surface structure. (3) Texture tokens are needed to separate the effects of different physical processes operating on a surface. They represent the local structure of the image texture. Their spatial variation can be used in the detection of texture discontinuities and texture gradients, and their temporal variation may be used for establishing motion correspondence. What precisely constitutes the texture tokens is unknown; it appears, however, that the intensity changes alone will not suffice, but local groupings of them may. (4) The above primitives need to be assigned rapidly over a large range in an image.

AIM-648

Author[s]: W.A. Richards

A Lightness Scale from Image Intensity Distributions

August 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-648.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-648.pdf

A lightness scale is derived from a theoretical estimate of the probability distribution of image intensities for natural scenes. The derived image intensity distribution considers three factors: reflectance, surface orientation and illumination, and surface texture (or roughness). The convolution of the effects of these three factors yields the theoretical probability distribution of image intensities. A useful lightness scale should be the integral of this probability density function for then equal intervals along the scale are equally probable and carry equal information. The result is a scale similar to that used in photography, or by the nervous system as its transfer function.

AIM-647

Author[s]: Marvin Minsky

Nature Abhors an Empty Vacuum

August 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-647.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-647.pdf

Imagine a crystalline world of tiny, discrete “cells”, each knowing only what its nearest neighbors do. Each volume of space contains only a finite amount of information, because space and time come in discrete units. In such a universe, we’ll construct analogs of particles and fields – and ask what it would mean for these to satisfy constraints like conservation of momentum. In each case classical mechanics will break down – on scales both small and large, and strange phenomena emerge: a maximal velocity, a slowing of internal clocks, a bound on simultaneous measurement, and quantum- like effects in very weak or intense fields. This fantasy about conservation in cellular arrays was inspired by this first conference on computation and physics, a subject destined to produce profound and powerful theories. I wish this essay could include one such; alas, it only portrays images of what such theories might be like. The “cellular array” idea is popular already in such forms as Ising models, renormalization theories, the “Game of Life” and Von Neumann’s work on self- producing machines. This essay exploits many unpublished ideas I got from Edward Fredkin. The ideas about field and particle are original; Richard Feynman persuaded me to consider fields instead of forces, but is not responsible for my compromise on potential surfaces. I also thank Danny Hillis and Richard Stallman for other ideas.

AIM-646

Author[s]: W. Daniel Hillis

The Connection Machine

September 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-646.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-646.pdf

This paper describes the connection memory, a machine for concurrently manipulating knowledge stored in semantic networks. We need the connection memory because conventional serial computers cannot move through such networks fast enough. The connection memory sidesteps the problem by providing processing power proportional to the size of the network. Each node and link in the network has its own simple processor. These connect to form a uniform locally-connected network of perhaps a million processor/memory cells

AIM-645

Author[s]: Tomaso Poggio

Marr's Approach to Vision

August 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-645.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-645.pdf

In the last seven years a new computational approach has led to promising advances in the understanding of biological visual perception. The foundations of the approach are largely due to the work of a single man, David Marr at M.I.T. Now, after his death in Boston on November 17th 1980, research in vision will not be the same for the growing number of those who are following his lead.

AIM-644

Author[s]: Richard M. Stallman

The SUPDUP Protocol

July 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-644.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-644.pdf

The SUPDUP protocol provides for login to a remote system over a network with terminal- independent output, so that only the local system need know how to handle the user’s terminal. It offers facilities for graphics and for local assistance to remote text editors. This memo contains a complete description of the SUPDUP protocol in fullest possible detail.

AIM-643

Author[s]: Richard M. Stallman

A Local Front End for Remote Editing

February 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-643.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-643.pdf

The Local Editing Protocol allows a local programmable terminal to execute the most common editing commands on behalf of an extensible text editor on a remote system, thus greatly improving speed of response without reducing flexibility. The Line Saving Protocol allows the local system to save text which is not displayed, and display it again later when it is needed, under the control of the remote editor. Both protocols are substantially system and editor independent.

AIM-642

Author[s]: Giuseppe Attardi and Maria Simi

Semantics of Inheritance and Attributions in the Description System Omega

August 1981

ftp://publications.ai.mit.edu/ai-publciations/500-999/AIM-642.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-642.pdf

Omega is a description system for knowledge embedding which incorporates some of the attractive modes of expression in common sense reasoning such as descriptions, inheritance, quantification, negation, attributions and multiple viewpoints. A formalization of Omega is developed as a framework for investigations on the foundations of knowledge representation. As a logic, Omega achieves the goal of an intuitively sound and consistent theory of classes which permits unrestricted abstraction within a powerful logic system. Description abstraction is the construct provided in Omega corresponding to set abstraction. Attributions and inheritance are the basic mechanisms for knowledge structuring. To achieve flexibility and incrementality, the language allows descriptions with an arbitrary number of attributions, rather than predicates with a fixed number of arguments as in predicate logic. This requires a peculiar interpretation for instance descriptions, which in turn provides insights into the use and meaning of several kinds of attributions. The formal treatment consists in presenting semantic models for Omega, deriving an axiomatization and establishing the consistency and completeness of the logic.

AIM-641

Author[s]: William A. Kornfeld and Carl Hewitt

The Scientific Community Metaphor

January 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-641.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-641.pdf

Scientific communnities have proven to be extremely successful at solving problems. They are inherently parallel systems and their macroscopic nature makes them amenable to careful study. In this paper the character of scientific research is examined drawing on sources in the philosophy and history of science. We maintain that the success of scientific research depends critically on its concurrency and pluralism. A variant of the language Ether is developed that embodies notions of concurrency necessary to emulate some of the problem solving behavior of scientific communities. Capabilities of scientific communities are discussed in parallel with simplified models of these capabilities in this language.

AIM-640

Author[s]: Laurence Miller

Natural Learning

October 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-640.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-640.pdf

This memo reports the results of a case study into how children learn in the absence of explicit teaching. The three subjects, an eight year old, a ten year old and a thirteen year old were observed in both of two experimental micro-worlds. The first of these micro-worlds, called the Chemicals World, included a large table, a collection of laboratory and household chemicals, and apparatus for conducting experiments with chemicals; the second, called the Mork and Mindy World included a collection of video taped episodes of the television series Mork and Mindy, a video-tape machine and experimenter with whom the subjects could discuss the episodes. The main result of the study is a theory of how children’s interests interact with knowledge embodied in their environment causing them to learn new powerful ideas. An early version of this theory is presented in chapter five.

AIM-638

Author[s]: Daniel G. Shapiro

Sniffer: A System that Understands Bugs

June 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-638.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-638.pdf

This paper presents a bug understanding system, called sniffer, which applies inspection methods to generate a deep understanding of a narrow class of errors. Sniffer is an interactive debugging aide. It can locate and identify error-containing implementations of typical programming clichés, and it can describe them using the terminology employed by expert programmers.

AIM-637

Author[s]: Kent A. Stevens

Evidence Relating Subjective Contours and Interpretations Involving Occlusion

June 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-637.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-637.pdf

Subjective contours, according to one theory, outline surfaces that are apparently interposed between the viewer and background (because of the disruption of background figures, sudden termination of lines, and other occlusion “cues”) but are not explicitly outlined by intensity discontinuities. This theory predicts that if occlusion cures are not interpreted as evidence of occlusion, no intervening surface need be postulated, hence no subjective contours would be seen. This prediction, however, is difficult to test because observers normally interpret the cues as occlusion evidence and normally see the subjective contours. This article describes a patient with visual agnosia who is both unable to make the usual occlusion interpretations and is unable to see subjective contours. He has, however, normal ability to interpret standard visual illusions, stereograms, and in particular, stereogram versions of the standard subjective contour figures, which elicit to him strong subjective edges in depth (corresponding to the subjective contours viewed in the monocular versions of the figures).

AITR-636

Author[s]: Barbara Sue Kerne Steele

An Accountable Source-To-Source Transformation System

June 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-636.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-636.pdf

Though one is led to believe that program transformation systems which perform source-to-source transformations enable the user to understand and appreciate the resulting source program, this is not always the case. Transformations are capable of behaving and/or interacting in unexpected ways. The user who is interested in understanding the whats, whys, wheres, and hows of the transformation process is left without tools for discovering them. I provide an initial step towards the solution of this problem in the form of an accountable source- to-source transformation system. It carefully records the information necessary to answer such questions, and provides mechanisms for the retrieval of this information. It is observed that though this accountable system allows the user access to relevant facts from which he may draw conclusions, further study is necessary to make the system capable of analyzing these facts itself.

AIM-635

Author[s]: John M. Hollerbach and Tamar Flash

Dynamic Interactions Between Limb Segments During Planar Arm Movement

November 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-635.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-635.pdf

Movement of multiple segment limbs requires generation of appropriate joint torques which include terms arising from dynamic interactions among the moving segments as well as from such external forces as gravity. The interaction torques, arising from inertial, centripetal, and Coriolis forces, are not present for single joint movements. The significance of the individual interaction forces during reaching movements in a horizontal plane involving only the shoulder and elbow joints has been assessed for different movement paths and movement speeds. Trajectory formation strategies which simplify the dynamics computation are presented.

AIM-634

Author[s]: Charles Rich and Richard C. Waters

Abstraction, Inspection and Debugging in Programming

June 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-634.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-634.pdf

We believe that software engineering has much to learn from other mature engineering disciplines, such as electrical engineering, and that the problem solving behaviors of engineers in different disciplines have many similarities. Three key ideas in current artificial intelligence theories of engineering problem solving are: Abstraction – using a simplified view of the problem to guide the problem solving process. Inspection – problem solving by recognizing the form (“plan”) of a solution. Debugging – incremental modification of an almost satisfactory solution to a more satisfactory one. These three techniques are typically used together in a paradigm which we call AID (for Abstraction, Inspection, Debugging): First an abstract model of the problem is constructed in which some important details are not intentionally omitted. In this simplified view inspection methods are more likely to succeed, yielding the initial form of a solution. Further details of the problem are then added one at a time with corresponding incremental modifications to the solution. This paper states the goals and milestones of the remaining three years of a five year research project to study the fundamental principles underlying the design and construction of large software systems and to demonstrate the feasibility of a computer aided design tool for this purpose, called the programmer’s apprentice.

AITR-633

Author[s]: William Douglas Clinger

Foundations of Actor Semantics

May 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-633.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-633.pdf

The actor message-passing model of concurrent computation has inspired new ideas in the areas of knowledge-based systems, programming languages and their semantics, and computer systems architecture. The model itself grew out of computer languages such as Planner, Smalltalk, and Simula, and out of the use of continuations to interpret imperative constructs within A-calculus. The mathematical content of the model has been developed by Carl Hewitt, Irene Greif, Henry Baker, and Giuseppe Attardi. This thesis extends and unifies their work through the following observations. The ordering laws postulated by Hewitt and Baker can be proved using a notion of global time. The most general ordering laws are in fact equivalent to an axiom of realizability in global time. Independence results suggest that some notion of global time is essential to any model of concurrent computation. Since nondeterministic concurrency is more fundamental than deterministic sequential computation, there may be no need to take fixed points in the underlying domain of a power domain. Power domains built from incomplete domains can solve the problem of providing a fixed point semantics for a class of nondeterministic programming languages in which a fair merge can be written. The event diagrams of Greif's behavioral semantics, augmented by Baker's pending events, form an incomplete domain. Its power domain is the semantic domain in which programs written in actor-based languages are assigned meanings. This denotational semantics is compatible with behavioral semantics. The locality laws postulated by Hewitt and Baker may be proved for the semantics of an actor-based language. Altering the semantics slightly can falsify the locality laws. The locality laws thus constrain what counts as an actor semantics.

AIM-632

Author[s]: Patrick H. Winston

Learning New Principles from Precedents and Exercises: The Details

November 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-632.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-632.pdf

Much Learning is done by way of studying precedents and exercises. A teacher supplies a story, gives a problem, and expects a student both to solve a problem and to discover a principle. The student must find the correspondence between the story and the problem, apply the knowledge in the story to solve the problem, generalize to form a principle, and index the principle so that it can be retrieved when appropriate. This sort of learning pervades Management, Political science, Economics, Law, and Medicine as well as the development of common-sense knowledge about life in general. This paper presents a theory of how it is possible to learn by precedents and exercises and describes an implemented system that exploits the theory. The theory holds that causal relations identify the regularities that can be exploited from past experience, given a satisfactory representation for situations. The representation used stresses actors and objects which are taken from English-like input and arranged into a kind of semantic network. Principles emerge in the form of production rules which are expressed in the same way situations are.

AIM-631

Author[s]: John M. Rubin and W.A. Richards

Color Vision and Image Intensities: When Are Changes Material?

May 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-631.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-631.pdf

Marr has emphasized the difficulty in understanding a biological system or its components without some idea of its goals. In this paper, a preliminary goal for color vision is proposed and analyzed. That goal is to determine where changes of material occur in a scene (using only spectral information). This goal is challenging for two reasons. First, the effects of many processes (shadowing, shading from surface orientation changes, highlights, variations in pigment density) are confounded with the effects of material changes in the available image intensities. Second, material changes are essentially arbitrary. We are consequently led to a strategy of rejecting the presence of such confounding processes. We show there is a unique condition, the spectral crosspoint, that allows rejection of the hypothesis that measured image intensities arise from one of the confounding processes. (If plots are made of image intensity versus wavelength from two image regions, and the plots intersect, we say that there is a spectral crosspoint.) We restrict our attention to image intensities measured from regions on opposite sides of an edge because material changes almost always cause edges. Also, by restricting our attention to luminance discontinuities, we can avoid peculiar conspiracies of confounding processes that might mimic a material change. Our crosspoint conjecture is that biological visual systems interpret spectral crosspoints across edges as material changes. A circularly symmetric operator is designed to detect crosspoints: it turns out to resemble the double-opponent cell which is commonplace in biological color vision systems.

AIM-629

Author[s]: William Daniel Hillis

Active Touch Sensing

April 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-629.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-629.pdf

The mechanical hand of the future will roll a screw between its fingers and sense, by touch, which end is which. This paper describes a step toward such a manipulator – a robot finger that is used to recognize small objects by touch. The device incorporates a novel imaging tactile sensor – an artificial skin with hundreds of pressure sensors in a space the size of a finger tip. The sensor is mounted on a tendon-actuated mechanical finger, similar in size and range of motion to a human index finger. A program controls the finger, using it to press and probe the object placed in front of it. Based on how the object feels, the program guesses its shape and orientation and then uses the finger to test and refine the hypothesis. The device is programmed to recognize commonly used fastening devices – nuts, bolts, flats, washers, lock washers, dowel pins, cotter pins and set screws.

AIM-628

Author[s]: David A. Moon

Chaosnet

June 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-628.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-628.pdf

Chaosnet is a local network, that is, a system for communication among a group of computers located within about 1000 meters of each other. Originally developed by the Artificial Intelligence Laboratory as the internal communications medium of the Lisp Machine system, it has since come to be used to link a variey of machines around MIT and elsewhere.

AIM-627

Author[s]: William A. Kornfeld

The Use of Parallelism to Implement a Heuristic Search

March 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-627.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-627.pdf

The role of parallel processing in heuristic search is examined by means of an example (cryptarithmetic addition). A problem solver is constructed that combines the metaphors of constraint propagation and hypothesize-and- test. The system is capable of working on many incompatible hypotheses at one time. Furthermore, it is capable of allocating different amounts of processing power to running activities and and changing these allocations as computation proceeds. It is empirically found that the parallel algorithm is, on the average, more efficient than a corresponding sequential one. Implications of this for problem solving in general are discussed.

AIM-626

Author[s]: Henry Lieberman

Thinking About Lots of Things at Once without Getting Confused: Parallelism in Act 1

May 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-626.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-626.pdf

As advances in computer architecture and changing economics make feasible machines with large-scale parallelism, Artificial Intelligence will require new ways of thinking about computation that can exploit parallelism effectively. We present the actor model of computation as being appropriate for parallel systems, since it organizes knowledge as active objects acting independently, and communicating by message passing. We describe the parallel constructs in our experimental actor interpreter Act 1. Futures create concurrency, by dynamically allocating processing resources much as Lisp dynamically allocates passive storage. Serializers restrict concurrency by constraining the order in which events take place, and have changeable local state. Using the actor model allows parallelism and synchronization to be implemented transparently, so that parallel or synchronized resources can be used as easily as their serial counterparts.

AIM-625

Author[s]: Henry Lieberman

A Preview of Act 1

June 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-625.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-625.pdf

The next generation of artificial intelligence programs will require the ability to organize knowledge as groups of active objects. Each object should have only its own local expertise, the ability to operate in parallel with other objects, and the ability to communicate with other objects. Artificial Intelligence programs will also require a great deal of flexibility, including the ability to support multiple representations of objects, and to incrementally and transparently replace objects with new, upward-compatible versions. To realize this, we propose a model of computation based on the notion of an actor, an active object that communicates by message passing. Actors blur the conventional distinction between data and procedures. The actor philosophy is illustrated by a description of our prototype actor interpreter Act 1.

AIM-624

Author[s]: Randall Davis and Reid G. Smith

Negotiation as a Metaphor for Distributed Problem Solving

May 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-624.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-624.pdf

We describe the concept of distributed problem solving and define it as the cooperative solution of problems by a decentralized and loosely coupled collection of problem solvers. This approach to problem solving offers the promise of increased performance and provides a useful medium for exploring and developing new problem- solving techniques. We present a framework called the contract net that specifies communication and control in a distributed problem solver. Task distribution is viewed as an interactive process, a discussion carried on between a node with a task to be executed and a group of nodes that may be able to execute the task. We describe the kinds of information that must be passed between nodes during the discussion in order to obtain effective problem-solving behavior. This discussion is the origin of the negotiation metaphor: Task distribution is viewed as a form of contract negotiation. We emphasize that protocols for distributed problem solving should help determine the content of the information transmitted, rather than simply provide a means of sending bits from one node to another. The use of the contract net framework is demonstrated in the solution of a simulated problem in area surveillance, of the sort encountered in ship or air traffic control. We discuss the mode of operation of a distributed sensing system, a network of nodes extending throughout a relatively large geographic area, whose primary aim is the formation of a dynamic map of traffic in the area. From the results of this preliminary study we abstract features of the framework applicable to problem solving in general, examining in particular transfer of control. Comparisons with PLANNER, CONNIVER, HEARSAY-II, and PUP6 are used to demonstrate that negotiation – the two-way transfer of information – is a natural extension to the transfer of control mechanisms used in earlier problem-solving systems.

AITR-623

Author[s]: Anna R. Bruss

The Image Irradiance Equation: Its Solution and Application

June 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-623.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-623.pdf

How much information about the shape of an object can be inferred from its image? In particular, can the shape of an object be reconstructed by measuring the light it reflects from points on its surface? These questions were raised by Horn [HO70] who formulated a set of conditions such that the image formation can be described in terms of a first order partial differential equation, the image irradiance equation. In general, an image irradiance equation has infinitely many solutions. Thus constraints necessary to find a unique solution need to be identified. First we study the continuous image irradiance equation. It is demonstrated when and how the knowledge of the position of edges on a surface can be used to reconstruct the surface. Furthermore we show how much about the shape of a surface can be deduced from so called singular points. At these points the surface orientation is uniquely determined by the measured brightness. Then we investigate images in which certain types of silhouettes, which we call b-silhouettes, can be detected. In particular we answer the following question in the affirmative: Is there a set of constraints which assure that if an image irradiance equation has a solution, it is unique? To this end we postulate three constraints upon the image irradiance equation and prove that they are sufficient to uniquely reconstruct the surface from its image. Furthermore it is shown that any two of these constraints are insufficient to assure a unique solution to an image irradiance equation. Examples are given which illustrate the different issues. Finally, an overview of known numerical methods for computing solutions to an image irradiance equation are presented.

AIM-622

Author[s]: William M. Silver

On the Representation of Angular Velocity and Its Effect on the Efficiency of Manipulator Dynamics Computation

March 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-622.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-622.pdf

Recently there has been considerable interest in efficient formulations of manipulator dynamics, mostly due to the desirability of real-time control or analysis of physical devices using modest computers. The inefficiency of the classical Lagrangian formulation is well known, and this has led researchers to seek alternative methods. Several authors have developed a highly efficient formulation of manipulator dynamics based on the Newton-Euler equations, and there may be some confusion as to the source of this efficiency. This paper shows that there is in fact no fundamental difference in computational efficiency between Lagrangian and Newton-Euler formulations. The efficiency of the above-mentioned Newton-Euler formulation is due to two factors: the recursive structure of the computation and the representation chosen of the rotational dynamics. Both of these factors can be achieved in the Lagrangian formulation, resulting in an algorithm identical to the Newton-Euler formulation. Recursive Lagrangian dynamics has been discussed previously by Hollerbach. This paper takes the final step by comparing in detail the representations that have been used for rotational dynamics and showing that with a proper choice of representation the Lagrangian formulation is indeed equivalent to the Newton-Euler formulation.

AIM-620

Author[s]: Gerald R. Barber

Record of the Workshop on Research in Office Semantics

February 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-620.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-620.pdf

This paper is a compendium of the ideas and issues presented at the Chatham Bars Workshop on Office Semantics. The intent of the workshop was to examine the state of the art in office systems and to elucidate the issues system designers were concerned with in developing next generation office systems. The workshop involved a cross- section of people from government, industry and academia. Presentations in the form of talks and video tapes were made of prototypical systems.

AITR-619

Author[s]: Barbara Y. White

Designing Computer Games to Facilitate Learning

February 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-619.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-619.pdf

The aim of this thesis was to explore the design of interactive computer learning environments. The particular learning domain selected was Newtonian dynamics. Newtonian dynamics was chosen because it is an important area of physics with which many students have difficulty and because controlling Newtonian motion takes advantage of the computer’s graphics and interactive capabilities. The learning environment involved games which simulated the motion of a spaceship on a display screen. The purpose of the games was to focus the students’ attention on various aspects of the implications of Newton’s laws.

AIM-617

Author[s]: Kuk Huang Lim

Control of a Tendon Arm

February 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-617.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-617.pdf

The dynamics and control of tendon driven three degree of freedom shoulder joint are studied. A control scheme consisting of two phases has been developed. In the first phase, approximation of the time optimal control trajectory was applied open to the loop to the system. In the second phase a closed loop linear feedback law was employed to bring the system to the desired final state and to maintain it there.

AIM-616

Author[s]: Marvin Minsky

Music, Mind and Meaning

February 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-616.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-616.pdf

Speculating about cognitive aspects of listening to music, this essay discusses: how metric regularity and thematic repetition might involve representation frames and memory structures, how the result of listening might resemble space-models, how phrasing and expression might evoke innate responses and, finally, why we like music – or rather, what is the nature of liking itself.

AITR-615

Author[s]: Kenneth D. Forbus

A Study of Qualitative and Geometric Knowledge in Reasoning about Motion

February 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-615.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-615.pdf

Reasoning about motion is an important part of our commonsense knowledge, involving fluent spatial reasoning. This work studies the qualitative and geometric knowledge required to reason in a world that consists of balls moving through space constrained by collisions with surfaces, including dissipative forces and multiple moving objects. An analog geometry representation serves the program as a diagram, allowing many spatial questions to be answered by numeric calculation. It also provides the foundation for the construction and use of place vocabulary, the symbolic descriptions of space required to do qualitative reasoning about motion in the domain. The actual motion of a ball is described as a network consisting of descriptions of qualitatively distinct types of motion. Implementing the elements of these networks in a constraint language allows the same elements to be used for both analysis and simulation of motion. A qualitative description of the actual motion is also used to check the consistency of assumptions about motion. A process of qualitative simulation is used to describe the kinds of motion possible from some state. The ambiguity inherent in such a description can be reduced by assumptions about physical properties of the ball or assumptions about its motion. Each assumption directly rules out some kinds of motion, but other knowledge is required to determine the indirect consequences of making these assumptions. Some of this knowledge is domain dependent and relies heavily on spatial descriptions.

AIM-614

Author[s]: W.A. Richards, J.M. Rubin and D.D. Hoffman

Equation Counting and the Interpretation of Sensory Data

June 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-614.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-614.pdf

Many problems in biological information processing require the solution to a complex system of equations in many unknown variables. An equation-counting procedure is described for determining whether such a system of equations will indeed have a unique solution, and under what conditions the solution should be interpreted as “correct”. Three examples of the procedure are given for illustration, one for auditory signal processing and two from vision.

AIM-613

Author[s]: W.E.L. Grimson

A Computational Theory of Visual Surface Interpolation

June 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-613.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-613.pdf

Computational theories of structure from motion [Ulman, 1979] and stereo vision [Marr and Poggio, 1979] only specify the computation of three-dimensional surface information at special points in the image. Yet, the visual perception is clearly of complete surfaces. In order to account for this, a computational theory of the interpolation of surfaces from visual information is presented.

AIM-612

Author[s]: B.K.P. Horn

The Curve of Least Energy

January 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-612.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-612.pdf

Here we search for the curve which has the smallest integral of the square of curvature, while passing through two given points with given orientation. This is the true shape of a spline used in lofting. In computer-aided design, curves have been sought which maximize “smoothness”. The curve discussed here is the one arising in this way from a commonly used measure of smoothness. The human visual system may use such a curve when it constructs a subjective contour.

AIM-611A

Author[s]: Richard C. Waters

GPRINT: A LISP Pretty Printer Providing Extensive User Format Control Mechanism

September 1982

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-611a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-611a.pdf

A Lisp pretty printer is presented which makes it easy for a user to control the format of the output produced. The printer can be used as a general mechanism for printing data structures as well as programs. It is divided into two parts: a set of formatting functions and an output routine. The user specifies how a particular type of object should be formatted by creating a formatting function for the type. When passed an object of that type, the formatting function creates a sequence of directions which specify how the object should be printed if it can fit on one line and how it should be printed if it must be broken up across multiple lines. A simple template language makes it easy to specify these directions. Based on the line length available, the output routine decides what structures have to be broken up across multiple lines and produces the actual output following the directions created by the formatting functions. The paper concludes with a discussion of how the pretty printing method presented could be applied to languages other than Lisp.

AIM-611

Author[s]: Richard C. Waters

GPRINT - A LISP Pretty Printer Providing Extensive User Format-Control Mechanism

October 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-611.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-611.pdf

A pretty printer is presented which makes it easy for a user to control the format of the output produced. The printer can be used as a general mechanism for printing data structures as well as programs. It is divided into two parts: a set of formatting functions, and an output routine. Each formatting function creates a sequence of directions which specify how an object is to be formatted if it can fit on one line and how it is to be formatted if it must be broken up across multiple lines. Based on the line length available, the output routine decides what structures have to be broken up across multiple lines and produces the actual output following the directions created by the formatting functions. The directions passed from the formatting functions to the output routine form a well defined interface: a language for specifying formatting options. Three levels of user format-control are provided. A simple template mechanism makes it easy for a user to control certain aspects of the format produced. A user can exercise much more complete control over how a particular type of object is formatted by writing a special formatting function for it. He can make global changes in format by modifying the formatting process as a whole.

AITR-610

Author[s]: Richard Brown

Coherent Behavior from Incoherent Knowledge Sources in the Automatic Synthesis of Numerical Computer Programs

January 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-610.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-610.pdf

A fundamental problem in artificial intelligence is obtaining coherent behavior in rule-based problem solving systems. A good quantitative measure of coherence is time behavior; a system that never, in retrospect, applied a rule needlessly is certainly coherent; a system suffering from combinatorial blowup is certainly behaving incoherently. This report describes a rule-based problem solving system for automatically writing and improving numerical computer programs from specifications. The specifications are in terms of “constraints” among inputs and outputs. The system has solved program synthesis problems involving systems of equations, determining that methods of successive approximation converge, transforming recursion to iteration, and manipulating power series (using differing organizations, control structures, and argument-passing techniques).

AIM-609

Author[s]: Barbara S. Kerns

Towards a Better Definition of Transactions

December 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-609.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-609.pdf

This paper builds on a technical report written by Carl Hewitt and Henry Baker called “Actors and Continuous Functionals”. What is called a “goal-oriented activity” in that paper will be referred to in this paper as a “transaction”. The word “transaction” brings to mind an object closer in function to what we wish to present than does the word “activity”. This memo, therefore, presents the definitions of a reply and a transaction as given in Hewitt and Baker’s paper and points out some discrepancies in their definitions. That is, that the properties of transactions and replies as they were defined did not correspond with our intuitions, and thus the definitions should be changed. The issues of what should constitute a transaction are discussed, and a new definition is presented which eliminates the discrepancies caused by the original definitions. Some properties of the newly defined transactions are discussed, and it is shown that the results of Hewitt and Baker’s paper still hold given the new definitions.

AIM-608

Author[s]: D.D. Hoffman and B.E. Flinchbaugh

The Interpretation of Biological Motion

December 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-608.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-608.pdf

The term biological motion has been coined by G. Johansson (1973) to refer to the ambulatory patterns of terrestrial bipeds and quadripeds. In this paper a computational theory of the visual perception of biological motion is proposed. The specific problem addressed is how the three dimensional structure and motions of animal limbs may be computed from the two dimensional motions of their projected images. It is noted that the limbs of animals typically do not move arbitrarily during ambulation. Rather, for anatomical reasons, they typically move in single planes for extended periods of time. This simple anatomical constraint is exploited as the basis for utilizing a “planarity assumption” in the interpretation of biological motion. The analysis proposed is: (1) divide the image into groups of two or three elements each; (2) test each group for pairwise-rigid planar motion; (3) combine the results from (2). Fundamental to the analysis are two ‘structure from planar motion’ propositions. The first states that the structure and motion of two points rigidly linked and rotating in a plane is recoverable from three orthographic projections. The second states that the structure and motion of three points forming two hinged rods constrained to move in a plane is recoverable from two orthographic projections. The psychological relevance of the analysis and possible interactions with top down recognition processes are discussed.

AIM-606

Author[s]: Tomas Lozano-Perez

Automatic Planning of Manipulator Transfer Movements

December 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-606.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-606.pdf

This paper deals with the class of problems that involve finding where to place or how to move a solid object in the presence of obstacles. The solution to this class of problems is essential to the automatic planning of manipulator transfer movements, i.e. the motions to grasp a part and place it at some destination. This paper presents algorithms for planning manipulator paths that avoid collisions with objects in the workspace and for choosing safe grasp points on objects. These algorithms allow planning transfer movements for Cartesian manipulators. The approach is based on a method of computing an explicit representation of the manipulator configurations that would bring about a collision.

AIM-605

Author[s]: Tomas Lozano-Perez

Spatial Planning: A Configuration Space Approach

December 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-605.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-605.pdf

This paper presents algorithms for computing constraints on the position of an object due to the presence of obstacles. This problem arises in applications which require choosing how to arrange or move objects among other objects. The basis of the approach presented here is to characterize the position and orientation of the object of interest as a single point in a Configuration Space, in which each coordinate represents a degree of freedom in the position and/or orientation of the object. The configurations forbidden to this object, due to the presence of obstacles, can then be characterized as regions in the Configuration Space. The paper presents algorithms for computing these Configuration Space obstacles when the objects and obstacles are polygons or polyhedra. An approximation technique for high-dimensional Configuration Space obstacles, based on projections of obstacles slices, is described.

AITR-604

Author[s]: Charles Rich

Inspection Methods in Programming

June 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-604.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-604.pdf

The work reported here lies in the area of overlap between artificial intelligence software engineering. As research in artificial intelligence, it is a step towards a model of problem solving in the domain of programming. In particular, this work focuses on the routine aspects of programming which involve the application of previous experience with similar programs. I call this programming by inspection. Programming is viewed here as a kind of engineering activity. Analysis and synthesis by inspection area prominent part of expert problem solving in many other engineering disciplines, such as electrical and mechanical engineering. The notion of inspections methods in programming developed in this work is motivated by similar notions in other areas of engineering. This work is also motivated by current practical concerns in the area of software engineering. The inadequacy of current programming technology is universally recognized. Part of the solution to this problem will be to increase the level of automation in programming. I believe that the next major step in the evolution of more automated programming will be interactive systems which provide a mixture of partially automated program analysis, synthesis and verification. One such system being developed at MIT, called the programmer’s apprentice, is the immediate intended application of this work. This report concentrates on the knowledge are of the programmer’s apprentice, which is the form of a taxonomy of commonly used algorithms and data structures. To the extent that a programmer is able to construct and manipulate programs in terms of the forms in such a taxonomy, he may relieve himself of many details and generally raise the conceptual level of his interaction with the system, as compared with present day programming environments. Also, since it is practical to expand a great deal of effort pre- analyzing the entries in a library, the difficulty of verifying the correctness of programs constructed this way is correspondingly reduced. The feasibility of this approach is demonstrated by the design of an initial library of common techniques for manipulating symbolic data. This document also reports on the further development of a formalism called the plan calculus for specifying computations in a programming language independent manner. This formalism combines both data and control abstraction in a uniform framework that has facilities for representing multiple points of view and side effects.

AIM-603

Author[s]: Marvin Minsky

Jokes and the Logic of the Cognitive Unconscious

November 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-603.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-603.pdf

Freud’s theory of jokes explains how they overcome the mental “censors” that make it hard for us to think “forbidden” thoughts. But his theory did not work so well for humorous nonsense as for other comical subjects. In this essay I argue that the different forms of humor can be seen as much more similar, once we recognize the importance of knowledge about knowledge and, particularly, aspects of thinking concerned with recognizing and suppressing bugs – ineffective or destructive thought processes. When seen in this light, much humor that at first seems pointless, or mysterious, becomes more understandable.

AIM-602

Author[s]: Daniel Weinreb and David Moon

Flavors: Message Passing in the Lisp Machine

November 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-602.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-602.pdf

The object oriented programming style used in the Smalltalk and Actor languages is available in Lisp Machine Lisp, and used by the Lisp Machine software system. It is used to perform generic operations on objects. Part of its implementation is simply a convention in procedure calling style; part is a powerful language feature, called Flavors, for defining abstract objects. This chapter attempts to explain what programming with objects and with message passing means, the various means of implementing these in Lisp Machine Lisp, and when you should use them. It assumes no prior knowledge of any other languages.

AIM-601

Author[s]: James L. Stansfield

Conclusions from the Commodity Expert Project

November 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-601.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-601.pdf

The goal of the commodity expert project was to develop a prototype program that would act as an intelligent assistant to a commodity market analyst. Since expert analysis must deal with very large, yet incomplete, data bases of unreliable facts about a complex world, the project would stringently test the applicability of Artificial Intelligence techniques. After a significant effort however, I am forced to the conclusion that an intelligent, real-world system of the kind envisioned is currently out of reach. Some of the difficulties were due to the size and complexity of the domain. As its true scale became evident, the available resources progressively appeared less adequate. The representation and reasoning problems that arose were persistently difficult and fundamental work is needed before the tools will be sufficient to engineer truly intelligent assistants. Despite these difficulties, perhaps even because of them, much can be learned from the project. To assist future applications projects, I explain in this report some of the reasons for the negative result, and also describe some positive ideas that were gained along the way. In doing so, I hope to convey the respect I have developed for the complexity of real- world domains, and the difficulty of describing the ways experts deal them.

AIM-599

Author[s]: Boris Katz

A Three-Step Procedure for Language Generation

December 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-599.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-599.pdf

This paper outlines a three-step plan for generating English text from any semantic representation by applying a set of syntactic transformations to a collection of kernel sentences. The paper focuses on describing a program which realizes the third step of this plan. Step One separates the given representation into groups and generates from each group a set of kernel sentences. Step Two must decide based upon both syntactic and thematic considerations, the set of transformations that should be performed upon each set of kernels. The output of the first two steps provides the “TASK” for Step Three. Each element of the TASK corresponds to the generation of one English sentence, and in turn may be defined as a triple consisting of: (a) a list of kernel phrase markers; (b) a list of transformations to be performed upon the list of kernels; (c) a “syntactic separator” to separate or connect generated sentences. Step Three takes as input the results of Step One and Step Two. The program which implements Step three “reads” the TASK, executes the transformations indicated there, combines the altered kernels of each set into a sentence, performs a pronomialization process, and finally produces the appropriate English word string. This approach subdivides a hard problem into three more manageable and relatively independent pieces. It uses linguistically motivated theories at Step Two and Step Three. As implemented so far, Step Three is small and highly efficient. The system is flexible; all the transformations can be applied in any order. The system is general; it can be adapted easily to many domains.

AIM-598

Author[s]: John Batali and Anne Hartheimer

The Design Procedure Language Manual

September 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-598.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-598.pdf

This manual describes the Design Procedure Language (DPL) for LSI design. DPL creates and maintains a representation of a design in a hierarchically organized, object-oriented LISP data-base. Designing in DPL involves writing programs (Design Procedures) which construct and manipulate descriptions of a project. The programs use a call-by-keyword syntax and may be entered interactively or written by other programs. DPL is the layout language for the LISP-based Integrated Circuit design system (LISPIC) being developed at the Artificial Intelligence Laboratory at MIT. The LISPIC design environment will combine a large set of design tools that interact through a common data-base. This manual is for prospective users of the DPL and covers the information necessary to design a project with the language. The philosophy and goals of the LISPIC system as well as some details of the DPL data-base are also discussed.

AIM-597

Author[s]: David Marr and Lucia Vaina

Representation and Recognition of the Movement of Shapes

October 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-597.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-597.pdf

The problems posed by the representation and recognition of the movements of 3-D shapes are analyzed. A representation is proposed for the movements of shapes that lie within the scope of Marr & Nishihara’s (1978) 3-D model representation of static shapes. The basic problem is, how to segment a stream of movement into pieces each of which can be described separately. The representation proposed here is based upon segmenting a movement at moments when a component axis, e.g. an arm, starts to move relative to its local coordinate frame (here, the torso). So that for example walking is divided into a sequence of the stationary states between each swing of the arms and legs, and the actual motions between the stationary points (relative to the torso, not the ground). This representation is called the state-motion-state (SMS) moving shape representation, and several examples of its application are given.

AIM-596

Author[s]: Koji Fukumori

Fundamental Scheme for Train Scheduling

September 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-596.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-596.pdf

Traditionally, the compilation of long-term timetables for high-density rail service with multiple classes of trains on the same track is a job for expert people, not computers. We propose an algorithm that uses the range- constriction search technique to schedule the timing and pass-through relations of trains smoothly and efficiently. The program determines how the timing of certain trains constrains the timing of others, finds possible time regions and pass-through relations and then evaluates the efficiency of train movement for each pass-through relation.

AITR-595

Author[s]: Guy Lewis Steele Jr.

The Definition and Implementation of a Computer Programming Language Based on Constraints

August 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-595.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-595.pdf

The constraint paradigm is a model of computation in which values are deduced whenever possible, under the limitation that deductions be local in a certain sense. One may visualize a constraint 'program' as a network of devices connected by wires. Data values may flow along the wires, and computation is performed by the devices. A device computes using only locally available information (with a few exceptions), and places newly derived values on other, locally attached wires. In this way computed values are propagated. An advantage of the constraint paradigm (not unique to it) is that a single relationship can be used in more than one direction. The connections to a device are not labelled as inputs and outputs; a device will compute with whatever values are available, and produce as many new values as it can. General theorem provers are capable of such behavior, but tend to suffer from combinatorial explosion; it is not usually useful to derive all the possible consequences of a set of hypotheses. The constraint paradigm places a certain kind of limitation on the deduction process. The limitations imposed by the constraint paradigm are not the only one possible. It is argued, however, that they are restrictive enough to forestall combinatorial explosion in many interesting computational situations, yet permissive enough to allow useful computations in practical situations. Moreover, the paradigm is intuitive: It is easy to visualize the computational effects of these particular limitations, and the paradigm is a natural way of expressing programs for certain applications, in particular relationships arising in computer-aided design. A number of implementations of constraint-based programming languages are presented. A progression of ever more powerful languages is described, complete implementations are presented and design difficulties and alternatives are discussed. The goal approached, though not quite reached, is a complete programming system which will implicitly support the constraint paradigm to the same extent that LISP, say, supports automatic storage management.

AIM-593

Author[s]: Mike Brady

Toward a Computational Theory of Early Visual Processing In Reading

September 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-593.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-593.pdf

This paper is the first of a series aimed at developing a theory of early visual processing in reading. We suggest that there has been a close parallel in the development of theories of reading and theories of vision in Artificial Intelligence. We propose to exploit and extend recent results in Computer Vision to develop an improved model of early processing in reading. This first paper considers the problem of isolating words in text based on the information which Marr and Hildreth’s (1980) theory asserts is available in the parafovea. We show in particular that the findings of Fisher (1975) on reading transformed texts can be accounted for without postulating the need for complex interactions between early processing and downloading information as he suggests. The paper concludes with a brief discussion of the problem of integrating information over successive saccades and relates the earlier analysis fo the empirical findings of Rayner.

AIM-592

Author[s]: D.D. Hoffman

Inferring Shape from Motion Fields

December 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-592.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-592.pdf

The human visual system has the ability o utilize motion information to infer the shapes of surfaces. More specifically, we are able to derive descriptions of rigidly rotating smooth surfaces entirely from the orthographic projection of the motions of their surface markings. A computational analysis of this ability is proposed based on “shape from motion” proposition. This proposition states that given the first spatial derivatives of the orthographically projected velocity and the acceleration fields of a rigidly rotating regular surface, then the angular velocity and the surface normal at each visible point on that surface are uniquely determined up to a reflection.

AIM-591

Author[s]: Shimon Ullman

Interfacing the One-Dimensional Scanning of an Image with the Applications of Two-Dimensional Operators

April 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-591.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-591.pdf

To interface between the one-dimensional scanning of an image, and the applications of a two-dimensional operator, an intermediate storage is required. For a square image of size n2, and a square operator of size m2, the minimum intermediate storage is shown to be n .(m-1). An interface of this size can be conveniently realized by using a serpentine delay line. New kinds of imagers would be required to reduce the size of the intermediate storage below n.(m-1).

AIM-590

Author[s]: Robert W. Lawler

Extending a Powerful Idea

July 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-590.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-590.pdf

Mathematics is much more than the manipulation of numbers. At its best, it involves simple, clear examples of thought so apt to the world we live in that those examples provide guidance for our thinking about problems we meet subsequently. We call such examples, capable of heuristic use, POWERFUL IDEAS, after Papert (1980). This article documents a child’s introduction to a specific powerful idea in a computer environment. We trace his extensions of that idea to other problem areas, the first similar to his initial experience and the second more remote from it.

AITR-589

Author[s]: Andrew P. Witkin

Shape from Contour

November 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-589.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-589.pdf

The problem of using image contours to infer the shapes and orientations of surfaces is treated as a problem of statistical estimation. The basis for solving this problem lies in an understanding of the geometry of contour formation, coupled with simple statistical models of the contour generating process. This approach is first applied to the special case of surfaces known to be planar. The distortion of contour shape imposed by projection is treated as a signal to be estimated, and variations of non-projective origin are treated as noise. The resulting method is then extended to the estimation of curved surfaces, and applied successfully to natural images. Next, the geometric treatment is further extended by relating countour curvature to surface curvature, using cast shadows as a model for contour generation. This geometric relation, combined with a statistical model, provides a measure of goodness-of-fit between a surface and an image contour. The goodness-of-fit measure is applied to the problem of establishing registration between an image and a surface model. Finally, the statistical estimation strategy is experimentally compared to human perception of orientation: human observers' judgements of tilt correspond closely to the estimates produced by the planar strategy.

AIM-587

Author[s]: Guy L. Steele, Jr.

Destructive Reordering of CDR-Coded Lists

August 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-587.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-587.pdf

Linked list structures can be compactly represented by encoding the CDR (“next”) pointer in a two-bit field and linearizing list structures as much as possible. This “CDR- coding” technique can save up to 50% on storage for linked lists. The RPLACD (alter CDR pointer) operation can be accommodated under such a scheme by using indirect pointers. Standard destructive reordering algorithms, such as REVERSE and SORT, use RPLACD quite heavily. If these algorithms are used on CDR-coded lists, the result is a proliferation of indirect pointers. We present here algorithms for destructive reversal and sorting of CDR-coded lists which avoid creation of indirect pointers. The essential idea is to note that a general list can be viewed as a linked list of array-like “chunks”. The algorithm applied to such “chunky lists” is a fusion of separate array- and list-specific algorithms; intuitively, the array-specific algorithm is applied to each chunk, and the list algorithm to the list with each chunk considered as a single element.

AIM-586

Author[s]: Robert W. Lawler

The Progressive Construction of Mind

June 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-586.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-586.pdf

We propose a vision of the structure of knowledge and processes of learning based upon the particularity of experience. Highly specific cognitive structures are constructed through activities in limited domains of experience. For new domains, new cognitive structures develop from and call upon the knowledge of prior structures. Applying this vision of disparate cognitive structures to a detailed case study, we present an interpretation of addition-related matter from the corpus and trace the interplay of specific experiences with the interactions of ascribed, disparate structures. The interpretive focus is on learning processes through which a broadly applicable skill emerges from the interaction and integration of knowledge based on specific, particular experiences.

AIM-585

Author[s]: Judi Jones

Primer for R users

September 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-585.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-585.pdf

R is a text formatter. The information in this primer is meant to explain, in simple English, the basic commands needed to use R. Input for R is prepared on computer systems using a text editor. Which editor employed depends on which computer system you use, and your personal preference. Almost every characteristic of a document can be controlled or changed if necessary.

AITR-581

Author[s]: Jon Doyle

A Model for Deliberation, Action, and Introspection

May 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-581.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-581.pdf

This thesis investigates the problem of controlling or directing the reasoning and actions of a computer program. The basic approach explored is to view reasoning as a species of action, so that a program might apply its reasoning powers to the task of deciding what inferences to make as well as deciding what other actions to take. A design for the architecture of reasoning programs is proposed. This architecture involves self- consciousness, intentional actions, deliberate adaptations, and a form of decision-making based on dialectical argumentation. A program based on this architecture inspects itself, describes aspects of itself, and uses this self-reference and these self-descriptions in making decisions and taking actions. The program’s mental life includes awareness of its own concepts, beliefs, desires, intentions, inferences, actions, and skills. All of these are represented by self-descriptions in a single sort of language, so that the program has access to all of these aspects of itself, and can reason about them in the same terms.

AIM-580

Author[s]: K.F. Prazdny and Mike Brady

Extra-Retinal Signals Influence Induced Motion: A New Kinetic Illusion

May 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-580.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-580.pdf

When a moving dot, which is tracked by the eyes and enclosed in a moving framework, suddenly stops while the enclosing framework continues its motion, the dot is seen to describe a curved path. This illusion can be explained only by assuming that extra- retinal signals are taken into account in interpreting retinal information. The form of the illusion, and the fact that the phenomenal path cannot be explained on the basis of positional information alone, suggests that the perceived path is computed by integrating (instantaneous) velocity information over time. A vector addition model embodying a number of simplifying assumptions is found to qualitatively fit the experimental data. A number of follow-up studies are suggested.

AITR-579

Author[s]: Ellen C. Hildreth

Implementation of a Theory of Edge Detection

April 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-579.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-579.pdf

This report describes the implementation of a theory of edge detection, proposed by Marr and Hildreth (1979). According to this theory, the image is first processed independently through a set of different size filters, whose shape is the Laplacian of a Gaussian, ***. Zero-crossings in the output of these filters mark the positions of intensity changes at different resolutions. Information about these zero-crossings is then used for deriving a full symbolic description of changes in intensity in the image, called the raw primal sketch. The theory is closely tied with early processing in the human visual systems. In this report, we first examine the critical properties of the initial filters used in the edge detection process, both from a theoretical and practical standpoint. The implementation is then used as a test bed for exploring aspects of the human visual system; in particular, acuity and hyperacuity. Finally, we present some preliminary results concerning the relationship between zero-crossings detected at different resolutions, and some observations relevant to the process by which the human visual system integrates descriptions of intensity changes obtained at different resolutions.

AIM-577

Author[s]: Henry Lieberman and Carl Hewitt

A Session with TINKER: Interleaving Program Testing with Program Design

September 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-577.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-577.pdf

Tinker is an experimental interactive programming system which integrates program testing with program design. New procedures are created by working out the steps of the procedure in concrete situations. Tinker displays the results of each step as it is performed, and constructs a procedure for the general case from sample calculations. The user communicates with Tinker mostly by selecting operations from menus on an interactive graphic display rather than by typing commands. This paper presents a demonstration of our current implementation of Tinker.

AIM-576

Author[s]: Randall Davis

Meta-Rules: Reasoning About Control

March 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-576.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-576.pdf

How can we insure that knowledge embedded in a program is applied effectively? Traditionally the answer to this question has been sought in different problem solving paradigms and in different approaches to encoding and indexing knowledge. Each of these is useful with a certain variety of problem, but they all share a common problem: they become ineffective in the face of a sufficiently large knowledge base. How then can we make it possible for a system to continue to function in the face of a very large number of plausibly useful chunks of knowledge? In response to this question we propose a framework for viewing issues of knowledge indexing and retrieval, a framework that includes what appears to be a useful perspective on the concept of a strategy. We view strategies as a means of controlling invocation in situations where traditional selection mechanisms become ineffective. We examine ways to effect such control, and describe meta-rules, a means of specifying strategies which offers a number of advantages. We consider at some length how and when it is useful to reason about control, and explore the advantages meta-rules offer for doing this.

AIM-575

Author[s]: R.W. Lawler

One Child's Learning: Introducing Writing with a Computer

March 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-575.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-575.pdf

This is a case study of how one child learned to write in a computer-rich setting. Although computer access did affect her learning significantly, the details presented here go beyond supporting that claim. They provide a simple example of what a computer-based introduction to writing might be like for other children. We conclude with a short discussion of issues raised by the study.

AIM-574

Author[s]: S. Ullman

Against Direct Perception

March 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-574.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-574.pdf

Central to contemporary cognitive science is the notion that mental processes involve computations defined over internal representations. This notion stands in sharp contrast with another prevailing view – the direct theory of perception whose most prominent proponent has been J.J. Gibson. The publication of his recent book (The Ecological Approach to Visual Perception – Boston, Houghton Mifflin Company, 1979) offers an opportunity to examine the theory of direct perception and to contrast it with the computational/representational view. In this paper the notion of direct perception is examined primarily from a theoretical standpoint, and various objections are raised against it. An attempt is made to place the theory of direct perception in perspective by embedding it in a more comprehensive framework.

AIM-573A

Author[s]: J. Richter and S. Ullman

A Model for the Spatio-Temporal Organization of X- and Y-Type Ganglion Cells in the Primate Retina

April 1980 (Updated October 1981)

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-573a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-573a.pdf

A model is proposed for the spatial and temporal characteristics of X- and Y-type responses of ganglion cells in the primate retina. The model is related to a theory of directional selectivity proposed by Marr & Ullman (1981). The X- and Y-type responses predicted by the model to a variety of stimuli are examined and compared with electrophysiological recordings. A number of implications and predictions are discussed.

AIM-573A

Author[s]: J. Richter and S. Ullman

A Model for the Spatio-Temporal Organization of X- and Y-Type Ganglion Cells in the Primate Retina

April 1980 (Updated October 1981)

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-573a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-573a.pdf

AIM-572

Author[s]: Berthod K.P. Horn and Brian G. Schunck

Determining Optical Flow

April 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-572.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-572.pdf

Optical flow cannot be computed locally, since only one independent measurement is available from the image sequence at a point, while the flow velocity has two components. A second constraint is needed. A method for finding the optical flow pattern is presented which assumes that the apparent velocity of the brightness pattern varies smoothly almost everywhere in the image. An iterative implementation is shown which successfully computes the optical flow for a number of synthetic image sequences. The algorithm is robust in that it can handle image sequences that are quantized rather coarsely in space and time. It is also insensitive to quantization of brightness levels and additive noise. Examples are included where the assumption of smoothness is violated at singular points or along lines in the image.

AIM-570

Author[s]: Sylvia Weir

The Evaluation and Cultivation of Spatial and Linguistic Abilities in Individuals with Cerebral Palsy

October 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-570.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-570.pdf

The work of the Cerebral Palsy project (members: Seymour Papert, Sylvia Weir, Jose Valente and Gary Drescher) over the past eighteen months is summarized, and the next phase of activity is outlined. The issues to be addressed by the proposed research are as follows: 1. An investigation of computer-based techniques to maximize the acquisition of spatial and linguistic skills in severely Cerebral Palsied children, to serve the educational and therapeutic needs of this population. 2. Developing a set of computer- based diagnostic tools for use with physically handicapped persons which could contribute to the provision of a functional specification of subcategories of Cerebral Palsy. 3. Investigating the ways in which findings on Cerebral Palsy subjects can inform our theories of cognitive development and the adult functioning of normal individuals.

AIM-569A

Author[s]: Henry Lieberman and Carl Hewitt

A Real Time Garbage Collector Based on the Lifetimes of Objects

October 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-569a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-569a.pdf

In previous heap storage systems, the cost of creating objects and garbage collection is independent of the lifetime of the object. Since objects with short lifetimes account for a large portion of storage use, it’s worth optimizing a garbage collector to reclaim storage for these objects more quickly. The garbage collector should spend proportionately less effort reclaiming objects with longer lifetimes. We present a garbage collection algorithm which: Makes storage for short-lived objects cheaper than storage for long-lived objects. Operates in real time – object creation and access times are bounded. Increases locality of reference, for better virtual memory performance. Works well with multiple processors and a large address space.

AIM-568

Author[s]: Jon Doyle and Philip London

A Selected Descriptor-Indexed Bibliography to the Literature on Belief Revision

February 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-568.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-568.pdf

This article presents an overview of research in an area loosely called belief revision. Belief revision concentrates on the issue of revising systems of beliefs to reflect perceived changes in the environment or acquisition of new information. The paper includes both an essay surveying the literature and a descriptor-indexed bibliography of over 200 papers and books.

AIM-567

Author[s]: Katsushi Ikeuchi

Shape from Regular Patterns: An Example of Constraint Propagation in Vision

March 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-567.ps

ftp://publications.ai.mit.edu/ai-publicaitons/pdf/AIM-567.pdf

An algorithm is proposed for obtaining local surface orientation from the apparent distortion of surface patterns in an image. A spherical projection is used for imaging. A mapping is defined from points on this image sphere to a locus of points on the Gaussian sphere which corresponds to possible surface orientations. This mapping is based on the measurement of the local distortions of a repeated known texture pattern due to the imaging projection. This locus of possible surface orientations can be reduced to a unique orientation at each point on the image sphere using 3 vantage points and taking the intersection of the loci of possible orientations derived from each vantage. It is also possible to derive a unique surface orientation at each image point through the use of an iterative constraint propagation technique along with the orientation information available at occluding boundaries. Both method are demonstrated for real images.

AIM-566

Author[s]: Katsushi Ikeuchi

Numerical Shape from Shading and Occluding Contours in a Single View

November 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-566.ps

ftp://publications.ai.mit.edu/ai-publicaitons/pdf/AIM-566.pdf

An iterative method of using occluding boundary information is proposed to compute surface slope from shading. We use a stereographic space rather than the more commonly used gradient space in order to express occluding boundary information. Further, we use “average” smoothness constraints rather than the more obvious “closed loop” smoothness constraints. We develop alternate constraints from the definition of surface smoothness, since the closed loop constraints do not work in stereographic space. We solve the image irradiance equation iteratively using a Gauss- Seidel method applied to the constraints and boundary information. Numerical experiments show that the method is effective. Finally, we analyze SEM (Scanning Electron Microscope) pictures using this method. Other applications are also proposed.

AIM-565

Author[s]: W.E.L. Grimson

A Computer Implementation of a Theory of Human Stereo Vision

January 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-565.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-565.pdf

Recently, Marr and Poggio (1979) presented a theory of human stereo vision. An implementation of that theory is presented and consists of five steps: (1) The left and right images are each filtered with masks of four sizes that increase with eccentricity; the shape of these masks is given by $ abla^{2}G$, the laplacian of a gaussian function. (2) Zero-crossing in the filtered images are found along horizontal scan lines. (3) For each mask size, matching takes place between zero-crossings of the same sign and roughly the same orientation in the two images, for a range of disparities up to about the width of the mask's central region. Within this disparity range, Marr and Poggio showed that false targets pose only a simple problem. (4) The output of the wide masks can control vergence movements, thus causing small masks to come into low resolution to dealing with small disparities at a high resolution. (5) When a correspondence is achieved, it is stored in a dynamic buffer, called the 2 1/2 dimensional sketch. To support the sufficiency of the Marr- Poggio model of human stereo vision, the implementation was tested on a wide range of stereograms from the human stereopsis literature. The performance of the implementation is illustrated and compared with human perception. As well, statistical assumptions made by Marr and Poggio are supported by comparison with statistics found in practice. Finally, the process of implementing the theory has led to the clarification and refinement of a number of details within the theory; these are discussed in detail.

AIM-564

Author[s]: Lucia M. Vaina

Towards a Computational Theory of Semantic Memory

February 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-564.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-564.pdf

Research in memory has been a frustrating task not least because of the intimate familiarity with what we are trying to understand, and partly also because the human cognitive system has developed as an interactive whole; it is difficult to isolate its component modules – a necessary prerequisite for their thorough elucidation. Memory cannot be studied in isolation since it is essentially only an adjunct to the proper execution of our ordinary information processing tasks. In order to try to formulate specifically some of the basic requirements of memory we must therefore examine the structure of the processing tasks for which it is used.

AIM-561

Author[s]: William A. Kornfeld

Using Parallel Processing for Problem Solving

December 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-561.ps

ftp://publications.ai.mit.edu/ai-publicaitons/pdf/AIM-561.pdf

Parallel processing as a conceptual aid in the design of programs for problem solving applications is developed. A pattern directed invocation language know as Ether is introduced. Ether embodies tow notions in language design: activities and viewpoints. Activities are the basic parallel processing primitive. Different goals fo the system can be pursued in parallel by placing them in separate activities. Language primitives are provided for manipulating running activities. Viewpoints are a generalization of context mechanisms and serve as a device for representing multiple world models. A number of problem solving schemes are developed making use of viewpoints and activities. It will be demonstrated that many kinds of heuristic search that are commonly implemented using backtracking can be reformulated to use parallel processing with advantage in control over the problem solving behavior. The semantics of Ether are such that such things as deadlock and race conditions that plague many languages for parallel processing cannot occur. The programs presented are quite simple to understand.

AIM-559

Author[s]: Jack Holloway, Guy L. Steele Jr., Gerald Jay Sussman and Alan Bell

The SCHEME-79 Chip

January 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-559.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-559.pdf

We have designed and implemented a single-chip microcomputer (which we call SCHEME-79) which directly interprets a typed pointer variant of SCHEME, a dialect of the language LISP. To support this interpreter the chip implements an automatic storage allocation system for heap-allocated data and an interrupt facility for user interrupt routines implemented in SCHEME. We describe how the machine architecture is tailored to support the language, and the design methodology by which the hardware was synthesized. We develop an interpreter for SCHEME written in LISP which may be viewed as a microcode specification. This is converted by successive compilation passes into actual hardware structures on the chip. We develop a language embedded in LSIP for describing layout artwork so we can procedurally define generators for generalized macro components. The generators accept parameters to produce the specialized instances used in a particular design. We discuss the performance of the current design and directions for improvement, both in the circuit performance and in the algorithms implemented by the chip. A complete annotated listing of the microcode embodied by the chip is included.

AIM-558

Author[s]: David C. Marr and Tomaso Poggio

Some Comments on a Recent Theory of Stereopsis

July 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-558.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-558.pdf

A number of developments have taken place since the formulation of Marr and Poggio’s theory of human stereo vision. In particular, these concern the shape of the underlying receptive fields, the control of eye movements and the role of neuronal pools in the so-called pulling effect. These and other connected matters are briefly discussed.

AIM-557

Author[s]: Francis H.C. Crick, David C. Marr and Tomaso Poggio

An Information Processing Approach to Understanding the Visual Cortex

April 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-557.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-557.pdf

An outline description is given of the experimental work on the visual acuity and hyperacuity of human beings. The very high resolution achieved in hyperacuity corresponds to a fraction of the spacing between adjacent cones in the fovea. We briefly outline a computational theory of early vision, according to which (a) retinal image is filtered through a set of approximately bandpass, spatial filters and (b) zero- crossings may contain sufficient information for much of the subsequent processing. Consideration of the optimum filter lead to one which is equivalent to a cell with a particular center-surround type of response. An “edge” in the visual field then corresponds to a line of zero-crossings in the filtered image. The mathematics of sampling and of Logan’s zero-crossing theorem are briefly explained.

AIM-556

Author[s]: Richard M. Stallman

Phantom Stacks: If You Look Too Hard, They Aren't There

July 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-556.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-556.pdf

A Stack is a very efficient way of allocating and deallocating memory, but it works only with a restricted pattern of usage. Garbage collection is completely flexible but comparatively costly. The implementation of powerful control structures naturally uses memory which usually fits in with stack allocation but must have the flexibility to do otherwise from time to time. How can we manage memory which only once in a while violates stack restrictions, without paying a price the rest of the time? This paper provides an extremely simple way of doing so, in which only the part of the system which actually uses the stack needs to know anything about the stack. We call them Phantom Stacks because they are liable to vanish if subjected to close scrutiny. Phantom Stacks will be used in the next version of the Artificial Intelligence Lab’s Scheme microprocessor chip.

AIM-555

Author[s]: Richard M. Stallman

EMACS Manual for TWENEX Users

March 1983

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-555.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-555.pdf

A reference manual for the extensible, customizable, self-documenting real-time display editor. This manual corresponds to EMACS version 162.

AIM-554

Author[s]: Richard M. Stallman

EMACS Manual for ITS Users

October 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-554.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-554.pdf

A reference manual for the extensible, customizable, self-documenting real-time display editor. This manual corresponds to EMACS version 162.

AIM-552

Author[s]: Beth C. Levin

Instrumental With and the Control Relation in English

November 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-552.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-552.pdf

This paper explores the nature of the underlying representation of a sentence, that representation formulated to make explicit the semantic structure of a sentence as a description of an event. It argues that the typical conception of an underlying representation as a predicate-argument representation, exemplified in systems of case and thematic relations, must be modified. An underlying representation must include semantic relations between noun phrases as well as the predicate-argument relations of noun phrases to a verb. An examination of instrumental with will be used to motivate and justify this revision. In particular, an account of instrumental with requires the introduction of the control relation, a relation between two noun phrases.

AIM-551

Author[s]: David A. McAllester

An Outlook on Truth Maintenance

August 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-551.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-551.pdf

Truth maintenance systems have been used in several recent problem solving systems to record justifications for deduced assertions, to track down the assumptions which underlie contradictions when they arise, and to incrementally modify assertional data structures when assumptions are retracted. A TMS algorithm is described here that is substantially different from previous systems. This algorithm performs deduction in traditional propositional logic in such a way that the premise set from which deduction is being done can be easily manipulated. A novel approach is also taken to the role of a TMS in larger deductive systems. In this approach the TMS performs all propositional deduction in a uniform manner while the larger system is responsible for controlling the instantiation of universally quantified formulae and axiom schemas.

AITR-550

Author[s]: David Allen McAllester

The Use of Equality in Deduction and Knowledge Representation

January 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-550.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-550.pdf

This report describes a system which maintains canonical expressions for designators under a set of equalities. Substitution is used to maintain all knowledge in terms of these canonical expressions. A partial order on designators, termed the better-name relation, is used in the choice of canonical expressions. It is shown that with an appropriate better-name relation an important engineering reasoning technique, propagation of constraints, can be implemented as a special case of this substitution process. Special purpose algebraic simplification procedures are embedded such that they interact effectively with the equality system. An electrical circuit analysis system is developed which relies upon constraint propagation and algebraic simplification as primary reasoning techniques. The reasoning is guided by a better-name relation in which referentially transparent terms are preferred to referentially opaque ones. Multiple description of subcircuits are shown to interact strongly with the reasoning mechanism.

AIM-549

Author[s]: Richard C. Waters

Mechanical Arm Control

October 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-549.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-549.pdf

This paper discusses three main problems associated with the control of the motion of a mechanical arm. 1) Transformation between different coordinate systems associated with the arm. 2) Calculation of detailed trajectories for the arm to follow. 3) Calculation of the forces which must be applied to the joints of the arm in order to make it move along a specified path. Each of the above problems is amenable to exact solution. However, the resulting equations are, in general, quite complex and difficult to compute. This paper investigates several methods for speeding up this calculation, and for getting approximate solutions to the equations.

AIM-548

Author[s]: Glenn A. Iba

Learning Disjunctive Concepts From Examples

September 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-548.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-548.pdf

This work proposes a theory for machine learning of disjunctive concepts. The paradigm followed is one of teaching and testing, where the teaching is accomplished by presenting a sequence of positive and negative examples of the target concept. The core of the theory has been implemented and tested as computer programs. The theory addresses the problem of deciding when it is appropriate to merge descriptions and when it is appropriate to form a disjunctive split. The approach outlined has the advantage that it allows recovery from over generalizations. It is observed that negative examples play an important role in the decision making process, as well as in detecting over generalizations and instigating recovery. Because of the ability to recover from over generalizations when they occur, the system is less sensitive to the ordering of the training sequence than other systems. The theory is presented in a domain and representation independent format. A few conditions are presented, which abstract the assumptions made about any representation scheme that is to be employed within the theory. The work is illustrated in several different domains, illustrating the generality and flexibility of the theory.

AIM-546

Author[s]: Seymour Papert, Daniel Watt, Andrea diSessa and Sylvia Weir

Final Report of the Brookline LOGO Project. Part III: Profiles of Individual Student's Work

September 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-546.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-546.pdf

During the school year 1977/78 four computers equipped with LOGO and Turtle Graphics were installed in an elementary school in Brookline, Mass. All sixth grade students in the school had between 20 and 40 hours of hands-on experience with the computers. The work of 16 students was documented in detail.

AIM-545

Author[s]: Seymour Papert, Daniel Watt, Andrea diSessa and Sylvia Weir

Final Report of the Brookline LOGO Project. Part II: Project Summary and Data

September 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-545.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-545.pdf

AIM-544

Author[s]: Marvin Minsky

Toward a Remotely-Manned Energy and Production Economy

September 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-544.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-544.pdf

We can solve many problems of Energy, Health, Productivity, and Environmental Quality by improving the technology of remote control. This will produce Nuclear Safety and Security, Advances in Mining, Increases in Productivity, Economies in Transportation, New Industries and Markets. By creating “mechanical hands” that are versatile and economical enough, we shape a new world of health, energy and security. It will take 10 to 20 years, and cost about a billion dollars.

AIM-543

Author[s]: Luc Steels

Procedural Attachment

August 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-543.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-543.pdf

A frame-based reasoning system is extended to deal with procedural attachment. Arguments are given why procedural attachment is needed in a symbolic reasoner. The notion of an infinitary concept is introduced. Conventions for representing procedures and a control structure regulating their execution is discussed. Examples from electrical engineering and music illustrate arithmetic constraints and constraints over properties of strings and sequences.

AITR-542

Author[s]: Luc Steels

Reasoning Modeled as a Society of Communicating Experts

June 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-542.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-542.pdf

This report describes a domain independent reasoning system. The system uses a frame- based knowledge representation language and various reasoning techniques including constraint propagation, progressive refinement, natural deduction and explicit control of reasoning. A computational architecture based on active objects which operate by exchanging messages is developed and it is shown how this architecture supports reasoning activity. The user interacts with the system by specifying frames and by giving descriptions defining the problem situation. The system uses its reasoning capacity to build up a model of the problem situation from which a solution can interactively be extracted. Examples are discussed from a variety of domains, including electronic circuits, mechanical devices and music. The main thesis is that a reasoning system is best viewed as a parallel system whose control and data are distributed over a large network of processors that interact by exchanging messages. Such a system will be metaphorically described as a society of communicating experts.

AIM-541

Author[s]: D. Marr, E. Hildreth and T. Poggio

Evidence for a Fifth, Smaller Channel in Early Human Vision

August 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-541.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-541.pdf

Recent studies in psychophysics and neurophysiology suggest that the human visual system utilizes a range of different size or spatial frequency tuned mechanisms in its processing of visual information. It has been proposed that there exist four such mechanisms, operating everywhere in the visual field, with the smallest mechanism having a central excitatory width of 3’ of arc in the ventral fovea. This note argues that there exists indirect evidence for the existence of a fifth, smaller channel, with a central width in the fovea of 1.5’.

AITR-540

Author[s]: Kenneth Michael Kahn

Creation of Computer Animation from Story Descriptions

August 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-540.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-540.pdf

This report describes a computer system that creates simple computer animation in response to high-level, vague, and incomplete descriptions of films. It makes its films by collecting and evaluating suggestions from several different bodies of knowledge. The order in which it makes its choices is influenced by the focus of the film. Difficult choices are postponed to be resumed when more of the film has been determined. The system was implemented in an object- oriented language based upon computational entities called “actors”. The goal behind the construction of the system is that, whenever faced with a choice, it should sensibly choose between alternatives based upon the description of the film and as much general knowledge as possible. The system is presented as a computational model of creativity and aesthetics.

AIM-539

Author[s]: Katsushi Ikeuchi and Berthold K.P. Horn

An Application of the Photometric Stereo Method

August 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-539.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-539.pdf

The orientation of patches on the surface of an object can be determined from multiple images taken with different illuminations, but from the same viewing position. This method, referred to as photometric stereo, can be implemented using table lookup based on numerical inversion of experimentally determined reflectance maps. Here we concentrate on objects with specularly reflecting surfaces, since these are of importance in industrial applications. Previous methods, intended for diffusely reflecting surfaces, employed point source illumination, which is quite unsuitable in this case. Instead, we use a distributed light source obtained by uneven illumination of a diffusely reflecting planar surface. Experimental results are shown to verify analytic expressions obtained for a method employing three light source distributions.

AITR-537

Author[s]: Candace Lee Sidner

Towards a Computational Theory of Definite Anaphora Comprehension in English Discourse

June 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-537.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-537.pdf

This report investigates the process of focussing as a description and explanation of the comprehension of certain anaphoric expressions in English discourse. The investigation centers on the interpretation of definite anaphora, that is, on the personal pronouns, and noun phrases used with a definite article the, this or that. Focussing is formalized as a process in which a speaker centers attention on a particular aspect of the discourse. An algorithmic description specifies what the speaker can focus on and how the speaker may change the focus of the discourse as the discourse unfolds. The algorithm allows for a simple focussing mechanism to be constructed: and element in focus, an ordered collection of alternate foci, and a stack of old foci. The data structure for the element in focus is a representation which encodes a limted set of associations between it and other elements from teh discourse as well as from general knowledge.

AIM-536

Author[s]: Berthold K.P. Horn

SEQUINS and QUILLS: Representations for Surface Topography

May 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-536.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-536.pdf

The shape of a continuous surface can be represented by a collection of surface normals. These normals are like a porcupine’s quills. Equivalently, one can use the surface patches on which these normals rest. These in turn are like sequins sewn on a costume. These and other representations for information which can be obtained from images and used in the recognition and description of objects in a scene will be briefly described.

AITR-534

Author[s]: John Hollerbach

Theory of Handwriting

March 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-534.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-534.pdf

Handwriting production is viewed as a constrained modulation of an underlying oscillatory process. Coupled oscillations in horizontal and vertical directions produce letter forms, and when superimposed on a rightward constant velocity horizontal sweep result in spatially separated letters. Modulation of the vertical oscillation is responsible for control of letter height, either through altering the frequency or altering the acceleration amplitude. Modulation of the horizontal oscillation is responsible for control of corner shape through altering phase or amplitude. The vertical velocity zero crossing in the velocity space diagram is important from the standpoint of control. Changing the horizontal velocity value at this zero crossing controls corner shape, and such changes can be effected through modifying the horizontal oscillation amplitude and phase. Changing the slope at this zero crossing controls writing slant; this slope depends on the horizontal and vertical velocity zero amplitudes and on the relative phase difference. Letter height modulation is also best applied at the vertical velocity zero crossing to preserve an even baseline. The corner shape and slant constraints completely determine the amplitude and phase relations between the two oscillations. Under these constraints interletter separation is not an independent parameter. This theory applies generally to a number of acceleration oscillation patterns such as sinusoidal, rectangular and trapezoidal oscillations. The oscillation theory also provides an explanation for how handwriting might degenerate with speed. An implementation of the theory in the context of the spring muscle model is developed. Here sinusoidal oscillations arise from a purely mechanical sources; orthogonal antagonistic spring pairs generate particular cycloids depending on the initial conditions. Modulating between cycloids can be achieved by changing the spring zero settings at the appropriate times. Frequency can be modulated either by shifting between coactivation and alternating activation of the antagonistic springs or by presuming variable spring constant springs. An acceleration and position measuring apparatus was developed for measurements of human handwriting. Measurements of human writing are consistent with the oscillation theory. It is shown that the minimum energy movement for the spring muscle is bang-coast-bang. For certain parameter values a singular arc solution can be shown to be minimizing. Experimental measurements however indicate that handwriting is not a minimum energy movement.

AIM-533

Author[s]: John M. Hollerbach

A Recursive Lagrangian Formulation of Manipulator Dynamics

June 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-533.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-533.pdf

An efficient Lagrangian formulation of manipulator dynamics has been developed. The efficiency derives from recurrence relations for the velocities, accelerations, and generalized forces. The number of additions and multiplications varies linearly with the number of joints, as opposed to past Lagrangian dynamics formulations with an n4 dependence. With this formulation it should be possible in principle to compute the Lagrangian dynamics in real time. The computational complexities of this and other dynamics formulations including recent Newton-Euler formulations and tabular formulations are compared. It is concluded that recursive formulations based either on the Lagrangian or Newton-Euler dynamics offer the best method of dynamics calculation.

AIM-531

Author[s]: Mitchell P. Marcus

An Overview of a Theory of Syntactic Recognition for Natural Language

July 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-531.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-531.pdf

Assume that the syntax of natural language can be parsed by a left-to-right deterministic mechanism without facilities for parallelism or backup. It will be shown that this “determinism” hypothesis, explored within the context of the grammar of English, leads to a simple mechanism, a grammar interpreter, having the following properties: (a) Simple rules of grammar can be written for this interpreter which capture the generalizations behind various linguistic phenomena, despite the seeming difficulty of capturing such generalizations in the framework of a processing model for recognition. (b) The structure of the grammar rules cannot parse sentences which violate either of two constraints which Chomsky claims are linguistic universals. This result depends in part upon the computational use of Chomsky’s notion of Annotated Surface Structure. (c) The grammar interpreter provides a simple explanation for the difficulty caused by “garden path” sentences, such as “The cotton clothing is made of grows in Mississippi”. To the extent that these properties, all of which reflect deep properties of natural language, follow from the original hypothesis, they provide indirect evidence for the truth of this assumption. This memo is an abridged form of several topics discussed at length in [Marcus 77]; it does not discuss the mechanism used to parse noun phrases nor the kinds of interaction between syntax and semantics discussed in that work.

AIM-530

Author[s]: David A. Smith

Using Enhanced Spherical Images for Object Representation

May 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-530.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-530.pdf

The processes involved in vision, manipulation, and spatial reasoning depend greatly on the particular representation of three-dimensional objects used. A novel representation, based on concepts of differential geometry, is explored. Special attention is given to properties of the enhanced spherical image model, reconstruction of objects from their representation, and recognition of similarity with prototypes. Difficulties associated with representing smooth and non-convex bodies are also discussed.

AITR-529

Author[s]: Johan de Kleer

Causal and Teleological Reasoning in Circuit Recognition

September 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-529.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-529.pdf

This thesis presents a theory of human-like reasoning in the general domain of designed physical systems, and in particular, electronic circuits. One aspect of the theory, causal analysis, describes how the behavior of individual components can be combined to explain the behavior of composite systems. Another aspect of the theory, teleological analysis, describes how the notion that the system has a purpose can be used to aid this causal analysis. The theory is implemented as a computer program, which, given a circuit topology, can construct by qualitative causal analysis a mechanism graph describing the functional topology of the system. This functional topology is then parsed by a grammar for common circuit functions. Ambiguities are introduced into the analysis by the approximate qualitative nature of the analysis. For example, there are often several possible mechanisms which might describe the circuit's function. These are disambiguated by teleological analysis. The requirement that each component be assigned an appropriate purpose in the functional topology imposes a severe constraint which eliminates all the ambiguities. Since both analyses are based on heuristics, the chosen mechanism is a rationalization of how the circuit functions, and does not guarantee that the circuit actually does function. This type of coarse understanding of circuits is useful for analysis, design and troubleshooting.

AIM-528

Author[s]: Thomas F. Knight, Jr., David A. Moon, Jack Holloway and Guy L. Steele, Jr.

CADR

May 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-528.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-528.pdf

The CADR machine, a revised version of the CONS machine, is a general-purpose, 32-bit microprogrammable processor which is the basis of the Lisp-machine system, a new computer system being developed by the Laboratory as a high-performance, economical implementation of Lisp. This paper describes the CADR processor and some of the associated hardware and low- level software.

AIM-527

Author[s]: Guy Lewis Steele, Jr. and Gerald Jay Sussman

The Dream of a Lifetime: A Lazy Scoping Mechanism

November 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-527.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-527.pdf

We define a “rack”, a data abstraction hybrid of a register and a stack. It is used for encapsulating the behavior of the kind of register whose contents may have an extent which requires that it be saved during the execution of an unknown piece of code. A rack can be implemented cleverly to achieve performance benefits over the usual implementation of a stack discipline. The basic idea is that we interpose a state machine controller between the rack abstraction and its stack/registers. This controller can act as an on-the-fly run-time peephole optimizer, eliding unnecessary stack operations. We demonstrate the sorts of savings one might expect by using cleverly implemented racks in the context of a particular caller-saves implementation of an interpreter for the SCHEME dialect of LISP. For sample problems we can expect that only one out of every four pushes that would be done by a conventional machine will be done by a clever version.

AIM-526

Author[s]: Gerald Jay Sussman, Jack Holloway and Thomas F. Knight, Jr.

Computer Aided Evolutionary Design for Digital Integrated Systems

May 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-526.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-526.pdf

We propose to develop a computer aided design tool which can help an engineer deal with system evolution from the initial phases of design right through the testing and maintenance phases. We imagine a design system which can function as a junior assistant. It provides a total conversational and graphical environment. It remembers the reasons for design choices and can retrieve and do simple deductions with them. Such a system can provide a designer with information relevant to a proposed modification and can help him understand the consequences of simple modifications by pointing out the structures and functions which will be affected by modifications. The designer’s assistant will maintain a vast amount of such annotation on the structure and function of the system being evolved and will be able to retrieve the appropriate annotation and remind the designer about the features which he installed too long ago to remember, or which were installed by other designers who work with him. We will develop the fundamental principles behind such a designer’s assistant and we will construct a prototype system which meets many of these desiderata.

AIM-524

Author[s]: D. Marr and S. Ullman

Directional Selectivity and Its Use in Early Visual Processing

June 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-524.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-524.pdf

The construction of directionally selective units and their use in the processing of visual motion are considered.

AIM-523

Author[s]: Jeanne Bamberger

Logo Music Projects: Experiments in Musical Perception and Design

May 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-523.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-523.pdf

This memo gives a series of experiments which one can use to get a better understanding of how music works and how music is apprehended by an active and knowing listener. It does so by using the children’s computer language, LOGO, and capitalizes on the use of procedural thinking and other programming concepts (for example, the use of variables) in the designing and analysis of melody and rhythm.

AIM-522

Author[s]: Kent A. Stevens

Constraints on the Visual Interpretation of Surface Contours

March 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-522.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-522.pdf

This article examines the computational problems underlying the 3-D interpretation of surface contours. A surface contour is the image of a curve across a physical surface, such as the edge of a shadow cast across a surface, a gloss contour, wrinkle, seam, or pigmentation marking. Surface contours by and large are not as restricted as occluding contours and therefore pose a more difficult interpretation problem. Nonetheless, we are adept at perceiving a definite 3-D surface from even simple line drawings (e.g. graphical depictions of continuous functions of two variables). The solution of a specific surface shape comes by assuming that the physical curves are particularly restricted in their geometric relationship to the underlying surface. These geometric restrictions are examined.

AIM-521

Author[s]: Jon Doyle

A Truth Maintenance System

June 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-521.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-521.pdf

To choose their actions, reasoning programs must be able to make assumptions and subsequently revise their beliefs when discoveries contradict these assumptions. The Truth Maintenance System (TMS) is a problem solver subsystem for performing these functions by recording and maintaining the reasons for program beliefs. Such recorded reasons are useful in constructing explanations of program actions in guiding the course of action of a problem solver. This paper describes (1) the representations and structure of the TMS, (2) the mechanisms used to revise the current set of beliefs, (3) how dependency-directed backtracking changes the current set of assumptions, (4) techniques for summarizing explanations of beliefs, (5) how to organize problem solvers into “dialectically arguing” modules, (6) how to revise models of the belief systems of others, and (7) methods for embedding control structures in patterns of assumptions. We stress the need of problem solvers to choose between alternative systems of beliefs, and outline a mechanism by which a problem solver can employ rules guiding choices of what to believe, what to want, and what to do.

AIM-520

Author[s]: Patrick H. Winston

Learning and Reasoning by Analogy: The Details

April 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-520.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-520.pdf

We use analogy when we say something is a Cinderella story and when we learn about resistors by thinking about water pipes. We also use analogy when we learn subjects like Economics, Medicine and Law. This paper presents a theory of analogy and describes an implemented system that embodies the theory. The specific competence to be understood is that of using analogies to do certain kinds of learning and reasoning. Learning takes place when analogy is used to generate a constraint description in one domain, given a constraint description in another, as when we learn Ohm’s law by way of knowledge about water pipes. Reasoning takes place when analogy is used to answer questions about one situation, given another situation that is supposed to be a precedent, as when we answer questions about Hamlet by way of knowledge about Macbeth. The input language used and the treatment of words implying CAUSE have been improved. AIM 632, “Learning New Principles from Precedents and Exercises,” describes these improvements and subsequent work. It is, at this writing, in publication in the Artificial Intelligence Journal.

AIM-519A

Author[s]: Richard M. Stallman

EMACS: The Extensible, Customizable, Self-Documenting Display Editor

March 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-519A.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-519A.pdf

EMACS is a display editor which is implemented in an interpreted high level language. This allows users to extend the editor by replacing parts of it, to experiment with alternative command languages, and to share extensions which are generally useful. The ease of extension has contributed to the growth of a large set of useful features. This paper describes the organization of the EMACS system, emphasizing the way in which extensibility is achieved and used.

AIM-518

Author[s]: D. Marr and E. Hildreth

Theory of Edge Detection

April 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-518.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-518.pdf

A theory of edge detection is presented.

AIM-517

Author[s]: Anna R. Bruss

Some Properties of Discontinuities in the Image Irradiance Equation

April 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-517.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-517.pdf

The image irradiance equation is a first order partial differential equation. Part of this paper is a “comprehensive” guide to solving this kind of equation. The special structure of the image irradiance equation is explored in order to understand the relation of discontinuities in the surface properties and in the image intensities.

AIM-516

Author[s]: Marvin Minsky

K-Lines: A Theory of Memory

June 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-516.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-516.pdf

Most theories of memory suggest that when we learn or memorize something, some “representation” of that something is constructed, stored and later retrieved. This raises questions like: How is information represented? How is it stored? How is it retrieved? Then, how is it use? This paper tries to deal with all these at once. When you get an idea and want to “remember” it, you create a “K-line” for it. When later activated, the K-line induces a partial mental state resembling the one that created it. A “partial mental state” is a subset of those mental agencies operating at one moment. This view leads to many ideas about the development, structure and physiology of Memory, and about how to implement frame-like representations in a distributed processor.

AITR-515

Author[s]: Matthew Thomas Mason

Compliance and Force Control for Computer Controlled Manipulators

April 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-515.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-515.pdf

Compliant motion occurs when the manipulator position is constrained by the task geometry. Compliant motion may be produced either by a passive mechanical compliance built in to the manipulator, or by an active compliance implemented in the control servo loop. The second method, called force control, is the subject of this report. In particular, this report presents a theory of force control based on formal models of the manipulator, and the task geometry. The ideal effector is used to model the manipulator, and the task geometry is modeled by the ideal surface, which is the locus of all positions accessible to the ideal effector. Models are also defined for the goal trajectory, position control, and force control.

AIM-514

Author[s]: Guy Lewis Steele, Jr. and Gerald Jay Sussman

Design of LISP-based Processors, or SCHEME: A Dielectric LISP, or Finite Memories Considered Harmful, or LAMBDA: The Ultimate Opcoed

March 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-514.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-514.pdf

We present a design for a class of computers whose 'instruction sets' are based on LISP. LISP, like traditional stored-program machine languages and unlike most high-level languages, conceptually stores programs and data in the same way and explicitly allows programs to be manipulated as data. LISP is therefore a suitable language around which to design a stored-program computer architecture. LISP differs from traditional machine languages in that the program/data storage is conceptually an unordered set of linked record structures of various sizes, rather than an ordered, indexable vector of integers or bit fields of fixed size. The record structures can be organized into trees or graphs. An instruction set can be designed for programs expressed as such trees. A processor can interpret these trees in a recursive fashion, and provide automatic storage management for the record structures. We describe here the basic ideas behind the architecture, and for concreteness give a specific instruction set (on which variations are certainly possible). We also discuss the similarities and differences between these ideas and those of traditional architectures. A prototype VLSI microprocessor has been designed and fabricated for testing. It is a small-scale version of the ideas presented here, containing a sufficiently complete instruction interpreter to execute small programs, and a rudimentary storage allocator. We intend to design and fabricate a full-scale VLSI version of this architecture in 1979.

AIM-513

Author[s]: Kenneth M. Kahn

Making Aesthetic Choices

March 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-513.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-513.pdf

A framework is presented for making choices that are primarily constrained by aesthetic, as opposed to, pragmatic considerations. An example of the application of this framework is a computer system called “Ani”, capable of making simple computer animation in response to high-level incomplete story descriptions. Aesthetic choice is presented as a parallel computation in which each choice point gathers together and evaluates suggestions. When faced with difficulties these choices can be postponed. The order in which inter-dependent choices are made is strongly influenced by the focus of the problem.

AITR-512

Author[s]: Kent A. Stevens

Surface Perception from Local Analysis of Texture and Contour

February 1980

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-512.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-512.pdf

The visual analysis of surface shape from texture and surface contour is treated within a computational framework. The aim of this study is to determine valid constraints that are sufficient to allow surface orientation and distance (up to a multiplicative constant) to be computed from the image of surface texture and of surface contours.

AIM-510

Author[s]: W.E.L. Grimson

Differential Geometry, Surface Patches and Convergence Methods

February 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-510.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-510.pdf

The problem of constructing a surface from the information provided by the Marr-Poggio theory of human stereo vision is investigated. It is argued that not only does this theory provide explicit boundary conditions at certain points in the image, but that the imaging process also provides implicit conditions on all other points in the image. This argument is used to derive conditions on possible algorithms for computing the surface. Additional constraining principles are applied to the problem; specifically that the process be performable by a local-support parallel network. Some mathematical tools, differential geometry, Coons surface patches and iterative methods of convergence, relevant to the problem of constructing the surface are outlined. Specific methods for actually computing the surface are examined.

AIM-507

Author[s]: Howard E. Shrobe, Richard C. Waters and Gerald J. Sussman

A Hypothetical Monologue Illustrating the Knowledge Underlying Program Analysis

January 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-507.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-507.pdf

Automated Program Analysis is the process of discovering decompositions of a system into sub-units such that the behavior of the whole program can be inferred from the behavior of its parts. Analysis can be employed to increase the explanatory power of a program understanding system. We identify several techniques which are useful for automated program analysis. Chief among these is the identification and classification of the macro-scale units of programming knowledge which are characteristic of the problem domain. We call these plans. This paper presents a summary of how plans can be used in program analysis in the form of a hypothetical monologue. We also show a small catalogue of plans which are characteristic of AI programming. Finally, we present some techniques which facilitate plan recognition.

AIM-506

Author[s]: Charles Rich, Howard E. Shrobe and Richard C. Waters

Computer Aided Evolutionary Design for Software Engineering

January 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-506.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-506.pdf

We report on a partially implemented interactive computer aided design tool for software engineering. A distinguishing characteristic of our project is its concern for the evolutionary character of software systems. Our project draws a distinction between algorithms and systems, centering on its attention on support for the system designer. Although verification has played a large role in recent research, our perspective suggests that the complexity and evolutionary nature of software systems requires a number of additional techniques, which are described in this paper.

AIM-505

Author[s]: Carl Hewitt, Giuseppe Attardi and Henry Lieberman

Specifying and Proving Properties of Guardians for Distributed Systems

June 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-505.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-505.pdf

In a distributed system where many processors are connected by a networ and communicate using message passing, many users can be allowed to access the same facilities. A public utility is usually an expensive or limited resource whose use has to be regulated. A GUARDIAN is an abstraction that can be used to regulate the use of resources by scheduling their access, providing protection, and implementing recovery from hardware failures. We present a language construct called a PRIMITIVE SERIALIZER which can be used to express efficient implementations of guardians in a modular fashion. We have developed a proof methodology for proving strong properties of network utilities e.g. the utility is guaranteed to respond to each request which it is sent. This proof methodology is illustrated by proving properties of a guardian which manages two hardcopy printing devices.

AITR-503

Author[s]: Howard Elliot Shrobe

Dependency Directed Reasoning for Complex Program Understanding

April 1979

ftp://publications.ai.mit.edu/ai-publications/500-999/AITR-503.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-503.pdf

Artificial Intelligence research involves the creation of extremely complex programs which must possess the capability to introspect, learn, and improve their expertise. Any truly intelligent program must be able to create procedures and to modify them as it gathers information from its experience. [Sussman, 1975] produced such a system for a 'mini-world'; but truly intelligent programs must be considerably more complex. A crucial stepping stone in AI research is the development of a system which can understand complex programs well enough to modify them. There is also a complexity barrier in the world of commercial software which is making the cost of software production and maintenance prohibitive. Here too a system which is capable of understanding complex programs is a necessary step. The Programmer's Apprentice Project [Rich and Shrobe, 76] is attempting to develop an interactive programming tool which will help expert programmers deal with the complexity involved in engineering a large software system. This report describes REASON, the deductive component of the programmer's apprentice. REASON is intended to help expert programmers in the process of evolutionary program design. REASON utilizes the engineering techniques of modelling, decomposition, and analysis by inspection to determine how modules interact to achieve the desired overall behavior of a program. REASON coordinates its various sources of knowledge by using a dependency-directed structure which records the justification for each deduction it makes. Once a program has been analyzed these justifications can be summarized into a teleological structure called a plan which helps the system understand the impact of a proposed program modification.

AIM-502

Author[s]: Guy Lewis Steel, Jr. and Gerald Jay Sussman

Constraints

November 1978

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-502.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-502.pdf

AIM-502A

Author[s]: Gerald Jay Sussman and Guy Lewis Steel, Jr.

Constraints: A Language for Expressing Amost-Hierarchical Descriptions

August 1981

ftp://publications.ai.mit.edu/ai-publications/500-999/AIM-502a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-502a.pdf

We present an interactive system organized around networks of constraints rather than the programs which manipulate them. We describe a language of hierarchical constraint networks. We describe one method of deriving useful consequences of a set of constraints which we call propagation. Dependency analysis is used to spot and track down inconsistent subsets of a constraint set. Propagation of constraints is most flexible and useful when coupled with the ability to perform symbolic manipulations on algebraic expressions. Such manipulations are in turn best expressed as alterations or augmentations of the constraint network. Almost-Hierarchical Constraint Networks can be constructed to represent the multiple viewpoints used by engineers in the synthesis and analysis of electrical networks. These multiple viewpoints are used in terminal equivalence and power arguments to reduce the apparent synergy in a circuit so that it can be attacked algebraically.

AIM-499

Author[s]: Johan de Kleer

Causal Reasoning and Rationalization in Electronics

September 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-499.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-499.pdf

This research attempts to formalize the type of causal arguments engineerings employ to understand circuit behavior. A causal argument consists of a sequence of changes to circuit quantities (called events), each of which is caused by precious events. The set of events that an individual event can directly cause is largely an artifact of the point of view taken to analyze the circuit. A particular causal argument does not rule out other possibly conflicting causal arguments for the same circuit. If the actual behavior of the circuit is know or determined by measurements, the correct argument can be identified. The selected argument is a rationalization for the observed behavior since it explains but does not guarantee the observed behavior. A causal analysis program QUAL has been implemented which determines the response of a circuit to changes in input signals. It operates with a simple four valued arithmetic of unknown, unchanging, increasing and decreasing. This program is used to illustrate the applicability of causal reasoning to circuit recognition, algebraic analysis, troubleshooting and design.

AIM-498

Author[s]: Berthold K.P. Horn and Robert W. Sjoberg

Calculating the Reflectance Map

October 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-498.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-498.pdf

It appears that the development of machine vision may benefit from a detailed understanding of the imaging process. The reflectance map, showing scene radiance as a function of surface gradient, has proved to be helpful in this endeavor. The reflectance map depends both on the nature of the surface layers of the objects being imaged and the distribution of light sources. Recently, a unified approach to the specification of surface reflectance in terms of both incident and reflected beam geometry has been proposed. The reflectance-distribution function (BRDF). Here we derive the reflectance map in terms of the BRDF and the distribution of source radiance. A number of special cases of practical importance are developed in detail. The significance of this approach to the understanding of image formation is briefly indicated.

AIM-496

Author[s]: Seymour A. Papert and Sylvia Weir

Information Prosthetics for the Handicapped

September 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-496.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-496.pdf

In this proposal we describe a technological step towards the realization of INFORMATION PROSTHETICS. Our primary focus is on using rather than making the technology. Specifically, our goal is to transpose for the use of cerebral-palsied children a computer- based learning environment we have developed, and to study in this environment a series of issues in developmental psychology, in the psychology of learning, in psycho-diagnostic techniques and in methods of instruction.

AIM-495

Author[s]: Ira Goldstein

Developing a Computational Representation for Problem Solving Skills

October 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-495.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-495.pdf

This paper describes the evolution of a problem solving model over several generations of computer coaches. Computer coaching is a type of computer assisted instruction in which the coaching program observes the performance of a student engaged in some intellectual game. The coach’s function is to intervene occasionally in student generated situations to discuss appropriate skills that might improve the student’s play. Coaching is a natural context in which to investigate the teaching and learning processes, but it is a demanding task. The computer must be able to analyze the student’s performance in terms of a model of the underlying problem solving skills. This model must represent not only expertise for the task but also intermediate stages of problem solving skill and typical difficulties encountered by the learner. Implementing several generations of computer coaches to meet these demands has resulted in a model that represents problem solving skills a san evolving set of rules for a domain acting on an evolving representation of the problem and executed by a resource-limited problem solver. This paper describes this evolution from its starting point as a simple rule-based approach to its current form.

AIM-493

Author[s]: Brian Cantwell Smith

A Proposal for a Computational Model of Anatomical and Physiological Reasoning

November 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-493.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-493.pdf

The studies of anatomy and physiology are fundamental ingredients of medical education. This paper identifies six ways in which such functional knowledge serves as the underpinnings for general medical reasoning, and outlines the design of a computational model of common sense reasoning about human physiology. The design of the proposed model is grounded in a set of declarative representational ideas sometimes called “frame theory”: representational structures constructed from multiple-perspective, potentially redundant, descriptions, organized into structured collections, and associated with the objects and classes being described.

AITR-492

Author[s]: Richard C. Waters

Automatic Analysis of the Logical Structure of Programs

December 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-492.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-492.pdf

This report presents a method for viewing complex programs as built up out of simpler ones. The central idea is that typical programs are built up in a small number of stereotyped ways. The method is designed to make it easier for an automatic system to work with programs. It focuses on how the primitive operations performed by a program are combined together in order to produce the actions of the program as a whole. It does not address the issue of how complex data structures are built up from simpler ones, nor the relationships between data structures and the operations performed on them.

AIM-491

Author[s]: D. Marr, T. Poggio and S. Ullman

Bandpass Channels, Zero-Crossings, and Early Visual Information Processing

September 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-491.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-491.pdf

A recent advance by B.F. Logan in the theory of one octave bandpass signals may throw new light on spatial-frequency-tuned channels in early visual information processing.

AIM-490

Author[s]: Berthold K.P. Horn, Robert J. Woodham and M. Silverwilliam

Determining Shape and Reflectance Using Multiple Images

August 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-490.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-490.pdf

Distributions of surface orientation and reflectance factor on the surface of an object can be determined from scene radiances observed by a fixed sensor under varying lighting conditions. Such techniques have potential application to the automatic inspection of industrial parts, the determination of the attitude of a rigid body in space and the analysis of images returned from planetary explorers. A comparison is made of this method with techniques based on images obtained from different viewpoints with fixed lighting.

AIM-488

Author[s]: Edwina Rissland Michener

Understanding Understanding Mathematics

August 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-488.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-488.pdf

In this paper we look at some of the ingredients and processes involved in the understanding of mathematics. We analyze elements of mathematical knowledge, organize them in a coherent way and take note of certain classes of items that share noteworthy roles in understanding. We thus build a conceptual framework in which to talk about mathematical knowledge. We then use this representation to describe the acquisition of understanding. We also report on classroom experience with these ideas.

AIM-486A

Author[s]: Drew McDermott and John Doyle

Non-Monotonic Logic I

January 1979

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-486a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-486a.pdf

“Non-monotonic” logical systems are logics in which the introduction of new axioms can invalidate old theorems. Such logics are very important in modeling the beliefs of active processes which, acting in the presence of incomplete information, must make and subsequently revise predictions in light of new observations. We present the motivation and history of such logics. We develop model and proof theories, a proof procedure, and applications for one important non-monotonic logic. In particular, we prove the completeness of the non-monotonic predicate calculus and the decidability of the non- monotonic sentential calculus. We also discuss characteristic properties of this logic and its relationship to stronger logics, logics of incomplete information, and truth maintenance systems.

AIM-486

Author[s]: Drew McDermott and Jon Doyle

Non-Monotonic Logic I

August 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-486.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-486.pdf

AIM-485

Author[s]: Johan de Kleer and Gerald Jay Sussman

Propagation of Constraints Applied to Circuit Synthesis

September 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-485.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-485.pdf

A major component in the process of design is synthesis, the determination of the parameters of the parts of a network given desiderata for the behavior of the network as a whole. Traditional automated synthesis techniques are either restricted to small, precisely defined classes of circuit functions for which exact mathematical methods exist or they depend upon numerical optimization methods in which it is difficult to determine the basis for any of the answers generated and their relations to the design desiderata and constraints. We are developing a symbolic computer-aided design tool, SYN, which can be of assistance to an engineer in the synthesis of a large class of circuits. The symbolic methods produce solutions which are clear and insightful. The dependence of each parameter on the individual design desiderata and circuit constraints can be easily traced.

AIM-484

Author[s]: Members of the LOGO Project

Interim Report of the LOGO Project in the Brookline Public Schools

June 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-484.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-484.pdf

The LOGO activities of a group of 16 sixth- grade students, representing a full spectrum of ability, are being documented with a view to developing ways of capturing the learning possibilities of such an environment. The first group of eight subjects have completed 25 closely observed hours, extending over 7 weeks, in a LOGO classroom situated in a Brookline school. This is an interim report on these observations designed to exhibit the content of what has been learned; and insights into both the variety of cognitive styles of the pupils and the variety of learning situations available to a teacher with which to respond to different pupil styles and abilities. We have a large amount of data available for analysis, and we are interested in looking at this material from several points of view. The current state of our various analysis is presented here, without any effort to prune the considerable redundancy which has been generated in the process of doing this multiple-cut exercise.

AITR-483

Author[s]: Kurt A. Vanlehn

Determining the Scope of English Quantifiers

June 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-483.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-483.pdf

How can one represent the meaning of English sentences in a formal logical notation such that the translation of English into this logical form is simple and general? This report answers this question for a particular kind of meaning, namely quantifier scope, and for a particular part of the translation, namely the syntactic influence on the translation. Rules are presented which predict, for example, that the sentence: Everyone in this room speaks at least two languages. has the quantifier scope AE in standard predicate calculus, while the sentence: At lease two languages are spoken by everyone in this room. has the quantifier scope EA. Three different logical forms are presented, and their translation rules are examined. One of the logical forms is predicate calculus. The translation rules for it were developed by Robert May (May 19 77). The other two logical forms are Skolem form and a simple computer programming language. The translation rules for these two logical forms are new. All three sets of translation rules are shown to be general, in the sense that the same rules express the constraints that syntax imposes on certain other linguistic phenomena. For example, the rules that constrain the translation into Skolem form are shown to constrain definite np anaphora as well. A large body of carefully collected data is presented, and used to assess the empirical accuracy of each of the theories. None of the three theories is vastly superior to the others. However, the report concludes by suggesting that a combination of the two newer theories would have the greatest generality and the highest empirical accuracy.

AIM-482

Author[s]: Kenneth M. Kahn

Director Guide

June 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-482.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-482.pdf

Director is a programming language designed for dynamic graphics, artificial intelligence, and naïve users. It is based upon the actor or object oriented approach to programming and resembles Act 1 and SmallTalk. Director extends MacLisp by adding a small set of primitive actors and the ability to create new ones. Its graphical features include an interface to the TV turtle, pseudo-parallelism, many animation primitives, and a primitive actor for making and recording “movies”. For artificial intelligence programming Director provides a pattern-directed data base associated with each actor, an inheritance hierarchy, pseudo- parallelism, and a means of conveniently creating non-standard control structures. For use by relatively naïve programmers Director is appropriate because its stress upon very powerful, yet conceptually simple primitives and its verbose, simple syntax based upon pattern matching. Director code can be turned into optimized Lisp which in turn can be compiled into machine code.

AIM-482B

Author[s]: Kenneth M. Kahn

Director Guide

December 1979

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-482b.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-482b.pdf

Director is a programming language designed for dynamic graphics, artificial intelligence, and use by computer-naïve people. It is based upon the actor or object oriented approach to programming and resembles Act 1 and SmallTalk. Director extends MacLisp by adding a small set of primitive actors and the ability to create new ones. Its graphical features include an interface to the TV turtle, quasi-parallelism, many animation primitives, a parts/whole hierarchy and a primitive actor for making and recording “movies”. For artificial intelligence programming Director provides a pattern- directed data base associated with each actor, an inheritance hierarchy, and a means of conveniently creating non-standard control structures. For use by naïve programmers Director is appropriate because of its stress upon very powerful, yet conceptually simple primitives and its verbose, simple syntax based upon pattern matching. Director code can be turned into optimized Lisp which in turn can be compiled into machine code.

AIM-480

Author[s]: Kenneth M. Kahn and Carl Hewitt

Dynamic Graphics Using Quasi Parallelism

June 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-480.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-480.pdf

Dynamic computer graphics is best represented as several processes operating in parallel. Full parallel processing, however, entails much complex mechanism making it difficult to write simple, intuitive programs for generating computer animation. What is presented in this paper is a simple means of attaining the appearance of parallelism and the ability to program the graphics in a conceptually parallel fashion without the complexity of a more general parallel mechanism. Each entity on the display screen can be independently programmed to move, turn, change size, color or shape and to interact with other entities.

AIM-479

Author[s]: Robert J. Woodham

Photometric Stereo

June 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-479.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-479.pdf

Traditional stereo techniques determine range by relating two images of an object viewed from different directions. If the correspondence between picture elements is known, then distance to the object can be calculated by triangulation. Unfortunately, it is difficult to determine this correspondence. This paper introduces a novel technique called photometric stereo. The idea of photometric stereo is to vary the direction of the incident illumination between successive views while holding the viewing direction constant. This provides enough information to determine surface orientation at each picture element. Since the imaging geometry does not change, the correspondence between picture elements is known a priori. This stereo technique is photometric because it uses the intensity values recorded in a single picture element, in successive views, rather than the relative positions of features.

AIM-478

Author[s]: Berthold K.P.Horn, Ken-Ichi Hirokawa and Vijay Vazirani

Dynamics of a Three Degree of Freedom Kinematic Chain

October 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-478.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-478.pdf

In order to be able to design a control system for high-speed control of mechanical manipulators, it is necessary to understand properly their dynamics. Here we present an analysis of a detailed model of a three-link device which may be viewed as either a “leg” in a locomotory system, or the first three degrees of freedom of an “arm” providing for its gross motions. The equations of motion are shown to be non-trivial, yet manageable.

AIM-477

Author[s]: David Wayne Ihrie

Analysis of Synthetic Students as a Model of Human Behavior

May 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-477.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-477.pdf

The research described in this report is an attempt to evaluate the educational effects of a computer game known as Wumpus. A set of five synthetic computer students was taken as a model of the progress of real students playing a sequence of twenty Wumpus “warrens”. Using a combination of observations made of the students, representations drawn by the students and protocols kept by the computer of each session, it was found that the synthetic students are a reasonable static model of real students, but miss completely many of the important dynamic factors which affect a student’s play. In spite of this, the Wumpus game was found to be an effective educational tool.

AIM-476

Author[s]: S. Ullman

The Interpretation of Structure From Motion

October 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-476.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-476.pdf

The interpretation of structure from motion is examined from a computational point of view. The question addressed is how the 3-D structure and motion of objects can be inferred from the 2-D transformations of their projected images when no 3-D information is conveyed by the individual projections.

AIM-475

Author[s]: Steven Rosenberg

Understanding in Incomplete Worlds

May 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-475.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-475.pdf

Most real world domains differ from the micro- worlds traditionally used in A.I. in that they have an incomplete factual database which changes over time. Understanding in these domains can be thought of as the generation of plausible inferences which are able to use the facts available, and respond to changes in them. A traditional rule interpreter such as Planner can be extended to construct plausible inferences in these domains by A) allowing assumptions to be made in applying rules, resulting in simplifications of rules which can be used in an incomplete database; B) monitoring the antecedents and consequents of a rule so that inferences can be maintained over a changing database. The resulting chains of inference can provide a dynamic description of an event. This allows general reasoning processes to be used to understand in domains for which large numbers of Schema-like templates have been proposed as the best model.

AITR-474

Author[s]: Guy Lewis Steele, Jr.

RABBIT: A Compiler for SCHEME

May 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-474.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-474.pdf

We have developed a compiler for the lexically-scoped dialect of LISP known as SCHEME. The compiler knows relatively little about specific data manipulation primitives such as arithmetic operators, but concentrates on general issues of environment and control. Rather than having specialized knowledge about a large variety of control and environment constructs, the compiler handles only a small basis set which reflects the semantics of lambda- calculus. All of the traditional imperative constructs, such as sequencing, assignment, looping, GOTO, as well as many standard LISP constructs such as AND, OR, and COND, are expressed in macros in terms of the applicative basis set. A small number of optimization techniques, coupled with the treatment of function calls as GOTO statements, serve to produce code as good as that produced by more traditional compilers. The macro approach enables speedy implementation of new constructs as desired without sacrificing efficiency in the generated code. A fair amount of analysis is devoted to determining whether environments may be stack-allocated or must be heap- allocated. Heap-allocated environments are necessary in general because SCHEME (unlike Algol 60 and Algol 68, for example) allows procedures with free lexically scoped variables to be returned as the values of other procedures; the Algol stack-allocation environment strategy does not suffice. The methods used here indicate that a heap- allocating generalization of the "display" technique leads to an efficient implementation of such "upward funargs". Moreover, compile- time optimization and analysis can eliminate many "funargs" entirely, and so far fewer environment structures need be allocated at run time than might be expected. A subset of SCHEME (rather than triples, for example) serves as the representation intermediate between the optimized SCHEME code and the final output code; code is expressed in this subset in the so-called continuation-passing style. As a subset of SCHEME, it enjoys the same theoretical properties; one could even apply the same optimizer used on the input code to the intermediate code. However, the subset is so chosen that all temporary quantities are made manifest as variables, and no control stack is needed to evaluate it. As a result, this apparently applicative representation admits an imperative interpretation which permits easy transcription to final imperative machine code. These qualities suggest that an applicative language like SCHEME is a better candidate for an UNCOL than the more imperative candidates proposed to date.

AIM-473

Author[s]: David A. McAllester

A Three Valued Truth Maintenance System

May 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-473.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-473.pdf

Truth maintenance systems have been used in recently developed problem solving systems. A truth maintenance system (TMS) is designed to be used by deductive systems to maintain the logical relations among the beliefs which those systems manipulate. These relations are used to incrementally modify the belief structure when premises are changed, giving a more flexible context mechanism than has been present in earlier artificial intelligence systems. The relations among beliefs can also be used to directly trace the source of contradictions or failures, resulting in far more efficient backtracking.

AITR-472

Author[s]: Edwina Rissland Michener

The Structure of Mathematical Knowledge

August 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-472.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-472.pdf

This report develops a conceptual framework in which to talk about mathematical knowledge. There are several broad categories of mathematical knowledge: results which contain the traditional logical aspects of mathematics; examples which contain illustrative material; and concepts which include formal and informal ideas, that is, definitions and heuristics.

AIM-468

Author[s]: Candace Sidner

A Progress Report on the Discourse and Reference Components of PAL

April 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-468.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-468.pdf

This paper reports on research being conducted on a computer assistant, called PAL. PAL is being designed to arrange various kinds of events with concern for the who, what, when, where and why of that event. The goal for PAL is to permit a speaker to interact with it in English and to use extended discourse to state the speaker’s requirements. The portion of the language system discussed in this report disambiguates references from discourse and interprets the purpose of sentences of the discourse. PAL uses the focus of discourse to direct its attention to a portion of the discourse and to the database to which the discourse refers. The focus makes it possible to disambiguate references with minimal search. Focus and a frames representation of the discourse make it possible to interpret discourse purposes. The focus and representation of the discourse are explained, and the computational components of PAL which implement reference disambiguation and discourse interpretation are presented in detail.

AIM-467

Author[s]: B.K.P. Horn and R.J. Woodham

Destriping Satellite Images

March 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-467.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-467.pdf

Before satellite images obtained with multiple image sensors can be used in image analysis, corrections must be introduced for the differences in transfer functions on these sensors. Methods are here presented for obtaining the required information directly from the statistics of the sensor outputs. The assumption is made that the probability distribution of the scene radiance seen by each image sensor is the same. Successful destriping of LANDSAT images is demonstrated.

AIM-466

Author[s]: Steven Rosenberg and Herbert A. Simon

Modeling Semantic Memory: Effects of Presenting Semantic Information in Different Modalities

April 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-466.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-466.pdf

How is semantic information from different modalities integrated and stored? If related ideas are encountered in French and English, or in pictures and sentences, is the result a single representation in memory or two modality-dependent ones? Subjects were presented with items in different modalities, then were asked whether or not subsequently presented items were identical with the former ones. Subjects frequently accepted translations and items semantically consistent with those presented earlier as identical, although not as often as they accepted items actually seen previously. The same pattern of results was found when the items were French and English sentences, and when they were pictures and sentences. The results can be explained by the hypothesis that subjects integrate information across modalities into a single underlying semantic representation. A computer model, embodying this hypothesis, made predictions in close agreement with the data.

AIM-465

Author[s]: Berthold K.P. Horn and Robert J. Woodham

LANDSAT MSS Coordinate Transformations

February 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-465.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-465.pdf

A number of image analysis tasks require the registration of a surface model with an image. In the case of satellite images, the surface model may be a map or digital terrain model in the form of surface elevations on a grid of points. We develop here an affine transformation between coordinates of Multi- Spectral Scanner (MSS) images produced by the LANDSAT satellites, and coordinates of a system lying in a plane tangent to the earth’s surface near the sub-satellite (Nadir) point.

AIM-464

Author[s]: Michael S. Patterson and Carl E. Hewitt

Comparative Schematology

May 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-464.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-464.pdf

While we may have the intuitive idea of one programming language having greater power than another, or of some subset of a language being an adequate “core” for that language, we find when we try to formalize this notion that there is a serious theoretical difficulty. This lies in the fact that even quite rudimentary languages are nevertheless “universal” in the following sense. If the language allows us to program with simple arithmetic or list processing functions, then any effective control structure can be simulated, traditionally by encoding a Turing machine computation in some way. In particular, a simple language with some basic arithmetic can express programs for any partial recursive function. Such an encoding is usually quite unnatural and impossibly inefficient. Thus in order to carry on a practical study of the comparative power of different languages we are led to banish explicit functions and deal instead with abstract, uninterpreted programs, or schemas. What follows is a brief report on some preliminary exploration in this area.

AIM-463

Author[s]: Thomas M. Strat

Shaded Perspective Images of Terrain

March 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-463.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-463.pdf

In order to perform image analysis, one must have a thorough understanding of how images are formed. This memo presents an algorithm that produces shaded perspective images of terrain as a vehicle to understanding the fundamentals of image formation. The image is constructed using standard projection equations along with an efficient hidden-surface removal technique. The image intensity is calculated using the reflectance map, a convenient way of describing the surface reflection as a function of surface gradient. Aside from its use as a tool toward understanding image analysis, the algorithm has several applications of its own, including providing video input to a flight simulator.

AIM-462

Author[s]: William R. Swartout

A Comparison of PARSIFAL with Augmented Transition Networks

March 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-462.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-462.pdf

This paper compares Marcus’ parser, PARSIFAL with Woods’ Augmented Transition Network (ATN) parser. In particular, the paper examines the two parsers in light of Marcus’ Determinism Hypothesis. An overview of each parser is presented. Following that, the Determinism Hypothesis is examined in detail. A method for transforming the PARSIFAL grammar rules into the ATN formalism is outlined. This transformation shows some of the fundamental differences between PARSIFAL and ATN parsers, and the nature of the hypotheses used in PARSIFAL. Finally, the principle of least commitment is proposed as an alternative to the Determinism Hypothesis.

AIM-461A

Author[s]: Jon Doyle

A Glimpse of Truth Maintenance

November 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-461a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-461a.pdf

To choose their actions, reasoning programs must be able to draw conclusions from limited information and subsequently revise their beliefs when discoveries invalidate previous assumptions. A truth maintenance system is a problem solver subsystem for performing these functions by recording and maintaining the reasons for program beliefs. These recorded reasons are useful in constructing explanations of program actions in “responsible” programs, and in guiding the course of action of a problem solver. This paper describes the structure of a truth maintenance system, methods for encoding control structures in patterns of reasons for beliefs, and the method of dependency- directed backtracking.

AIM-461

Author[s]: Jon Doyle

A Glimpse of Truth Maintenance

February 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-461.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-461.pdf

Many procedurally-oriented problem solving systems can be viewed as performing a mixture of computation and deduction, with much of the computation serving to decide what deductions should be made. This results in bits and pieces of deductions being strewn throughout the program text and execution. This paper describes a problem solver subsystem called a truth maintenance system which collects and maintains these bits of deductions. Automatic functions of the truth maintenance system then use these pieces of “proofs” to consistently update a data base of program beliefs and to perform a powerful form of backtracking called dependency-directed backtracking.

AIM-460

Author[s]: Seymour Papert and Daniel H. Watt

Assessment and Documentation of a Children's Computer Laboratory

September 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-460.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-460.pdf

This research will thoroughly document the experiences of a small number of 5th grade children in an elementary school computer laboratory, using LOGO, an advanced computer language designed for children. Four groups of four children will be taught a 10-week LOGO course. Detailed anecdotal records will be kept, and observers will note the development of the children’s computer programming skills, and the acquisition of knowledge in the areas of mathematics, science, and language, and of cognitive strategies and attitudinal changes which transfer beyond the specific subject matter studied.

AIM-459

Author[s]: Charles Rich, Howard E. Shrobe, Richard C. Waters, Gerald J. Sussman and Carl E. Hewitt

Programming Viewed as an Engineering Activity

January 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-459.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-459.pdf

It is profitable to view the process of writing programs as an engineering activity. A program is a deliberately contrived mechanism constructed from parts whose behaviors are combined to produce the behavior of the whole. We propose to develop a notion of understanding a program which is analogous to similar notions in other engineering subjects. Understanding is a rich notion in engineering domains. It includes the ability to identify the parts of a mechanism and assign a purpose to each part. Understanding also entails being able to explain to someone how a mechanism works and rationalize its behavior under unusual circumstances.

AIM-458

Author[s]: Horn, Berthold K.P. and Marc H. Raibert

Configuration Space Control

December 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-458.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-458.pdf

Complicated systems with non-linear time- varying behavior are difficult to control using classical linear feedback methods applied separately to individual degrees of freedom. At the present, mechanical manipulators, for example, are limited in their rate of movement by the inability of traditional feedback systems to deal with time-varying inertia, torque coupling effects between links and Coriolis forces. Analysis of the dynamics of such systems, however, provides the basic information needed to achieve adequate control.

AITR-457

Author[s]: Robert J. Woodham

Reflectance Map Techniques for Analyzing Surface Defects in Metal Castings

June 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-457.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-457.pdf

This report explores the relation between image intensity and object shape. It is shown that image intensity is related to surface orientation and that a variation in image intensity is related to surface curvature. Computational methods are developed which use the measured intensity variation across surfaces of smooth objects to determine surface orientation. In general, surface orientation is not determined locally by the intensity value recorded at each image point. Tools are needed to explore the problem of determining surface orientation from image intensity. The notion of gradient space , popularized by Huffman and Mackworth, is used to represent surface orientation. The notion of a reflectance map, originated by Horn, is used to represent the relation between surface orientation image intensity. The image Hessian is defined and used to represent surface curvature. Properties of surface curvature are expressed as constraints on possible surface orientations corresponding to a given image point. Methods are presented which embed assumptions about surface curvature in algorithms for determining surface orientation from the intensities recorded in a single view. If additional images of the same object are obtained by varying the direction of incident illumination, then surface orientation is determined locally by the intensity values recorded at each image point. This fact is exploited in a new technique called photometric stereo. The visual inspection of surface defects in metal castings is considered. Two casting applications are discussed. The first is the precision investment casting of turbine blades and vanes for aircraft jet engines. In this application, grain size is an important process variable. The existing industry standard for estimating the average grain size of metals is implemented and demonstrated on a sample turbine vane. Grain size can be computed form the measurements obtained in an image, once the foreshortening effects of surface curvature are accounted for. The second is the green sand mold casting of shuttle eyes for textile looms. Here, physical constraints inherent to the casting process translate into these constraints, it is necessary to interpret features of intensity as features of object shape. Both applications demonstrate that successful visual inspection requires the ability to interpret observed changes in intensity in the context of surface topography. The theoretical tools developed in this report provide a framework for this interpretation.

AIM-456

Author[s]: Howard E. Shrobe

Floyd-Hoare Verifiers “Considered Harmful”

January 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-456.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-456.pdf

The Floyd-Hoare methodology completely dominates the field of program verification and has contributed much to our understanding of how programs might be analyzed. Useful but limited verifiers have been developed using Floyd-Hoare techniques. However, it has long been known that it is difficult to handle side effects on shared data structures within the Floyd-Hoare framework. Most examples of successful Floyd-Hoare axioms for assignment to complex data structures, similar statements have been used by London. This paper demonstrates an error in these formalizations and suggests a different style of verification.

AIM-454

Author[s]: Henry G. Baker and Carl Hewitt

The Incremental Garbage Collection Processes

December 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-454.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-454.pdf

This paper investigates some problems associated with an expression evaluation order that we call "future" order, which is different from call-by-name, call-by-value, and call-by-need. In future order evaluation, an object called "future" is created to serve as the value of each expression that is to be evaluated and separate process is dedicated to its evaluation. This mechanism allows the fully parallel evaluation of the expressions in a programming language. We discuss an approach to a problem that arises in this context: futures which were thought to be relevant when they were created become irrelevant through not being needed later in computation. The problem of irrelevant processes also appears in multiprocessing problem-solving systems which start several processors working on the same problem but with different methods, and return with the solution which finishes first. This parallel method strategy has the drawback that the processes which are investigating the losing methods must be identified, cleanly stopped, and the processors they are using reassigned to more useful tasks. The solution we propose is that of incremental garbage collection. The goal structure of the solution plan should be explicitly represented in memory as part of the graph memory (like Lisp's heap) so that a garbage collection algorithm can discover which processes are performing useful work, and which can be recycled for a new task. An incremental algorithm for the unified garbage collection of storage and processes is described.

AIM-453

Author[s]: Guy Lewis Steele Jr. and Gerald Jay Sussman

The Art of the Interpreter of the Modularity Complex (Parts Zero, One, and Two)

May 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-453.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-453.pdf

We examine the effects of various language design decisions on theprogramming styles available to a user of the language, with particular emphasis on the ability to incrementally construct modular systems. At each step we exhibit an interactive meta- circular interpreter for the language under consideration. Each new interpreter is the result of an incremental change to a previous interpreter. We explore the consequences of various variable binding disciplines and the introduction of side effects. We find that dynamic scoping is unsuitable for constructing procedural abstractions, but has another role as agent of modularity, being a structured form of side effect. More general side effects are also found to be necessary to promote modular style. We find that the notion of side effect and the notion of equality (object identity) are mutually constraining; to define one is to define the other. The interpreters we exhibit are all written in a simple dialect of LISP, and all implement LISP-like languages. A subset of these interpreters constitute a partial historical reconstruction of the actual evaluation of LISP.

AIM-452

Author[s]: Guy Lewis Steele, Jr. and Gerald Jay Sussman

The Revised Report on SCHEME: A Dialect of LISP

January 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-452.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-452.pdf

SCHEME is a dialect of LISP. It is an expression-oriented, applicative order, interpreter-based language which allows one to manipulate programs as data. It differs from most current dialects of LISP in that it closes all lambda-expressions in the environment of their definition or declaration, rather than in the execution environment. This has the consequence that variables are normally lexically scoped, as in ALGOL. However, in contrast with ALGOL, SCHEME treats procedures as a first-class data type. They can be the values of variables, the returned values of procedures, and components of data structures. Another difference from LISP is that SCHEME is implemented in such a way that tail- recursions execute without net growth of the interpreter stack. The effect of this is that a procedure call behaves like a GOTO and thus procedure calls can be used to implement iterations, as in PLASMA.

AIM-451

Author[s]: D. Marr and T. Poggio

A Theory of Human Stereo Vision

November 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-451.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-451.pdf

An algorithm is proposed for solving the stereoscopic matching problem. The algorithm consists of five steps: 1.) Each image is filtered with bar masks of four sizes that vary with eccentricity; the equivalent filters are about one octave wide. 2.) Zero-crossings of the mask values are localized, and positions that correspond to terminations are found. 3.) For each mask size, matching takes place between pairs of zero crossings or terminations of the same sign in the two images, for a range of disparities up to about the width of the mask's central region. 4.) Wide masks can control vergence movements, thus causing small masks to come into correspondence. 5.) When a correspondence is achieved, it is written into a dynamic buffer, called the 2-1/2-D sketch. It is shown that this proposal provides a theoretical framework for most existing psychophysical and neurophysiological data about stereopsis. Several critical experimental predictions are also made, for instance about the size of Panum's area under various conditions. The results of such experiments would tell us whether, for example, cooperativity is necessary for the fusion process.

AITR-450

Author[s]: Scott E. Fahlman

A System for Representing and Using Real-World Knowledge

December 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-450.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-450.pdf

This report describes a knowledge-base system in which the information is stored in a network of small parallel processing elements – node and link units – which are controlled by an external serial computer. This network is similar to the semantic network system of Quillian, but is much more tightly controlled. Such a network can perform certain critical deductions and searches very quickly; it avoids many of the problems of current systems, which must use complex heuristics to limit and guided their searches. It is argued (with examples) that the key operation in a knowledge-base system is the intersection of large explicit and semi-explicit sets. The parallel network system does this in a small, essentially constant number of cycles; a serial machine takes time proportional to the size of the sets, except in special cases.

AIM-449

Author[s]: Ira P. Goldstein

The Genetic Epistemology of Rule Systems

January 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-449.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-449.pdf

I shall describe a model of the evolution of the rule-structured knowledge that serves as a cornerstone of our development of computer- based coaches. The key idea is a graph structure whose nodes represent rules, and whose links represent various evolutionary relationships such as generalization, correction, and refinement. This graph guides both student modelling and tutoring as follows: the coach models the student in terms of nodes in this graph, and selects tutoring strategies for a given rule on the basis of its genetic links. It also suggests a framework for a theory of learning in which the graph serves as a memory structure constructed by the student by means of processes corresponding to the various links. Given this framework, a learning complexity measure can be defined in terms of the topology of the graph.

AIM-448

Author[s]: Berthold K. P. Horn

Fan-beam Reconstruction Methods

November 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-448.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-448.pdf

In a previous paper a technique was developed for finding reconstruction algorithms for arbitrary ray-sampling schemes. The resulting algorithms use a general linear operator, the kernel of which depends on the details of the scanning geometry. Here this method is applied to the problem of reconstructing density distributions from arbitrary fan-beam data. The general fan-beam method is then specialized to a number of scanning geometries of practical importance. Included are two cases where the kernel of the general linear operator can be factored and rewritten as a function of the difference of coordinates only and the superposition integral consequently simplifies into a convolution integral. Algorithms for these special cases of the fan-beam problem have been developed previously by others. In the general case, however, Fourier transforms and convolutions do not apply, and linear space-variant operators must be used. As a demonstration, details of a fan-beam method for data obtained with uniform ray-sampling density are developed.

AIM-447

Author[s]: Eugene Ciccarelli

An Introduction to the EMACS Editor

January 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-447.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-447.pdf

EMACS is a real-time editor primarily intended for display terminals. The intent of this memo is to describe EMACS in enough detail to allow a user to edit comfortably in most circumstances, knowing how to get more information if needed. Basic commands described cover buffer editing, file handling, and getting help. Two sections cover commands especially useful for editing LISP code, and text (word- and paragraph- commands). A brief “cultural interest” section describes the environment that supports EMACS commands.

AIM-446

Author[s]: D. Marr, G. Palm, and T. Poggio

Analysis of a Cooperative Stereo Algorithm

October 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-446.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-446.pdf

Marr & Poggio (1976) recently described a cooperative algorithm that solves the correspondence problem for stereopsis. This article uses a probabilistic technique to analyze the convergence of that algorithm, and derives the conditions governing the stability of the solution state. The actual results of applying the algorithm to random-dot stereograms are compared with the probabilistic analysis. A satisfactory mathematical analysis of the asymptotic behaviour of the algorithm is possible for a suitable choice of the parameter values and loading rules, and again the actual performance of the algorithm under these conditions is compared with the theoretical predictions. Finally, some problems raised by the analysis of this type of “cooperative” algorithm are briefly discussed.

AIM-445

Author[s]: Stephen C. Purcell

Understanding Hand-Printed Algebra for Computer Tutoring

February 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-445.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-445.pdf

This thesis demonstrates how the use of a global context can improve the power of a local character recognizer. The global context considered is a computer tutor of high school algebra that observes a student working algebra problems on a graphics tablet. The tutoring system is integrated with a character recognizer to understand the pen strokes of an algebra tutoring system is designed and implemented. This thesis joins together two users of a computer, intelligent tutoring and tablet communication. Natural communication with computers has been pursued through speech understanding, English text understanding, special purpose languages, hand printing and graphics. This work extends the power of hand-printing understanders by using more varied and higher level sources of knowledge than have been used previously.

AIM-444

Author[s]: Alan Bawden, Richard Greenblatt, Jack Holloway, Thomas Knight, David Moon and Daniel Weinreb

LISP Machine Progress Report

August 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-444.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-444.pdf

This informal paper introduces the LISP Machine, describes the goals and current status of the project, and explicates some of the key ideas. It covers the LISP machine implementation, LISP as a system language, input/output, representation of data, representation of programs, control structures, storage organization, garbage collection, the editor, and the current status of the work.

AIM-443

Author[s]: Guy Lewis Steele, Jr.

Debunking the 'Expensive Procedure Call' Myth, or, Procedure Call Implementations Considered Harmful, or, Lambda: The Ultimate GOTO

October 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-443.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-443.pdf

Folklore states that GOTO statements are 'cheap', while procedure calls are 'expensive'. This myth is largely a result of poorly designed language implementations. The historical growth of this myth is considered. Both theoretical ideas and an existing implementation are discussed which debunk this myth. It is shown that the unrestricted use of procedure calls permits great stylistic freedom. In particular, any flowchart can be written as a 'structured' program without introducing extra variables. The difficulty with the GOTO statement and the procedure call is characterized as a conflict between abstract programming concepts and concrete language constructs.

AIM-442

Author[s]: Harold Abelson

Towards a Theory of Local and Global in Computation

September 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-442.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-442.pdf

We formulate the rudiments of a method for assessing the difficulty of dividing a computational problem into “independent simpler parts.” This work illustrates measures of complexity which attempt to capture the distinction between “local” and “global” computational problems. One such measure is the covering multiplicity, or average number of partial computations which take account of a given piece of data. Another measure reflects the intuitive notion of a “highly interconnected” computational problem, for which subsets of the data cannot be processed “in isolation.” These ideas are applied in the setting of computational geometry to show that the connectivity predicate has unbounded convering multiplicity and is highly interconnected; and in the setting of numerical computations to measure the complexity of evaluating polynomials and solving systems of linear equations.

AIM-441

Author[s]: Andrea A. diSessa

On "Learnable" Representations of Knowledge: A Meaning for the Computational Metaphor

September 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-441.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-441.pdf

The computational metaphor which proposes the comparison of processes of mind to realizable or imaginable computer activities suggests a number of educational concerns. This paper discusses some of those concerns including procedural modes of knowledge representation and control knowledge – knowing what to do. I develop a collection of heuristics for education researchers and curriculum developers which are intended to address the issues raised. Finally, an extensive section of examples is given to concretize those heuristics.

AIM-440

Author[s]: Berthold K.P. Horn

Density Reconstruction Using Arbitrary Ray Sampling Schemes

September 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-440.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-440.pdf

Methods for calculating the distribution of absorption densities in a cross section through an object from density integrals along rays in the plane of the cross section are well known, but are restricted to particular geometries of data collection. So-called convolutional-backprojection-summation methods, used now for parallel ray data, have recently been extended to special cases of the fan-beam reconstruction problem by the addition of pre- and post-multiplication steps. In this paper, I present a technique for deriving reconstructing algorithms for arbitrary ray- sampling schemes: the resulting algorithms entail the use of a general linear operator, but require little more computation than the convolutional methods, which represent special cases.

AITR-439

Author[s]: Marc H. Raibert

Motor Control and Learning by the State Space Model

September 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-439.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-439.pdf

A model is presented that deals with problems of motor control, motor learning, and sensorimotor integration. The equations of motion for a limb are parameterized and used in conjunction with a quantized, multi- dimensional memory organized by state variables. Descriptions of desired trajectories are translated into motor commands which will replicate the specified motions. The initial specification of a movement is free of information regarding the mechanics of the effector system. Learning occurs without the use of error correction when practice data are collected and analyzed.

AIM-438

Author[s]: Russell Atkinson and Carl Hewitt

Specification and Proof Techniques for Serializers

August 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-438.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-438.pdf

This paper presents an implementation mechanism, specification language, and proof techniques for problems involving the arbitration of concurrent requests to shared resources. This mechanism is the serializer which may be described as a kind of protection mechanism, in that it prevents improper orders of access to a protected resource. Serializers are a generalization and improvement of the monitor mechanism of Brinch-Hansen and Hoare.

AIM-437

Author[s]: Berthold K.P. Horn and Brett L. Bachman

Using Synthetic Images to Register Real Images with Surface Models

August 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-437.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-437.pdf

A number of image analysis tasks can benefit from registration of the image with a model of the surface being imaged. Automatic navigation using visible light or radar images requires exact alignment of such images with digital terrain models. In addition, automatic classification of terrain, using satellite imagery, requires such alignment to deal correctly with the effects of varying sun angle and surface slope. Even inspection techniques for certain industrial parts may be improved by this means.

AIM-436A

Author[s]: Carl Hewitt and Henry Baker

Actors and Continuous Functionals

July 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-436a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-436a.pdf

This paper presents precise versions of some “laws” that must be satisfied by computations involving communicating parallel processes. The laws take the form of stating plausible restrictions on the histories of computations that are physically realizable. The laws are very general in that they are obeyed by parallel processes executing on a time varying number of distributed physical processors.

AIM-435

Author[s]: Johan de Kleer, Jon Doyle, Charles Rich, Guy L. Steele, Jr. and Gerald Jay Sussman

AMORD: A Deductive Procedure System

January 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-435.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-435.pdf

We have implemented an interpreter for a rule-based system, AMORD, based on a non- chronological control structure and a system of automatically maintained data- dependencies. The purpose of this paper is to serve as a reference manual and as an implementation tutorial. We wish to illustrate: (1) The discipline of explicit control and dependencies, (2) How to use AMORD, and (3) One way to implement the mechanisms provided by AMORD. This paper is organized into sections. The first section is a short “reference manual” describing the major features of AMORD. Next, we present some examples which illustrate the style of expression encouraged by AMORD. This style makes control information explicit in a rule- manipulable form, and depends on an understanding of the use of non-chronological justifications for program beliefs as a means for determining the current set of beliefs. The third section is a brief description of the Truth Maintenance System employed by AMORD for maintaining these justifications and program beliefs. The fourth section presents a complete annotated interpreter for AMORD, written in MacLISP.

AIM-433

Author[s]: Gerald Jay Sussman

SLICES: At the Boundary Between Analysis and Synthesis

July 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-433.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-433.pdf

The algebraic difficulty of determining the component values in a circuit of known topology and specifications is large. Expert circuit designers use terminal equivalence and power arguments to reduce the apparent synergy in a circuit so that their computational power can be focussed. A new descriptive mechanism, called slices, is introduced. Slices combine the notion of equivalence with identification of parameters. Armed with appropriate slices, an automatic analysis procedure, Analysis by Propagation of Constraints can be used to assign the component values in a circuit. Techniques of formation, notation, and use of slices are described. The origin of slices in the topological design process is indicated. Slices are shown to be of wider interest in scientific thought than just in circuit analysis.

AIM-432

Author[s]: Hal Abelson and Paul Goldenberg

Teacher's Guide for Computational Models of Animal Behavior

April 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-432.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-432.pdf

This is an experimental curriculum unit which suggests how the computational perspective can be integrated into a subject such as elementary school biology. In order to illustrate the interplay of computer and non- computer activities, we have prepared the unit as a companion to the Elementary School Science Study “Teacher’s Guide to Behavior of Mealworms.” This material is based on use of the Logo computer language.

AIM-431

Author[s]: Steven T. Rosenberg

Frame-based Text Processing

November 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-431.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-431.pdf

This paper presents an overview of a theory of discourse structure, and discusses a model for assimilating text into a frame-based data structure. The model has been applied to the analysis of news articles. The theory assumes sentences contain links to the database which are relatively easy to compute. These links point to prior themes which contain expectations and procedural knowledge. This knowledge is used to assimilate new sentences to these themes. At any given time, only procedural knowledge from the indicated theme is active in processing new sentences.

AIM-430

Author[s]: Marvin Minksy

Plain Talk About Neurodevelopmental Epistemology

June 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-430.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-430.pdf

This paper is based on a theory being devloped in collaboration with Seymour Papert in which we view the mind as an organized society of intercommunicating “agents”. Each such agent is, by itself, very simple. The subject of this paper is how that simplicity affects communication between different parts of a single mind and , indirectly, how it may affect inter-personal communications.

AIM-429

Author[s]: S.D. Litvintchouk and V.R. Pratt

A Proof-Checker for Dynamic Logic

June 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-429.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-429.pdf

We consider the problem of getting a computer to follow reasoning conducted in dynamic logic. This is a recently developed logic of programs that subsumes most existing first-order logics of programs that manipulate their environment, including Floyd’s and Hoare’s logics of partial correctness and Manna and Waldinger’s logic of total correctness. Dynamic logic is more closely related to classical first-order logic than any other proposed logic of programs. This simplifies the design of a proof-checker for dynamic logic. Work in progress on the implementation of such a program is reported on, and an example machine-checked proof is exhibited.

AIM-428

Author[s]: Akinori Yonezawa and Carl Hewitt

Modelling Distributed Systems

June 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-428.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-428.pdf

Distributed systems are multi-processor information processing systems which do not rely on the central shared memory for communication. This paper presents ideas and techniques in modelling distributed systems and its application to Artificial Intelligence. In section 2 and 3, we discuss a model of distributed systems and its specification and verification techniques. We introduce a simple example of air line reservation systems in Section 4 and illustrate our specification and verification techniques for this example in the subsequent sections. Then we discuss our further work.

AIM-427

Author[s]: Johan de Kleer, Jon Doyle, Guy L. Steele, Jr. and Gerald Jay Sussman

Explicit Control of Reasoning

June 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-427.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-427.pdf

The construction of expert problem-solving systems requires the development of techniques for using modular representations of knowledge without encountering combinatorial explosions in the solution effort. This report describes an approach to dealing with this problem based on making some knowledge which is usually implicitly part of an expert problem solver explicit, thus allowing this knowledge about control to be manipulated and reasoned about. The basic components of this approach involve using explicit representations of the control structure of the problem solver, and linking this and other knowledge manipulated by the expert by means of explicit data dependencies.

AIM-426

Author[s]: Bruce R. Schatz

The Computation of Immediate Texture Discrimination

August 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-426.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-426.pdf

The computation of immediate texture discrimination involves finding boundaries between regions of differing texture. Various textures are examined to investigate the factors determining discrimination in the limited domain of line-and-point images. Two operators embodying necessary properties are proposed: length and orientation of actual lines and of local virtual lines between terminators. It is conjectured that these are sufficient as well. Relations between this theory and those of Julesz and of Marr are discussed. Supporting psychological evidence is introduced and an implementation strategy outlined.

AIM-425

Author[s]: Gerald Jay Sussman

Electrical Design: A Problem for Artificial Intelligence Research

June 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-425.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-425-pdf

This report outlines the problem of intelligent failure recovery in a problem-solver for electrical design. We want our problem solver to learn as much as it can from its mistakes. Thus we cast the engineering design process on terms of Problem Solving by Debugging Almost-Right Plans, a paradigm for automatic problem solving based on the belief that creation and removal of “bugs” is an unavoidable part of the process of solving a complex problem. The process of localization and removal of bugs called for by the PSBDARP theory requires an approach to engineering analysis in which every result has a justification which describes the exact set of assumptions it depends upon. We have developed a program based on Analysis by Propagation of Constraints which can explain the basis of its deductions. In addition to being useful to a PSBDARP designer, these justifications are used in Dependency- Directed Backtracking to limit the combinatorial search in the analysis routines. Although the research we will describe is explicitly about electrical circuits, we believe that similar principles and methods are employed by other kinds of engineers, including computer programmers.

AIM-424

Author[s]: John M. Hollerbach

The Minimum Energy Movement for a Spring Muscle Model

September 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-424.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-424.pdf

There are many ways of programming an actuator or effector for movement between the same two points. In the interest of efficiency it is sometimes desirable to program that trajectory which requires the least amount of energy. This paper considers the minimum energy movement for a spring-like actuator abstracted from muscle mechanics and energetics. It is proved that for this actuator a bang-coast-bang actuation pattern minimizes the energy expenditure. For some parameter values this pattern is modified by a singular arc at the first switching point. A surprising limitation on the duration of coast is demonstrated. Some relaxations of the restrictions underlying the spring model are shown to preserve the bang-coast-bang solution.

AIM-423

Author[s]: James L. Stansfield

COMEX: A Support System for a Commodities Expert

August 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-423.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-423.pdf

The intelligent support system project is developing a program (COMEX) to assist a commodities expert in tasks such as interpreting data, predicting trends and intelligent noticing. Large amounts of qualitative and quantitative information about factors such as weather, trade and crop condition need to be managed. This memo presents COMEX-), a prototype system written in FRL, a frame-based language (Goldstein & Roberts, 1977). COMEX-O has a complaint handling system, frame structure matching and simple reasoning. By conversing with a user, it builds groupings of frame structures to represent events. These are called CLUSTERS and are proposed as a new representation method. New CLUSTERS are built from previously defined ones using INSTANTIATION and AGGREGATION, two methods which combine with frame inheritance and constraints to make up a general event representation mechanism. CLUSTERS capture the idea of generic patterns of relationships between frames and raise an issue named the GENERIC CONSTRAINT PROBLEM concerning constraints between the parts of a cluster. The final section presents plans for future work on qualitative reasoning within COMEX and includes a hypothetical scenario.

AIM-422

Author[s]: K. Forbus

Light Source Effects

May 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-422.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-422.pdf

The perception of surface luster in achromatic single view images seems to depend on the existence of regions with source-like properties. These regions are due to the interaction of specular component of the surface’s reflectance and the illumination. Light source effects are broken down into three categories according to gross aspects of the physical situation in which they occur, and criteria for detecting the regions they cause are suggested.

AIM-421

Author[s]: Guy Lewis Steele, Jr.

Fast Arithmetic in MACLISP

September 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-421.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-421.pdf

MacLISP provides a compiler which produces numerical code competitive in speed with some FORTRAN implementations and yet compatible with the rest of the MacLISP system. All numerical programs can be run under MacLISP interpreter. Additional declarations to the compiler specify type information which allows the generation of optimized numerical code which generally does not require the garbage collection of temporary numerical results. Array accesses are almost as fast as in FORTRAN, and permit the use of dynamically allocated arrays of varying dimensions. Here we discuss the implementation decisions regarding user interface, data representations, and interfacing conventions which allow the generation of fast numerical LISP code.

AIM-420

Author[s]: Guy Lewis Steele, Jr.

Data Representations in PDP-10 MACLISP

September 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-420.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-420.pdf

The internal representations of the various MacLISP data types are presented and discussed. Certain implementation tradeoffs are considered. The ultimate decisions on these tradeoffs are discussed in the light of MacLISP’s prime objective of being an efficient high-level language for the implementation of large systems such as MACSYMA. The basic strategy of garbage collection is outlined, with reference to the specific representations involved. Certain “clever tricks” are explained and justified. The “address space crunch” is explained and some alternative solutions explored.

AITR-419

Author[s]: Jon Doyle

Truth Maintenance Systems for Problem Solving

January 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-419.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-419.pdf

The thesis developed here is that reasoning programs which take care to record the logical justifications for program beliefs can apply several powerful, but simple, domain- independent algorithms to (1) maintain the consistency of program beliefs, (2) realize substantial search efficiencies, and (3) automatically summarize explanations of program beliefs. These algorithms are the recorded justifications to maintain the consistency and well founded basis of the set of beliefs. The set of beliefs can be efficiently updated in an incremental manner when hypotheses are retracted and when new information is discovered. The recorded justifications also enable the pinpointing of exactly whose assumptions which support any particular belief. The ability to pinpoint the underlying assumptions is the basis for an extremely powerful domain-independent backtracking method. This method, called Dependency-Directed Backtracking, offers vastly improved performance over traditional backtracking algorithms.

AITR-418

Author[s]: Benjamin J. Kuipers

Representing Knowledge of Large-Scale Space

July 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-418.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-418.pdf

This dissertation presents a model of the knowledge a person has about the spatial structure of a large-scale environment: the "cognitive map". The functions of the cognitive map are to assimilate new information about the environment, to represent the current position, and to answer route-finding and relative-position problems. This model (called the TOUR model) analyzes the cognitive map in terms of symbolic descriptions of the environment and operations on those descriptions. Knowledge about a particular environment is represented in terms of route descriptions, a topological network of paths and places, multiple frames of reference for relative positions, dividing boundaries, and a structure of containing regions. The current position is described by the "You Are Here" pointer, which acts as a working memory and a focus of attention. Operations on the cognitive map are performed by inference rules which act to transfer information among different descriptions and the "You Are Here" pointer. The TOUR model shows how the particular descriptions chosen to represent spatial knowledge support assimilation of new information from local observations into the cognitive map, and how the cognitive map solves route-finding and relative-position problems. A central theme of this research is that the states of partial knowledge supported by a representation are responsible for its ability to function with limited information of computational resources. The representations in the TOUR model provide a rich collection of states of partial knowledge, and therefore exhibit flexible, "common- sense" behavior.

AIM-417

Author[s]: Brian Carr

Wusor II: A Computer Aided Instruction Program with Student Modelling Capabilities

May 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-417.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-417.pdf

Wusor II is the second program that has been developed to tutor students in the game of Wumpus. From the earlier efforts with Wusor I it was possible to produce a rule-based expert which processed a relatively complete mastery of the game. Wusor II endeavors to teach the knowledge embodied in the rules used by the Expert. The Student Model represents Wusor’s estimation of the student’s knowledge of said rules, and this estimation is based primarily on analyses of the player’s moves. The Student Model allows Wusor to personalize its explanations to the student according to the student’s current knowledge of the game. The result is a system which, according to preliminary results, is highly effective at tutoring students of varied abilities.

AIM-416

Author[s]: D. Marr and H.K. Nishihara

Representation and Recognition of the Spatial Organization of Three Dimensional Shapes

May 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-416.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-416.pdf

The human visual process can be studied by examining the computational problems associated with deriving useful information from retinal images. In this paper, we apply this approach to the problem of representing three-dimensional shapes for the purpose of recognition.

AIM-415

Author[s]: D. Marr

Representing Visual Information

May 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-415.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-415.pdf

Vision is the construction of efficient symbolic descriptions from images of the world. An important aspect of vision is the choice of representations for the different kinds of information in a visual scene. In the early stages of the analysis of an image, the representations used depend more on what it is possible to compute from an image than on what is ultimately desirable, but later representations can be more sensitive to the specific needs of recognition. This essay surveys recent work in vision at M.I.T. from a perspective in which the representational problems assume a primary importance. An overall framework is suggested for visual information processing, in which the analysis proceeds through three representations; (1) the primal sketch, which makes explicit the intensity changes and local two-dimensional geometry of an image (2) the 2 1/2-D sketch, which is a viewer-centered representation of the depth, orientation and discontinuities of the visible surfaces, and (3) the 3-D model representation, which allows an object- centered description of the three-dimensional structure and organization of a viewed shape. Recent results concerning processes for constructing and maintaining these representations are summarized and discussed.

AIM-414A

Author[s]: Patrick H. Winston

Learning by Creating and Justifying Transfer Frames

January 1978

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-414a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-414a.pdf

In the particular kind of learning discussed in this paper, the teacher names a destination and a source. In the sentence, “Robbie is like a fox,” Robbie is the destination and fox is the source. The student, on analyzing the teacher’s instruction, computes a filter called a transfer frame. The transfer frame stands between the source and the destination and determines what information is allowed to pass from one to the other.

AIM-414

Author[s]: Patrick H. Winston

Learning by Creating and Justifying Transfer Frames

January 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-414.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-414.pdf

Learning is defined to be the computation done by a student when there is a transfer of information to him from a teacher. In the particular kind of learning discussed, the teacher names a source and destination. In the sentence, "Robbie is like a fox," fox is the source and Robbie is the destination. The student, on analyzing the teacher's instruction, computes a kind of filter called a transfer frame. It stands between the source and the destination and determines what information is allowed to pass from one to the other.

AIM-413

Author[s]: Candace Bullwinkle

Levels of Complexity in Discourse for Reference Disambiguation and Speech Act Interpretation

May 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-413.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-413.pdf

This paper presents a discussion of means of describing the discourse and its components which makes speech act interpretation and reference disambiguation possible with minimal search of the knowledge in the database. A portion of this paper will consider how a frames representation of sentences and common sense knowledge provides a mechanism for representing the postulated discourse components. Finally some discussion of the use of the discourse model and of frames in a discourse understanding program for a personal assistant will be presented.

AIM-412

Author[s]: Marc Raibert

Control and Learning by the State Space Model: Experimental Findings

April 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-412.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-412.pdf

This is the second of a two part presentation of a model for motor control and learning. The model was implemented using a small computer and the MIT -Scheinman manipulator. Experiments were conducted which demonstrate the controller's ability to learn new movements, adapt to mechanical changes caused by inertial and elastic loading, and generalize its behavior among similar movements. A second generation model, based on improvements suggested by these experiments is suggested.

AIM-410

Author[s]: Carl Hewitt

Viewing Control Structures as Patterns of Passing Messages

December 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-410.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-410.pdf

The purpose of this paper is to discuss some organizational aspects of programs using the actor model of computation. In this paper we present an approach to modelling intelligence in terms of a society of communicating knowledge-based problem-solving experts. In turn each of the experts can be viewed as a society that can be further decomposed in the same way until th primitive actors fo the system are reached. We are investigating the nature of the communication mechanisms needed for effective problem-solving by a society of experts and the conventions of discourse that make this possible. In this way we hope eventually to develop a framework adequate for the discussion of the central issues of problem-solving involving parallel versus serial processing and centralization versus decentralization of control and information storage.

AIM-409

Author[s]: R. Bruce Roberts and Ira P. Goldstein

The FRL Manual

September 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-409.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-409.pdf

The Frame Representation Language (FRL) is described. FRL is an adjunct to LISP which implements several representation techniques suggested by Minsky's [75] concept of a frame: defaults, constraints, inheritance, procedural attachment and annotation.

AIM-408

Author[s]: R. Bruce Roberts and Ira P. Goldstein

The FRL Primer

July 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-408.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-408.pdf

The Frame Representation Language (FRL) is an experimental language written to explore the use of frames as a knowledge representation technique. The term 'frame' as used in FRL was inspired by Minsky's [75] development of frame theory. FRL extends the traditional Property List representation scheme by allowing properties to have comments, defaults and constraints, to inherit information from abstract forms of the same type, and to have attached procedures triggered by adding or deleting values, or if a value is needed. We introduce FRL with the aid of a simple example: WHOSIS, a database of AI persons' names, addresses, interests and publications. A second section contains an abridged manual describing FRL's most-used commands and conventions.

AIM-407

Author[s]: Ira P. Goldstein and Eric Grimson

Annotated Production Systems: A Model for Skill Acquisition

February 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-407.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-407.pdf

Annotated Production Systems provide a procedural model for skill acquisition by augmenting a production model of the skill with formal commentary describing plans, bugs, and interraltionships between various productions. This commentary supports processes of efficient interpretation, self- debugging and self-improvement. The theory of annotated productions is developed by analyzing the skill of attitude instrument flying. An annotated production interpreter has been written that executes skill models which control a flight simulator. Preliminary evidence indicates that annotated productions effectively model certain bugs and certain learning behaviors characteristic of student pilots.

AIM-406

Author[s]: Brian P. Carr and Ira P. Goldstein

Overlays: A Theory of Modelling for Computer Aided Instruction

February 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-406.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-406.pdf

Overlay modelling is a technique for describing a student's problem solving skills in terms of modular program designed to be an expert for the given domain. The model is an overlay on the expert program in that it consists of a set of hypotheses regarding the student's familiarity with the skills employed by the expert. The modelling is performed by a set of P rules that are triggered by different sources of evidence, and whose effect is to modify these hypotheses. A P critic monitors these rules to detect discontinuities and inconsistencies in their predictions. A first implementation of overlay modelling exists as a component of WUSOR-II, a CAI program based on artificial intelligence techniques. WUSOR-II coaches a student in the logical and probability skills required to play the computer game WUMPUS. Preliminary evidence indicates that overlay modelling significantly improves the appropriateness of the tutoring program's explanations.

AIM-405

Author[s]: Ira P. Goldstein and R. Bruce Roberts

NUDGE, A Knowledge-Based Scheduling Program

February 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-405.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-405.pdf

Traditional scheduling algorithms (using the techniques of PERT charts, decision analysis or operations research) require well-defined quantitative, complete sets of constraints. They are insufficient for scheduling situations where the problem description is ill-defined, involving incomplete, possibly inconsistent and generally qualitative constraints. The NUDGE program uses an extensive knowledge base to debug scheduling requests by supplying missing details and resolving minor inconsistencies. The result is that an informal request is converted to a complete description suitable for a traditional scheduler.

AITR-403

Author[s]: Richard Brown

Use of Analogy to Achieve New Expertise

April 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-403.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-403.pdf

We will take the view that the end result of problem solving in some world should be increased expertness. In the context of computers, increasing expertness means writing programs. This thesis is about a process, reasoning by analogy that writes programs. Analogy relates one problem world to another. We will call the world in which we have an expert problem solver the IMAGE world, and the other world the DOMAIN world. Analogy will construct an expert problem solver in the domain world using the image world expert for inspiration.

AITR-402

Author[s]: Drew Vincent Mcdermott

Flexibility and Efficiency in a Computer Program for Designing Circuits

June 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-402.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-402.pdf

This report is concerned with the problem of achieving flexibility (additivity, modularity) and efficiency (performance, expertise) simultaneously in one AI program. It deals with the domain of elementary electronic circuit design. The proposed solution is to provide a deduction-driven problem solver with built-in-control-structure concepts. This problem solver and its knowledge base in the applicaitn areas of design and electronics are descrbed. The prgram embodying it is being used to explore the solutionof some modest problems in circuit design. It is concluded that shallow reasoning about problem-solver plans is necessary for flexibility, and can be implemented with reasonable efficiency.

AIM-401

Author[s]: Jeanne Bamberger

Development of Musical Intelligence II: Children's Representation of Pitch Relations

December 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-401.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-401.pdf

The work reported here is an outgrowth of studies in the development of musical intelligence and learning that have been underway for about four years. Beginning as one of the activities in the LOGO Lab (a part of the MIT Artificial Intelligence Laboratory) the research has expanded to include more theoretical work in the MIT Division for Study a nd Research in Education.

AIM-400

Author[s]: Vaughan R. Pratt

The Competence/Performance Dichotomy in Programming

January 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-400.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-400.pdf

We consider the problem of automating some of the duties of programmers. We take as our point of departure the claim that data management has been automated to the point where the programmer concerned only about the correctness (as opposed to the efficiency) of his program need not involve himself in any aspect of the storage allocation problem. We focus on what we feel is a sensible next step, the problem of automating aspects of control. To accomplish this we propose a definition of control based on a fact/ heuristic dichotomy, a variation of Chomsky's competence/performance dichotomy. The dichotomy formalizes an idea originating with McCarthy and developed by Green, Hewitt, McDermott, Sussman, Hayes, Kowalski and others. It allows one to operate arbitrarily on the control component of a program without affecting the program's correctness, which is entirely the responsibility of the fact component. The immediate objectives of our research are to learn how to program keeping fact and control separate, and to identify those aspects of control amenable to automation.

AIM-399

Author[s]: Akinori Yonezawa and Carl Hewitt

Symbolic Evaluation Using Conceptual Representations for Programs with Side-Effects

December 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-399.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-399.pdf

Symbolic evaluation is a process which abstractly evaluates an program on abstract data. A formalism based on conceptual representations is proposed as a specification language for programs with side-effects. Relations between algebraic specifications and specifications based on conceptual representations are discussed and limitations of the current algebraic specification techniques are pointed out. Symbolic evaluation is carried out with explicit use of a notion of situations. Uses of situational tags in assertions make it possible to state relations about properties of objects in different situations. The proposed formalism can deal with problems of side- effects which have been beyond the scope of Floyd-Hoare proof rules and give a solution to McCarthy’s frame problem.

AIM-398

Author[s]: Jeanne Bamberger

Capturing Intuitive Knowledge in Procedural Description

December 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-398.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-398.pdf

Trying to capture intuitive knowledge is a little like trying to capture the moment between what just happened and what is about to happen. Or to quote a famous philosopher, “You can’t put your foot in the same river once.” The problem is tha tyou can only “capture” what stands still. Intuitive knowledge is not a static structure, but rather a continuing process of constructing coherence and meaning out of the sensory phenomena that come at you. To capture intuitive knowledge, then means: Given some phenomena, what are your spontaneous ways of selecting significant features or for choosing what constitutes an element; how do you determine what is the same and what is different; how do you agregate or chunk the sensory data before you?

AITR-397

Author[s]: Tomas Lozano-Perez

The Design of a Mechanical Assembly System

December 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-397.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-397.pdf

This thesis describes a mechanical assembly system called LAMA (Language for Automatic Mechanical Assembly). The goal of the work was to create a mechanical assembly system that transforms a high-level description of an automatic assembly operation into a program or execution by a computer controlled manipulator. This system allows the initial description of the assembly to be in terms of the desired effects on the parts being assembled. Languages such as WAVE [Bolles & Paul] and MINI [Silver] fail to meet this goal by requiring the assembly operation to be described in terms of manipulator motions. This research concentrates on the spatial complexity of mechanical assembly operations. The assembly problem is seen as the problem of achieving a certain set of geometrical constraints between basic objects while avoiding unwanted collisions. The thesis explores how these two facets, desired constraints and unwanted collisions, affect the primitive operations of the domain.

AIM-396

Author[s]: Cynthia J. Solomon

Teaching the Computer to Add: An Example of Problem-Solving in an Anthropomorphic Computer Culture

December 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-396.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-396.pdf

Computers open up new ways to think about knowledge and learning. Learning computer science should draw upon and feed these new approaches. In a previous paper called “Leading a Child to a Computer Culture” I discuss some ways to do so in a very elementary context. This paper is a contribution to extending such thinking to a more advanced project.

AIM-395

Author[s]: Robert Lawler

Pre-Readers' Concepts of the English Word

November 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-395.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-395.pdf

Pre-Readers exhibit concepts of the English word different from those of literate adults. The inclusive word concept is primary: A word is what we call an utterance and any of its parts. Pre-Readers suffer confusion between homophones at the syllabic level, e.g., the sound of the suffix in “PUPPY” is confused with the name of the letter. Conflict between implicit judgments of wordhood (inferred from the child’s counting of the number of words in an utterance) and explicit judgments (responses to questions about whether an item is a word) vary from high, for pre-readers, to low, for beginning readers. The justifications pre-readers offer to support their judgments of wordhood are notable for not including any argumetns based on immediate verbal context. A concept development theory is offered to interpret this data and their relaxation to learning to read.

AIM-394

Author[s]: Johan de Kleer

Local Methods for Localizing Faults in Electronic Circuits

November 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-394.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-394.pdf

The work described in this paper is part of an investigation of the issues involved in making expert problem solving programs for engineering design and for maintenance of engineered systems. In particular, the paper focuses on the troubleshooting of electronic circuits. Only the individual properties of the components are used, and not the collective properties of groups of components. The concept of propagation is introduced which uses the voltage-current properties of components to determing additional information from given measurements. Two propagated values can be discovered for the same point. This is called a coincidence. In a faulted circuit, the assumptions made about components in the coinciding propagations can then be used to determine information about the faultiness of these components. In order for the program to deal with actual circuits, it handles errors in measurement readings and tolerances in component parameters. This is done by propagating ranges of numbers instead of single numbers. Unfortunately, the comparing of ranges introduces many complexities into the theory of coincidences. In conclusion, we show how such local deductions can be used as the basis for qualitative reasoning and troubleshooting.

AIM-393

Author[s]: Harold Abelson and Andy diSessa

Student Science Training Program in Mathematics, Physics and Computer Science

September 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-393.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-393.pdf

During the summer of 1976, the Massachussetts Institute of Technology Artificial Intelligence Laboratory sponsored a Student Science Training Program in Mathematics, Physics and Computer Science for high ability secondary school students. This report describes, in some detail, the style of the program, the curriculum and the projects the students undertook.

AIM-392

Author[s]: Kent A. Stevens

Computation of Locally Parallel Structure

March 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-392.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-392.pdf

A Moire-like effect can be observed in dot patterns consisting of two superimposed copies of a random dot pattern where one copy has been expanded, translated, or rotated. One perceives in these patterns a structure that is locally parallel. Our ability to perceive this structure is shown by experiment to be limited by the local geometry of the pattern, independent of the overall structure or the dot density. A simple representation of locally parallel structure is proposed, and it is found to be computable by a non-iterative, parallel algorithm. An implementation of this algorithm is demonstrated. Its performance parallels that observed experimentally, providing a potential explanation for human performance. Advantages are discussed for the early description of locally parallel structure in the course of visual processing.

AIM-391

Author[s]: Neil Rowe

Grammar as a Programming Language

October 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-391.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-391.pdf

This paper discusses some student projects involving generative grammars. While grammars are usually associated with linguisitics, their usefuleness goes far beyond just “language” to make different domains. Their application is general enough to make grammars a sort of programming language in their own right.

AIM-389

Author[s]: Ira Goldstein

The Computer as Coach: An Athletic Paradigm for Intellectual Education

December 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-389.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-389.pdf

Over the next five years, computer games will find their way into a vast number of American homes, creating a unique educational opportunity: the development of “computer coaches” for the serious intellectual skills required by some of these games. From the player’s perspective, the coach will provide advice regarding strategy and tactics for better play. But, from the perspective of the coach, the request for help is an opportunity to tutor basic mathematical, scientific or other kinds of knowledge that the game exercises.

AIM-388

Author[s]: Mark L. Miller and Ira P. Goldstein

PAZATN: A Linguistic Approach to Automatic Analysis of Elementary Programming Protocols

December 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-388.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-388.pdf

PATN is a design for a machine problem solver which uses an augmented transition network (ATN) to represent planning knowledge. In order to explore PATN’s potential as a theory of human problem solving, a linguistic approach to protocol analysis is presented. An interpretation of a protocol is taken to be a parse tree supplemented by semantic and pragmatic annotation attached to various nodes. This paradigm has implications for constructing a cognitive model of the individual and designing computerized tutors.

AIM-387

Author[s]: Ira P. Goldstein and Mark L. Miller

Structured Planning and Debugging: A Linguistic Theory of Design

December 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-387.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-387.pdf

A unified theory of planning an debugging is explored by designing a problem solving program called PATN. PATN uses an augmented transition network (ATN) to represent a broad range of planning techniques, including identification, decomposition, and reformulation. (The ATN [Woods 1970] is a simple yet powerful formalism which has been effectively utilized in computational linguistics.)

AIM-386

Author[s]: Mark L. Miller and Ira P. Goldstein

SPADE: A Grammar Based Editor for Planning and Debugging Programs

December 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-386.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-386.pdf

A grammar of plans is developed from a taxonomy of basic planning techniques. This grammar serves as the basis for the design of a new kind of interactive programming environment (SPADE), in which programs are generated by explicitly articulating planning descisions. The utility of this approach to program definition is that a record of these decisions, called the plan derivation, provides guidance for subsequent modification of debugging of the program.

AIM-385

Author[s]: Mark L. Miller and Ira P. Goldstein

Parsing Protocols Using Problem Solving Grammars

December 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-385.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-385.pdf

A theory of the planning and debugging of programs is formalized as is context free grammar. The grammar is used to reveal the constituent structure of problem solving episodes, by parsing protocols in which programs are written, tested and debugged. This is illustrated by the detailed analysis of an actual session with a beginning student. The virtues and limitations of the context free formalism are considered.

AIM-384

Author[s]: Ira P. Goldstein and Mark L. Miller

AI Based Personal Learning Environments: Directions for Long Term Research

December 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-384.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-384.pdf

The application of artificial intelligence (AI) techniques to the design of personal learning environments is an enterprise of both theoretical and practical interest. In the short term, the process of developing and testing intelligent tutoring programs serves as a new experimental vehicle for exploring alternative cognitive and pedagogical theories. In the long term, such programs should supplement the educational supervision and guidance provided by human teachers. This paper illustrates our long term perspective by a scenario with a hypothetical tutoring system for elementary graphics programming.

AIM-383A

Author[s]: Mark L. Miller and Ira P. Golstein

Overview of a Linguistic Theory of Design

February 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-383a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-383a.pdf

The SPADE theory uses linguistic formalisms to model the program planning and debugging processes. The theory has been applied to constructing a grammar-based editor in which programs are written in a structured fashion, designing an automatic programming system based on Augmented Transition Network, and parsing protocols of programming episodes.

AIM-383

Author[s]: Mark L. Miller and Ira P. Goldstein

Overview of a Linguistic Theory of Design

February 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-383.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-383.pdf

SPADE is a theory of the design of computer programs in terms of complementary planning and debugging processes. An overview of the authors’ recent research on this theory is provided. SPADE borrows tools from computational linguistics – grammars, augmented transition networks (ATN’s), chart- based parsers – to formalize planning and debugging. The theory has been applied to parsing protocols of programming episodes, constructing a grammar-based editor in which programs are written in a structured fashion, and designing an automatic programming system based ont eh ATN formalism.

AIM-382

Author[s]: Steven T. Rosenberg

Dual Coding and the Representation of Letter Strings

July 1977

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-382.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-382.pdf

Sub-strings derived from four-letter strings (e.g. ABCD) were presented to subjects using a variation on Bransford and Franks’ (1971) paradigm. Each strins was in either upper or lower case. Subjects were then tested for recognition of the strings, false recognition of translations of the strings into the other case, and false recognitions of new but legal strings. Subjects accepted previously seen strings most frequently, following by translations, with New strings accepted least often. This replicateds Rosenberg and Simon’s (in press) findings with sentences and pictures that express the same concept. However, in the present experiment the two forms of a string were unbiased with respect to verbal or pictorial encoding. The forms in which a string could appear (upper or lower case) were not confounded with the two types of encoding (verbal and pictorial) hypothesized by a dual coding theory. The results supported the view that the previously reported difference between the original form and a translation is best explained by a model which uses a single representation that preserves some form distinctions.

AIM-381

Author[s]: James L. Stansfield, Brian P. Carr and Ira P. Goldstein

Wumpus Advisor 1: A First Implementation Program that Tutors Logical and Probabilistic Reasoning Skills

October 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-381.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-381.pdf

The Wumpus Advisor program offers advice to a player involved in choosing the best move in a game for which competence in dealing with incomplete and uncertain knowledge is required. The design and implementation of the advisor explores a new paradigm in Computer Assisted Instruction, in which the performance of computer-based tutors is greatly improved through the application of Artificial Intelligence techniques. This report describes the design of the Advisor and outlines directions for further work. Our experience with the tutor is informal and psychological experimentation remains to be done.

AIM-380

Author[s]: Richard M. Stallman and Gerald Jay Sussman

Forward Reasoning and Dependency-Directed Backtracking in a System for Computer-Aided Circuit Analysis

September 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-380.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-380.pdf

We present a rule-based system for computer-aided circuit analysis. The set of rules, called EL, is written in a rule language called ARS. Rules are implemented by ARS as pattern-directed invocation demons monitoring an associative data base. Deductions are performed in an antecedent manner, giving EL’s analysis a catch-as- catch-can flavor suggestive of the behavior of expert circuit analyzers. We call this style of circuit analysis propagation of constraints. The system threads deduced facts with justifications which mention the antecedent facts and the rule used. These justifications may be examined by the user to gain insight into the operation of the set of rules as they apply to a problem. The same justifications are used by the system to determine the currently active data-base context for reasoning in hypothetical situations. They are also used by the system in the analysis failures to reduce the search space. This leads to effective control of cominatorial search which we call dependency-directed backtracking.

AIM-379

Author[s]: Guy Lewis Steele Jr.

LAMBDA: The Ultimate Declarative

November 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-379.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-379.pdf

In this paper, a sequel to "LAMBDA: The U ltimate Imperative", a new view of LAMBDA as a renaming operator is presented and contrasted with the usual functional view taken by L ISP. This view, combined with the view of function invocation as a kind of generalized GOTO, leads to several new insights into the nat ure of the LISP evaluation mechanism and the symmetry between form and function, evaluation and application, and control and environmen t. It also complements Hewitt's actors theory nicely, explaining the intent of environment manipulation as cleanly, generally, and intu itively as the actors theory explains control structures. The relationship between functional and continuation-passing styles of progra mming is also clarified. This view of LAMBDA leads directly to a number of specific techniques for use by an optimizing compiler: 1.) T emporary locations and user-declared variables may be allocated in a uniform manner. 2.) Procedurally defined data structures may compi le into code as good as would be expected for data defined by the more usual declarative means. 3.) Lambda-calculus-theoretic models of such constructs as GOT, DO loops, call-by-name, etc. may be used directly as macros, the expansion of which may then compile into code as good as that produced by compilers which are designed especially to handle GOTO, DO, etc. The necessary characteristics of such a c ompiler designed according to this philosophy are discussed. Such a compiler is to be built in the near future as a testing ground for these ideas.

AIM-378

Author[s]: Guy L. Steele Jr.

Arithmetic Shifting Considered Harmful

September 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-378.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-378.pdf

For more than a decade there has been great confusion over the semantics of the standard "arithmetic right shift" instruction. This confusion particularly afflicts authors of computer reference handbooks and of optimizing compilers. The fact that shifting is not always equivalent to division has been red iscovered over and over again over the years, but has never been publicized. This paper quotes a large number of sources to prove the widespread extent of this confusion, and then proceeds to a short discussion of the problem itself and what to do about it.

AIM-377

Author[s]: D. Marr and H.K. Nishihara

Representation and Recognition of the Spatial Organization of Three-Dimensional

August 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-377.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-377.pdf

A method is given for representing 3-D shapes. It is based on a hierarchy of stick figures (called 3-D models), where each stick corresponds to an axis in the shape’s generalized cone representation. Although the representation of a complete shape may contain many stick figures at different levels of detail, only one stick figure is examined at a time while the representation is being used ot interpret an image. By thus balancing scope of description against detail, the complexity of the computations needed to support the representation is minimized. The method requires (a) a database of stored stick figures; (b) a simple device called the image-space processor for moving between object- centered and viewer-centered coordinate frames; and (c) a process for “relaxing” a stored model onto the image during recognition. The relation of the theory to “mental rotation” phenomena is discussed, and some critical experimental predictions are made.

AIM-376

Author[s]: Harold Abelson

Computational Geometry of Linear Threshold Functions

July 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-376.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-376.pdf

Linear threshold machines are defined to be those whose computations are based on the outputs of a set of linear threshold decision elements. The number of such elements is called the rank of the machine. An analysis of the computational geometry of finite-rank linear threshold machines, analogous to the analysis of finite-order perceptrons given by Minsky and Papert, reveals that the use of such machines as “general purpose pattern recognition systems” is severely limited. For example, these machines cannot recognize any topological invariant, nor can they recognize non-trivial figures “in context”.

AIM-375

Author[s]: Cynthia J. Solomon and Seymour Papert

A Case Study of a Young Child Doing Turtle Graphics in LOGO

July 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-375.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-375.pdf

This paper explores some important issues with regard to using computers in education. It probes into the question of what programming ideas and projects will engage young children. In particular, a seven year old child’s involvement in turtle graphics is presented as a case study.

AIM-374

Author[s]: Glen Speckert

A Computerized Look at Cat Locomotion or One Way to Scan a Cat

July 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-374.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-374.pdf

This paper describes a three phase project concerning the watchin, analyzing, and describing of motions of a cat in various gaits. All data is based on two 16mm films of an actual cat moving on a treadmill. In phase I, the low level issues of tracking key points on the cat from frame to frame are discussed. Phase II deals with building and using a graphics tool to analyze the data of phase I. Pahse III is a high level discussion of cat locomotion based on the trajectories and movements explored by phase II.

AIM-373

Author[s]: Seymour A. Papert

Some Poetic and Social Criteria for Education Design

June 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-373.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-373.pdf

Ten years is in some ways a challenging and in some ways a very awkward period for predicting the impact of computers in education. If you asked me whether the practice of education will have undergone a fundamental change through the impact of computers in either five years of in twenty-five years, I could answer with complete confidence “NO” to the first question and “YES” to the second. But what happens in the ten years depends very sensitively on how hard we try; on when the people with the requisite financial, intellectual and moral resources recognize the opportunity and the urgency of action. If we act smartly it is still possible that by 1985 the existence of model schools and learning centers will have changed the ball-park in which society sets the sights of its educational ambitions.

AIM-372

Author[s]: D. Marr

Analysis of Occluding Contour

October 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-372.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-372.pdf

Almost nothing can be deduced about a general 3-D surface given only its occluding contours in an image, yet contour information is easily and effectively used by us to infer the shape of a surface. Therefore, implicit in the perceptual analysis of occluding contour must lie various assumptions about the viewed surfaces. The assumptions that seem most natural are (a) that the distinction between convex and concave segments reflects real properties of the viewed surface; and (b) that contiguous portions of contour arise from contiguous parts of the viewed surface – i.e. there are no invisible obscuring edges. It is proved that, for smooth surfaces, these assumptions are essentially equivalent to assuming that the viewed surface is a generalized cone. Methods are defined for finding the axis of such a cone, and for segmenting a surface constructed of several cones into its components, whose axes can then be found separately. These methods, together with the algorithms for implementing them devised by Vatan & Marr (1977), provide one link between an uninterpreted figure extracted from an image, and the 3-D representation theory of Marr and Nishihara (1977).

AIM-371

Author[s]: Seymour A. Papert

Proposal to NSF: An Evaluative Study of Modern Technology in Education

June 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-371.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-371.pdf

This proposal to the NSF describes a new phase of research planned in LOGO. Previous phases have concentrated on developing a conceptual superstructure (theories and teaching methods) and a material infra- structure (hardware and software) for a new style of using computers in education. We now want to test, to prove and to disseminate the results of our work, which will, of course, continue along the lines of the early phases. Part 1 is an overview of where we are and what we have to do next in the historical framework of the uses of computers for education. Parts 2 and 3 focus more on the specific content of the work planned for the next three years (1976-79).

AIM-370

Author[s]: Eugene C. Freuder

Synthesizing Constraint Expressions

July 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-370.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-370.pdf

An algorithm is presented for determining the values which simultaneously satisfy a set of relations, or constraints, involving different subsets of n variables. The relations are represented in a series of constraint networks, which ultimately contain a node for every subset of the n variables. Constraints may be propagated through such networks in (potentially) parallel fashion to determine the values which simultaneously satisfy all the constraints. The iterated constraint propagation serves to mitigate combinatorial explosion. Applications in scene analysis, graph theory, and backtrack search are provided.

AIM-369

Author[s]: David Taenzer

Physiology and Psychology of Color Vision -- A Review

August 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-369.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-369.pdf

This paper is a review of the anatomy, physiology, and psychology of human color vision.

AIM-368

Author[s]: Richard C. Waters

A System for Understanding Mathematical FORTRAN Programs

August 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-368.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-368.pdf

This paper proposes a system which, when implemented, will be able to understand mathematical FORTRAN programs such as those in the IBM Scientific Subroutine Package. The system takes, as input, a program and annotation of the program. In order to understand the program, the system develops a “plan” for it. The “plan” specifies the purpose of each feature of the program, and how these features cooperate in order to create the behavior exhibited by the program. The system can use its understanding of the program to answer questions about it including questions about the ramifications of a proposed modification. It is also able to aid in debugging the program by detecting errors in it, and by locating the features of the program which are responsible for an error. The system should be of significant assistance to a person who is writing a program.

AIM-367

Author[s]: Shimon Ullman

Filling in the Gaps: The Shape of Subjective Contours and a Model for Their Generation

October 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-367.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-367.pdf

The properties of isotropy, smoothness, minimum curvature and locality suggest the shape of filled-in contours between two boundary edges. The contours are composed of the arcs of two circles tangent to the given edges, meeting smoothly, and minimizing the total curvature. It is shown that shapes meeting all the above requirement can be generated by a network which performs simple, local computations. It is suggested that the filling-in process plays an important role in the early processing of visual information.

AIM-366

Author[s]: Patrick H. Winston

Proposal to the Advanced Research Projects Agency

May 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-366.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-366.pdf

This is the substance of a proposal submitted in June, 1975, for research in the areas of large data bases and intelligent terminals, applications of machine vision and manipulation, basic studies in Artificial Intelligence, and LISP machine development.

AIM-365

Author[s]: Berthold K.P. Horn and Patrick H. Winston

A Laboratory Environment for Applications Oriented Vision and Manipulation

May 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-365.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-365.pdf

This report is a brief summary guide to work done in the M.I.T. Artificial Intelligence Laboratory directed at the production of tools for productivity technology research. For detailed coverage of the work, readers should use this summary as an introduction to the reports and papers listed in the bibliography.

AIM-364

Author[s]: D. Marr and T. Poggio

Cooperative Computation of Stereo Disparity

June 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-364.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-364.pdf

The extraction of stereo disparity information from two images depends upon establishing a correspondence between them. This article analyzes the nature of the correspondence computation, and derives a cooperative algorithm that implements it. We show that this algorithm successfully extracts information from random-dot stereograms, and its implications for the psychophysics and neurophysiology of the visual system are briefly discussed.

AIM-363

Author[s]: Kent A. Stevens

Occlusion Clues and Subjective Contours

June 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-363.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-363.pdf

The paper describes some experiments with a visual agnosia patient who has lost the abillity to perceive subjective contours. The patient’s interpretations of simple examples of occlusion indicate that he fails to notice monocular occlusion clues, as well. The findings support the hypothesis that subjective countours are constructions that account for occluded figures, in the absence of objective edges. The patient’s ability to perceive coutours by stereopsis demonstrates that stereopsis independently gives rise to disparity countours. Furthermore, the overall results strongly suggest that the detection of occlusion is modularized, and that the module for detecting

AITR-362

Author[s]: Allen Brown

Qualitative Knowledge, Casual Reasoning and the Localization of Failures

November 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-362.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-362.pdf

This report investigates some techinques appropriate to representing the knowledge necessary for understanding a class of electronic machines -- radio receivers. A computational performance model - WATSON - is presented. WATSONs task is to isolate failures in radio receivers whose principles of operation have been appropriately described in his knowledge base. The thesis of the report is that hierarchically organized representational structures are essential to the understanding of complex mechanisms. Such structures lead not only to descriptions of machine operation at many levels of detail, but also offer a powerful means of organizing "specialist" knowledge for the repair of machines when they are broken.

AIM-361

Author[s]: Henry Lieberman

The TV Turtle: A Logo Graphics System for Raster Displays

June 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-361.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-361.pdf

Until recently, most computer graphics systems have been oriented toward the display of line drawins, continually refreshing the screen from a display list of vectors. Developments such as plasma panel displays and rapidly declining memory prices have now made feasible raster graphics systems, which instead associate some memory with each point on the screen, and display points according to the contents of the memory. This paper discusses the advantages and limitations of such systems. Raster systems permit operations which are not feasible on vector displays, such as reading directly from the screen as well as writing it, and manipulating two dimensional areas as well as vectors. Conceptual differences between programming for raster and vector systems are illustrated with a description of the author’s TV Turtle, a graphics system for raster scan video display terminals. This system is embedded in Logo, a Lisp-like interactive programming language designed for use by kids, and is based on Logo’s turtle geometry approach to graphics. Logo provides powerful ideas for using graphics which are easy for kids to learn, yet generalize naturally when advanced capabilities such as primitives for animation and color are added to the system.

AIM-360

Author[s]: Radia Perlman

Using Computer Technology to Provide a Creative Learning Environment for Preschool Children

May 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-360.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-360.pdf

TORTIS is a system of special terminals together with software which is designed to provide programming capability and be accesible for use by very young children. The system is designed to add capabilities in small increments so that the child is never overwhelmed by too much to learn at one time, and maintains a feeling of control over the environment. This system facilitates learning of various concepts such as relative size of numbers, frames of reference, procedures, conditionals, and recursion, but more importantly it teaches good problem solving techniques and a healthy approach to learning.

AIM-359

Author[s]: Benjamin Kuipers

Spatial Knowledge

June 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-359.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-359.pdf

This paper introduces a model of spatial cognition to describe the states of partial knowledge that people have about the spatial structure of a large-scale environment. Spatial knowledge has several different representations, each of which captures one aspect of the geography. With knowledge stored in multiple representations, we must examine the procedures for assimilating new information for solving problems, and for communicating information between representations. The model centers on an abstract machine called the TOUR machine, which executes a description of the route to drive the “You Are Here” pointer (a small working memory) through a map that describes the geography. Representations for local and global spatial knowledge are discussed in detail. The model is compared with a survey of the psychological literature. Finally, the directions of necessary and desirable future research are outlined.

AIM-358

Author[s]: Joseph D. Cohen

The Text-Justifier TJ6

May 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-358.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-358.pdf

This memo, intended as both a reference and user’s manual describes the text-justifying program TJ6, which compiles a neat output document from a sloppy input manuscript. TJ6 can justify and fill text; automatically number pages and figures; control page format and indentation; underline, superscript, and subscript; print a table of contents; etc.

AIM-357

Author[s]: D. Marr and T. Poggio

From Understanding Computation to Understanding Neural Circuitry

May 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-357.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-357.pdf

The CNS needs to be understood at four nearly independent levels of description: (1) that at which the nature of computation is expressed; (2) that at which the algorithms that implement a computation are characterized; (3) that at which an algorithm is committed to particular mechanisms; and (4) that at which the mechanisms are realized in hardware. In general, the nature of a computation is determined by the problem to be solved, the mechanisms that are used depend upon the available hardware, and the particular algorithms chosen depend on the problem and on the available mechanisms. Examples are given of theories at each level.

AIM-356

Author[s]: H. Abelson, J. Bamberger, I. Goldstein, and S. Papert

Logo Progress Report 1973-1975

September 1975 (Revised March 1976)

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-356.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-356.pdf

Over the past two years, the Logo Project has grown along many dimensions. This document provides an overview in outline form of the main activities and accomplishments of the past as well as the major goals guiding our current research. Research on the design of learning environments, the corresponding development of a theory of learning and the exploration of teaching activities in these environments is presented.

AIM-355

Author[s]: David Marr

Artificial Intelligence -- A Personal View

March 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-355.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-355.pdf

The goal of A.I. is to identify and solve useful information processing problems. In so doing, two types of theory arise. Here, they are labelled Types 1 and 2, and their characteristics are outlined. This discussion creates a more than usually rigorous perspective of the subject, from which past work and future prospects are briefly reviewed

AITR-354

Author[s]: Charles Rich and Howard E. Shrobe

Initial Report on a LISP Programmer's Apprentice

December 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-354.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-354.pdf

This is an initial report on the design and partial implementation of a LISP programmers apprentice, an interactive programming system to be used by an expert programmer in the design, coding, and maintenance of large, complex programs.

AIM-353

Author[s]: Guy Lewis Steele, Jr. and Gerald Jay Sussman

Lambda: The Ultimate Imperative

March 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-353.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-353.pdf

We demonstrate how to model the following common programmingsconstructs in terms of an applicative order language similar to LISP: Simple Recursion, Iteration, Compound Statements and Expressions, GO TO and Assignment, Continuation-Passing, Escape Expressions, Fluid Variables, Call by Name, Call by Need, and Call by Reference. The models require only (possibly self-referent) lambda application, conditionals, and (rarely) assignment. No complex data structures such as stacks are used. The models are transparent, involving only local syntactic transformations. This paper is partly tutorial in intent, gathering all the models together for purposes of context.

AITR-352

Author[s]: Johan De Kleer

Qualitative and Quantitative Knowledge in Classical Mechanics

December 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-352.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-352.pdf

This thesis investigates what knowledge is necessary to solve mechanics problems. A program NEWTON is described which understands and solves problems in mechanics mini-world of objects moving on surfaces. Facts and equations such as those given in mechanics text need to be represented. However, this is far from sufficient to solve problems. Human problem solvers rely on "common sense" and "qualitative" knowledge which the physics text tacitly assumes to be present. A mechanics problem solver must embody such knowledge. Quantitative knowledge given by equations and more qualitative common sense knowledge are the major research points exposited in this thesis. The major issue in solving problems is planning. Planning involves tentatively outlining a possible path to the solution without actually solving the problem. Such a plan needs to be constructed and debugged in the process of solving the problem. Envisionment, or qualitative simulation of the event, plays a central role in this planning process.

AIM-351

Author[s]: Marc Raibert

A State Space Model for Sensorimotor Control and Learning

January 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-351.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-351.pdf

This is the first of a two-part presentation which deals with certain computer controlled manipulator problems. This first part discusses a model which is designed to address problems of motor control, motor learning, adaptation, and sensorimotor integration. In this section the problems are outlined and a solution is given which makes used of a state space memory and a piece- wise linearization of the equations of motion. A forthcoming companion article will present the results of tests performed on an implementation of the model.

AIM-349

Author[s]: Gerald J. Sussman and Guy L. Steele, Jr

An Interpreter for Extended Lambda Calculus

December 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-349.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-349.pdf

Inspired by ACTORS [Greif and Hewitt] [Smith and Hewitt], we have implemented an interpreter for a LISP-like language, SCHEME, based on the lambda calculus [Church], but extended for side effects, multiprocessing, and process synchronization. The purpose of this implementation is tutorial. We wish to: (1) alleviate the confusion caused by Micro-PLANNER, CONNIVER, etc. by clarifying the embedding of non-recursive control structures in a recursive host language like LISP. (2) explain how to use these control structures, independent of such issues as pattern matching and data base manipulation. (3) have a simple concrete experimental domain for certain issues of programming semantics and style.

AIM-348

Author[s]: Andy diSessa

Turtle Escapes the Plane: Some Advanced Turtle Geometry

December 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-348.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-348.pdf

Since the LOGO Turtle took his first step he has been mathematically confined to running around on flat surfaces. Fortunately the physically intuitive, procedurally oriented nature of the Turtle which makes him a powerful explorer in the plane is equally, if not more apparent when he is liberated to tread curved surfaces. This paper is aimed roughly at the High School level. Yet because it is built on intuition and physical action rather than formalism, it can reach such “graduate school” mathematical ideas as geodesics, Gaussian Curvature, and topological invariants as expressed in the Gauss-Bonnet Theorem.

AITR-347

Author[s]: Robert Carter Moore

Reasoning from Incomplete Knowledge in a Procedural Deduction System

December 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-347.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-347.pdf

One very useful idea in AI research has been the notion of an explicit model of a problem situation. Procedural deduction languages, such as PLANNER, have been valuable tools for building these models. But PLANNER and its relatives are very limited in their ability to describe situations which are only partially specified. This thesis explores methods of increasing the ability of procedural deduction systems to deal with incomplete knowledge. The thesis examines in detail, problems involving negation, implication, disjunction, quantification, and equality. Control structure issues and the problem of modelling change under incomplete knowledge are also considered. Extensive comparisons are also made with systems for mechanica theorem proving.

AITR-346

Author[s]: John M. Hollerbach

Hierarchical Shape Description of Objects by Selection and Modification of Prototypes

November 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-346.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-346.pdf

An approach towards shape description, based on prototype modification and generalized cylinders, has been developed and applied to the object domains pottery and polyhedra: (1) A program describes and identifies pottery from vase outlines entered as lists of points. The descriptions have been modeled after descriptions by archeologists, with the result that identifications made by the program are remarkably consisten with those of the archeologists. It has been possible to quantify their shape descriptors, which are everyday terms in our language applied to many sorts of objects besides pottery, so that the resulting descriptions seem very natural. (2) New parsing strategies for polyhedra overcome some limitations of previous work. A special feature is that the processes of parsing and identification are carried out simultaneously.

AITR-345

Author[s]: Eugene C. Freuder

Computer System for Visual Recognition Using Active Knowledge

June 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-345.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-345.pdf

A system for visual recognition is described, with implications for the general problem of representation of knowledge to assist control. The immediate objective is a computer system that will recognize objects in a visual scene, specifically hammers. The computer receives an array of light intensities from a device like a television camera. It is to locate and identify the hammer if one is present. The computer must produce from the numerical “sensory data” a symbolic description that constitutes its perception of the scene. Of primary concern is the control of the recognition process. Control decisions should be guided by the partial results obtained on the scene. If a hammer handle is observed this should suggest that the handle is part of a hammer and advise where to look for the hammer head. The particular knowledge that a handle has been found combines with general knowledge about hammers to influence the recognition process. This use of knowledge to direct control is denoted here by the term “active knowledge”. A descriptive formalism is presented for visual knowledge which identifies the relationships relevant to the active use of the knowledge. A control structure is provided which can apply knowledge organized in this fashion actively to the processing of a given scene.

AIM-344

Author[s]: Murray Elias Danofsky

How Near is Near?

February 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-344.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-344.pdf

This paper presents a system for understanding the concept of near and far, weighing such factors as purpose of the judgement, dimensions of the objects, absolute size of the distance, and size of the distance relative to other objects, ranges, and standards. A further section discusses the meaning of phrases such as very near, much nearer than, and as near as. Although we will speak of near as a judgement about physical distance, most of the ideas developed will be applicable to any continuous measurable parameter, such as size or time. An adaptation for rows (discrete spaces) is made as well.

AIM-343

Author[s]: Cynthia J. Solomon

Leading a Child to a Computer Culture

December 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-343.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-343.pdf

“LOGO” is sometimes used as the name of a programming language. It is also used as the name of…what shall I call it?… an environment, a culture, a way of thinking about computers and about learning and about putting the two together. I shall try to convey to you how I bring a child into this environment.

AIM-342

Author[s]: Jeanne Bamberger

The Development of Musical Intelligence I: Strategies for Representing Simple Rhythms

November 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-342.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-342.pdf

This paper is the first in a series of monographs which will describe various aspects of the development of musical intelligence.

AIM-341

Author[s]: D. Marr and H.K. Hishihara

Spatial Disposition of Axes in a Generalized Cylinder Representation of Objects

December 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-341.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-341.pdf

It is proposed that the 3-D representation of an object is based primarily on a stick-figure configuration, where each stick represents one or more axes in the object’s generalized cylinder representation. The loosely hierarchical description of a stick figure is interpreted by a special-purpose processor, able to maintain two vectors and the gravitational vertical relative to a Cartesian space-frame. It delivers information about the appearance of these vectors, which helps the system to rotate its model into the correct 3-D orientation relative to the viewer during recognition.

AIM-340

Author[s]: D. Marr

Early Processing of Visual Information

December 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-340.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-340.pdf

The article describes a symbolic approach to visual information processing, and sets out four principles that appear to govern the design of complex symbolic information processing systems. A computational theory of early visual information processing is presented, which extends to about the level of figure-ground separation. It includes a process-oriented theory of texture vision. Most of the theory has been implemented, and examples are shown of the analysis of several natural images. This replaces Memos 324 and 334.

AIM-339

Author[s]: Drew V. McDermott

Very Large Planner-Type Data Bases

September 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-339.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-339.pdf

This paper describes the implementation of a typical data-base manaer for an A.I. language like Planner, Conniver, or QA4, and some proposed extensions for applications involving greater quantities of data than usual. The extensions are concerned with data bases involving several active and potentially active sub-data-bases, or “contexts”. The major mechanisms discussed are the use of contexts as packets of data with free variables; and indexing data according to the contexts they appear in. The paper also defends the Planner approach to data representations against some more recent proposals.

AIM-338

Author[s]: Harvey A. Cohen

The Art of Snaring Dragons

November 1974 (Revised May 1975)

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-338.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-338.pdf

DRAGONs are formidable problems in elementary mechanics not amenable to solution by naïve formula cranking. What is the intellectual weaponry one needs to snare a Dragon? To snare a Dragon one brings to mind an heuristic frame – a specifically structured association of problem solving ideas. Data on the anatomy of heuristic frames – just how and what ideas are linked together – has been obtained from the protocols of many attacks on Dragons by students and physicists. In this paper various heuristic frames are delineated by detailing how they motivate attacks on two particular Dragons, Milko and Jugglo, from the writer’s compilation. This model of the evolution of problem solving skills has also been applied to the interpretation of the intellectual growth of children, and in an Appendix we use it to give a cogent interpretation for the protocols of Piagetian “Conservation” experiments. The model provides a sorely needed theoretical framework to discuss teaching strategems calculated to promote problem solving skills.

AIM-337

Author[s]: Ira Goldstein and Seymour Papert

Artificial Intelligence, Language and the Study of Knowledge

July 1975 (Revised March 1976)

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-337.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-337.pdf

This paper studies the relationship of Artificial Intelligence to the study of language and the representation of the underlying knowledge which supports the comprehension process. It develops the view that intelligence is based on the ability to use large amounts of diverse kinds of knowledge in procedural ways, rather than on the possession of a few general and uniform principles. The paper also provides a unifying thread to a variety of recent approaches to natural language comprehension. We conclude with a brief discussion of how Artificial Intelligence may have a radical impact on education if the principles which it utilizes to explore the representation and use of knowledge are made available to the student to use in his own learning experiences. This paper is a revised version of an earlier document written with Marvin Minsky. Many of the ideas in this paper owe much to Minsky’s thoughtful critique; the authors, however, take responsibility fo the organization and wording of this document.

AIM-336

Author[s]: Howard Austin

Teaching Teachers LOGO: The Lesley Experiments

April 1976

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-336.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-336.pdf

This research is concerned with the question of whether or not teachers who lack specialized backgrounds can adapt to and become proficient in the technically complex, philosophically sophisticated LOGO learning environment. Excellent results were obtained and are illustrated through a series of examples of student work. The report then gives some brief observations about the thought styles observed and concludes with suggestions for further work.

AIM-335

Author[s]: Berthold K.P. Horn

Image Intensity Understanding

August 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-335.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-335.pdf

Image intensities have been processed traditionally without much regard to how they arise. Typically they are used only to segment an image into regions or to find edge- fragments. Image intensities do carry a great deal of useful information about three- dimensional aspects of objects and some initial attempts are made here to exploit this. An understanding of how images are formed and what determines the amount of light reflected from a point on an object to the viewer is vital to such a development. The gradient-space, popularized by Huffman and Mackworth is a helpful tool in this regard.

AIM-334

Author[s]: D. Marr

Analyzing Natural Images: A Computational Theory of Texture Vision

June 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-334.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-334.pdf

A theory of early and intermediate visual information processing is given, which extends to about the level of figure-ground separation. Its core is a computational theory of texture vision. Evidence obtained from perceptual and from computational experiments is adduced in its support. A consequence of the theory is that high-level knowledge about the world influences visual processing later and in a different way from that currently practiced in machine vision.

AIM-333

Author[s]: Shimon Ullman

On Visual Detection of Light Sources

May 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-333.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-333.pdf

The paper addresses the following problem: Given an array of light intensities obtained from some scene, find the light sources in the original scene. The following factors are discussed from the point of view of their relevance to light sources detection: The highest intensity in the scene, absolute intensity value, local and global contrast, comparison with the average intensity, and lightness computation. They are shown to be insufficient for explaining humans’ ability to identify light sources in their visual field. Finally, a method for accomplishing the source detection task in the mondrian world is presented.

AIM-332

Author[s]: Erik Sandewall

Ideas About Management of LISP Data Bases

May 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-332.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-332.pdf

The paper advocates the need for systems which support maintenance of LISP-type data bases, and describes an experimental system of this kind, call DABA. In this system, a description of the data base’s structure is kept in the data base itself. A number of utility programs use the description for operations on the data base. The description must minimally include syntactic information reminiscent of data structure declarations in more conventional programming languages, and can be extended by the user. Two reasons for such systems are seen: (1) As A.I. programs develop from toy domains using toy data bases, to more realistic exercises, the management of the knowledge base becomes non-trivial and requires program support. (2) A powerful way to organize LISP programs is to make them data-driven, whereby pieces of program are distributed throughout a data base. A data base management system facilitates the use of this programming style. The paper describes and discusses the basic ideas in the DABA system as well as the technique of data driven programs.

AIM-331

Author[s]: Scott E. Fahlman

Thesis Progress Report: A System for Representing and Using Real-World Knowledg

May 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-331.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-331.pdf

This paper describes progress to date in the development of a system for representing various forms of real-world knowledge. The knowledge is stored in the form of a net of simple parallel processing elements, which allow certain types of deduction and set- intersection to be performed very quickly and easily. It is claimed that this approach offers definite advantages for recognition and many other data-accessing tasks. Suggestions are included for the application of this system as a tool in vision, natural-language processing, speech recognition, and other problem domains.

AIM-330

Author[s]: Howard Austin

A Computational View of the Skill of Juggling

December 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-330.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-330.pdf

This research has as its basic premise the belief that physical and mental skills are highly similar, enough so in fact that computation paradigms such as the ones used in Artificial Intelligence research about predominantly mental skills can be usefully extended to include physical skills. This thesis is pursued experimentally by categorization of “juggling bugs” via detailed video observations. A descriptive language for juggling movements is developed and a taxonomy of bugs is presented. The remainder of the paper is concerned with an empirical determination of the characteristics of an ultimate theory of juggling movements. The data presented is relevant to the computational issues of control structure, naming, addressing and subprocedurization.

AIM-329

Author[s]: Tomas Lozano-Perez

Parsing Intensity Profiles

May 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-329.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-329.pdf

Much low-level vision work in AI deals with one-dimensional intensity profiles. This paper describes PROPAR, a system that allows a convenient and uniform mechanism for recognizin such profiles. PROPAR is a modified Augmented Transition Networks parser. The grammar used by the parser serves to describe and label the set of acceptable profiles. The input to the parser are descriptions of segments of a piecewise linear approximation to an intensity profile. A sample grammar is presented and the results discussed.

AIM-328

Author[s]: Gerald Jay Sussman and Richard Matthew Stallman

Heuristic Techniques in Computer Aided Circuit Analysis

March 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-328.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-328.pdf

We present EL, a new kind of circuit analysis program. Whereas other circuit analysis systems rely on classical, formal analysis techniques, EL employs heuristic “inspection” methods to solve rather complex DC bias circuits. These techniques also give EL the ability to explain any result in terms of its own qualitative reasoning processes. EL’s reasoning is based on the concept of a “local one-step deduction” augmented by various “teleological” principles and by the concept of a “macro-element”. We present several annotated examples of EL in operation and an explanation of how it works. We also show how EL can be extended in several directions, including sinusoidal steady state analysis. Finally, we touch on possible implications for engineering education. We feel that EL is significant not only as a novel approach to circuit analysis but also as an application of Artificial Intelligence techniques to a new and interesting domain.

AIM-327

Author[s]: David Marr

A Note on the Computation of Binocular Disparity in a Symbolic, Low-Level Visual Processor

December 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-327.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-327.pdf

The goals of the computation that extracts disparity from pairs of pictures of a scene are defined, and the contraints imposed upon that computation by the three-dimensional structure of the world are determined. Expressing the computation as a grey-level correlation is shown to be inadequate. A precise expression of the goals of the computation is possible in a low-level symbolic visual processor: the constraints translate in this environment to pre-requisites on the binding of disparity values to low-level symbols. The outine of a method based on this is given.

AIM-326

Author[s]: David Marr

The Recognition of Sharp, Closely Spaced Edges

December 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-326.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-326.pdf

The recognition of sharp edges from edge- and bar-mask convolutions with an image is studied for the special case where the separation of the edges is of the order of the masks' panel-widths. Desmearing techniques are employed to separate the items in the image. Attention is also given to parsing de-smeared mask convolutions into edges and bars; to detecting edge and bar terminations; and to the detection of small blobs.

AIM-325

Author[s]: David Marr

The Low-level Symbolic Representation of Intensity Changes in an Image

December 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-325.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-325.pdf

A family of symbols is defined by which much of the useful information in an image may be represented, and its choice is justified. The family includes symbols for the various commonly occuring intensity profiles that are associated with the edges of objects, and symbols for the gradual luminance changes that provide clues about a surface's shape. It is shown that these descriptors may readily be computed from measurements similar to those made by simple cells in the visual cortex of the cat. The methods that are described have been implemented, and examples are shown of their application to natural images.

AIM-324

Author[s]: David Marr

On the Purpose of Low-level Vision

December 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-324.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-324.pdf

This article advances the thesis that the purpose of low-level vision is to encode symbolically all of the useful information contained in an intensity array, using a vocabulary of very low-level symbols: subsequent processes should have access only to this symbolic description. The reason is one of computational expediency: it allows the low-level processes to run almost autonomously: and it greatly simplifies the application of criteria to an image, whose representation in terms of conditions on the initial intensities, or on simple measurements made from them, is very cummbersome. The implications of this thesis for physiological and for computational approaches to vision are discussed. A list is given of several computational problems in low-level vision: some of these are dealt with in the accompanying articles.

AIM-323

Author[s]: Berthold K. P. Horn

Orienting Silicon Integrated Circuit Chips for Lead Bonding

January 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-323.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-323.pdf

Will computers that see and understand what they see revolutionize industry by automating the part orientation and part inspection processes? There are two obstacles: the expense of computin and our feeble understanding of images. We believe these obstacles are fast ending. To illustrate what can be done we describe a working program that visually determines the position and orientation of silicon chips used in integrated circuits.

AIM-322

Author[s]: Benjamin J. Kuipers

A Frame for Frames: Representing Knowledge for Recognition

March 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-322.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-322.pdf

This paper presents a version of frames suitable for representing knowledge for a class of reconition problems. An initial section gives an intuitive model of frames, and illustrates a number of desirable features of such a representation. A more technical example describes a small recognition program for the Blocks World which implements some of these features. The final section discusses the more general significance of the representation and the recognition process used in the example.

AIM-321

Author[s]: Shimon Ullman

Model-Driven Geometry Theorem Prover

May 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-321.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-321.pdf

This paper describes a new Geometry Theorem Prover, which was implemented to illuminate some issues related to the use of models in theorem provin. The paper is divided into three parts: Part 1 describes G.T.P. and presents the ideas embedded in it. It concentrates on the forward search method, and gives two examples of proofs produced that way. Part 2 describes the backward search mechanism and presents proofs to a sequence of successively harder problems. The last section of the work addresses the notion of similarity in a problem, defines a notion of semantic symmetry, and compares it to Gelernter’s concept of syntactic symmetry.

AIM-320

Author[s]: Harold Abelson, Andrea diSessa and Lee Rudolph

Velocity Space and the Geometry of Planetary Orbits

December 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-320.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-320.pdf

We develop a theory of orbits for the inverse- square central force law which differs considerably from the usual deductive approach. In particular, we make no explicit use of calculus. By beginning with qualitative aspects of solutions, we are led to a number of geometrically realizable physical invariants of the orbits. Consequently most of our theorems rely only on simple geometrical relationships. Despite its simplicity, our planetary geometry is powerful enough to treat a wide range of perturbations with relative ease. Furthermore, without introducing any more machinery, we obtain full quantitative results. The paper concludes with sugestions for further research into the geometry of planetary orbits.

AIM-319

Author[s]: Gerald Jay Sussman and Allen L. Brown

Localization of Failures in Radio Circuits: A Study in Causal and Teleological Reasoning

December 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-319.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-319.pdf

This paper examines some methodologies for diagnosing correctly designed radio circuits which are failing to perform in the intended way because of some faulty component. Particular emphasis is placed on the utility and necessity of good teleological descriptions in successfully executing the task of isolating failing components.

AITR-316

Author[s]: Ann D. Rubin

Hypothesis Formation and Evaluation in Medical Diagnosis

January 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-316.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-316.pdf

This thesis describes some aspects of a computer system for doing medical diagnosis in the specialized field of kidney disease. Because such a system faces the spectre of combinatorial explosion, this discussion concentrates on heuristics which control the number of concurrent hypotheses and efficient "compiled" representations of medical knowledge. In particular, the differential diagnosis of hematuria (blood in the urine) is discussed in detail. A protocol of a simulated doctor/patient interaction is presented and analyzed to determine the crucial structures and processes involved in the diagnosis procedure. The data structure proposed for representing medical information revolves around elementary hypotheses which are activated when certain disposing of findings, activating hypotheses, evaluating hypotheses locally and combining hypotheses globally is examined for its heuristic implications. The thesis attempts to fit the problem of medical diagnosis into the framework of other Artifcial Intelligence problems and paradigms and in particular explores the notions of pure search vs. heuristic methods, linearity and interaction, local vs. global knowledge and the structure of hypotheses within the world of kidney disease.

AIM-315A

Author[s]: E. Paul Goldenberg

A Glossary of PDP11 LOGO Primitives

March 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-315a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-315a.pdf

This glossary was written for the purpose of providing a quick and concise yet accurate description of the primitives and special words and characters of the March 18, 1975 PDP 11 implementation of the LOGO languge. Many entries include references to other related words and/or examples of the use of the primitive being described, but this is not intended to replace the functions of a good manual. For a more detailed and comprehensive description of the language, see the LOGO MANUAL, LOGO MEMO 7. The description of each LOGO word includes the work, itself, any arguments that the word may require, the “type” of word it is, abbreviated and alternate forms of the work, if any, and a definition correct as the date of this glossary. Word tupe is described on the first page and an example of the formatt of the entries is given below. In the appendix to this glossary are sections about 1) LOGO words that take a variable number of inputs, 2) infix operators, 3) editing characters, 4) special characters, 5) special names, 6) decimal ascii code and corresponding characters.

AIM-315

Author[s]: Hal Abelson and Jim Adams

A Glossary of LOGO Primitives

December 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-315.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-315.pdf

This is a brief description of the primitives in PDP 11 LOGO. It is intended to provide a quick reference for users who are already familiar with LOGO basics. For a more detailed and comprehensive description of LOGO, consult the LOGO Manual (A.I. Memo 313, LOGO Memo 7).

AIM-314

Author[s]: Jeanne Bamberger

What's in a Tune

November 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-314.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-314.pdf

The work reported here began with two fundamental assumptions: 1) The perception of music is an active process; it involves the individual in selecting, sorting, and grouping the features of the phenomena before her. 2) Individual differences in response to a potentially sensible melody rest heavily on just which features the individual has access to or is able to focus on.

AIM-313

Author[s]: Hal Abelson, Nat Goodman and Lee Rudolph

LOGO Manual

December 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-313.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-313.pdf

This document descibes the LOGO system implemented for the PDP 11/45 at the M.I.T. Artificial Intelligence Laboratory. The “system” includes not only the LOGO evaluator, but also a dedicated time-sharing system which services about a dozen users. There are also various special devices such as robot turtles, tone generators, and CRT displays.

AIM-312

Author[s]: Jeanne Bamberger

The Luxury of Necessity

December 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-312.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-312.pdf

This paper was originally written as an address to a conference of the National Association of Schools of Music on “The Music Consumer”. Posing a series of questions which point to fundamental issues underlyin the LOGO music project, the paper goes on to describe some of the specific projects with which students have been working in an effort to probe these issues. Emphasis is placed on “modes of representation” as a significant realm of enquiry: just how does an individual represent a tune to himself, what are the differences between formal and informal modes of representation – what features and relations of a melody does a representation capture, what does it leave out? What is the influence of such modes of “perception”, how do they effect strategies of problem solving, notions of “same” and “different” or even influence musical “taste”? Finally, there are some hints at what might constitute “sufficiently powerful representations” of musical design with examples from both simple and complex pieces of music as well as a probe into what might distinguish “simple” from “complex” musical designs.

AIM-311

Author[s]: Radia Perlman

TORTIS: Toddler's Own Recursive Turgle Interpreter System

December 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-311.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-311.pdf

TORTIS is a device for preschool children to communicated with and program the turtle. It consistst of several boxes (currently 3 button boxes and two blox boxes) designed so that only a few new concepts are introduced at a time but more can be added when the child becomes familiar with what he has. Hopefully transitions are gradual enough so that the child never thinks talking to the turtle is too hard or that he is “too dumb”. And hopefully playing with the system should teach such concepts as numbers, breaking large problems into small solvable steps, writin and debugging procedures, recursion, variables, and conditionals. Most important of all, it should teach that learning is fun.

AITR-310

Author[s]: Patrick H. Winston

New Progress in Artificial Intelligence

September 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-310.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-310.pdf

This report concentrates on progress during the last two years at the M.I.T. Artificial Intelligence Laboratory. Topics covered include the representation of knowledge, understanding English, learning and debugging, understanding vision and productivity technology. It is stressed that these various areas are tied closely together through certain fundamental issues and problems.

AIM-309

Author[s]: James R. Geiser

Commenting Proofs

May 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-309.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-309.pdf

This paper constitutes a summary of a seminar entitled “Commenting Proofs” given a the Artificial Intelligence Laboratory during the spring of 1974. The work is concerned with new syntactic structures in formal proofs which derive from their pragmatic and semantic aspects. It is a synthesis of elements from Yessenin-Volpin’s foundational studies and developments in Artificial Intelligence concerned with commenting programs and the use of this idea in automatic debugging procedures.

AIM-308

Author[s]: Hirochika Inoue

Force Feedback in Precise Assembly Tasks

August 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-308.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-308.pdf

This paper describes the execution of precise assembly tasks by a robot. The level of performance of the experimental system allows such basic actions as putting a peg into a hole, screwing a nut on a bolt, and picking up a thin piece from a flat table. The tolerance achieved in experiments was 0.001 inch. The experiments proved that force feedback enabled a reliable assembly of a bearing complex consisting of eight parts with close tolerances. A movie of the demonstration is available.

AIM-307A

Author[s]: Ira Goldstein, Henry Lieberman, Harry Bochner and Mark Miller

LLOGO: An Implementation of LOGO in LISP

March 1975

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-307a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-307a.pdf

This paper describes LLOGO, an implementation of the LOGO language written in MACLISP for the ITS, TEN50 and TENEX PDP-10 systems, and MULTICS. The relative merits of LOGO and LISP as educational languages are discussed. Design decisions in the LISP implementation of LOGO are contrasted with those of two other implementations: CLOGO for the PDP-10 and 11LOGO for the PDP-11, both written in assembler language. LLOGO’s special facilities for character-oriented display terminals, graphic display ‘turtles’, and music generation are also described.

AIM-307

Author[s]: Ira Goldstein, Henry Lieberman, Harry Bochner and Mark Miller

LLOGO: An Implementation of LOGO in LISP

June 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-307.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-307.pdf

AIM-306

Author[s]: Marvin Minsky

A Framework for Representing Knowledge

June 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-306.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-306.pdf

This is a partial theory of thinking, combining a number of classical and modern concepts from psychology, linguistics, and AI. Whenever one encounters a new situation (or makes a substantial change in one's viewpoint) he selects from memory a structure called a frame, a remembered framework to be adopted to fit reality by changing details as necessary. A frame is a data-structure for representing a stereotyped situation, like being in a certain kind of living room, or going to a child's birthday party. Attached to each frame are several kinds of information. Some of this information is about how to use the frame. Some is about what one can expect to happen next. Some is about what to do if these expectations are not confirmed. The "top levels" of a frame are fixed, and represent things that are always true about the supposed situation. The lower levels have many "alota" that must be filled by specific instances or data. Collections of related frames are linked together into frame- systems. The effects of important actions are mirrored by transformations between the frames of a system. These are used to make certain kinds of calculations economical, to represent changes of emphasis and attention and to account for effectiveness of "imagery". In Vision, the different frames of a system describe the scene from different viewpoints, and the transformations between one frame and another represent the effects of moving from place to place. Other kinds of frame- systems can represent actions, cause-effect relations, or changes in conceptual viewpoint. The paper applies the frame-system idea also to problems of linguistic understanding: memory, acquisition and retrieval of knowledge, and a variety of ways to reason by analogy and jump to conclusions based on partial similarity matching.

AIM-305

Author[s]: Ira P. Goldstein

Summary of MYCROFT: A System for Understanding Simple Picture Programs

May 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-305.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-305.pdf

A collection of powerful ideas—description, plans, linearity, insertions, global knowledge and imperative semantics—are explored which are fundamental to debugging skill. To make these concepts precise, a computer monitor called MYCROFT is described that can debug elementary programs for drawing pictures. The programs are those written for LOGO turtles.

AIM-304

Author[s]: R.W. Gosper

Acceleration of Series

March 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-304.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-304.pdf

The rate of convergence of infinite series can be accelerated b y a suitable splitting of each term into two parts and then combining the second part of the n-th term with the first part of the (n+1) -th term t get a new series and leaving the first part of the first term as an "orphan". Repeating this process an infinite number of times, the series will often approach zero, and we obtain the series of orphans, which may converge faster than the original series. H euristics for determining the splits are given. Various mathematical constants, originally defined as series having a term ratio which approaches 1, are accelerated into series having a term ratio less than 1. This is done with the constants of Euler and Catalan. The se ries for pi/4 = arctan 1 is transformed into a variety of series, among which is one having a term ration of 1/27 and another having a term ratio of 54/3125. A series for 1/pi is found which has a term ratio of 1/64 and each term of which is an integer divided by a powe r of 2, thus making it easy to evaluate the sum in binary arithmetic. We express zeta(3) in terms of pi-3 and a series having a term ra tio of 1/16. Various hypergeometric function identities are found, as well as a series for (arcsin y)-2 curiously related to a series f or y arcsin y. Convergence can also be accelerated for finite sums, as is shown for the harmonic numbers. The sum of the reciprocals of the Fibonacci numbers has been expressed as a series having the convergence rate of theta function. Finally, it is shown that a series whose n-th term ratio is (n+p)(n+q)/(n+r)(n+s), where p, q, r, s are integers, is equal to c + d pi-2, where c and d are rational.

AIM-303

Author[s]: Arthur J. Nevins

Plane Geometry Theorem Proving Using Forward Chaining

January 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-303.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-303.pdf

A computer program is described which operates on a subset of plane geometry. Its performance not only compares favorably with previous computer programs, but within its limited problem domain (e.g. no curved lines nor introduction of new points), it also invites comparison with the best human theorem provers. The program employs a combination of forward and backward chaining with the forward component playing the more important role. This, together with a deeper use of diagrammatic information, allows the program to dispense with the diagram filter in contrast with its central role in previous programs. An important aspect of human problem solving may be the ability to structure a problem space so that forward chaining techniques can be used effectively.

AIM-302

Author[s]: Arthur J. Nevins

A Relaxation Approach to Splitting in an Automatic Theorem Prover

January 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-302.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-302.pdf

The splitting of a problem into subproblems often involves the same variable appearing in more than one of the subproblems. This makes these subproblems dependent upon one another since a solution to one may not qualify as a solution to another. A two stage method of splitting is described which first obtains solutions by relaxing the dependency requirement and then attempts to reconcile solutions to different subproblems. The method has been realized as part of an automatic theorem prover programmed in LISP which takes advantage of the procedural power that LISP provides. The program has had success with sryptarithmetic problems, problems from blocks world, and has been used as asubroutine in a plane geometry theorem prover.

AIM-301

Author[s]: Richard C. Waters

A Mechanical Arm Control System

January 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-301.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-301.pdf

This paper describes a proposed mechanical arm control system and some of the lines of thought which led to this design. In particular, the paper discusses the basic systme required in order for the arm to control its environment, and deal with error situations which arise. In addition the paper discusses the system needed to control the motion of the arm using the computed torque drive method, and force feedback.

AIM-300

Author[s]: Carl R. Flatau

Design Outline for Mini-Arms Based on Manipulator Technology

May 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-300.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-300.pdf

The design of small manipulators is an art requiring proficiency in diverse disciplines. This paper documents some of the general ideas illustrated by a particular design for an arm roughly one quarter human size. The material is divided into the following sections: A. General design constraints. B. Features of existing manipulator technology. C. Scaling relationships for major arm components. D. Design of a particular small manipulator. E. Comments on future possibilities.

AIM-299

Author[s]: P. Winston, B.K.P. Horn, G.J. Sussman, et al.

Proposal to ARPA for Research on Intelligent Automata and Micro-Automation

September 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-299.ps

ftp://publications.ai.mit.edu/ai-publicaitons/pdf/AIM-299.pdf

The results of a decade of work in Artificial Intelligence have brought us to the threshold of a new phase of knowledge-based programming -- in which we can design computer systems that (1) react reasonably to significantly complicated situations and (2) perhaps more important for the future -- interact intelligently with their operators when they encounter limitations, bugs or insufficient information. This proposal lays out programmes for bringing several such systems near to the point of useful application. These include: A physical "micro-automation" system for maintenance and repair of electronic circuits. A related "expert" problem-solving program for diagnosis and modification of electronic circuits. A set of advanced "Automatic Programming" techniques and systems for aid in developing and debugging large computer programs. Some Advanced Natural Language application methods and sustems for use with these and other interactive projects. A series of specific "expert" problem solvers, including Chess analysis. Steps toward a new generation of more intelligent Information Retrieval and Management Assistance systems.

AIM-298

Author[s]: Seymour Papert

Uses of Technology to Enhance Education

June 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-298.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-298.pdf

Section 1: Schematic outline of project and what we want. Hardly any intellectual content. Section 2: Statement of our goals in general terms. This statement is intended to have serious intellectual content but lacks meaty examples. Readers who find it too abstract for comfort might like to read at least part of #3 first. Section 3: A series fo extended examples intended to give more concrete substance to the generalities in #2. Section4: This is the real "proposal". It sets out specifically a list of concrete "goals" on which we want to work in the immediate future. Appendix: Papers by Jeanne Bamberger, Marvin Minsky, Seymour Papert and Cynthia Solomon.

AITR-297

Author[s]: Gerald J. Sussman

A Computational Model of Skill Acquisition

August 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-297.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-297.pdf

This thesis confronts the nature of the process of learning an intellectual skill, the ability to solve problems efficiently in a particular domain of discourse. The investigation is synthetic; a computational performance model, HACKER, is displayed. Hacker is a computer problem-solving system whose performance improves with practice. HACKER maintains performance knowledge as a library of procedures indexed by descriptions of the problem types for which the procedures are appropriate. When applied to a problem, HACKER tries to use a procedure from this “Answer Library”. If no procedure is found to be applicable, HACKER writes one using more general knowledge of the problem domain and of programming techniques. This new program may be generalized and added to the Answer Library.

AIM-296

Author[s]: David Marr

An Essay on the Primate Retina

Jaunary 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-296.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-296.pdf

This essay is considerably longer than the published version of the same theory and is designed for readers who have only elementary knowledge of the retina. It is organized into four parts. The first is a review that consists of four sections: retinal anatomy, physiology, psychophysics, and the retinex theory. The main exposition starts with Part II, which deals with the operation of the retina in conditions of moderate ambient illumination. The account is limited to an analysis of a single cone channel -- like the red or the green one -- the red channel being referred to frequently during the account. Part III considers various interesting properties of retinal signals, including those from the fully dark-adapted retina; and finally the thorny problem of bleaching adaptation is dealt with in Part IV. The general flow of the account will be from the receptors to the ganglion cells, and an analysis of each of the retinal cells and synapses is given in the appropriate place.

AIM-295

Author[s]: Berthold K.P. Horn

On Lightness

October 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-295.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-295.pdf

The intensity at a point in an image is the product of the reflectance at the corresponding object point and the intensity of illumination at that point. We are able to perceive lightness, a quantity closely correlated with reflectance. How then do we eliminate the component due to illumination from the image on our retina? The two components of image intensity differ in their spatial distribution. A method is presented here which takes advantage of this to compute lightness from image intensity in a layered, parallel fashion.

AITR-294

Author[s]: Ira P. Goldstein

Understanding Simple Picture Programs

April 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-294.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-294.pdf

What are the characteristics of the process by which an intent is transformed into a plan and then a program? How is a program debugged? This paper analyzes these questions in the context of understanding simple turtle programs. To understand and debug a program, a description of its intent is required. For turtle programs, this is a model of the desired geometric picture. a picture language is provided for this purpose. Annotation is necessary for documenting the performance of a program in such a way that the system can examine the procedures behavior as well as consider hypothetical lines of development due to tentative debugging edits. A descriptive framework representing both causality and teleology is developed. To understand the relation between program and model, the plan must be known. The plan is a description of the methodology for accomplishing the model. Concepts are explicated for translating the global intent of a declarative model into the local imperative code of a program. Given the plan, model and program, the system can interpret the picture and recognize inconsistencies. The description of the discrepancies between the picture actually produced by the program and the intended scene is the input to a debugging system. Repair of the program is based on a combination of general debugging techniques and specific fixing knowledge associated with the geometric model primitives. In both the plan and repairing the bugs, the system exhibits an interesting style of analysis. It is capable of debugging itself and reformulating its analysis of a plan or bug in response to self-criticism. In this fashion, it can qualitatively reformulate its theory of the program or error to account for surprises or anomalies.

AIM-292

Author[s]: Donald E. Eastlake

U.T.: Telnet Reference Manual

April 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-292.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-292.pdf

UT is a user telnet program designed to run under the ITS time sharing system. It implements the relatively recent ARPA network negotiating protocol for telnet connections.

AITR-291

Author[s]: Drew V. McDermott

Assimilation of New Information by a Natural Language Understanding System

February 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-291.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-291.pdf

This work describes a program, called TOPLE, which uses a procedural model of the world to understand simple declarative sentences. It accepts sentences in a modified predicate calculus symbolism, and uses plausible reasoning to visualize scenes, resolve ambiguous pronoun and noun phrase references, explain events, and make conditional predications. Because it does plausible deduction, with tentative conclusions, it must contain a formalism for describing its reasons for its conclusions and what the alternatives are. When an inconsistency is detected in its world model, it uses its recorded information to resolve it, one way or another. It uses simulation techniques to make deductions about creatures motivation and behavior, assuming they are goal-directed beings like itself.

AIM-290

Author[s]: Michael Beeler

Paterson's Worm

June 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-290.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-290.pdf

A description of a mathematical idealization of the feeding pattern of a kind of worm is given.

AIM-289

Author[s]: Daniel W. Corwin

Visual Position Extraction using Stereo Eye Systems with a Relative Rotational Motion Capability

March 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-289.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-289.pdf

This paper discusses the problem of context- free position estimation using a stereo vision system with moveable eyes. Exact and approximate equations are developed linking position to measureable quantities of the image-space, and an algorithm for finding these quantities is suggested in rough form. An estimate of errors and resolution limits is provided.

AIM-287

Author[s]: Tim Finin

Finding the Skeleton of a Brick

March 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-287.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-287.pdf

TC-SKELETONs duty is to help find the dimensions of brick shaped objects by searching for sets of three complete edges, one for each dimension. The program was originally written by Patrick Winston, and then was refined and improved by Tim Finin.

AIM-286

Author[s]: Gerald J. Sussman

The FINDSPACE Problem

March 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-286.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-286.pdf

The FINDSPACE problem is that of establishing a volume in space where an object of specified dimensions will fit. The problem seems to have two subproblems: the hypothesis generation problem of finding a likely spot to try, and the verification problem of testing that spot for occupancy by other objects. This paper treats primarily the verification problem.

AIM-285

Author[s]: Berthold K.P. Horn

The Binford-Horn LINE-FINDER

December 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-285.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-285.pdf

This paper briefly describes the processing performed in the course of producing a line drawing from an image obtained through an image dissector camera. The edge-marking pahse uses a non-linear parallel line-follower. Complicated statistical measures are not used. The line and vertex generating phases use a number of heuristics to guide the transition from edge-fragments to cleaned up line-drawing. Higher-level understanding of the blocks-world is not used. Sample line- drawings produced by the program are included.

AIM-284

Author[s]: Marvin Minsky and Seymour Papert

Proposal to ARPA for Continued Research on A.I. for 1973

June 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-284.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-284.pdf

The Artificial Intelligence Laboratory proposes to continue its work on a group of closely interconnected projects, all bearing on questions about how to make computers able to use more sophisticated kinds of knowledge to solve difficult problems. This proposal explains what we expect to come of this work, and why it seems to us the most profitable direction for research at this time. The core of this proposal is about well-defined specific tasks such as extending the computer"s ability to understand information presented as visual scenes, or in natural, human language. Although these specific goals are important enough in themselves, we see their pursuit also as tightly bound to the development of a general theory of the computations needed to produce intelligent processes. Obviously, a certain amount of theory is needed to achieve progress in this and we maintain tha the steps toward a comprehensive theory in this domain muyst include thorough analysis of very specific phenomena. Our confidence in this strategy is based both on past successes and on our current theory of knowledge structure. Our proposed solutions are still evolving, but they all seem to revolve around new methods of programming and new ways to represent knowledge about programming.

AITR-283

Author[s]: Scott E. Fahlman

A Planning System for Robot Construction Tasks

May 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-283.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-283.pdf

This paper describes BUILD, a computer program which generates plans for building specified structures out of simple objects such as toy blocks. A powerful heuristic control structure enables BUILD to use a number of sophisticated construction techniques in its plans. Among these are the incorporation of pre-existing structure into the final design, pre-assembly of movable sub- structures on the table, and use of the extra blocks as temporary supports and counterweights in the course of construction. BUILD does its planning in a modeled 3- space in which blocks of various shapes and sizes can be represented in any orientation and location. The modeling system can maintain several world models at once, and contains modules for displaying states, testing them for inter-object contact and collision, and for checking the stability of complex structures involving frictional forces. Various alternative approaches are discussed, and suggestions are included for the extension of BUILD-like systems to other domains. Also discussed are the merits of BUILD's implementation language, CONNIVER, for this type of problem solving.

AIM-282

Author[s]: Andee Rubin

Grammar for the People: Flowcharts of SHRDLU's Grammar

March 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-282.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-282.pdf

The grammar which SHRDLU uses to parse sentences is outlined in a series of flowcharts which attempt to modularize and illuminate its structure. In addition, a short discussion of systemic grammar is included.

AITR-281

Author[s]: Patrick H. Winston (Editor)

Progress in Vision and Robotics

May 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-281.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-281.pdf

The Vision Flashes are informal working papers intended primarily to stimulate internal interaction among participants in the A.I. Laboratory’s Vision and Robotics group. Many of them report highly tentative conclusions or incomplete work. Others deal with highly detailed accounts of local equipment and programs that lack general interest. Still others are of great importance, but lack the polish and elaborate attention to proper referencing that characterizes the more formal literature. Nevertheless, the Vision Flashes collectively represent the only documentation of an important fraction of the work done in machine vision and robotics. The purpose of this report is to make the findings more readily available, but since they are not revised as presented here, readers should keep in mind the original purpose of the papers!

AIM-280

Author[s]: Ira Goldstein

Elementary Geometry Theorem Proving

April 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-280.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-280.pdf

An elementary theorem prover for a small part of plane Euclidean geometry is presented. The purpose is to illustrate important problem solving concepts that naturally arise in building procedural models for mathematics.

AIM-279

Author[s]: Ira Goldstein

Pretty-Printing, Converting List to Linear Structure

February 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-279.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-279.pdf

Pretty-printing is the conversion of the list structure to a readable format. This paper outlines the computational problems encountered in such a task and documents the current algorithm in use.

AIM-278

Author[s]: Robert C. Moore

D-SCRIPT: A Computational Theory of Descriptions

February 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-278.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-278.pdf

This paper descries D-SCRIPT, a language for representing knowledge in artificial intelligence programs. D-SCRIPT contains a powerful formalism for descriptions, which permits the representation of statements that are problematical for other systems. Particular attention is paid to problems of opaque contexts, time contexts, knowledge about knowledge. The design of a theorem prover for this language is also considered.

AIM-277

Author[s]: Vaughan R. Pratt

A Linguistics Oriented Programming Language

February 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-277.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-277.pdf

A programming language for natural language processing programs is described. Examples of the output of programs written using it are given. The reasons for various design decisions are discussed. An actual session with the system is presented, in which a small fragment of an English-to-French translator is developed. Some of the limitations of the system are discussed, along with plans for further development.

AIM-276

Author[s]: Michael Beeler

The Making of the Film, SOLAR CORONA

February 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-276.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-276.pdf

The film SOLAR CORONA was made from data taken from August 14, 1969 through May 7, 1970, by OSO-VI, one of the Orbiting Satellite Observatories. One of the experiments on board scanned across and up and down the image of the sun, as we read a printed page. Each line of the scan was broken up into several distinct measurement points, similar to our eyes fixating as we read a line of text.

AIM-275

Author[s]: Martin Brooks and Jerrold Ginsparg

Differential Perceptrons

January 1973

ftp://publications.ai.mit.edu/ai-publciations/0-499/AIM-275.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-275.pdf

As originally proposed, perceptrons were machines that scanned a discrete retina and combined the data gathered in a linear fashion to make decisions about the figure presented on the retina. This paper considers differential perceptions, which view a continuous retina. Thus, instead of summing the results of predicates, we must now integrate. This involves setting up a predicate space which transforms the typical perceptron sum, Ea(p)a(f), into Esacp,f(p)dp, where f is the figure on the retina, i.e. in the differential case, the figure is viewed as a function on the predicate space. We show that differential perceptrons are equivalent to perceptrons on the class of figures that fit exactly onto a sufficiently small square grid. By investigating predicates of various geometric transformations, we discover that translation and symmetry can be computed in finite order using finite coefficients in both continuous and discrete cases. We also note that in the perceptron scheme, combining data linearly implies the ability to combine data in a polynomial fashion.

AIM-274

Author[s]: Marvin Minsky and Seymour Papert

Proposal to ARPA for Continuation of Micro-Automation Development

January 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-274.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-274.pdf

This proposal discusses practical aspects of our project to produce a replicable research tool for development of real-world computer-controlled hand-eye systems. If this proposal is read out of context, it will not seem very sophisticated because it is concerned mainly with the practical aspects of putting together an engineering system. The theoretical and conceptual context is described more thoroughly in the memo, supplementary to our main ARPA contract proposal, that describes in detail robotics reasearch at the MIT A.I. Laboratory.

AIM-273

Author[s]: David Silver

The Little Robot System

January 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-273.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-273.pdf

The Little Robot System provides for the I.T.S. user a medium size four degree of freedom six axis robot which is controlled by the PDP-6 computer through the programming language Lisp. The robot includes eight force feedback channels which when interpreted by the PDP-6 are read by Lisp as the signed force applied to the end of the fingers. The first six forces are the X,Y, and Z forces and the torques around X, Y, and Z. the other two forces are the grippers and the vice grippers. The three X, Y, and Z forces and three torques are computed from six numbers read in from six L.V.D.Ts (Linear Variable Differential Transformers) arranged three in the vertical and three in the horizontal plane within a stress strain spring loaded wrist. The grip is read in from a strain gauge mounted on the stationary reference finger. The relative position between the motor shaft and the vice shaft is determined through means of two potentiometers to measure the vice force. The two shafts are coupled by a spring.

AIM-272

Author[s]: Michael Speciner

How the GAS Program Works with a Note on Simulating Turtles with Touch Sensors

December 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-272.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-272.pdf

The GAS program is a display simulation of a 2 dimensional ideal gas. Barriers, or walls, are line segments, and molecules, alias particles or balls, are circles. Collisions occur between balls and other balls as well as between balls and walls. All collisions are elastic. Global gravitational, electric, and magnetic fields can be imposed to act on the articles. The following is a description of some of the inner workings on the program.

AITR-271

Author[s]: David L. Waltz

Generating Semantic Descriptions From Drawings of Scenes With Shadows

November 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-271.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-271.pdf

The research reported here concerns the principles used to automatically generate three-dimensional representations from line drawings of scenes. The computer programs involved look at scenes which consist of polyhedra and which may contain shadows and various kinds of coincidentally aligned scene features. Each generated description includes information about edge shape (convex, concave, occluding, shadow, etc.), about the type of illumination for each region (illuminated, projected shadow, or oriented away from the light source), and about the spacial orientation of regions. The methods used are based on the labeling schemes of Huffman and Clowes; this research provides a considerable extension to their work and also gives theoretical explanations to the heuristic scene analysis work of Guzman, Winston, and others.

AIM-270

Author[s]: Gerald Jay Sussman

Teaching of Procedures-Progress Report

October 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-270.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-270.pdf

The idea of building a programmer is very seductive in that it holds the promise of massive bootstrapping and thus ties in with many ideas about learning and teaching. I will avoid going into those issues here. It is necessary, however, to explain what I am not working on. I am not interested in developing new and better languages for expressing algorithms. When FORTRAN was invented, it was touted as an automatic programmer, and indeed it was, as it relieved the user of complete specification of the details of implementation. Newer programming languages are just elaborations (usually better) of that basic idea. I am, however, interested in the problem of implementation of a partially specified algorithm rather tan a complete algorithm and a partially specified implementation. This problem is truly in the domain of Artificial Intelligence because the system which "solves" this problem needs a great deal of knowledge about the problem domain for which te algorithm is being constructed in order to "reasonably" complete the specification. Indeed, a programmer is not told exactly the algorithm to be implemented, he is told the problem which his program is expected to solve.

AIM-269

Author[s]: Marvin Minsky

Proposal to ARPA for Continued Research on A.I.

October 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-269.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-269.pdf

The Artificial Intelligence Laboratory proposes to continue its work on a group of closely interconnected projects, all bearing on questions about how to make computers able to use more sophisticated kinds of knowledge to solve difficult problems. This proposal explains what we expect to come of this work, and why it seems to us the most profitable direction for research at this time. The core of this proposal is about well-defined specific tasks such as extending the computer's ability to understand information presented as visual scenes--or in natural, human language. Although these specific goals are important enough in themselves, we see their pursuit also as tightly bound to the development of a general theory of the computations needed to produce intelligent processes. Obviously, a certain amount of theory is needed to achieve progress in this and we maintain that the steps toward a deep theory in this domain must include thorough analysis of a very specific phenomena. Our confidence in this strategy is based both on past successes and on our current theory of knowledge structure. These bases are discussed both below and in the appendices.

AIM-268

Author[s]: Arthur J. Nevins

A Human Oriented Logic for Automatic Theorem Proving

October 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-268.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-268.pdf

The automation of first order logic has received comparatively little attention from researcher intent upon synthesizing the theorem proving mechanism used by humans. The dominant point of view [15], [18] has been that theorem proving on the computer should be oriented to the capabilities of the computer rather than to the human mind and therefore one should not be afraid to provide the computer with a logic that humans might find strange and uncomfortable. The preeminence of this point of view is not hard to explain since until now the most successful theorem proving programs have been machine oriented. Nevertheless, there are at least two reasons for being dissatisfied with the machine oriented approach. First, a mathematician often is interested more in understanding the proof of a proposition than in being told that the propositions true, for the insight gained from an understanding of the proof can lead to the proof of additional propositions and the development of new mathematical concepts. However, machine oriented proofs can appear very unnatural to a human mathematician thereby providing him with little if any insight. Second, the machine oriented approach has failed to produce a computer program which even comes close to equaling a good human mathematician in theorem proving ability; this leads one to suspect that perhaps the logic being supplied to the machine is not as efficient as the logic used by humans. The approach taken in this paper has been to develop a theorem proving program as a vehicle for gaining a better understanding of how humans actually prove theorems. The computer program which has emerged from this study is based upon a logic which appears more "natural" to a human (i.e., more human oriented). While the program is not yet the equal of a top flight human mathematician, it already has given indication (evidence of which is presented in section 9) that it can outperform the best machine oriented theorem provers.

AIM-267A

Author[s]: Marvin Minsky

Manipulator Design Vignettes

October 1981

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-267a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-267a.pdf

This memo is about mechanical arms. The literature on robotics seems to be deficient in such discussions, perhaps because not enough sharp theoretical problems have been formulated to attract interest. I"m sure many of these matters have been discussed in other literatures-- prosthetics, orthopedics, mechanical engineering, etc., and references to such discussions would be welcome. We raise these issues in the context of designing the mini-robot" system in the A.I. Laboratory in 1972-1973. But we would like to attract the interests of the general heuristic programming community to such questions.

AIM-267

Author[s]: Marvin Minsky

Manipulator Design Vignettes

October 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-267.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-267.pdf

This memo is about mechanical arms. The literature on robotics seems to be deficient in such discussions, perhaps because not enough sharp theoretical problems have been formulated to attract interest. I’m sure many of these matters have been discussed in other literatures – prosthetics, orthopedics, mechanical engineering, etc., and references to such discussions would be welcome. We raise these issues in the context of designing the “mini-robot” system in the A.I. Laboratory in 1972-1973. But we would like to attract the interest of the general heuristic programming community to such questions.

AITR-266

Author[s]: Eugene Charniak

Toward A Model Of Children's Story Comprehension

December 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-266.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-266.pdf

How does a person answer questions about children's stories? For example, consider 'Janet wanted Jack's paints. She looked at the picture he was painting and said 'Those paints make your picture look funny.' The question to ask is 'Why did Janet say that?'. We propose a model which answers such questions by relating the story to background real world knowledge. The model tries to generate and answer important questions about the story as it goes along. Within this model we examine two questions about the story as it goes along. Within this model we examine two problems, how to organize this real world knowledge, and how it enters into more traditional linguistic questions such as deciding noun phrase reference.

AIM-265

Author[s]: Garry S. Meyer

Infants in Children Stories - Toward a Model of Natural Language Comprehension

August 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-265.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-265.pdf

How can we construct a program that will understand stories that children would understand? By understand we mean the ability to answer questions about the story. We are interested here with understanding natural language in a very broad area. In particular how does one understand stories about infants? We propose a system which answers such questions by relating the story to background real world knowledge. We make use of the general model proposed by Eugene Charniak in his Ph.D. thesis (Charniak 72). The model sets up expectations which can be used to help answer questions about the story. There is a set of routines called BASE-routines that correspond to our "real world knowledge" and routines that are "put-in" which are called DEMONs that correspond to contextual information. Context can help to assign a particular meaning to an ambiguous word, or pronoun.

AIM-264

Author[s]: Jeanne Bamberger

Developing a Musical Ear: A New Experiment

July 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-264.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-264.pdf

I would like to report on some ideas we have been developing at M.I.T. for self-paced, independent music study. The aim of our approach is to nurture in students that enigmatic quality called, "musical"-- be it a "musical ear" or an individual's capacity to give a "musical performance". While all of us cherish these qualities, rarely do we come to grips with them directly in teaching. More often we rely on our magical or mystical faith in the inspiration of music, itself, and its great artists, to do the teaching. And for some (maybe ultimately all) this is the best course. But what about the others to whom we teach only the techniques of playing instruments or some "facts" about music--its forms, its history and its apparent elements? How often do we have or take the time to examine the assumptions underlying these "facts" we teach, or to question the relation between what we teach and what we do as musicians?

AIM-263

Author[s]: Yoshiaki Shirai

A Heterarchical Program for Recognition of Polyhedra

June 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-263.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-263.pdf

Recognition of polyhedra by a heterarchical program is presented. The program is based on the strategy of recognizing objects step by step, at each time making use of the previous results. At each stage, the most obvious and simple assumption is made and the assumption is tested. To find a line segment, a range of search is proposed. Once a line segment is found, more of the line is determined by tracking along it. Whenever a new fact is found, the program tries to reinterpret the scene taking the obtained information into consideration. Results of the experiment using an image dissector are satisfactory for scenes containing a few blocks and wedges. Some limitations of the present program and proposals for future development are described.

AIM-262

Author[s]: Mitchell Wand

A Concrete Approach to Abstract Recursive Definitions

June 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-262.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-262.pdf

We introduce a non-categorical alternative to Wagner's Abstract Recursive Definitions [Wg-1,2] using a generalization of the notion of clone called a u-clone. Our more concrete approach yields two new theorems: 1.) the free u-clone generated by a ranked set is isomorphic to the set of loop-representable flow diagrams with function symbols in the set, 2.) For every element of a u-clone there is an expression analogous to a regular expression. Several well-known theorems of language and automata theory are drawn as special cases of this theorem.

AIM-261

Author[s]: Donald E. Eastlake

PEEK

May 1973

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-261.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-261.pdf

PEEK is a utility program designed to operate under the ITS time sharing system. It enables a user to monitor a variety of aspects of the time sharing system by providing periodically updated display output or periodic printer output to teletype or line printer. just what information is being presented to the user is controlled by PEEKs information mode. The available modes are listed in section 3 below. Section 5 describes how PEEK determines which device to output on. Section 2 describes, in general, how the user can input commands to PEEK.

AIM-261A

Author[s]: Donald E. Eastlake

PEEK

February 1974

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-261a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-261a.pdf

PEEK is a utility program designed to operate under the ITS time sharing system. It enables a user to monitor a variety of aspects of the time sharing system by providing, to the user, various periodically updated displays.

AIM-260

Author[s]: Donald E. Eastlake

Lock

June 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-260.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-260.pdf

LOCK is a miscellaneous utility program operating under the ITS system. It allows the user to easily and conveniently perform a variety of infrequently required tasks. Most of these relate to console input-output or the operation of the ITS system.

AIM-259A

Author[s]: Drew V. McDermott and Gerald Jay Sussman

The Conniver Reference Manual

May 1972 (Updated January 1974)

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-259a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-259a.pdf

This manual is an introduction and reference to the latest version of the Conniver programming language, an AI language wit general control and data-base structures.

AIM-259

Author[s]: Drew V. McDermott and Gerald Jay Sussman

The Conniver Reference Manual

May 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-259.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-259.pdf

This manual is intended to be a guide to the philosophy and use of the programming language CONNIVER, which is "complete," and running at the AI Lab now. It assumes good knowledge of LISP, but no knowledge of Micro-Planner, in whose implementation many design decisions were made that are not to be expected to have consequences in CONNIVER. Those not familiar with LISP should consult Weissmans (1967) Primer, the LISP 1.5 Programmer's Manual (McCarthy et.al., 1962), or Jon L. Whites (1970) and others (PDP-6, 1967) excellent memos here at our own lab

AITR-258

Author[s]: Carl Hewitt

Description and Theoretical Analysis (Using Schemata) of Planner: A Language for Proving Theorems and Manipulating Models in a Robot

April 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-258.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-258.pdf

Planner is a formalism for proving theorems and manipulating models in a robot. The formalism is built out of a number of problem- solving primitives together with a hierarchical multiprocess backtrack control structure. Statements can be asserted and perhaps later withdrawn as the state of the world changes. Under BACKTRACK control structure, the hierarchy of activations of functions previously executed is maintained so that it is possible to revert to any previous state. Thus programs can easily manipulate elaborate hypothetical tentative states. In addition PLANNER uses multiprocessing so that there can be multiple loci of changes in state. Goals can be established and dismissed when they are satisfied. The deductive system of PLANNER is subordinate to the hierarchical control structure in order to maintain the desired degree of control. The use of a general-purpose matching language as the basis of the deductive system increases the flexibility of the system. Instead of explicitly naming procedures in calls, procedures can be invoked implicitly by patterns of what the procedure is supposed to accomplish. The language is being applied to solve problems faced by a robot, to write special purpose routines from goal oriented language, to express and prove properties of procedures, to abstract procedures from protocols of their actions, and as a semantic base for English.

AIM-257

Author[s]: Rich Schroeppel

A Two Counter Machine Cannot Calculate 2N

May 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-257.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-257.pdf

This note proves that a two counter machine cannot calculate 2N.

AIM-256

Author[s]: Michael J. Fischer

Efficiency of Equivalence Algorithms

April 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-256.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-256.pdf

This paper was first presented at the Symposium on Complexity of Computer Computations, IBM Thomas J. Watson Research Center, Yorktown Heights, New York, on March 22, 1972. The equivalence problem is to determine the finest partition on a set that is consistent with a sequence of assertions of the form "x == y". A strategy for doing this on a computer processes the assertions serially, maintaining always in storage a representation of the partition defined by the assertions so far encountered. To process the command "x == y", the equivalence classes of x and y are determined. If they are the same, nothing further is done; otherwise the two classes are merged together. Galler and Fischer (1964A) give an algorithm for solving this problem based on tree structures, and it also appears in Knuth (1968A). The items in each equivalent class are arranged in a tree, and each item except for the root contains a pointer to its father. The root contains a flag indicating that it is a root, and it may also contain other information relevant to the equivalence class as a whole.

AIM-255

Author[s]: Gerald Jay Sussman

Why Conniving is Better than Planning

February 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-255.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-255.pdf

A higher level language derives its great power form the fact that it tends to impose structure on the problem solving behavior for the user. Besides providing a library of useful subroutines with a uniform calling sequence, the author of a higher level language imposes his theory of problem solving on the user. By choosing what primitive data structures, control structures, and operators he presents to the user, he makes the implementation of some algorithms more difficult than others, thus discouraging some techniques and encouraging others. So, to be "good", a higher level language must not only simplify the job of programming, by providing features which package programming structures commonly found in the domain for which the language was designed, it must also do its best to discourage the use of structures which lead to "bad" algorithms.

AIM-255A

Author[s]: Gerald Jay Sussman and Drew Vincent McDermott

Why Conniving is Better than Plannng

April 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-255a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-255a.pdf

This paper is a critique of a computer programming language, Carl Hewitts PLANNER, a formalism designed especially to cope with the problems that Artificial Intelligence encounters. It is our contention that the backtrack control structure that is the backbone of PLANNER is particular, automatic backtracking encourages inefficient algorithms, conceals what is happening from the user, and misleads him with primitives having powerful names whose power is only superficial. An alternative, a programming language called CONNIVER which avoids these problems, is presented from the point of view of this critique.

AIM-254

Author[s]: Seymour Papert and Cynthia Solomon

NIM: A Game-Playing Program

January 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-254.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-254.pdf

This note illustrates some ideas about how to initiate beginning students into the art of planning and writing a program complex enough to be considered a project rather than an exercise on using the language or simple programming ideas. The project is to write a program to play a simple game ("one-pile NIM" or "21") as invincibly as possible. We developed the project for a class of seventh grader children we taught in 1968-69 at the Muzzey Junior High School in Lexington, Massachusetts. This was the longest programming project these children had encountered, and our intention was to give them a model of how to go about working under these conditions.

AIM-253

Author[s]: Matthew J. Hillsman, R. Wade Williams and John S. Roe

The Computer-Controlled Oculometer: A Prototype Interactive Eye Movement Tracking System

September 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-253.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-253.pdf

One kind of eye movement tracking device which has great potential is the digital computer-controlled Oculometer, an instrument which non-invasively measures point of regard of the subject, as well as pupil diameter and blink occurrence. In conjunction with a computer-generated display which can change in real time as a function of the subject's eye motions, the computer- controlled Oculometer makes possible a variety of interactive measurement and control systems. Practical applications of such schemes have had to await the development of an instrument design which does not inconvenience the subject, and which conveniently interfaces with a digital computer (see ref. 1). This report describes an Oculometer subsystem and an eye-tracking/control program designed for use with the PDP-6 computer of the MIT Project MAC Artificial Intelligence Group. The oculometer electro- optic subsystem utilizes near-infrared light reflected specularly off the front surface of the subject's cornea and diffusely off the retina, producing a bright pupil with an overriding corneal highlight. An electro-optic scanning aperture vidissector within the unit, driven by a digital eye-tracking algorithm programmed into the PDP-6 computer, detects and tracks the centers of the corneal highlight and the bright pupil to give eve movement measurements. A computer-controlled, moving mirror head motion tracker directly coupled to the vidissector tracker permits the subject reasonable freedom of movement. Various applications of this system, which are suggested by the work reported here, include; (a) using the eye as a control device, (b) recording eye fixation and exploring patterns, (c) game playing, (d) training machines, and (e) psychophysiological testing and recording.

AIM-252

Author[s]: Marvin Minsky and Seymour Papert

Artificial Intelligence Progress Report

January 1, 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-252.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-252.pdf

Research at the Laboratory in vision, language, and other problems of intelligence. This report is an attempt to combine a technical progress report with an exposition of our point of view about certain problems in the Theory of Intelligence.

AIM-251

Author[s]: Marvin Minsky

Mini-Robot Proposal to ARPA

January 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-251.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-251.pdf

During the next decade it will become practical to use more and more sophisticated techniques of automation--we shall call this "robotics"--both in established industries and in new areas. The rate at which these techniques become available will depend very much on the way research programs are organized to pursue them. The issues involved are rather large and touch not only on technical matters but also on aspects of national economic policy and attitudes toward world trade positions. The project herein proposed is concerned with the development of two particular aspects of Robotics, namely; 1.) Development of a miniature hand-eye system 2.) Development of remote, ARPA-NETWORK style operation of robotic systems, in which simple jobs are handled locally while more complex computations are done on a larger scale.

AIM-250

Author[s]: Carl Hewitt

Planner Implementation Proposal to ARPA 1972-1973

December 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-250.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-250.pdf

The task objective is the generalization and implementation of the full power of the problem solving formalism PLANNER in the next two years. We will show how problem solving knowledge can be effectively incorporated into the formalism. Several domains will be explored to demonstrate how PLANNER enhances problem solving.

AIM-249

Author[s]: Seymour Papert

Teaching Children to be Mathematicians vs. Teaching About Mathematics

July 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-249.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-249.pdf

Being a mathematician is no more definable as 'knowing' a set of mathematical facts than being a poet is definable as knowing a set of linguistic facts. Some modern math ed reformers will give this statement a too easy assent with the comment: 'Yes, they must understand, not merely know.' But this misses the capital point that being a mathematician, again like being a poet, or a composer or an engineer, means doing, rather than knowing or understanding. This essay is an attempt to explore some ways in which one might be able to put children in a better position to do mathematics rather than merely to learn about it.

AIM-248

Author[s]: Seymour Papert and Cynthia Solomon

Twenty Things To Do With A Computer

June 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-248.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-248.pdf

When people talk about computers in education they do not all have the same image in mind. Some think of using the computer to program the kid; others think of using the kid to program the computer. But most of them have at least this in common: the transaction between the computer and the kid will be some kind of "conversation" or "questions and answers" in words or numbers.

AIM-247

Author[s]: Seymour Papert

Teaching Children Thinking

October 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-247.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-247.pdf

This paper is dedicated to the hope that someone with power to act will one day see that contemporary research on education is like the following experiment by a nineteenth century engineer who worked to demonstrate that engines were better than horses. This he did by hitching a 1/8 HP motor in parallel with his team of four strong stallions. After a year of statistical research he announced a significant difference. However, it was generally thought that there was a Hawthorne effect on the horses.

AIM-246

Author[s]: Seymour Papert

A Computer Laboratory for Elementary Schools

October 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-246.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-246.pdf

This is a research project on elementary education whose immediate objective is the development of new methods and materials for teaching in an environment of computers and computer-controlled devices. Longer term objectives are related to theories of cognitive processes and to conjectures about the possibility of producing much larger changes than are usually thought possible in the expected intellectual achievement of children. This proposal is formulated in terms of the self-sufficient immediate objectives.

AIM-245

Author[s]: Marvin Minsky and Seymour Papert

Proposal to ARPA for Research on Artificial Intelligence at M.I.T., 1971-1972

October 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-245.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-245.pdf

The activities of the Artificial Intelligence laboratory can be viewed under three main aspects; (1) Artificial Intelligence- understanding the principles of making intelligent machines along the lines discusses in previous proposals, and elaborated below. (2) Natural Intelligence- As we understand intelligence better we see fewer differences between the problems of understanding human and machine intelligence. We have been increasingly able to translate our ideas about programming machines into ideas about educating children, and are currently developing systematic methods in elementary education. And conversely, we attribute to our observations and experience in the latter activities much of what we believe are important new conceptions of how to organize knowledge for programs that really understand. (3) mathematical theories; This aspect is relevant not because we often need to solve specific mathematical problems but especially because we are firmly committed to maintaining a mathematical style in the laboratory. In many centers we have seen decline and deterioration following apparently successful "experiment" in artificial intelligence because the principles behind the performance were not understood, hence the limitations unseen.

AIM-243

Author[s]: Stephen W. Smoliar

Using the EUTERPE Music System

October 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-243.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-243.pdf

This memo describes the practical implementation of programs written in the language EUTERPE. Details of this language are given in the author's thesis (A Parallel Processing Model of Musical Structures) and will not be treated here. We shall only be concerned with the preparation and processing of a EUTREPE source program. Sample programs are given in their entirely in the thesis or may be read off the authors file directory (SWS;). Notational conventions are those of Dowson's guide to the AI lab timesharing system (AI Memo No 215).

AITR-242

Author[s]: Stephen W. Smoliar

A Parallel Processing Model of Musical Structures

September 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-242.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-242.pdf

Euterpe is a real-time computer system for the modeling of musical structures. It provides a formalism wherein familiar concepts of musical analysis may be readily expressed. This is verified by its application to the analysis of a wide variety of conventional forms of music: Gregorian chant, Mediaeval polyphony, Back counterpoint, and sonata form. It may be of further assistance in the real-time experiments in various techniques of thematic development. Finally, the system is endowed with sound-synthesis apparatus with which the user may prepare tapes for musical performances.

AIM-241

Author[s]: Terry Winograd

An AI Approach to English Morphemic Analysis

February 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-241.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-241.pdf

This paper illustrated an approach toward understanding natural language through the techniques of artificial intelligence. It explores the structure of English word-endings both morpho-graphemically and semantically. It illustrated the use of procedures and semantic representations in relating the broad range of knowledge a language user brings to bear on understanding and utterance.

AIM-240A

Author[s]: Donald E. Eastlake

LLSIM Reference Manual

February 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-240a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-240a.pdf

A program that simulates a Digital Equipment Corporation PDP-11 computer and many of its peripherals on the AI Laboratory Time Sharing System (ITS) is described from a user's reference point of view. This simulator has a built in DDT-like command level which provides the user with the normal range of DDT facilities but also with several special debugging features built into the simulator. The DDT command language was implemented by Richard M. Stallman while the simulator was written by the author of this memo.

AIM-240

Author[s]: Donald Eastlake

LLSIM Reference Manual

December 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-240.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-240pdf

AIM-239

Author[s]: M. Beeler, R.W. Gosper and R. Schroeppel

HAKMEM

February 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-239.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-239.pdf

Here is some little know data which may be of i nterest to computer hackers. The items and examples are so sketchy that to decipher them may require more sincerity and curiosity than a non-hacker can muster. Doubtless, little of this is new, but nowadays it's hard to tell. So we must be content to give you an insight , or save you some cycles, and to welcome further contributions of items, new or used.

AIM-238

Author[s]: Donald E. Eastlake

ITS Status Report

April 1972

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-238.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-238.pdf

ITS is a time-shared operating system designed for the Artificial Intelligence Laboratory DEC PDP-10/PDP-6 installation and tailored to its special requirements. This status report described the design philosophy behind the ITS system, the hardware and software facilities of the system implemented with this philosophy, and some information on work currently in progress or desirable in the near future.

AIM-237

Author[s]: Patrick E. ONeil

An Inquiry into Algorithmic Complexity

September 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-237.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-237.pdf

This is the first section in a proposed monograph on algorithmic complexity theory. Future sections shall include: information Theory as a Proof Technique; Algorithms Using Linear Form Inequalities; Some Probabilistic Analyses of Algorithms, etc. Comments, suggestions and corrections are welcomed. Please let me know what you think. This is not a limited distribution document, although I may wish to publish it later. Anyone who develops an idea based on this work to a more advanced state is welcome to publish first. I would be very eager to see any such result as soon as possible.

AITR-235

Author[s]: Terry Winograd

Procedures as a Representation for Data in a Computer Program for Understanding Natural Language

January 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-235.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-235.pdf

This paper describes a system for the computer understanding of English. The system answers questions, executes commands, and accepts information in normal English dialog. It uses semantic information and context to understand discourse and to disambiguate sentences. It combines a complete syntactic analysis of each sentence with a “heuristic understander” which uses different kinds of information about a sentence, other parts of the discourse, and general information about the world in deciding what the sentence means. It is based on the belief that a computer cannot deal reasonably with language unless it can “understand” the subject it is discussing. The program is given a detailed model of the knowledge needed by a simple robot having only a hand and an eye. We can give it instructions to manipulate toy objects, interrogate it about the scene, and give it information it will use in deduction. In addition to knowing the properties of toy objects, the program has a simple model of its own mentality. It can remember and discuss its plans and actions as well as carry them out. It enters into a dialog with a person, responding to English sentences with actions and English replies, and asking for clarification when its heuristic programs cannot understand a sentence through use of context and physical knowledge.

AITR-234

Author[s]: Lawrence W. Krakauer

Computer Analysis of Visual Properties of Curved Objects

May 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-234.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-234.pdf

A method is presented for the visual analysis of objects by computer. It is particularly well suited for opaque objects with smoothly curved surfaces. The method extracts information about the object’s surface properties, including measures of its specularity, texture, and regularity. It also aids in determining the object’s shape. The application of this method to a simple recognition task – the recognition of fruit – is discussed. The results on a more complex smoothly curved object, a human face, are also considered.

AITR-233

Author[s]: Edwin Roger Banks

Information Processing and Transmission in Cellular Automata

January 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-233.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-233.pdf

A cellular automaton is an iterative array of very simple identical information processing machines called cells. Each cell can communicate with neighboring cells. At discrete moments of time the cells can change from one state to another as a function of the states of the cell and its neighbors. Thus on a global basis, the collection of cells is characterized by some type of behavior. The goal of this investigation was to determine just how simple the individual cells could be while the global behavior achieved some specified criterion of complexity – usually the ability to perform a computation or to reproduce some pattern. The chief result described in this thesis is that an array of identical square cells (in two dimensions), each cell of which communicates directly with only its four nearest edge neighbors and each of which can exist in only two states, can perform any computation. This computation proceeds in a straight forward way. A configuration is a specification of the states of all the cells in some area of the iterative array. Another result described in this thesis is the existence of a self-reproducing configuration in an array of four-state cells, a reduction of four states from the previously known eight-state case. The technique of information processing in cellular arrays involves the synthesis of some basic components. Then the desired behaviors are obtained by the interconnection of these components. A chapter on components describes some sets of basic components. Possible applications of the results of this investigation, descriptions of some interesting phenomena (for vanishingly small cells), and suggestions for further study are given later.

AITR-232

Author[s]: Berthold K. P. Horn

Shape from Shading: A Method for Obtaining the Shape of a Smooth Opaque Object from One View

November 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-232.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-232.pdf

A method will be described for finding the shape of a smooth apaque object form a monocular image, given a knowledge of the surface photometry, the position of the lightsource and certain auxiliary information to resolve ambiguities. This method is complementary to the use of stereoscopy which relies on matching up sharp detail and will fail on smooth objects. Until now the image processing of single views has been restricted to objects which can meaningfully be considered two-dimensional or bounded by plane surfaces. It is possible to derive a first-order non-linear partial differential equation in two unknowns relating the intensity at the image points to the shape of the objects. This equation can be solved by means of an equivalent set of five ordinary differential equations. A curve traced out by solving this set of equations for one set of starting values is called a characteristic strip. Starting one of these strips from each point on some initial curve will produce the whole solution surface. The initial curves can usually be constructed around so-called singular points. A number of applications of this metod will be discussed including one to lunar topography and one to the scanning electron microscope. In both of these cases great simplifications occur in the equations. A note on polyhedra follows and a quantitative theory of facial make-up is touched upon. An implementation of some of these ideas on the PDP-6 computer with its attached image- dissector camera at the Artificial intelligence Laboratory will be described, and also a nose-recognition program.

AITR-231

Author[s]: Patrick H. Winston

Learning Structural Descriptions from Examples

September 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-231.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-231.pdf

The research here described centers on how a machine can recognize concepts and learn concepts to be recognized. Explanations are found in computer programs that build and manipulate abstract descriptions of scenes such as those children construct from toy blocks. One program uses sample scenes to create models of simple configurations like the three-brick arch. Another uses the resulting models in making identifications. Throughout emphasis is given to the importance of using good descriptions when exploring how machines can come to perceive and understand the visual environment.

AITR-230

Author[s]: Arnold K. Griffith

Computer Recognition of Prismatic Solids

August 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-230.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-230.pdf

An investigation is made into the problem of constructing a model of the appearance to an optical input device of scenes consisting of plane-faced geometric solids. The goal is to study algorithms which find the real straight edges in the scenes, taking into account smooth variations in intensity over faces of the solids, blurring of edges and noise. A general mathematical analysis is made of optimal methods for identifying the edge lines in figures, given a raster of intensities covering the entire field of view. There is given in addition a suboptimal statistical decision procedure, based on the model, for the identification of a line within a narrow band on the field of view given an array of intensities from within the band. A computer program has been written and extensively tested which implements this procedure and extracts lines from real scenes. Other programs were written which judge the completeness of extracted sets of lines, and propose and test for additional lines which had escaped initial detection. The performance of these programs is discussed in relation to the theory derived from the model, and with regard to their use of global information in detecting and proposing lines.

AITR-229

Author[s]: Wendel Terry Beyer

Recognition of Topological Invariants by Iterative Arrays

October 1969

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-229.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-229.pdf

A study is made of the recognition and transformation of figures by iterative arrays of finite state automata. A figure is a finite rectangular two-dimensional array of symbols. The iterative arrays considered are also finite, rectangular, and two-dimensional. The automata comprising any given array are called cells and are assumed to be isomorphic and to operate synchronously with the state of a cell at time t+1 being a function of the states of it and its four nearest neighbors at time t. At time t=0 each cell is placed in one of a fixed number of initial states. The pattern of initial states thus introduced represents the figure to be processed. The resulting sequence of array states represents a computation based on the input figure. If one waits for a specially designated cell to indicate acceptance or rejection of the figure, the array is said to be working on a recognition problem. If one waits for the array to come to a stable configuration representing an output figure, the array is said to be working on a transformation problem.

AITR-228

Author[s]: Adolfo Guzman-Arenas

Computer Recognition of Three-Dimensional Objects in a Visual Scene

December 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-228.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-228.pdf

Methods are presented (1) to partition or decompose a visual scene into the bodies forming it; (2) to position these bodies in three-dimensional space, by combining two scenes that make a stereoscopic pair; (3) to find the regions or zones of a visual scene that belong to its background; (4) to carry out the isolation of objects in (1) when the input has inaccuracies. Running computer programs implement the methods, and many examples illustrate their behavior. The input is a two-dimensional line-drawing of the scene, assumed to contain three-dimensional bodies possessing flat faces (polyhedra); some of them may be partially occluded. Suggestions are made for extending the work to curved objects. Some comparisons are made with human visual perception. The main conclusion is that it is possible to separate a picture or scene into the constituent objects exclusively on the basis of monocular geometric properties (on the basis of pure form); in fact, successful methods are shown.

AITR-227

Author[s]: Eugene Charniak

CARPS: A Program which Solves Calculus Word Problems

July 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-227.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-227.pdf

A program was written to solve calculus word problems. The program, CARPS (CALculus Rate Problem Solver), is restricted to rate problems. The overall plan of the program is similar to Bobrow’s STUDENT, the primary difference being the introduction of “structures” as the internal model in CARPS. Structures are stored internally as trees. Each structure is designed to hold the information gathered about one object. A description of CARPS is given by working through two problems, one in great detail. Also included is a critical analysis of STUDENT.

AITR-226

Author[s]: Joel Moses

Symbolic Integration

September 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-226.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-226.pdf

SIN and SOLDIER are heuristic programs in LISP which solve symbolic integration problems. SIN (Symbolic INtegrator) solves indefinite integration problems at the difficulty approaching those in the larger integral tables. SIN contains several more methods than are used in the previous symbolic integration program SAINT, and solves most of the problems attempted by SAINT in less than one second. SOLDIER (SOLution of Ordinary Differential Equations Routine) solves first order, first degree ordinary differential equations at the level of a good college sophomore and at an average of about five seconds per problem attempted. The differences in philosophy and operation between SAINT and SIN are described, and suggestions for extending the work presented are made.

AITR-225

Author[s]: Allen Forte

Syntax-Based Analytic Reading of Musical Scores

April 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-225.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-225.pdf

As part of a larger research project in musical structure, a program has been written which “reads” scores encoded in an input language isomorphic to music notation. The program is believed to be the first of its kind. From a small number of parsing rules the program derives complex configurations, each of which is associated with a set of reference points in a numerical representation of a time- continuum. The logical structure of the program is such that all and only the defined classes of events are represented in the output. Because the basis of the program is syntactic (in the sense that parsing operations are performed on formal structures in the input string), many extensions and refinements can be made without excessive difficulty. The program can be applied to any music which can be represented in the input language. At present, however, it constitutes the first stage in the development of a set of analytic tools for the study of so-called atonal music, the revolutionary and little understood music which has exerted a decisive influence upon contemporary practice of the art. The program and the approach to automatic data- structuring may be of interest to linguists and scholars in other fields concerned with basic studies of complex structures produced by human beings.

AITR-224

Author[s]: Adolfo Guzman-Arenas

Some Aspects of Pattern Recognition by Computer

February 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-224.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-224.pdf

A computer may gather a lot of information from its environment in an optical or graphical manner. A scene, as seen for instance from a TV camera or a picture, can be transformed into a symbolic description of points and lines or surfaces. This thesis describes several programs, written in the language CONVERT, for the analysis of such descriptions in order to recognize, differentiate and identify desired objects or classes of objects in the scene. Examples are given in each case. Although the recognition might be in terms of projections of 2-dim and 3-dim objects, we do not deal with stereoscopic information. One of our programs (Polybrick) identifies parallelepipeds in a scene which may contain partially hidden bodies and non- parallelepipedic objects. The program TD works mainly with 2-dimensional figures, although under certain conditions successfully identifies 3-dim objects. Overlapping objects are identified when they are transparent. A third program, DT, works with 3-dim and 2-dim objects, and does not identify objects which are not completely seen. Important restrictions and suppositions are: (a) the input is assumed perfect (noiseless), and in a symbolic format; (b) no perspective deformation is considered. A portion of this thesis is devoted to the study of models (symbolic representations) of the objects we want to identify; different schemes, some of them already in use, are discussed. Focusing our attention on the more general problem of identification of general objects when they substantially overlap, we propose some schemes for their recognition, and also analyze some problems that are met.

AITR-223

Author[s]: William A. Martin

Symbolic Mathematical Laboratory

January 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-223.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-223.pdf

A large computer program has been developed to aid applied mathematicians in the solution of problems in non-numerical analysis which involve tedious manipulations of mathematical expressions. The mathematician uses typed commands and a light pen to direct the computer in the application of mathematical transformations; the intermediate results are displayed in standard text-book format so that the system user can decide the next step in the problem solution. Three problems selected from the literature have been solved to illustrate the use of the system. A detailed analysis of the problems of input, transformation, and display of mathematical expressions is also presented.

AITR-222

Author[s]: Lewis Mark Norton

ADEPT: A Heuristic Program for Proving Theorems of Group Theory

September 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-222.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-222.pdf

A computer program, named ADEPT (A Distinctly Empirical Prover of Theorems), has been written which proves theorems taken from the abstract theory of groups. Its operation is basically heuristic, incorporating many of the techniques of the human mathematician in a “natural” way. This program has proved almost 100 theorems, as well as serving as a vehicle for testing and evaluating special-purpose heuristics. A detailed description of the program is supplemented by accounts of its performance on a number of theorems, thus providing many insights into the particular problems inherent in the design of a procedure capable of proving a variety of theorems from this domain. Suggestions have been formulated for further efforts along these lines, and comparisons with related work previously reported in the literature have been made.

AITR-221

Author[s]: Warren Teitelman

PILOT: A Step Toward Man-Computer Symbiosis

September 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-221.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-221.pdf

PILOT is a programming system constructed in LISP. It is designed to facilitate the development of programs by easing the familiar sequence: write some code, run the program, make some changes, write some more code, run the program again, etc. As a program becomes more complex, making these changes becomes harder and harder because the implications of changes are harder to anticipate. In the PILOT system, the computer plays an active role in this evolutionary process by providing the means whereby changes can be effected immediately, and in ways that seem natural to the user. The user of PILOT feels that he is giving advice, or making suggestions, to the computer about the operation of his programs, and that the system then performs the work necessary. The PILOT system is thus an interface between the user and his program, monitoring both in the requests of the user and operation of his program. The user may easily modify the PILOT system itself by giving it advice about its own operation. This allows him to develop his own language and to shift gradually onto PILOT the burden of performing routine but increasingly complicated tasks. In this way, he can concentrate on the conceptual difficulties in the original problem, rather than on the niggling tasks of editing, rewriting, or adding to his programs. Two detailed examples are presented. PILOT is a first step toward computer systems that will help man to formulate problems in the same way they now help him to solve them. Experience with it supports the claim that such “symbiotic systems” allow the programmer to attack and solve more difficult problems.

AITR-220

Author[s]: Bertram Raphael

SIR: A Computer Program for Semantic Information Retrieval

June 1964

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-220.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-220.pdf

SIR is a computer system, programmed in the LISP language, which accepts information and answers questions expressed in a restricted form of English. This system demonstrates what can reasonably be called an ability to “understand” semantic information. SIR’s semantic and deductive ability is based on the construction of an internal model, which uses word associations and property lists, for the relational information normally conveyed in conversational statements. A format-matching procedure extracts semantic content from English sentences. If an input sentence is declarative, the system adds appropriate information to the model. If an input sentence is a question, the system searches the model until it either finds the answer or determines why it cannot find the answer. In all cases SIR reports its conclusions. The system has some capacity to recognize exceptions to general rules, resolve certain semantic ambiguities, and modify its model structure in order to save computer memory space. Judging from its conversational ability, SIR, is a first step toward intelligent man-machine communication. The author proposes a next step by describing how to construct a more general system which is less complex and yet more powerful than SIR. This proposed system contains a generalized version of the SIR model, a formal logical system called SIR1, and a computer program for testing the truth of SIR1 statements with respect to the generalized model by using partial proof procedures in the predicate calculus. The thesis also describes the formal properties of SIR1 and how they relate to the logical structure of SIR.

AITR-219

Author[s]: Daniel G. Bobrow

Natural Language Input for a Computer Problem Solving System

September 1964

ftp://publications.ai.mit.edu/ai-publications/0-499/AITR-219.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AITR-219.pdf

The STUDENT problem solving system, programmed in LISP, accepts as input a comfortable but restricted subset of English which can express a wide variety of algebra story problems. STUDENT finds the solution to a large class of these problems. STUDENT can utilize a store of global information not specific to any one problem, and may make assumptions about the interpretation of ambiguities in the wording of the problem being solved. If it uses such information or makes any assumptions, STUDENT communicates this fact to the user. The thesis includes a summary of other English language questions-answering systems. All these systems, and STUDENT, are evaluated according to four standard criteria. The linguistic analysis in STUDENT is a first approximation to the analytic portion of a semantic theory of discourse outlined in the thesis. STUDENT finds the set of kernel sentences which are the base of the input discourse, and transforms this sequence of kernel sentences into a set of simultaneous equations which form the semantic base of the STUDENT system. STUDENT then tries to solve this set of equations for the values of requested unknowns. If it is successful it gives the answers in English. If not, STUDENT asks the user for more information, and indicates the nature of the desired information. The STUDENT system is a first step toward natural language communication with computers. Further work on the semantic theory proposed should result in much more sophisticated systems.

AIM-218

Author[s]: Michael Beeler

Information Theory and the Game of JOTTO

August 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-218.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-218.pdf

The word game, JOTTO, has attracted the interest of several computer programmers over the years, not to mention countless devoted players. Rules are: 1.) Each of 2 players selects a 5-letter English word, or a proper noun, as his "secret word." 2.) Play consists of alternate turns of naming a "test word, whose constraints are the same as ton the secret words, and the opponent answering how close the test word is to his secret word. 3.) Closeness is measured in jots; each jot is a one-to-one letter match, and independent of which word is the test word. GLASS versus SMILE or SISSY is 2 jots. 4.) The first payer to guess his opponent's secret word wins. Constraints on a JOTTO program are; First, it must have a dictionary of all possible words at the outset of each game. (The modification of adding newly experienced words to its dictionary is trivial in practice ad not worth the programming efforts, especially since one wants to avoid adding word-like typing errors, etc.) the 9unacceptable) alternative is to have a letter-deducing algorithm and then a "word-proposer" to order the 5 factorial = 120 combinations (perhaps based on diagram frequencies and vowel constraints) once all 5 letters are found. Second, the most use the program can make of the jots from a given test word is to eliminate from its list of "possible secret words of opponent" all those which do not have that number of jots against that test word. Hence, each test word should be chosen to maximize the expected information derived.

AIM-217

Author[s]: W.W. Bledsoe, Robert S. Boyer and William H. Henneman

Computer Proofs of Limit Theorems

June 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-217.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-217.pdf

In this paper we describe some relatively simple changes that have been made to an existing automatic theorem proving program to enable it to prove efficiently a number of the limit theorems of elementary calculus. These changes include subroutines of a general nature which apply to all areas of analysis , and a special "limit-heuristic" design for the limit theorems of calculus.

AIM-216

Author[s]: Mitchell Wand

Theories, Pre-Theories and Finite State Transformations on Trees

May 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-216.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-216.pdf

The closure of an algebra is defined as a generalization of the semigroup of a finite automation. Pretheories are defined as a subclass of the closed algebras, and the relationship between pretheories and the algebraic theories of Lawrence [1963] is explored. Finally, pretheories are applied to the characterization problem of finite state transformations on trees, solving an open problem of Thatcher [1969].

AIM-215A

Author[s]: Mark Dowson

Instant TJ6. How to Get the System to Type Your Papers

September 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-215a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-215a.pdf

TJ6 is a program that takes disk files of text and arranges them so that they can be printed out neatly on 8 1/2 by 11 paper, lines justified, pages numbered, and so on. So that TJ6 will know what to do you must insert instructions to it in your file. AI Memo No 164 A fully describes TJ6 and lists all the instructions available. This note described a useful subset of the instructions to get you started.

AIM-215

Author[s]: Mark Dowson

How to Get onto the System

April 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-215.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-215.pdf

This memo is intended to get very new users started on the MAC AI system. It presents some simple rituals for making and editing fields, getting print outs, making microtapes, and so on. Most of the rituals given are not the only ways of doing something or even necessarily the simplest, but they do work. Some sources of more detailed documentation are referenced at the end of this memo; read then when you want to know more. If you don't understand something or need any kind of help- ask. No one minds; they all know how you feel.

AIM-214

Author[s]: Peter Samson

Linking Loader for MIDAS

March 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-214.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-214.pdf

This memo was originally printed as MAC Memo 268, January 31, 1966. The MIDAS Linking Loader is a PDP-6program to load re-locatable format output from the MIDAS assembler, with facilities to handle symbolic cross-reference between independently assembled programs. Although it is arranged primarily to load from DECtape, the loader is able to load paper-tape re-locatable programs.

AIM-213

Author[s]: Gordon Mumma and Stephen Smoliar

The Computer as a Performing Instrument

February 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-213.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-213.pdf

This memo was originally presented as a Project MAC seminar on February 20, 1970. From the outset, the computer has established two potential roles in the musical arts--the one as a sound synthesizer and the other as a composer (or composer's assistant). The most important developments in synthesis have been due to MAX Matthew at the Bell telephone Laboratories [7]. His music V system endows a computer with most of the capabilities of the standard hardware of electronic music. Its primary advantage is that the user may specify arbitrarily complex sound sequences and achieve then with a minimum of editing effort. Its primary disadvantage is that it is not on-line, so that the user loses that critical sense of immediacy which he, as a composer, may deem valuable.

AIM-211

Author[s]: Michael Stewart Paterson

Equivalence Problems in a Model of Computation

August 1967 (Issued November 1970)

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-211.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-211.pdf

A central problem in the mathematical teory of computers and computation is to find a suitable framework for expressing the ececution of a computer program by a computer. Within the framework we want to be alble to provide answers to such questions as; (1) Does a certain program perform a certain task? (2) Are two programs equivalent, i.e., do they perform the same task? (3) Under what conditions, if at all, will a program fail to help? (4) how can a given program be simplified, in some sense, or made more efficient? These kinds of questions are customarily answered by experienced intuition, for simple programs, supplemented by trial and, often error for more complicated ones. We should like to replace such methods by a formalizable procedure, capable of being carried out by a computer program.

AIM-210

Author[s]: Jeffrey P. Golden

A User's Guide to the A.I. Group LISCOM LISP Complier: Interim Report

December 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-210.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-210.pdf

The LISCOM version of the AI group PDP/6 LISP compiler is a descendant of the original Greenblatt-Nelson compiler, and is a friendly sibling to the COMPLR version maintained by Jon L. White. The compiler operates in two passes to translate LISP code into LAP code. The first pass performs a general study of the S-expression function definition which is to be compiled, producing as output a modified S-expression and various tables attached to free variables. The second pass does the actual compilation (generation of assembly code), making use of the transformations performed and the information gathered by the first pass. The LISCOM version of the compiler is being used as a vehicle for the implementation of "fast arithmetic" in LISP. This work is being done under the auspices of the MATHLAB project of the AI Laboratory. The early stages of the compiler implementation were handled by W. Diffie, and the work has been continued by the present author.

AIM-208

Author[s]: Carl Hewitt

Teaching Procedures in Humans and Robots

September 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-208.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-208.pdf

Analysis of the structure of procedures is central to the foundations of problem soling. In this paper we explore three principle means for teaching procedures: telling, canned loops, and procedural abstraction. The most straightforward way to teach a procedure is by telling how to accomplish it in a high level goal-oriented language. In the method of canned loops the control structure that is needed for the procedure is supposed and the procedure is deduced. In the method of procedural abstraction the procedure is abstracted from protocols of the procedure on examples.

AIM-207

Author[s]: Carl E. Hewitt

More Comparative Schematology

August 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-207.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-207.pdf

Schemas are programs in which some of the function symbols are un-interpreted. In this paper we compare classed of schemas in which various kinds of constraints are imposed on some of the function symbols. Among the classes of schemas compared are program, recursive, hierarchical and parallel.

AIM-206

Author[s]: Thomas O. Binford

The Vision Laboratory: Part One

July 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-206.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-206.pdf

Some of the facilities for vision programming are discussed in the format of a user's manual.

AIM-205

Author[s]: David S. Johnson

Look-Ahead Strategies in One Person Games with Randomly Generated Game Trees

July 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-205.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-205.pdf

A random method for generated binary trees is presented, ad twp forms of a class of one person games called, "Tree Solitaire" which have such trees as their game trees are defined. After what "look ahead strategy" means in terms of such games is discussed, as theorem on the most efficient use of unlimited look-ahead is proved, and a collection of strategies involving 0, 1, or 2 look-ahead per move is introduced. A method involving diagrams is presented for calculation the probability of winning under the various strategies over a restricted class of games. The superiority of one of the l look- ahead strategies over the other is proved for games of the first form on this restricted class. For games of the second form of this class, all the introduced strategies have their chances of winning calculated, and these results are compared among themselves, with the result for the first form of the game, and with the results of Monte Carlo estimation of the chance of winning in a particular case. An approximate methods for evaluating strategies form any given position is introduced, used to explain some of the previous results, and suggest modifications of strategies already defined, which are then evaluated by Monte Carlo methods. Finally, variants on Tree Solitaire are suggested, their general implications are discussed, and using the methods already developed one of the most suggestive variants is studied and the results show a significant reversal from those of the original game, which is explained by the difference in the games on one particular.

AIM-204

Author[s]: Martin Rattner

Extending Guzman's SEE Program

July 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-204.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-204.pdf

Adolfo Guzman's SEE program groups the regions of a two-dimensional scene into bodies, using, using local evidence in the scene to link regions together. This paper discusses an extended version of the SEE procedure that makes extensive use of evidence in the scene which indicated that two regions should be split into separate bodies. The new procedure is better in several ways: 1) it correctly analyzes many scenes for which SEE makes mistakes; 2) it can interact with a higher-level object-recognizing program; 3) it can provide alternative solutions on demand.

AIM-203

Author[s]: Gerald Sussman and Terry Winograd

Micro-Planner Reference Manual

July 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-203.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-203.pdf

Micro-Planner is an implementation of a subset of Cal Hewitt's language, PLANNER by Gerald Jay Sussman, Terry Winograd, and Eugene Charniak on the AI group computer in LISP. Micro-Planner is now a publically accessible systems program in the AI group systems ITS. The current version of Micro-Planner, embedded in an allocated LISP, may be obtained by incanting ':PLNR' or 'PLNR' to DDT. Micro-Planner is also available as EXPR code or LAP code. All questions, suggestions, or comments about Micro-Planner should be directed to Gerald Jay Sussman (login name GJS) who will maintain the program.

AIM-203A

Author[s]: Gerald Jay Sussman, Terry Winograd and Eugene Charniak

Micro-Planner Reference Manual (Update)

December 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-203a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-203a.pdf

This is a manual for the use of the Micro Planner interpreter, which implements a subset of Carl Hewitt's language, PLANNER and is now available for use by the Artificial Intelligence Group.

AIM-202

Author[s]: Michael Beeler

Peter Samson's Music Processor, BIG

July 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-202.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-202.pdf

The contents of this memo are: commands which create a name, commands which create music, playing commands, plotting commands, general utility commands, debugging commands (in relation to relics of the past, features you might hope to live to see), error comments and a final appendix--MUSCOM.

AIM-201

Author[s]: Michael S. Paterson and Carl E. Hewitt

Comparative Schematology

November 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-201.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-201.pdf

While we may have the intuitive idea of one programming language having greater power than another, or of some subset of a language being an adequate 'core' for that language, we find when we try to formalize this notion that there is a serious theoretical difficulty. This lies in the fact that even quite rudimentary languages are nevertheless 'universal' in the following sense. If the language allows us to program with simple arithmetic or list-processing functions then any effective control structure can be simulated, traditionally by encoding a Turing machine computation in some way. In particular, a simple language with some basic arithmetic can express programs for any partial recursive function. Such an encoding is usually quite unnatural and impossibly inefficient. Thus, in order to carry on a practical study of the comparative power of different languages we are led to banish explicit functions and deal instead with abstract, uninterpreted programs or schemas. What follows is a brief report on some preliminary exploration in this area.

AIM-200

Author[s]: Marvin Minsky and Seymour Papert

1968-1969 Progress Report

1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-200.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-200.pdf

This report mainly summarizes the Project MAC A.I. Group work between July 1968 and June 1969 but covers some work up to February 1970. The work on computer vision is described in detail. This summary should be read in conjunction with last year’s A.I. Group Report which is included at the end of this Memo.

AIM-199

Author[s]: Joel Moses

The Function of FUNCTION in LISP, or Why the FUNARG Problem Should be Called the Environment Problem

June 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-199.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-199.pdf

A problem common to many powerful programming languages arises when one has to determine what values to assign to free variables in functions. Different implementational approaches which attempt to solve the problem are considered. The discussion concentrates on LISP implementations and points out why most current LISP systems are not as general as the original LISP 1.5 system. Readers not familiar with LISP should be able to read this paper without difficulty since we have tried to couch the argument in ALGOL-like terms as much as possible.

AIM-198

Author[s]: Edwin Roger Banks

Cellular Automata

June 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-198.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-198.pdf

This paper presents in order 1) a brief description of the results, 2) a definition of cellular automata, 3) discussion of previous work in this area by Von Neumann and Codd, and 4) details of how the prescribed behaviors are achieved (with computer simulations included in the appendices). The results include showing that a two state cell with five neighbors is sufficient for universality.

AIM-197

Author[s]: Terry Winograd

A Simple Algorithm for Self-Replication

May 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-197.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-197.pdf

A recurrent topic of interest in the theory of automata has been the possibility of self-reproducing automata, particularly those which could reproduce globally through an application of a algorithm. In such a device, the "growth" at any point would depend at any time only on the local environment, but overall effect would be the reproduction of complex structures. This paper gives an algorithm of this type (an extension of an algorithm brought to my attention by Professor Fredkin) and examines the conditions under which such replication will occur. The system on which it operates will be defined, and the main theorem on its operation will follow several lemmas.

AIM-196

Author[s]: Lewis Wilson

Hypergeometric Functions in MATHLAB

June 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-196.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-196.pdf

This memo describers some of the important properties and manipulations of Hypergeometric Functions which my be useful in MATHLAB. A convention for representing the function is adopted which is readily adaptable to LISP operation. The most general tye of HGF with which we will be concerned is a function of a single variable, x, and is parametricized by "a" list, of length p, and a "B" list, of length "q". the latter consists, in general, of atoms; the argument is usually x, but may also be a simple function of x.

AIM-195

Author[s]: Thomas L. Jones

INSIM1: A Computer Model of Simple Forms of Learning

April 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-195.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-195.pdf

INSIM1 is a computer program, written in LISP, which models simple forms of learning analogues to the learning of a human infant during the first few weeks of his life, such as learning to suck the thumb and learning to perform elementary hand-eye coordination. The program operates by discovering cause- effect relationship and arranging them in a goal tree. For example, if A causes B, and the program wants B, it will set up A as a subgoal, working backward along the chain of causation until it reaches a subgoal which can be reached directly; i.e. a muscle pull. Various stages of the simulated infant's learning are described.

AIM-194

Author[s]: Michael Beeler

Movie Memo

April 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-194.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-194.pdf

This is intended as brief explanation of how to use the Kodak movie camera in sync with a display.

AIM-192

Author[s]: Richard Orban

Removing Shadows in a Scene

August 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-192.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-192.pdf

This paper describes a LISP function, ERASER, to be used in the process of recognizing objects by a computer. It is a pre-processor to a program called SEE which finds whole bodies in a scene. A short description of SEE and the required data-form for a scene is given. SEE is simulated for five different scenes to demonstrate the effects of shadows on its operation. The function , ERASER is explained through a sequence of operation, the heuristic used and detailed results for test cases. Finally, a "how to use it" section describes the data required to be on the property lists of the vertices in the scene, and the cruft that ERASER puts on these p-lists as it operates.

AIM-190

Author[s]: John L. White

An Interim LISP User's Guide

March 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-190.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-190.pdf

The substance of this memo is to initiate the naïve LISP user into the intricacies of the system at the Project MAC A.I. Lab. It is composed, at this time, of a Progress Report on the development of the LISP system and a few appendices but as such should be nearly adequate to start out a person who understands the basic ideas of LISP, and has understood a minimal part of the LISP 1.5 Primer.

AIM-189

Author[s]: Edwin Roger Banks

Construction of Decision Trees

February 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-189.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-189.pdf

The construction of optimal decision trees for the problem stated within can be accomplished by an exhaustive enumeration. This paper discusses two approaches. The section on heuristic methods gives mostly negative results (E.G. there is no merit factor that will always yield the optimal tests, etc.), but most to these methods do give good results. The section entitled "Exhaustive Enumeration Revisited" indicates some powerful shortcuts that can be applied to an exhaustive enumeration, extending the range of this method.

AIM-188

Author[s]: Manuel Blum, Arnold Griffith and Bernard Neumann

A Stability Test for Configurations of Blocks

February 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-188.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-188.pdf

This work is based on notes provided by Manuel Blum, which are paraphrased in section I, and which contain the examples used in the appendix. The main portion of this report was written by Bernard Neumann, who generalized Blum's results to situation involving friction. The program performing the relevant computation, which appears in the appendix, was written by Arnold Griffith, who compiled this memo.

AIM-187

Author[s]: Marvin Minsky

Form and Content in Computer Science

December 1969

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-187.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-187.pdf

The trouble with computer science today is an obsessive concern with form instead of content. This essay has three parts, suggesting form-content displacements in Theory of Computation in Programming languages and in Education.

AIM-185

Author[s]: Marvin Minsky and Seymour Papert

Proposal to ARPA for Research on Artificial Intelligence at MIT, 1970-1971

December 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-185.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-185.pdf

The MIT Artificial Intelligence Project has a variety of goals all bound together by search for principles of intelligent behavior. Among our immediate goals are to develop systems with practical applications for: Visually- controlled automatic manipulation and physical world problem-solving, machine understanding of natural language text and narrative, and advanced applied mathematics. The long-range goals are concerned with simplifying, unifying and extending the techniques of heuristic programming. We expect the results of our work to: make it easier to write and debug large heuristic programs, develop packaged collections of knowledge about many different kinds of things, lending to programs with more resourcefulness, understanding and common sense", and identify and sharpen certain principles for programming intelligence.

AIM-184

Author[s]: William Martin

Parsing Key Word Grammars

March 1969

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-184.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-184.pdf

Key word grammars are defined to be the same as context free grammars, except that a production may specify a string of arbitrary symbols. These grammars define languages similar to those used in the programs CARPS and ELIZA. We show a method of implementing the LR9k) parsing algorithm for context free grammars which can be modified slightly in order to parse key word grammars. When this is done algorithm can use many of the techniques used in ELIZA parse. Therefore, the algorithm helps to show the relation between the classical parsers and key word parsers.

AIM-183

Author[s]: Annette Herskovits and Thomas O. Binford

On Boundary Detection

July 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-183.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-183.pdf

A description is given of how edge erase of prismatic objects appear through a television camera serving as visual input to a computer. Two types of edge-finding predicates are proposed and compared, one linear in intensity, the other non-linear. A statistical analysis of both is carried out, assuming input data distorted by a Gaussian noise. Both predicates have been implemented as edge-verifying procedures, ie. Procedures aiming at high sensitivity and limited to looking for edges when approximate location and directions are given. Both procedures have been tried on actual scenes. Of the two procedures the non-linear one emerged as a satisfactory solution to line-verification because it performs well in spite of surface irregularities.

AIM-182

Author[s]: Thomas O. Binford

Display Functions in LISP

March 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-182.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-182.pdf

This note describes a system which compiles various forms of LISP lists and arrays into display commands for the DEC 340 display, and provides supporting functions for scaling, for moving elements in a display, for pot control of certain displays, and for adding elements to and removing elements from the display.

AIM-181

Author[s]: Terry Winograd

PROGRAMMER: A Language for Writing Grammars

November 1969

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-181.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-181.pdf

This memo describes PROGRAMMER, a parser for natural language. It consists of a language for writing grammars in the form of programs, and an interpreter which can use these grammars to parse sentence. PROGRAMMER is one part of an integrated system being written for the computer comprehension of natural language. The system will carry on a discourse in English, accepting data statements, answering questions, and carrying out commands. It has a verbally integrated structure, to perform parsing, semantic analysis, and deduction concurrently, and to use the results of each t guide the course of the entire process. This interaction is possible because all three aspects are written in the form of programs. This will allow the system to make full use of its "intelligence" (including non-linguistic knowledge about the subject being discussed) in interpreting the meaning of sentences.

AIM-180

Author[s]: Joel Moses

The Integration of a Class of Special Functions with the Risch Algorithm

September 1969

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-180.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-180.pdf

We indicate how to extend the Risch algorithm to handle a class of special functions defined in terms of integrals. Most of the integration machinery for this class of functions is similar to the machinery in the algorithm which handles logarithms. A program embodying much of the extended integration algorithm has been written. It was used to check a table of integrals and it succeeded in finding some misprints in it.

AIM-179

Author[s]: B.K.P. Horn

The Arithmetic-Statement Pseudo-Ops: .I and .F

August 1969

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-179.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-179.pdf

This is a feature of MIDAS which facilitates the rapid writing and debugging of programs involving much numerical calculation. The statements used are ALGOL-like and easy to interpret.

AIM-178

Author[s]: B.K.P. Horn

The Image Dissector "Eyes"

August 1969

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-178.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-178.pdf

This is a collection of data on the construction operation and performance of the two image dissector cameras. Some of this data is useful in deciding whether certain shortcomings are significant for a given application and if so how to compensate for them.

AIM-177

Author[s]: H.N. Mahabala

Preprocessor for Programs which Recognize Scenes

August 1969

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-177.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-177.pdf

A visual scene is transformed from a very simple and convenient format, to an internal format which describes the same scene, but is more akin to complex manipulations. This format is compatible with programs like "SEE". The entire analysis is done using a basic primitive which gives the orientation of a point with respect to a directed line. A novel handling of inaccuracies in the scene is achieved by considering the lines to be stripes of small but negligible width. The criterion is very general and easy to modify.

AIM-176

Author[s]: Patrick Winston

Discovering Good Regions for Teitelman's Character Recognition Scheme

May 1969

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-176.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-176.pdf

Warren Teitelman presented a novel scheme for real time character recognition in his master's thesis submitted in June of 1963. A rectangle, in which a character is to be drawn, is divided into two parts, one shaded and the other unshaded. Using this division a computer converts characters into ternary vectors in the following way. If a pen enters the shaded region, a 1 is added to the vector. When the unshaded region is entered, a 0 is appended. Finally 1 illustrates the basic idea he used. Thus, with the shading shown, the character V is converted to 1 0 x 1 0.* A V drawn without lifting the pen would yield a 1 0 1. A t gives 1 0 w 1, and so on. Notice that each character may yield several vectors, depending upon the style of the user as well as the division of the rectangle into shaded and unshaded regions. In order to conserve storage space and reduce search time, the character vectors of Teitelman"s scheme are stored in a tree-like structure like that shown in figure 2.

AIM-175

Author[s]: C.K. Chow

On Optimum Recognition Error and Reject Tradeoff

April 1969

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-175.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-175.pdf

The performance of a pattern recognition system is characterized by its error and reject tradeoff. This paper describes an optimum rejection rule and presents a general relation between the error and reject probabilities and some simple properties of the tradeoff in the optimum recognition system. The error rate can be directly evaluated from the reject function. Some practical implications of the results are discussed. Examples in normal distributions and uniform distributions are given.

AIM-174

Author[s]: Richard D. Greenblatt, Donald E. Eastlake III and Stephen D. Crocker

The Greenblatt Chess Program

April 1969

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-174.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-174.pdf

Since mid-November 1966 a chess program has been under development at the Artificial Intelligence Laboratory of Project MAC at M.I.T. This paper describes the state of the program as of August 1967 and gives some of the details of the heuristics and algorithms employed.

AIM-173

Author[s]: Patrick Winston

A Heuristic Program that Constructs Decision Trees

March 1969

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-173.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-173.pdf

Suppose there is a set of objects, {A, B,…E} and a set of tests, {T1, T2,…TN). When a test is applied to an object, the result is wither T or F. Assume the test may vary in cost and the object may vary in probability or occurrence. One then hopes that an unknown object may be identified by applying a sequence if tests. The appropriate test at any point in the sequence in general should depend on the results of previous tests. The problem is to construct a good test scheme using the test cost, the probabilities of occurrence, and a table of test outcomes.

AIM-172

Author[s]: Stewart Nelson and Michael Levitt

Robot Utility Functions

February 1969

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-172.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-172.pdf

This document describes a set of routines which have been provided at both the monitor and user level to facilitate the following operations: 1) Vidissector input; 2) Pot Box input; 3) Arm motion; and 4) Display list generation. This program was developed under contract with Systems Concepts, Incorporated.

AIM-171

Author[s]: Adolfo Guzman

Decomposition of a Visual Scene into Three-Dimensional Bodies

January 1969

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-171.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-171.pdf

The program described here takes as its input a collection of lines, vertices and surfaces describing a scene, and analyzes the scene into a composition of three-dimensional objects. The program does not need to know the form (model, or pattern) of the objects which are likely to appear: the scene is not searched for cubes, wedges, or houses, with an a-priori knowledge of the form of these objects; rather, the program pays attention to configurations of surfaces and lines which would make plausible three-dimensional solids, and in this way "bodies" are identified. Partially occluded bodies are handled correctly. The program is restricted to scenes formed by straight lines, where no shadows or noise are present. It has been tested in rather complicated scenes composed by rather simple objects. Examples are given.

AIM-170

Author[s]: John Holloway

WIRElist

January 1969

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-170.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-170.pdf

This memo describes a design aid used for the automatic production of wirelists for machine or hand wiring of wire-cards.

AIM-169

Author[s]: Donald Eastlake III

PEEK and LOCK

November 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-169.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-169.pdf

This memo describes two small utility programs that are of assistance in using the ITS 1.4 (see A.I. 161, MAC-M-377) time sharing system. LOCK performs miscellaneous utility functions while PEEK displays, with periodic updates, various aspects of the time sharing system's status.

AIM-168

Author[s]: Carl Hewitt

PLANNER: A Language for Manipulating Models and Proving Theorems in a Robot

August 1970 (Revised)

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-168.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-168.pdf

PLANNER is a language for proving theorems and manipulating models in a robot. The language is built out of a number of problem-solving primitives together with a hierarchical control structure. Statements can be asserted and perhaps later withdrawn as the state of the world changes. Conclusions can be drawn from these various changes in state. Goals can be established and dismissed when they are satisfied. The deductive system of PLANNER is subordinate to the hierarchical control structure in order to make the language efficient. The use of a general-purpose matching language makes the deductive system more powerful. The language is being applied to solve problems faced by a robot and as a semantic base for English.

AIM-167

Author[s]: Marvin Minsky and Seymour Papert

Linear Separation and Learning

October 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-167.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-167.pdf

This is a reprint of page proofs of Chapter 12 of Perceptrons, M. Minsky and S. Papert, MIT Press 1968, (we hope). It replaces A.I. Memo No. 156 dated March 1968. The perceptron and convergence theorems of Chapter 11 are related to many other procedures that are studied in an extensive and disorderly literature under such titles as LEARNING MACHINES, MODELS OF LEARNING, INFORMATION RETRIEVAL, STATISTICAL DECISION THEORY, PATTERN RECOGNITION and many more. In this chapter we will study a few of these to indicate points of contact with the perception and to revel deep differences. We can give neither a fully rigorous account not a unifying theory of these topics: this would go as far beyond our knowledge as beyond the scope of this book. The chapter is written more in the spirit of inciting students to research than to offering solutions to problems.

AIM-166

Author[s]: Terry Beyer

Recognition of Topological Invariants by Modular Arrays

September 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-166.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-166.pdf

In this paper we study recognition of topological invariant properties of patterns by use of finite, rectangular 2-dimensional, interactive arrays of finite state automata (hereafter called modular arrays). The use of modular arrays as pattern recognition devices has been studied by Atrubin [1] and by Unger [2]. Our aim is to show that modular arrays can not only recognize a large variety of topological invariants, but can do so in times that are almost minimal for a certain class of machines. We begin by describing our model of the modular array as a pattern recognition connectivity. Next, we introduce a fundamental transformation of patterns and prove several interesting properties of the transformation. Finally, we apply the transformation to modular arrays to obtain fast methods of recognizing a wide variety of topological invariants.

AIM-165

Author[s]: Jean-Yves Gresser

Description and Control of Manipulation by Computer-Controlled Arm

September 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-165.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-165.pdf

The immediate purpose of the research on Intelligent Automata is to have an autonomous machine able to understand uncomplicated commands and to manipulate simple objects without human intervention. This thesis is concerned with the programming of a special output device of the present machine existing at Project MAC: an arm with eight degrees of freedom, made of our identical segments. Classical approaches through hill-climbing and optimal control techniques are discussed. However a new method is proposed to decompose the problem, in an eight-dimensional space, into a sequence of subproblems in spaces with fewer dimensions. Each subproblem can then be solved with simple analytical geometry. A simulation program, which applies this method, is able to propose several configurations for a given goal (expressed as a point in a five-dimensional space).

AIM-164

Author[s]: Larry Krakauer

Producing Memos, Using TJ6, TECO and the Type 37 Teletype

September 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-164.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-164.pdf

This memo describes the TJ6 type justifying program, which can be used in the production of memos, such as this one. In addition, sections III and IV of this memo contain related information about TECO and the type 37 teletype, thus gathering most of the information needed for producing write ups into one location. A sample of input to TJ6 is given in section V, and is in fact the very input used to produce this page of output. The output from TJ6 may be either justified text, with the right margin exactly aligned, as in this introduction, or it may be "filled" text, with the right margin only approximately aligned. Since I do not personally like the appearance of justified text, the remainder of this memo will not be justified, but this decision, of course, rests with each particular user. The sections of this report are: Introduction, using TJ6, Inserting lower case letters into the TECO buffer, How to use a type 37 teletype, and Sample TJ6 input.

AIM-164A

Author[s]: R. Greenblatt, B.K.P. Horn and L.J. Krakauer

The Text-Justifier TJ6

June 1970

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-164a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-164a.pdf

This memo describes the TJ6 type justifying program, which can be used in the production of memos, such as this one. In addition, Appendices 1, 2, and 3 of this memo contain related information about TECO, the "Selectric" and the type 37 teletype, thus gathering most of the information needed for producing write ups into one location. A sample of input to TJ6 is given in section IV and is in fact the very input used to produce this page of output. The output from TJ6 may be either justified text, with the right margin exactly aligned, as in this introduction, or it may be "filled" text, as in this introduction, with the right margin only approximately aligned. The remainder of this memo will be justified. The sections of this memo are: Introduction, Using TJ6, Console operation of TJ6 and Sample TJ6 input. Appendix 1 relates to inserting lower case letters into the TECO buffer, Appendix 2 relates to the "Selectric" output device, and Appendix 3 is how to use a type 37 Teletype.

AIM-163

Author[s]: Patrick H. Winston

Holes

August 1968 (revised April 1970)

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-163.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-163.pdf

This memo originally had two parts. The first dealt with certain deficiencies in an early version of Guzman's program, SEE. The problems have been fixed, and the corresponding discussion has been dropped from this memo. The part remaining deals with line drawings of objects with holes.

AIM-162

Author[s]: Marvin Minsky

Remarks on Visual Display and Console Systems

June 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-162.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-162.pdf

This serves as a preliminary draft of Deluxe Picture Maintenance System, June, 1963. It is Technical Memorandum No. 1.

AIM-161

Author[s]: D. Eastlake, R. Greenblatt, J. Holloway, T. Knight and S. Nelson

ITS 1.5 Reference Manual

June 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-161.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-161.pdf

This reference manual consists of two parts. The first (sections 1 through 6) is intended for those who are either interested in the ITS 1.5 time sharing monitor for its own sake or who wish to write machine language programs to run under it. Some knowledge of PDP-6 (or PDP-10) machine language is useful in reading this part. The second part (sections 7, 8, and 9) describes three programs that run under ITS. The first program (DDT) is a modified machine language debugging program that also replaces the "monitor command" level (where the user is typing directly at the monitor) present in most time-sharing systems. The remaining two (PEEK and LOCK) are a status display anda miscellaneous utility program. It should be remembered that the McCulloch Laboratory PDP-6 and PDP-10 installation is undergoing continuous software and hardware development which may rapidly outdate this manual.

AIM-161A

Author[s]: D. Eastlake, R. Greenblatt, J. Holloway, T. Knight and S. Nelson

ITS 1.5 Reference Manual

July 1969

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-161A.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-161A.pdf

This reference manual consists of two parts. The first (sections 1 through 6) is intended for those who are either interested in the ITS 1.5 time sharing monitor for its own sake or who wish to write machine language programs to run under it. Some knowledge of PDP-6 (or PDP-10) machine language is useful in reading this part. The second part (sections 7, 8, and 9) describes three programs that run under ITS. The first program (DDT) is a modified machine language debugging program that also replaces the "monitor command" level (where the user is typing directly at the monitor) present in most time-sharing systems. The remaining two (PEEK and LOCK) are a status display and a miscellaneous utility program. It should be remembered that the McCulloch Laboratory PDP-6 and PDP-10 installation is undergoing continuous software and hardware development which may rapidly outdate this manual.

AIM-160

Author[s]: B.K.P. Horn

Focusing

May 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-160.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-160.pdf

This memo describes a method of automatically focusing the new vidisector (TVC). The same method can be used for distance measuring. Included are instructions describing the use of a special LISP and the required LISP-functions. The use of the vidisectors, as well as estimated of their physical characteristics is also included, since a collection of such data has not previously been available.

AIM-159

Author[s]: Jayant M. Shah

Numerical Solution of Elliptic Boundary Value Problems by Spline Functions

April 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-159.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-159.pdf

A numerical method for solving linear, two-dimensional elliptic boundary value problems is presented. The method is essentially the Ritz procedure which uses; polynomial spline functions to approximate the exact solution. The spline functions are constructed by defining a polynomial function over each of a set of disjoint subdomains and imposing certain compatibility conditions along common boundaries between subdomains. The main advantage of the methods is that it does not even require the continuity of the spline functions across the boundaries between subdomains. Therefore it is easy to construct classes of spline functions which will produce any specified rate of convergence.

AIM-158

Author[s]: Joel Moses

SARGE: A Program for Drilling Students in Freshman Calculus Integration Problems

March 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-158.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-158.pdf

The SARGE program is a prototype of a program which is intended to be used as an adjacent to regular classroom work in freshman calculus. Using SARGE, students can type their step-by-step solution to an indefinite integration problem, and can have the correctness of their solution determined by the system. The syntax for these steps comes quite close to normal mathematical notation, given the limitations of typewriter input. The methods of solution is pretty much unrestricted as long as no mistakes are made along the way. If a mistake is made, SARGE will catch it and yield an error message. The student may modify the incorrect step, or he may ask the program for advice on how the mistake arose by typing "help". At present the program is weak in generating explanations for mistakes. Sometimes the "help" mechanisms will just yield a response which will indicate the way in which the erroneous step can be corrected. In order to improve the explanation mechanism one would need a sophisticated analysis of students solutions to homework or quiz problems. Experience with the behavior of students with SARGE, which is nil at present, should also help in accomplishing this goal. SARGE is available as SARGE SAVED in T302 2517.

AIM-157

Author[s]: John White

Time-Sharing LISP for the PDP-6

March 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-157.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-157.pdf

This memo written in the style and convention of A.I. memo No. 116A, may be considered an addendum thereto. It should prove to be a welcome updating on the LISP system.

AIM-156

Author[s]: Marvin L. Minsky

Linear Decision and Learning Models

March 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-156.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-156.pdf

This memorandum is a first draft of an essay on the simplest "learning" process. Comments are invited. Subsequent sections will treat, among other things: the "stimulus-sampling" model of Estes, relations between Perceptron-type error, reinforcement and Bayesian-type correlation reinforcement and some other statistical methods viewed in the same way.

AIM-155

Author[s]: William A. Martin

A Left to Right then Right to Left Parsing Algorithm

February 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-155.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-155.pdf

Determination of the minimum resources required to parse a language generated by a given context free grammar is an intriguing and yet unsolved problem. It seems plausible that any unambiguous context free grammar could be parsed in time proportional to the length, n, of each input string. Early (2) has presented an algorithm which parses "many" grammars in the proportional to n, but requires n2 on some. His work is an extension of Knuth’s method. Knuth’s method fails when more than one alternative must be examined by a push-down automation making a left to right scan of the input string. Early’s extension takes all possible alternatives simultaneously without duplication of effort at any given one step. The method presented here continues through the string in order to gain pass, which is made on the symbols accumulated on the stack of the automation. The algorithm is probably more efficient than Early’s on certain grammars; it will fail completely on others. The essential idea may be interesting to those attacking the general problem.

AIM-154

Author[s]: Seymour Papert

The Artificial Intelligence of Hubert L. Dreyfus: A Budget of Fallacies

January 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-154.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-154.pdf

In December 1965 a paper by Hubert Dreyfus revived the old game of generating curious arguments for and against Artificial Intelligence. Dreyfus hit top form in September 1967 with an explanation in the Review of Metaphysics of the philosophically interesting difficulties encountered in constructing robots. The best of these is that a mechanical arm controlled by a digital computer could not reasonably be expected to move fast enough to play ping-pong.

AIM-153

Author[s]: Harold V. McIntosh

REEX: A CONVERT Program to Realize the McNaughton-Yamada Analysis Algorithm

January 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-153.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-153.pdf

REEX is a CONVERT program, realized in the CTSS-LISP of Project Mac, for carrying out the McNaughton-Yamada analysis algorithm, whereby a regular expression is found describing the words accepted by a finite state machine whose transition table is given. Unmodified the algorithm will produce 4n terms representing an n-state machine. This number could be reduced by eliminating duplicate calculations and rejecting ona high level expressions corresponding to no possible path in the same state diagram. The remaining expressions present a serious simplification problem, since empty expressions and null words are generated liberally by the algorithm. REEX treats only the third of these problems, and at that makes simplifications mainly oriented toward removing null words, empty expressions, and expressions of the form XUX*, AuB*A, and others closely similar. REEX is primarily useful to understand the algorithm, but hardly usable for machines with six or more states.

AIM-152

Author[s]: John L. White

PDP-6 IAP

January 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-152.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-152.pdf

LAP is a LISP FEXPR (or FSUBR when compiled) which is executed primarily for its side effect—namely assembling a symbolic listing into core as a machine language subroutine. As such, it is about the most convenient and rapid way for a LISP user to add machine language primitives to the LISP system, especially if the function in question are in a developmental stage and are reasonably small (e.g. 1-500 instructions). Also, the LISP compiler currently gives its results as a file of LAP code, which may then be loaded into core by IAP. Virtually any function definition, whether by DEFPROP, LABEL, or LAP is an extension of LISP’s primitives; and as in any actual programming language, the side-effects and global interactions are often of primary importance. Because of this, and because of the inherently broader range of machine instructions and data formats, a function quite easily described and written in PDP-6 machine language may accomplish what is only most painfully and artificially written in LISP. One must, then, consider the total amount of code in each language to accomplish a given task, the amount of commentary necessary to clarify the intent of the task given the program (in this sense, LISP code rates very high—a major benefit of the confines of LISP is that a good program serves as its own comment, and usually needs no further elucidations), and other considerations of programming convenience.

AIM-151

Author[s]: Carl Hewitt

Functional Abstraction in LISP and PLANNER

January 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-151.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-151.pdf

Presented here is part of the graduate work that I am doing in the much broader area of protocol analysis (see A.I. memo 137). The goal of the function abstraction is to find a procedure that satisfies a given set of fragmentary protocols. Thus functional abstraction is the inverse operation to taking a set of protocols of a routine. The basis technique in functional abstraction (which we shall call IMAGE) is to find a minimal homomorphic image of a set of fragmentary protocols. It is interesting to note that the technique of finding a minimal homomorphic image is the same one used to compute the schematized goal tree in A.I. memo 137. We define (a less than b) to mean that a is erased and b is written in its place. We shall use (a:b) to mean that the value of b is a.

AIM-150

Author[s]: Harold V. McIntosh

CGRU and CONG: CONVERT and LISP Programs to Find the Congruence Relations of a Finite State Machine

January 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-150.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-150.pdf

CRGU is a CONVERT program, CONG its literal transcription into LISP, realized in the CTSS LISP of Project MAC, for finding all the congruence relations of a finite state machine whose transition table is given as an argument. Central to both programs is the hull construction, which forms the smallest congruence relation containing a given relation. This is done by examining all pairs of equivalent elements to see if their images are equivalent. Otherwise the image classes are joined and the calculation repeated. With the hull program, one starts with the identity relation and proceed by joining pairs of congruence classes in previously found partitions, and forming the hull in order to see if he may produce a new partition. The process terminates when all such extensions have been tried without producing any new relations.

AIM-149

Author[s]: Harold V. McIntosh

REC/8: A CONVERT Compiler of REC for the PDP-8

January 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-149.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-149.pdf

REC/8 is a CONVERT program, realized in the CTSS LISP of Project MAC, for compiling RED expressions into the machine language of the PDP-8 computer. Since the compilation consists in its majority of subroutines calls (to be compiled, after removal of LISP parentheses by MACPO-8) the technique is applicable with trivial modification to any other computer having the subroutine jump and indirect transfer instructions. The purpose of the program is both to compile REC expressions and to illustrate the workings of the REC language, and accordingly a description of this language is given. It contains operators and predicates; flow of control is achieved by parentheses which define subexpressions, colon which implies interaction, and semicolon which terminates the execution of an expression. Predicates pass control to the position following the next colon or semicolon, allowing the execution of alternative expression strings.

AIM-148

Author[s]: Harold V. McIntosh

SUBM: A CONVERT Program for Constructing the Subset Machine Defined by a Transition System

January 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-148.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-148.pdf

SUBM is a CONVERT program, realized in the CTSS LISP of Project MAC, for constructing the subset machine with the same behaviour as a given transition system. The program interactively collects the six items defining a transition system: its state set, alphabet, transition function, initial states, accepting states and spontaneous transitions. It then computes the subset machine, producing its state set, transition function, initial state and accepting states.

AIM-147A

Author[s]: Eric Osman

DDT Reference Manual

September 1971

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-147a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-147a.pdf

This memo describes the version of DDT used as the command level of the A.I. Laboratory Time Sharing System (ITS). Besides the usual program control, examination, and modification features, this DDT provides many special utility commands. It also has the capability to control several programs for a user and to a single instruction continue mode and interrupt on read or write reference to a given memory location. This memo was prepared with the assistance of Donald E. Eastlake and many others.

AIM-147

Author[s]: Thomas Knight

A Multiple Procedure DDT

January 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-147.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-147.pdf

This Memo. Describes a version of DDT used as the command level of the A.I. Group PDP-6 Time Sharing System (ITS). Special features include capability to handle multiple jobs, ability to stop open read or write references to a given location, and the ability of system programs to return command strings to be executed by the DDT.

AIM-146

Author[s]: Roland Silver

PICPAC: A PDP-6 Picture Package

October 1967

ftp://publications.ai.mit.edu/ai-publcations/0-499/AIM-146.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-146.pdf

PICPAC is a program to be used for manipulating pictures of real-world scenes. It operated under ITS (the Incompatible Time-Sharing System) under control of a simple on-line command language. It includes facilities for reading pictures from either vidissector, for reading and writing them on disk or microtape, and for displaying or plotting them. It also includes focusing and control functions.

AIM-145

Author[s]: William A. Martin

A Fast Parsing Scheme for Hand-Printed Mathematical Expressions

October 19, 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-145.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-145.pdf

A set of one-line text-book-style mathematical expressions is defined by a context free grammar. This grammar generates strings which describe the expressions in terms of mathematical symbols and some simple positional operators, such as vertical concatenation. The grammar rules are processed to abstract information used to drive the parsing scheme. This has been called syntax-controlled as opposed to syntax-directed analysis. The parsing scheme consists of two operations. First, the X-Y plane is searched in such a way that the mathematical characters are picked up in a unique order. Then, the resulting character string is parsed using a precedence algorithm with certain modifications for special cases. The search of the X-Y plane is directed by the particular characters encountered.

AIM-144

Author[s]: Michael Beeler

I/O Test

October 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-144.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-144.pdf

IO TEST is intended as a hardware testing and debugging aid for use with the PDP-6 and its associated input multiplexer (analog to digital converter) and output multiplexer (digital to analog converter). While all characters typed are echoed, only the following have any effect on the program’ S operations: F, Y, W, V, B, E, D, S, nT, P A.

AIM-143

Author[s]: Marvin Minsky

Stereo and Perspective Calculations

September 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-143.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-143.pdf

A brief introduction to use of projecting coordinates for hand-eye position computations. Some standard theorems. Appendix A reproduces parts of Roberts’ thesis concerning homogenous coordinated and matching of perspectively transformed objects. Appendix B by Arnold Griffith derives the stereo calibration formulae using just the invariance of cross-ratios on projections of lines, and he describes a program that uses this.

AIM-142

Author[s]: Peter Samson

STRING

September 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-142.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-142.pdf

This document describes the STRING programming language which has been implemented on the MAC Artificial Group's PDP-6 computer. In the STRING system, all objects--constants, variables, functions and programs--are stored and processed in the form of strings of characters. The STRING language is unusually concise, yet at the dame time unusually rich in commands, including a strong arithmetic facility.

AIM-141

Author[s]: Stephen Smoliar

EUTERPE-LISP: A LISP System with Music Output

September 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-141.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-141.pdf

EUTERPE (Ai memo no. 129) was designed as a "real-time music program" which would interpret music described as "voice-programs" in DDT. These voice-programs consisted of note words, description of tones to be sounded, and control words which determined the parameters of pitch, tempo, articulation and wave form and allowed for a subroutine feature and transfer within the voice-program. It had been hoped that complex musical forms could be described in terms of a few collections of note words and sequences of control words. However, musical variation and development is more subtle than the developmental power of these control words. Any transformation of musical material may be expressed as a LISP function; therefore, the control words were abandoned and EUTERPE was linked to LISP. The voice-programs would be written and loaded by LISP and played by EUTERPE. The principle function in the system is LOAD which takes two arguments: 1) an absolute location in core and 2) a list of note words. The note words are translated into EUTERPE-readable code and loaded into the proper voice program. The addresses of the first location of each if the six voice programs are SETQed by the system with the names VOICE1, …, VOICE6. The value of LOAD s the next file word in core, so a series of lists may be loaded by bootstrapping.

AIM-140

Author[s]: Marvin Minsky and Seymour Papert

Linearly Unrecognizable Patterns

1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-140.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-140.pdf

The central theme of this study is the classification of certain geometrical properties according to the type of computation necessary to determine whether a given figure has them.

AIM-139

Author[s]: Adolfo Guzman

Decomposition of a Visual Scene into Bodies

September 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-139.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-139.pdf

This memorandum describes a program which finds bodies in a scene, presumably formed by 3-dimensional objects, with some of them perhaps not completely visible.

AIM-138

Author[s]: Michael Speciner

The Calcomp Plotter as an Output Device

July 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-138.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-138.pdf

(1)CHAR PLOT (see AI Memo 125) has been modified for TS. [It may be found on MS4 with the non-TS version]. The following changes should be noted: CRKBRK (now called PLTBRK in the non-TS CHAR PLOT), SUBPLT (which is not needed since PLOTC can be called recursively), PP (ditto), LBUFF and LWBUFF (as the TS system does the buffering) do not exist in the TS version. CRKCHN, now called PLTCHN (in both TS and non-TS versions) does exist. The command 1110 … (go to the effective address at process time) still exists, bit in TS return is with "POPJ P", rather than JRST 12, @ PLTBRK". The character codes 0 and 200 (lower case 0) respectively OPEN and CLOSE the plotter. (2) CHARPL SCOPE may soon be also so modified for TS. (3) SCOPE PLOT is unchanged. (4) None of the above TS routines can be used easily at present due to the lack of TS STINK.

AIM-137

Author[s]: Carl Hewitt

PLANNER: A Language for Proving Theorems

July 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-137.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-137.pdf

The following is a description of SCHEMATISE, a proposal for a program that proves very elementary theorems though the use of planning. The method is most easily explained through an example die to Black.

AIM-136

Author[s]: John L. White

Matrix Inversion in LISP

July 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-136.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-136.pdf

Very shortly there will appear on the vision library tape a field named @IAS which is a collection of compiled SUBR"s for performing general matrix row reduction and inversions. For an array A a call (IAS A NEW N M) performs gaussian row reduction on the first N rows of the array A (and in fact operated on only the first M columns); so that if M>N then the N+1 st through the Mth columns of the output array contain the solutions to the implicit M-N+1 systems of NxN simultaneous linear equations, while the first N columns contain the inverse matrix of A11 ANN. If the NEW is "T" then a new array of size NXM is declared and the answers are stored directly over the input array and no new array declarations are done. Currently, maximization of pivotal elements is not done; thus IAS will give wrong answers on certain numerically ill-conditioned matrices even though they be non-singular. It is possible to remedy this problem, at some expense, if necessary. IAS also uses a portion of binary program space for temporary storage and may give an error message if not enough space is available.

AIM-135

Author[s]: M. Blum and C. Hewitt

Automata On a 2-Dimensional Tape

June 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-135.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-135.pdf

This paper explains our approach to the problem of pattern recognition by serial computer. The rudimentary theory of vision presented here lies within the framework of automata theory. Out goal is to classify the types of patterns that can be recognized by an automaton that scans a finite 2-dimensional tape. For example, we would like to know if an automaton can decide whether or not a given pattern on a tape forms a connected region. This paper should be viewed as a Progress Report on work done to date. Our goal now is to generalize the theory presented here and make it applicable to a wide variety of pattern-recognizing machines.

AIM-134

Author[s]: Jim Bowring

PSEG: Standardization of Data

June 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-134.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-134.pdf

PSEG is a function of one argument--a region name which comes from REGIONLIST, as created by TOPOLOGIST. When it is done, the following data structure exists. *indicates that the data was already stored correctly when PSEG got it. REGIONLIST is a list of region names created by TOPOLOGIST. On the property list of each region are the following indicators: TYPE, OUTERBOUNDARY, NUCLEUS, HOLES, holes, NEIGHBORS, SHAPE, VERTIS, and SEGS. VERTEXLIST and SEGMENTLISTs are also discussed.

AIM-133

Author[s]: Russ Abbott

A Glossary of Vision Terms

June 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-133.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-133.pdf

Underlined terms are included in the glossary.

AIM-132

Author[s]: John L. White

Additions to LAP

July 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-132.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-132.pdf

In addition to the description on page 13 of AI Memo 116A LAP has the following features: Current Assembly Location Reference, Assembly Time Arithmetic, Constants, Multiple Entry Routines, and Defined Machine Operations in LAP. The atom "*" has a SYM value during assembly an integer which is the current cell address being assembled into. Thus (JRST O *) is a well known infinite loop equivalent to A (JRST O A). When LAP encounters a non-atomic argument in the position normally occupied but the address part of an instruction, and it is not one of the recognizable forms (QUOTE atom) (E function) of (C constant), then the assembly time calculates of the list of members are summed and this is the quantity assigned as address. Thus (JRST O (* 1)) is a do-little instruction roughly equivalent to TRA * +1 in FAP.

AIM-131

Author[s]: Arnold K. Griffith

POLYSEG

April 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-131.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-131.pdf

POLYSEG takes as input a list of dotted pairs of numbers. These pairs are assumed to be the co-ordinates of adjacent points along a single closed line. It is further assumed that the x and y co-ordinates of successive points differ by 1, 0, or -1. The output of POLYSEG is a list of dotted pairs of numbers, representing vertices of a polygonal approximation to the figure whose boundary was input. The scale is increased by a factor of four over that of the output; and the output is in fixed or floating point mode; according to the input.

AIM-130

Author[s]: Harold V. McIntosh and Adolfo Guzman

A Miscellaney of Convert Programming

April 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-130.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-130.pdf

CONVERT shares with other programming languages the circumstance that it is was easier to evaluate the language and to learn its uses if it is possible to scrutinize a representative sample of programs which effect typical but simple and easily understood calculations. Consequently we have gathered together several examples of varying degrees of difficulty in order to show CONVERT in action. In each case the CONVERT program, written as a LISP function ready for execution in CTSS, is shown, together with the results of its application to a small variety of arguments, and a general explanation of the program, its intent, form of its arguments, and method of operation. When the notation CLOCK (()) … CLOCK (T) appears, the time f execution has been determined, and is shown, in tenths of seconds immediately after the result has been printed. Since there is no particular organization to the selection of examples, we here give a brief catalogue of them.

AIM-129

Author[s]: Stephen Smoliar

EUTERPE A Computer Language for the Expression of Musical Ideas

April 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-129.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-129.pdf

The electronic medium has vastly increased the amount of material available to the contemporary composer. The various pieces of electronic equipment available today allow one to produce any conceivable sound; yet because of the complex nature of their output, these devices are generally difficult to control and the composer of electronic music may take several hours to prepare but a few minutes of his creation. EUTERPE was designed during the summer of 1966 by Marvin Minsky as a "real-time" music program" to be used at a teletype which was a direct link with a digital computer. The program is an interpreter and compiler, basically a translation device to convert symbolic input into internal machine language of a computer. The symbolic input consists of yup to six "voice-programs" which are strings of words.

AIM-128

Author[s]: Michael Beeler

Hardware and Program Memo About SERVO

March 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-128.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-128.pdf

SERVO is intended as an engineering and programming analyzing and debugging aid for use with devices connected through the input and output multiplexers to the PDP-6. Cannel numbers and values to output, as well as some other numeric arguments, are in octal. Only the frequency of K, N, Q & W, the duration of I & U, and the argument of Z are decimal. Commands are single letters, as follows.

AIM-127A

Author[s]: Roland Silver

LISP Linkage Feature: Incorporating MIDAS into PDP-6 LISP

October 1967 (Revised)

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-127a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-127a.pdf

Some PDP6 LISP users have felt a need for a way to incorporate MIDAS subroutines into LISP. LISP has been changed to let you do this, using files found on the LISP SYSTEM microtape. You write a routine for LISP in much the same way that you write any other MIDAS relocatable subroutine. You must, however, observe the constraints imposed by LISP’s allocation and use of accumulators, and its methods of handling input, output, and interrupts. In addition, you require linkage to LISP before your routine can operate properly: The entry point(s) of the subroutine must be put on the property list(s) of the appropriate atom(s), and the address fields of the instructions pointing to other routines, to list structure, or the other LISP data structures must be set properly. This is done when LISP begins operation—after allocation, but before going into its listen loop. We provide eight macros to ease the job of creating such linkages: SUBR, FSUBR, MACRO, QUOTE, E, SPECIAL, and SYM. If you write "SUBR name" at a location a in your routine, LISP will subsequently ascribe the property SUBR to the atom name, with entry location a. Similar remarks apply to the use of FSBUR, LSBUR, and MACRO. The significance and use of other four macros is perhaps best communicated through examples: 1. An instruction like "MOVEI A,QUOTE(X Y Z)" will be assembled as "MOVEI A,O". At link time, however, LISP will insert the location of list (X Y Z) into the address field of the instruction. 2. 2. Suppose that the atom FOO has the properties shown in Figure 1. Then the instructions "MOVEI A QUOTE FOO", "MOVEM B, SPECIAL FOO", "PUSHJ P, SYM FOO", and "CALL E FOO" will each be assembled with a zero address field, which will be modified at link time to be b, c, 106, and 101, respectively.

AIM-127

Author[s]: Roland Silver

Incorporating MIDAS Routines into PDP-6 LISP

March 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-127.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-127.pdf

AIM-126

Author[s]: Joel Moses

A Quick Fail-Safe Procedure for Determining Whether the GCD of 2 Polynomials is 1

March 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-126.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-126.pdf

One of the most widely used routines in an algebraic manipulation system is a polynomial manipulation package (1,2,3). The crucial operation in such routines is the extraction of the Greatest Common Divisor (GCD) of two polynomials. This operation is crucial because of its frequent use and because it is an expensive operation in regard to time and space. Experiments by Collins(1) have shown that given two polynomials chosen at random, the GCD has a high probability of being 1. Taking into account this probability and the cost of obtaining a GCD (some GCDs of polynomials of degree 5 in two or three variables can take on the order of a minute on the 7094(1), it appears that a quick method of determining whether the GCD is exactly 1 would be profitable. While no such complete method is known to exist, a fail-safe procedure has been found and is described here. A fail-safe procedure is characterized by the fact that when it comes to decision (in this case that the GCD is 1), then the decision is correct. However, the conclusion (i.e. that the GCD is 1) may be true, and the procedure need not arrive at a decision regarding it. It is believed that the fail-safe procedure presented here (and its extension to the linear case) will arrive at a decision quite frequently when the GCD is actually 1.

AIM-125

Author[s]: Donald Sordillo

CHAR PLOT

March 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-125.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-125.pdf

CHAR PLOT is a routine which enables one to use the CalComp plotter as a versatile output device. It is presently available as CHPLOT BIN (English CHAR PLOT) on tape MS 3. The program CHAR PLOT is normally called by a PUSHJ P, PL:OTC with a code representing a command or character (as defined in Appendix I) in accumulator C. Upon calling, the routine will either plot a character or line, or perform an internal control function. A O code initializes the routine, erasing any unexecuted (buffered) commands.

AIM-123

Author[s]: Marvin Minsky and Seymour Papert

Computer Tracking of Eye Motions

March 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-123.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-123.pdf

This memo is to explain why the Artificial Intelligence group of Project MAC is developing methods for on-line tracking of human eye movements. It also gives a brief resume of results to date and the next steps.

AIM-122

Author[s]: Marvin Minsky

Remarks on Correlation Tracking

March 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-122.ps

ftp://publications.ai.mit.edu/ai-publicaitons/pdf/AIM-122.pdf

The problem is to track the motion of part of a field of view. Let us assume that the scene is a two-dimensional picture in a plane perpendicular to the roll axis. (these simplifying assumptions, of course, are a main problem in estimating how the system works in real life). So we can think of the picture as a function f(x,y) in some plane. Now suppose that at time to the scene is fo(x,y) and at some time later it has moved, and is ft(x,y). Suppose also that the scene has not changed, but has only been moved rigidly in the plane. Then an elegant mathematical way to estimate this motion is to compute the cross-correlation of the original and current picture. First let us review a basic simple mathematical fact. Given any function f(x) and any displacement {triangle}, it is true that sf(x)f(x)>_sf(x)f(x+triangle).

AIM-121

Author[s]: Marvin Minsky

Estimating Stereo Disparities

February 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-121.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-121.pdf

An interesting practical and theoretical problem is putting bounds on how much computation one needs to find the stereo-disparity between two narrow-angle stereo scenes. By narrow angle I mean situations wherein the angle subtended by the eyes is a very few degrees: the kind of correlation-disparity method discussed here probably isn’t applicable to the wide-angle stereo we’ll usually use for scene-analysis in the Project. The method we consider is to find the local maximum of local correlation between the left and right scenes, over a range of displacements along the eye-eye axis. Obviously this is a simple-minded method that will fail in certain situations: here we are not interested in bad cases so much as in getting estimates of the minimal computation in the favorable situations. A correlation can be considered as a properly-normalized sum of pairwise products of intensifies (or other surface functions). The correlation, for each disparity d, is obtained by using pairs that are d units apart in visual angle, referred to a standard azimuth scale in each eye. One can imagine a scheme in which the pairs are all different in the retinas.

AIM-120

Author[s]: Marvin Minsky

Vision Memo

February 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-120.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-120.pdf

This Memo proposes a set of systems programs for vision work. Please comment immediately as we should start on it at once. Values stored outside an array range should have no effect, but set an overflow flag: values read outside a range are zero and also should set a flag. Coordinates normally occur as a dotted pair (x. y) in half words. For display purposes, normally the 10 most significant bits are used, but higher resolution options will be available. To specify a sub-array we have to state its size, location and mesh. All sub-arrays will be square. (Generalizing to rectangle is unwise because the natural generalization for later systems will be projective).

AIM-119

Author[s]: Adolfo Guzman

A Primitive Recognizer of Figures in a Scene

January 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-119.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-119.pdf

Given a scene, as seen for instance from a T.V. camera or a picture, it is desired to analyze it to organize, differentiate and identify desired objects or classes of objects (i.e., patterns) in it. The present report describes a program, written in CONVERT, which partially achieves this goal. Two inputs to the program determine its behavior and response: 1. The scene to be analyzed, which is entered in a symbolic format (it may contain 3-dimensional and curved objects). 2. A symbolic description -- called the model -- of the class for the objects we want to identify in the scene (1): Given a set of models for the objects we want to locate, and a scene or picture, the program will identify in it all those objects or figures which are similar to one of the models, provided they appear complete in the picture (i.e., no partial occlusion or hidden parts). Recognition is independent of position, orientation, size etc.; it strongly depends on the topology of the model. Important restrictions and suppositions are: (a) the input is assumed perfect --noiseless-- and highly organized; (b) more than one mode is, in general, required for the description of one object and (c) only objects which appear unobstructed are recognized. Work is continuing in order to drop restriction (c) and to improve (a).

AIM-118

Author[s]: Donald E. Eastlake III

PDP-6 Software Update

January 1967.

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-118.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-118.pdf

Conventions of this memo- Most numbers written in Arabic numerals are octal while all those written out in English are decimal. Underlying a character and immediately preceding it with a vertical bar indicates the character produced by holding down the control key while striking that character except in the case of 1$ which represents an ALT MODE. Characters not indictable with the character set used in this memo or control of such a character are described between angle bracket. The string from the open to the close angle bracket should be considered as one character which may be controlled by underlining and preceding with a vertical bar. Lower case letters in a command string usually indicate a possibly optional variable while capital letters or special characters are constant. Note the special conventions involving [cents] in the MACDMP section. Organization of PDP-6 Software: MACDMP is normally used to load system and user machine language programs. If when one approaches the PDP-6 it is not in MACDMP (which is usually displaying a file directory) one should first try starting at location 177400 which is MACDMPs starting address. If this fails be sure a system tape is mounted of drive number one and try reading in at location 0 (see appendix). If that loses try locations 1 and 2. If still unsuccessful try placing a paper tape of MACDMP in the paper tape reader, turning it on, and starting at location 20 (appendix). If all else fails you can conclude that most of memory is clobbered and load a paper tape of MACDMP according to the instructions on the inside of the left door of the first bay of the PDP-6 to the left of the console.

AIM-117

Author[s]: Russell Noftsker

Hardware Memo - Input Multiplexer Status

October 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-117.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-117.pdf

Note: Computer control of Input Multiplexer and Output Sample and Hold is available when clock and test switches on the I/O box are in "Computer Input" and "Computer Output" positions, respectively. Manual operation of the Input Multiplexer and Output Sample and Hold is available when the same switches are "Clock Mode" and "Test Mode" respectively. In "Test Mode," output commands are derived from input channels 154 through 177 as noted in the current INPUT MULTIPLEXER STATUS. These channels are potentiometer readings from wither Joy Stick Console where Pot No. 1 is at the top and No. 10 is consecutively at bottom. See OUTPUT SAMPLE AND HOLD for Output Channel numbers.

AIM-116A

Author[s]: None listed

PDP-6 LISP (LISP 1.5) Revised

April 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-116a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-116a.pdf

This is a mosaic description of PDP-6 LISP, intended for readers familiar with the LISP 1.5 Programmer’s Manual or who have used LISP on some other computer. Many of the features, such as the display, are subject to change. Thus, consult a PDP-6 system programmer for any differences which may exist between LISP of Oct. 14, 1966 and present LISP on the system tape.

AIM-115

Author[s]: Peter Samson

Program Memo about EYE

December 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-115.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-115.pdf

EYE is a program (on the Vision System tape with the name EYE BALL) which displays on the 340 field of view of the vidisector. The program is controlled by the light pen, which selects various modes and options; and by the box with four pots, to locate the exact area examined.

AIM-114

Author[s]: William Martin

A Step by Step Computer Solution of Three Problems in Non-Numerical Analysis

July 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-114.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-114.pdf

This memo describes the step by step solution of three problems from different fields of applied mathematics. These problems are solved by typing a series of computer commands for the manipulation of symbolic mathematical expressions. These commands are best typed at the PDP-6 console, so that the Type 30 display and the wider range of keyboard symbols can be used. The syntax of commands typed at the PDP-6 will be described. These commands are translated into a string of symbols which are sent to CTSS, where they are parsed into a LISP expression, which is then evaluated. The mathematical operators which are available in the system will be described and then the step by step solution of each of the problems will be given.

AIM-113

Author[s]: Larry Krakauer

A Description of the CNTOUR Program

November 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-113.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-113.pdf

The CNTOUR program plots an intensity relief map of an image which is read from the vidisector camera (TV-B). It may be used as a general purpose aiming, monitoring and focusing program, especially for high-contrast images, for which it produces something like a line drawing.

AIM-113A

Author[s]: Larry Krakauer

CNTOUR

January 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-113a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-113a.pdf

The CNTOUR program plots an intensity relief map of an image which is read from tape, disc, or from either vidisector camera. It is used to examine vidisector images. It may also be used as a general purpose aiming, monitoring and focusing program, especially for high-contrast images, for which it produces something like a line drawing. The program is available both in a time sharing and a non time sharing version.

AIM-112

Author[s]: Donald Sordillo

CHAR PLOT

October 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-112.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-112.pdf

CHAR PLOT is a routine which enables one to use the Calcomp plotter as an output typewriter. This program is stored as CHPLOT BIN [English CHAR PLOT]. In use a code, representing a character of command as defined in Appendix I, is placed into accumulator C. Upon calling the routine the plotter will, either print a character, or set itself into one of several modes. The input to the routine is a word whose 8 low order bits contain a code and whose sign bit must be 0. The routine is entered by MOVE C, [WORD], PUSHJ P, PLOTC. A word=O stops everything and initiates the system. Note: The program starts off in lower case mode. While it is in this mode any attempt to issue a lower-case code causes the computer to hang up. It is suggested that the first call be used to set the routine to upper case and the 8th bit in the code used to shift between upper and lower cases. The symbols P,C and CRKCHN are global and user-defined. Other symbols are PLOTC (Normal entry point), UCTAB (Beginning of lower case table.), LCTAB (Beginning of lower case table, CLNGTH (Routine which returns length of the character which was its argument in Acc. C.

AIM-111

Author[s]: Leslie Lamport

Summer Vision Programs

October 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-111.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-111.pdf

We assume that we are given a square array that describes a scene. The name of the array will be "array." The number of points representing the side length of the array will be called "pts." (I.e., (pts)2 is the total number of entries in the array.)

AIM-110

Author[s]: John White

Figure Boundary Description Routings for the PDP-6sVision Project

September 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-110.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-110.pdf

As a step in the direction of "computer vision," several programs have been written which transform the output of a vidisector into some mathematical descriptions of the boundaries enclosing the objects in the field of view. Most of the discussion concerns the techniques used to transform a sequence of points, presumably representing a curve in the two-dimensional plane of view, into the best-fit conic-curve segment, or best-fit straight line. The resultant output of this stage is a list of such segments, one list for each boundary found.

AIM-109

Author[s]: Donald Sordillo

SCPLOT BIN

October 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-109.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-109.pdf

This program will take a list of display instructions and cause it to be plotted. For further or more detailed information consult with Michael Speciner.

AIM-108

Author[s]: Donald E. Eastlake III

A Primitive Control P Feature

October 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-108.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-108.pdf

A program, some TECO macros, and some small modifications to existing systems software have been written, called PRO, whose purpose is to reduce the large number of control languages and system programs it has been necessary to know about and the large amount of redundant typing it has been necessary to do to effectively use the MAC PDP-6 system. PRO allows a user knowing the command languages on only TECO, DOT, and PRO to effectively edit and debug email absolute programs with a minimum of command typing overhead (systems of this sort are called control P features for historic reasons). The remainder of this memo, which describes PRO and its use in detail, assumes some knowledge of TECO, DOT, and the MAC PDP-6 system. (In this memo the symbol $ always stands for the character ALT MOD).

AIM-107

Author[s]: Donald Sordillo

Music Playing on the PDP-6

August 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-107.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-107.pdf

This memo describes a process of converting coded music into auditory stimuli on the PDP-6. Attached is a copy of the original specifications for the coding (a PDP-1 memo by Peter Samson).

AIM-106

Author[s]: D. Eastlake

An Input Macro for TECO

September 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-106.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-106.pdf

A macro has been written for TECO that enables one to insert characters into the buffer as they are typed with the entire current page (if not greater than the display screen"s height in length) always being displayed. This macro now exists on the MACDMP system tape as a file entitled "CTLP INP".

AIM-105

Author[s]: Tom Knight

Modifications to PDP-6 Teletype Logic

August 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-105.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-105.pdf

The existing teletype logic for the PDP-6 has been modified to accommodate up to four additional teletypes. These were added with a minimum of change to the existing logic, and are easily removable by taking out the cable in 4M2 and replacing the cable in 4M1 with the jumper module.

AIM-104

Author[s]: Jack Holloway

Output to the PDP-6 Calcomp Plotter

August 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-104.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-104.pdf

The plotter on the console of the PDP-6 is currently attached to device number 774, and accepts stepping pulses given under control of a CONO to that device. Its normal mode of operation is to CONO the desired bits on, wait an instruction, and cono a zero.

AIM-103

Author[s]: John White

Additions to Vision Library

August 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-103.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-103.pdf

Modified LAP: Additions have been made to LAP as described in the PDP-6 write-up.

AIM-102

Author[s]: Gerald Jay Sussman and Adolfo Guzman

Summer Vision Group: A Quick Look at Some of Our Programs

July 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-102.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-102.pdf

no abstract

AIM-101

Author[s]: Richard Greenblatt and Donald A. Sordillo

Sides 21

August 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-101.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-101.pdf

SIDES 21 produces a graph consisting of the locations of lines which comprise the sides of either a geometric solid or a plane figure. The representation is in floating point mode, suitable for subsequent processing. The input is a picture intensity-function.

AIM-100

Author[s]: Seymour Papert

The Summer Vision Project

July 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-100.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-100.pdf

The summer vision project is an attempt to use our summer workers effectively in the construction of a significant part of a visual system. The particular task was chosen partly because it can be segmented into sub-problems which allow individuals to work independently and yet participate in the construction of a system complex enough to be real landmark in the development of "pattern recognition". The basic structure is fixed for the first phase of work extending to some point in July. Everyone is invited to contribute to the discussion of the second phase. Sussman is coordinator of "Vision Project" meetings and should be consulted by anyone who wishes to participate. The primary goal of the project is to construct a system of programs which will divide a vidisector picture into regions such as likely objects, likely background areas and chaos. We shall call this part of its operation FIGURE-GROUND analysis. It will be impossible to do this without considerable analysis of shape and surface properties, so FIGURE-GROUND analysis is really inseparable in practice from the second goal which is REGION DESCRIPTION. The final goal is OBJECT IDENTIFICATION which will actually name objects by matching them with a vocabulary of known objects.

AIM-99

Author[s]: Adolfo Guzman and Harold McIntosh

CONVERT

June 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-099.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-099.pdf

A programming language is described which is applicable to problems conveniently described by transformation rules. By this we mean that patterns may be prescribed, each being associated with a skeleton, so that a series of such pairs may be searched until a pattern is found which matches an expression to be transformed. The conditions for a match are governed by a code which allows sub-expressions to be identified and eventually substituted into the corresponding skeleton. The primitive patterns and primitive skeletons are described, as well as the principles which allow their elaboration into more complicated patterns and skeletons. The advantages of the language are that it allows one to apply transformation rules to lists and arrays as easily as strings, that both patterns and skeletons may be defined recursively, and that as a consequence programs may be stated quite concisely.

AIM-98

Author[s]: Peter Samson

PDP-6 LISP

June 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-098.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-098.pdf

This is a mosaic description of PDP-6 LISP, intended for readers familiar with the LISP 1.5 Programmer’s Manual or who have used LISP on some other computer. Some of the newer features (e.g. the display) are experimental and subject to change; in such respects this should not be regarded as a final document. Some Distinctive characteristics: Top-level type in is to EVAL. There is no EVALQUOTE. EQUAL will not correctly compare fixed-point numbers to floating-point. Also (ZEROP 0.0) is NIL. T and NIL evaluate to T and NIL. There are not *T* and F. Interpreted variables, and variable used free in compiled functions, are automatically SPECIAL and may be used without restriction to communicate values. Also any PROG and LAMBDA variables in a compiled function may be declared SPECIAL, and will be bound and restored correctly. COMMON does not exist. Flags are not allowed; elements on a property list of an atom are expected to be paired. MAP, MAPCAR, etc. assume the first argument is the function, and the second is the list. Defining of functions is usually done with DEFPROP.

AIM-97A

Author[s]: Joel Moses

Symbolic Integration II

October 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-097a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-097a.pdf

In this memo we describe the current state of the integration program originally described in AI Memo 97 (MAC-M-310). Familiarity with Memo 97 is assumed. Some of the algorithms described in that memo have been extended. Certain new algorithms and a simple integration by parts routine have been added. The current program can integrate all the problems which were solved by SAINT and also the two problems which were solved. Due to the addition of a decision procedure the program is capable of identifying certain integrands (such as e or e/ x) as not integrable in closed form.

AIM-97

Author[s]: Joel Moses

Symbolic Integration

June 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-097.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-097.pdf

A program has been written which is capable of integrating all but two of the problems solved by the Siagle's symbollic integration program SAINT. In contrast to SAINT, it is a purely algorithmic program and it has achieved running times two to three orders of magnitude faster than SAINT. This program and some of the basic routines which it uses are described. A heuristic for integration, called the Edge heuristic, is presented. It is claimed that this heuristic with the aid of a few algorithms is capapble of solving all the problems solved by the algorithmic program and many others as well.

AIM-96

Author[s]: Adolfo Guzman

POLYBRICK: Adventures in the Domain of Parallelepipeds

May 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-096.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-096.pdf

A collection of programs tries to recognize, which one more successfully than its predecessor, 3-dimensional parallelepipeds (solids limited by 6 planes, parallel two-by-two), using as data 2-dimensional idealized projections. Special attention is given to the last of those programs; the method used is discussed in some detail and, in the light of its success and failures, a more general one is proposed.

AIM-95

Author[s]: Adolfo Guzman and Harold McIntosh

A Program Feature for CONVERT

April 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-095.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-095.pdf

A program feature has been constructed for CONVERT, closely modeled after the similar facility found in many versions of LISP. Since it is functional or operational in nature, it has been included as a skeleton form, together with a number of related operator skeletons. This Memo describes them, and also the RUL mode, which allows the user to specify arbitrary components of a pattern as the result of a computation performed while the matching process is taking place.

AIM-94

Author[s]: Arnold Griffith

A New Machine-Learning Technique Applied to the Game of Checkers

March 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-094.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-094.pdf

This paper described a recent refinement of the machine--learning process employed by Samuel (1) in connection with his development of a checker playing program. Samuels checker player operates in much the same way a human player does; by looking ahead, and by making a qualitative judgment of the strength of the board positions it encounters. A machine learning process is applied to the development of an accurate procedure for making this strength evaluation of board positions. Before discussing my modifications to Samuels learning process, I should like to describe briefly Samuel’s strength evaluation procedure, and the associated learning process.

AIM-93

Author[s]: Robert R. Fenichel and Joel Moses

A New Version of CTSS LISP

February 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-093.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-093.pdf

A new version of the CTSS LISP is now available. The new system provides additional data storage and several new functions and constants. The I/O capabilities, EXCISE, the error comments, and several routines have been improved. Musch irrelevant code and many bugs have all been removed. FAP source decks and BOD listings are available. The decks are organized so as to ease the job of assembling private LISP systems in which uneeded features are absent. Without reassembling, the user can create a private LISP system in which the data storage space has been arbitrarily allocated among binary program space, the push-down list, full word space, and free storage.

AIM-92

Author[s]: Michael Levin

Topics in Model Theory

January 1966

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-092.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-092.pdf

The concept of free as in "free group" is generalized to any first order theory. An interesting class of homomorphisms between models is discussed. Relations between model theory and abelian categories are discussed speculatively. This paper represents an incomplete study and may contain serious errors. A knowledge of model theory, and of MIT course 18.892 in particular is assumed.

AIM-91

Author[s]: Timothy Hart

A Useful Algebraic Property of Robinson's Unification Algorithm

November 1965

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-091.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-091.pdf

This memo presupposes some acquaintance with "A Machine Oriented Logic Based on the Resolution Principle", J.A. Robinson, JACM Jan65. The reader unfamiliar with this paper should be able to get a general idea of the theorem if he knows that OA is a post operator indicating a minimal set of substitutions (most general substitution) necessary to transform all elements of the set of formulae, A, into the same element (to "unify" A), so that when OA exists AOA is a set with one element (a "unit"). Example: A={f(x),y f(g(u)), f(g(z))} UA= {g(u)/x, f(g(u))/y, u/z} AOA= {f(g(u))} Another most general unifier of A is {g(z)/x, f(g(z))/y, z/u}.

AIM-90

Author[s]: Peter Samson

MIDAS

October 1968

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-090.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-090.pdf

The MIDAS linking loader is a PDP-6 program to load relocatable-format output from the MIDAS assemblers, with facilities to handle symbolic cross-reference between independently assembled programs. Although it is arranged primarily to load from DECtape, the loader is able also to load paper-tape relocatable programs. To use the loader, load it off the MACDMOP SYSTEM tape as the file STINK (A file STINK NEW may exist, repairing old bugs or introducing new features.) Then the loader expects commands to be typed in on the on-line Teletype; two successive ALT MODE characters terminate the string of commands. The commands in a string are not performed until the string is thus terminated. While a command in a string has not been terminated, RUBOUT will erase the last typed-in character (and type it out again as a reminder). A command string may contain any number of commands, and the effect is the same whether the commands are together in one string or are in successively typed-in strings each delimited by two ALT MODES.

AIM-89

Author[s]: Ward Douglas Maurer

A Theory of Computer Instructions

September 1965

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-089.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-089.pdf

This paper has arisen from an attempt to determine the nature of computer instructions from a viewpoint of general function and set theory. Mathematical machines, however the term is understood, are not adequate models for the computers of today; this is true whether we are talking about Turning machines, sequential machines, push-down automata, generalized sequential machines, or any of the other numerous machine models that have been formulated I the last fifteen years. Most of these models are either not general enough, as the sequential or Turning machines with their single input and output devices; or capable of accurately reproducing only one important programming feature; or in a sense too general (see discussion of sequential machines in Chapter 10 below). On the other hand, modern computers, whether they are binary, decimal, or mixed, whether they have one or two instructions per word, or one instruction covering several words, have several important common features, All of their instructions have input, output, and affected regions (in the sense of Definitions B and K below). The study of the input and output regions and the structure of affected regions of all the instructions on a given computer can provide a key to its logical efficiency.

AIM-89A

Author[s]: W.D. Maurer

Computer Experiments in Finite Algebra

June 1965

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-089a.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-089a.pdf

The experiments described here concern an initial design for a computer system specifically for the handling of finite groups, rings, fields, semigroups, and vector spaces. The usefulness of such a system was discussed in (1). The system has been coded MAD, with certain subroutines in FAP, for the IBM 7094, and is designed to operate in a time-sharing environment.

AIM-89B

Author[s]: W.D. Maurer

Computer Experiments in Finite Algebra-II

December 1965

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-089b.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-089b.pdf

In a previous memo (Computer Experiments in Finite Algebra, MAC-M-245) we described a computer system for the handling of finite groups, semigroups, subsets, finite maps, and constants. This system has been extended to read and write disk files; a mechanical procedure has been developed for extending the system; and a program (the inferential Compiler) has been written which accepts a source language consisting of mathematical statements in a standard format and compiles code which verifies these statements over a file or files of special cases (including possible counterexamples). Three limitations of the system were mentioned in the previous memo. Of these, (1) and (3) have been effectively eliminated in the current system. Limitation (2) still exists and will be overcome only in ALGEBRA III, which is briefly described in section 4.

AIM-88

Author[s]: Peter Samson

MACTAP: A PDP-6 DECtape Handling Package

September 1965

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-088.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-088.pdf

MACTAP is a set of PDP-6 subroutines to read and write DECtape in the MAC file format (see MAC-M-249). Programmers can call these subroutines for input or output of ASCII data, which will be compatible with TECO files; or for binary (36. -bit word) data. They were extracted mainly from PDP-6 TECO and arranged and checked out in their present form by Jack Holloway.

AIM-87

Author[s]: Warren Teitelman

FLIP - A Format List Processor

July 1967

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-087.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-087.pdf

This memo describes a notion of programming language for expressing, from within a LISP system, string manipulation such as those performed in COMIT. The COMIT formalism has been extended in several ways: the patterns (the left-half constituents of COMIT terminology) can be variable names of the results of computation; predicates can be associated with these elementary patterns allowing more precise specifications of the segments they match; the names of elementary patterns themselves may be variable or the results if computation; it is no longer necessary to restrict the operations to a linear string of characters (or words) since elementary patterns can themselves match structures; etc. Similar generalizations exist for formats, i.e. what corresponds to the right-half of the COMIT rule.

AIM-86

Author[s]: Marvin Minsky

Design of the Hand

August 1965

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-086.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-086.pdf

The following scheme for designing a general-purpose manipulator organ has many theoretical attractions. The basic idea is perhaps best conceived as a theoretical, or mathematical, idea. While it is unlikely that the actual system will be very much like it, it may have value as a sort of ideal against whose elegance we can match engineering and practical compromise.

AIM-85

Author[s]: William Martin

Syntax and Display of Mathematical Expressions

July 1965

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-085.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-085.pdf

A LISP program converts a mathematical expression stored in list structure form, into a text-book style visual display. To do this, requires the selection and positioning of the individual symbols which make up the expression, using a combination of global and local information. This memo describes a table-driven picture-compiler which makes the necessary information available. Syntax rules have been written for a large class of mathematical expressions. These rules are simplified by the introduction of concepts concerning the relative position of symbols. In addition to the symbols and their coordinates the program sends a parsing of the symbols to the display. This program is a refinement of the system proposed by M.L. Minsky in Artificial Intelligence Memo 61, 'Mathscope: Part I'.

AIM-84

Author[s]: Warren Teitelman

EDIT and BREAK Functions for LISP

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-084.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-084.pdf

This memo describes some LISP functions which have been found to be extremely useful in easing the often painful process of converting the initial versions of LISP programs into final debugged code. They are part of a much larger system currently being developed but may be used as two independent packages. The break package contains a more sophisticated break function than that in the current CTSS version of LISP, which includes facilities for breaking on undefined functions as well as SUBRS and FEXPS, plus a selective TRACE feature.

AIM-83

Author[s]: Peter Samson

Use of MACDMP

July 1965

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-083.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-083.pdf

MACIMP is a PDP-6 program which can load from DECtape to core memory, dump core onto DECtape, or verify a previously dumped filel against memory. Normally, just before it loads, it clears all of memory to 0 (except itself and locations 0 through 37); and, in general, it does not dump locations containing 0. (It also does not dump itself, or locations 0 through 37.) In this way, a short program uses only a few blocks on tape. MACIMP uses the MAC PDP-6 file structure and directory scheme, and writes files in mode 1.

AIM-82

Author[s]: Peter Samson

MAC PDP-6 DECtape File Structure

July 1965

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-082.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-082.pdf

The MAC system programs, MACDMP, TECO, and MIDAS, assume a certain data structure on DECtapes which they handle. Each DECtape has 1100 blocks of 200 words, numbered 0 through 1077. Block 0 and blocks 1070 through 1077 are not used by the MAC system. Block 100 of each tape contains the File Directory: a 200-word table describing the current contents of blocks 1 through 1067. The data on the tape is organized into files, each file consisting of one or more blocks. Each file has a name and a mode: the name is composed of 2 six-character subnames, and the mode is a two-bit number. The File Directory has space for 27 files.

AIM-81

Author[s]: Peter Samson

PDP-6 TECO

July 1965

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-081.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-081.pdf

TECO is a scope-keyboard text- editor. It uses an on-line command language (which permits macro-definitions, corditional, etc.) as well as text operations. The macro language permits the most sophisticated search, match, and substitution operations as well as simple typographical corrections to text.

AIM-80

Author[s]: William Martin

PDP-6 LISP Input-Output for the Display

June 1965

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-080.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-080.pdf

An intermediate level language for display programming has been embedded in LISP 1.5 The language is intended as a basis for higher analysis of display information. Through the construction of a hierarchy of LISP functions it will be possible to assign a complicated meaning to a series of simple light pen motions, or to construct a complex picture. The intermediate level of language should abstract from the light pen trajectory the information which these LISP functions require and provide basis for time, and programming effort. The first section of this memo discusses the system and gives programming examples. The details of the examples can be understood by reading the second section which discusses the implementation and the LISP functions available.

AIM-79

Author[s]: William A. Martin

PDP-6 LISP Input-Output for the Dataphone

June 1965

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-079.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-079.pdf

A version of LISP 1.5 for the PDP-6 Computer has been extended to include IO through the dataphone. This makes possible communication between programs running in Project MAC time sharing and LISP programs running on the PDP-6. The method of handling input-output for the dataphone is similar to that for the typewriter, paper tape punch, and paper tape reader. Three useful LISP functions are presented as examples of dataphone programming.

AIM-78

Author[s]: Michael Levin

Topics in Model Theory

May 1965

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-078.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-078.pdf

The concept of "free" as in free group and free semi-group is extended to arbitrary first order theories. Every consistent theory has free models. Some problems of obtaining a categorical theory of models are discussed.

AIM-77

Author[s]: Marvin Minsky

Matter, Mind and Models

March 1965

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-077.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-077.pdf

This paper attempts to explain why people become confused by questions about the relation between menal and physical events. When a question leads to confused, inconsistent answers, this may be (1) because the question is ultimately meaningless or at least unanswerable, but it may also be (2) because an adequate answer requires a powerful analytical apparatus. My view is that many important questions about relation between mind and brain are of this latter kind, and that some of the necessary technical and conceptual tools are becoming available as a result of work on he problems of making computer programs behave intelligently. In this paper we suggest a theory of why introspection does not give clear answers to these questions. The paper does not go very far toward finding technical solutions to the questions, but there is probably some value in finding at least a clear explanation of why we are confused.

AIM-76

Author[s]: Daniel G. Bobrow

The COMIT Feature in LISP II

February 1965

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-076.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-076.pdf

The purpose of COMIT feature is to facilitate certain types of list manipulations in LISP II. This feature is a syntactic convenience, rather than an extension of the semantics of LISP. It permits the programmer to test directly whether a piece of list structure matches a certain pattern, and if so, to construct another structure utilizing subsegments of the original structure which matched parts of the given pattern.

AIM-75

Author[s]: Marvin Minsky

Television Camera-To-Computer Adapter: PDP-6 Device 770

January 1965

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-075.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-075.pdf

The TVA (Television Adaptor) is a data-input device just completed. Any standard Closed-Circuit Television Camera can be connected to the PDP-6, without modification, by a single BNC connector. Then a simple program can make a digitized image of selected size and position appear in core memory. Operation is automatically controlled by the PDP-6 priority-interrupt system so that, to the programmer, the core-image is automatically read-in and maintained. This is an open invitation to come in and discuss applications. We are particularly interested in (i) projects leading to a working page-reader system, first for teletype character sets and later to include recognition of larger alphabets and hand-written corrections, and (ii) projects leading to recognition functions that will be useful in coordination with the mechanical hand system.

AIM-74

Author[s]: T. Hart

CTSS LISP Notice-Supplement to A.I. Memo No. 67

December 1964

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-074.ps

ftp://publications.ai.mit.edu/ai-publication/pdf/AIM-074.pdf

The LISP system (command version) has been updated. Bugs are corrected include: 1. out of pushdown list in compiled function will not transfer to 77777. 2. with compiler printing turned off by comprint, it is truly off. 3. "ERROR54A/" when running comiled program no longer occurs. 5. CSET and CSETQ have their proper values. 6. the public versions of PRINT DATA and EDIT DATA have been improved. In particular, the function DEFINELIST has been removed from PRINT; EDIT has had a minor bug in filelistadd corrected, and the functions filelistdelete [1; x; y] and extract [1; n; m] added. The former deletes the function on the list 1, from file n m and writes a new file n EDIT with these changes made. The latter extracts the function 1 from the file n DATA and adds them to the file m DATA, updating the disc by writing appropriate EDIT class files.

AIM-73

Author[s]: Marvin Minsky and Seymour Papert

Unrecognizable Sets of Numbers

November 1964

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-073.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-073.pdf

When is a set A of positive integers, represented as binary numbers, "regular" in the sense that it is a set of sequences that can be recognized by a finite-state machine? Let pie A(n) be the number of members of A less than the integer n. It is shown that the asymptotic behavior of pie A(n) is subject to severe restraints if A is regular. These constraints are violated by many important natural numerical sets whose distribution functions can be calculated, at least asymptotically. These include the set P of prime numbers for which pie P(n)~n/log n for large n, the set of integers A (k) of the form n to the power k for which pie A(k)(n)1/k, and many others. The technique cannot, however, yield a decision procedure for regularity since for every infinite regular set A there is a nonregular set A for which /pie Z(n)-pie A(n)/is less than or equal to 1, so that the asymptotic behaviors of the two distribution functions are essentially identical.

AIM-72

Author[s]: Michael Levin

Proposed Instructions on the GE 635 for List Processing and Push Down Stacks

September 1964

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-072.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-072.pdf

The instructions that transmit data between the index registers and the memory work only on the left half (address) portion of memory. These instructions are LDXn (load index n from address of storage word). And STXn (store the contents of index n in address of storage word). The effective address of both of these instructions includes modification by index registers. A corresponding set of instructions for transmitting data to or from the right half of memory would facilitate list structure operations. The present order code makes it impossible to so list-chaining operations (car or cdr) without disturbing the A or Q registers.

AIM-71

Author[s]: Daniel G. Bobrow

String Manipulation in the New Language

July 1964

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-071.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-071.pdf

String manipulation can be made convenient within the *** language by implementing two functions: 1) match [workspace; pattern] and 2) construct {format;pmatch]. In this memo I describe how I think these two functions can be implemented, and how they might be used to express operations now conveniently denoted in string manipulation languages such as COMIT, SNOBOL, and METEOR.

AIM-70

Author[s]: William A. Martin

Hash-Coding Functions of a Complex Variable

June 25, 1964

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-070.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-070.pdf

A common operation in non-numerical analysis is the comparison of symbolic mathematical expressions. Often equivalence under the algebraic and trigonometric relations can be determined with the high probability by hash-coding the expressions using finite field arithmetic and then comparing the resulting hash-code numbers. The use of this scheme in a program for algebraic simplification is discussed.

AIM-69

Author[s]: Michael Levin

New Language Storage Conventions

May 1964

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-069.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-069.pdf

These conventions are for the implementation of the new language on a large computer on which time-sharing is the standard role of operation. Each user is at any time asigned a certain amount of primary storage. This can eb the entire memory of the machine for non time-shared operation. When this quota is filled, then it is necessary either to extend it, or to have the reclaimer routine compact the user's storage. This decision can be made at run time and may be based on the user's storage requirements, and on the cost of primary memory at that particular instant. This may in turn depend on the degree of saturation of the system.

AIM-68

Author[s]: Michael Levin

Syntax of the New Language

May 1964

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-068.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-068.pdf

This is a definition of the syntax of the *** language. It consists of modifications and extensions of the "Revised Report on the Algorithmic Language ALGOL 60" which is printed in the "Communications of the ACM", January 1963. The paragraph numbering of that report is used in this paper. The corrections and additions are made partially in Backus normal form, and partially in English, and the choice has been made on the basis of convenience. For example, the use of the weak separator is described readily in a few sentences, whereas the modification to incorporate this into the syntax as described in Backus normal form would have been extensive.

AIM-67

Author[s]: William Martin and Timothy Hart

REVISED USER'S VERSION - Time Sharing LISP

April 1964

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-067.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-067.pdf

This memo describes changes to the LISP system by several people. The changes reduce printout and give the user more control over it. They also make it possible for LISP to communicate with the teletype and the disk. The last sections describe programs available in the public files which are useful for printing, editing, and debugging LISP functions.

AIM-66

Author[s]: Daniel G. Bobrow

Natural Language Input for a Computer Problem Solving System

March 1964

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-066.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-066.pdf

This paper describes a computer program which accepts and "understands" a comfortable, but restricted set of one natural language, English. Certain difficulties are inherent in this problem of making a machine "understand" English. Within the limited framework of the subject matter understood by the program, many of these problems are solved or circumvented. I shall describe these problems and my solutions, and point out those solutions which I feel have general applicability. I will also indicate which must be replaced by more general methods to be really useful, and give my ideas about what general solutions to these particular problems might entail.

AIM-65

Author[s]: Marvin Minsky

The Graphical Typewriter: A Versatile Remote Console Idea

January 1964

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-065.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-065.pdf

It would be useful to develop a combination typewriter-plotter along the lines described below. The device could be coupled to a telephone line with a reasonably small amount of electronics -- mostly relays.

AIM-64

Author[s]: Timothy P. Hart and Michael Levin

LISP Exercises

January 1964

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-064.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-064.pdf

The following exercises are carefully graded to mesh with the sections in Chapter I, "The LISP Language", in the LISP 1.5 Programmer's Manual. Each exercise should be worked immediately after reading the manual section indicated.

AIM-63

Author[s]: Daniel J. Edwards

Secondary Storage in LISP

December 1963

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-063.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-063.pdf

A principal limitation of LISP processors in many computations is that of inadequate primary random-access storage. This paper explores several methods of using a secondary storage medum (such as drums, disk files or magetic tape) to augment primary storage capacity and points out some limitations of these methods

AIM-62

Author[s]: Marvin Minsky

DERIVATOR I: A Program for Visual Inspection of Solutions to First-Order Non-Linear Differential Equations

December 1963

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-062.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-062.pdf

Derivator is a PDP-1 program for examining the solutions to differential equations by inspection of a visual display of trajectories. Because fixed-point arithmetic is used (in order to maintain visual display speeds), Derivator must be regarded as a qualitative tool. It is subject to truncation error in the trajectory-following program, and round-off error due to 'underflow' in the function-definition programs for dy and dx. Still it appears to be very suitable for studying topology of solutions around singularities, etc. The display shows the solution curves ('characteristics') in the x-y plane. They are generated parametrically.

AIM-61

Author[s]: Marvin Minsky

MATHSCOPE Part I: A Proposal for a Mathematical Manipulation-Display System

November 1963

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-061.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-061.pdf

Mathscope: A compiler for two-dimensional mathematical picture syntax. Mathscope is a proposed program for displaying publication-quality mathematical expressions given symbolic (list-structure) representations of the expressions. The goal is to produce 'portraits' of expressions that are sufficiently close to conventional typographic conventions that mathematicians will be able to work with without much effort -- so that they do not have to learn much in the way of a new language, so far as the representation of mathematical formulae is concerned

AIM-60

Author[s]: D.J. Edwards and M.L. Minsky

Recent Improvements in DDT

November 1963

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-060.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-060.pdf

This paper will report new developments and recent improvements to DDT. "Window DDT" now will remember undefined symbols and define them on a later command. Using sequence breaks, it can change the contents of memory while a program is running, and the contents of memory can be displayed in symbolic form on the scope.

AIM-59

Author[s]: Bertram Raphael

Operation of a Semantic Question-Answering System

November 1963

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-059.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-059.pdf

A computer program has been written in the LISP programming language which accepts information and answers questions presented to it in a restricted form of natural English language. The program achieves its effects by automatically creating, adding to, and searching a relational model for factual information. The purpose of this memo is to describe and explain the behavior of the program. The remainder of this section briefly describes the structure of the model. Section II presents sample conversations illustrating various features of the program, and describes the implementation of those features. Section III is a brief survey of conclusions drawn from this research. It is assumed throughout that the reader is at least somewhat familiar with the LISP programming system (and its meta-language notation), the concept of property (description) lists, and the usual notations of Mathematical Logic.

AIM-58

Author[s]: M.L. Minsky

A LISP Garbage Collector Algorithm Using Serial Secondary Storage

December 27, 1963

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-058.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-058.pdf

This paper presents an algorithm for reclaiming unused free storage memory cells in LISP. It depends on availability of a fast secondary storage device, or a large block of available temporary storage. For this price, we get: 1.) Packing of free-storage into a solidly packed block. 2.) Smooth packing of arbitrary linear blocks and arrays. 3.) The collector will handle arbitrarily complex re-entrant list structure with no introduction of spurious copies. 4.) The algorithm is quite efficient; the marking pass visits words at most twice and usually once, and the loading pass is linear. 5.) The system is easily modified to allow for increase in size of already fixed consecutive blocks, provided one can afford to initiate a collection pass or use a modified array while waiting for such a pass to occur.

AIM-57

Author[s]: Timothy P. Hart

MACRO Definitions for LISP

October 1963

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-057.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-057.pdf

In LISP 1.5 special forms are used for three logically separate purposes: a) to reach the alist, b) to allow functions to have an indefinite number of arguments, and c) to keep arguments from being evaluated. New LISP interpreters can easily satisfy need (a) by making the alist a SPECIAL-type or APVAL-type entity. Uses (b) and (c) can be replaced by incorporating a MACRO instruction expander in define. I am proposing such an expander.

AIM-56

Author[s]: Timothy P. Hart

A Proposal for a Geometry Theorem Proving Program

September 1963

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-056.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-056.pdf

During the last half of the nineteenth century the need for formal methods of proof became evident to mathematicians who were making such confidence-shaking discoveries as non-Euclidean geometry. The demand is not to be denied; every jump must be barred from our deductions. That it is hard to satisfy must be set down to the tediousness of proceeding step by step. Every proof which is even a little complicated threatens to become inordinately long. [M1] G. Frege, 1884 This general desire for rigor has persisted since that time, and a great deal has been learned about formal methods. But, for the reason noted by Frege, very little of real mathematics has been done with full formal treatment. Our present hope is to use computers to take the drudgery out of formal demonstrations, just as they are taking it out of accounting. Toward this end, several programs are under way. They vary in purpose; the Proofchecker [H8, H9] is to be capable of filling the gaps of a proof; the work of Mott et. al. [H10] aims to achieve the equivalent of a desk calculator ability as an aid to a mathematician doing formal proofs. The most intriguing prospect, however, is that computers can eventually be made to both devise and prove interesting non-trivial theorems wholly on their own. The first of these desires, the devising of interesting conjectures, has not even been attempted. I believe, however, that we are on the verge of achieving the second of these ends, the mechanical proof of non-trivial theorems, a belief which I hope I can justify in the sequel.

AIM-55

Author[s]: Michael Levin

Primitive Recursion

July 1963

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-055.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-055.pdf

This is one of a series of memos concerning a logical system for proof-checking. It is not self-contained, but belongs with future memos which will describe a complete formal system with its intended interpretation and application. This memo also assumes familiarity with LISP and with "A Basis for a Mathematical Theory of Computation" by John McCarthy.

AIM-54

Author[s]: Joel Winett

Proposal for a FAP Language Debugging Program

June 1963

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-054.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-054.pdf

A time-sharing system for the 7090 computer is being developed at the M.I.T. Computation Center whereby many users can communicate simultaneously with the computer through individual consoles. In the time-sharing system a time-sharing supervisor (TSS) program directs the running of each user’s program in such a manner that each user’s program is run in short bursts of computation. The effect is that the user sitting at his console has complete control over his program with unrestricted use of a large computing machine. Through the use of commands in the time-sharing system a user who writes a program in the FAP language can assemble his program, load it into core, and start the program. In order to make the most use of the time-sharing facility the user during the debugging stages of his program will want to dynamically monitor his running program and make changes as necessary. The proposed FAP language debugging program gives the user the facility to communicate with his program using the symbols defined within his program.

AIM-53

Author[s]: Warren Teitelman

ARGUS

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-053.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-053.pdf

This report describes the use of ARGUS, a program that recognizes hand-drawn characters in real time. The program is a result of research reported in "New Methods for Real-Time Recognition of Hand-Drawn Characters", submitted in partial fulfillment of the requirements for the degree of Master of Science. The report does not assume any previous knowledge of the theory behind ARGUS, but some of the discussion may be more meaningful if the reader refers to the thesis mentioned above.

AIM-52

Author[s]: John Cooke and Marvin Minsky

Universality of TAG Systems with P-2

April 1963

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-052.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-052.pdf

In the following sections we show, by a simple direct construction, that computations done by Turing machines can be duplicated by a very simple symbol manipulation process. The process is described by a simple form of Post Canonical system with some very strong restrictions. First, the system is monogenic; each formula (string of symbols) of the system can be affected by one and only one production (rule of inference) to yield a unique result. Accordingly, if we begin with a single axiom (initial string) the system generates a simply ordered sequence of formulas, and this operation of a monogenic system brings to mind the idea of a machine. The Post canonical system is further restricted to be of the "Tag" variety, described briefly below. It was shown in [1] that Tag systems are equivalent to Turing machines. The proof in [1] is very complicated and uses lemmas concerned with a variety of two-tape non-writing Turing machines. Our proof here avoids these otherwise interesting machines and strengthens the main result, obtaining the theorem with a best possible "deletion number" P – 2. Also, the representation of the Turing machine in the present system has a lower degree of exponentiation, which may be of significance in applications. These systems seem to be of value in establishing unsolvability of combinatorial problems.

AIM-51

Author[s]: Daniel G. Bobrow

METEOR: A LISP Interpreter for String Transformations

April 1963

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-051.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-051.pdf

Conditional expressions, composition and recursion are the basic operations used in LISP to define functions on list structures. Any computable function of arbitrarily complex list structures may be described using these operations, but certain simple transformations of linear lists (strings) are awkward to define in this notation. Such transformations may be characterized (and caricaturized) by the following instructions for a transformation: "Take that substring there, and that other one starting with "Black", which has the substring mentioned third as the first; then inserts the second substring mentioned; omit the first and leave the unmentioned parts of the original string unchanged."

AIM-50

Author[s]: Richard A. Robnett

Suggested Conventions for LISP Time-Sharing System

April 1963

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-050.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-050.pdf

Below is a list of suggested Conventions and De-bugging aids for LISP time-sharing. Any and all suggestions are encouraged and should be submitted in writing to R. A. Robnett in a hurry.

AIM-49

Author[s]: Bertram Raphael

Computer Representation of Semantic Information

April 1963

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-049.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-049.pdf

A major obstacle in the development of learning machines, mechanical translation, advanced information retrieval systems, and other areas of artificial intelligence, has been the problem of defining, encoding, and representing within a computer the "meaning" of the text data being processed. Various devices have been used to avoid this problem, but very little work has been done toward solving it. The purpose of this memo (and the thesis research with which it is associated) is to describe one possible solution, and report on a computer program which demonstrates its feasibility.

AIM-48

Author[s]: Marvin Minsky

Neural Nets and Theories of Memory

March 1963

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-048.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-048.pdf

A number of models developed in work often called "neural-net" research may be of interest to physiologists working on the problem of memory. From this work comes a variety of ideas on how networks of neuron-like elements can be made to act as learning machines. Some of these may suggest ways in which memory may be stored in nervous systems. It is important, perhaps, to recognize that these models were not founded at all on physiological ideas; they really stem from psychological and introspective notions. They all involve some form of alteration of synaptic transmission properties contingent on the pre- and post-synaptic activity during and after the relevant behavior. This notion is suggested not so much by actual observation of synapses as by the introspective simile of wearing down a path -- the "ingraining" of a frequently-traveled route. Below we shall argue that this idea is useful and suggestive, but not sufficient. These models can be made to account for learning connections between stimuli and responses on a low level, but do not seem to account for higher, symbolic behavior. We will argue that the latter suggests a return to the search for localization of memory, a topic that has been unpopular for many years.

AIM-47

Author[s]: Burton H. Bloom

A Proposal to Investigate the Application of a Heuristic Theory of Tree Searching to a Chess Playing Program

February 1963

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-047.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-047.pdf

The problem of devising a mechanical procedure for playing chess is fundamentally the problem of searching the very large move-tree associated with a chess position. This tree-searching problem is representative of a large class of problems. Consequently, we will first present briefly a general theory of tree-searching problems. This theory will be useful in clarifying the intention of our proposed research.

AIM-46

Author[s]: T.G. Evans

A Heuristic Program to Solve Geometric Analogy Problems

October 1962

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-046.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-046.pdf

A program to solve a wide class of intelligence-test problems of the "geometric-analogy" type ("figure A is to figure B as figure C is to which of the following figures?") is being constructed. The program, which is written in LISP, uses heuristic methods to (a) calculate, from relatively primitive input descriptions, "articular" (cf. Minsky, Steps Toward Artificial Intelligence) descriptions of the figures, then (b) utilize these descriptions in finding an appropriate transformation rule and applying it, modifying it as necessary, to arrive at an answer. The current version has solved a number of geometric-analogy problems and is now being modified in several ways and run on further test cases.

AIM-45

Author[s]: Daniel G. Bobrow

A Question-Answerer for Algebra Word Problems

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-045.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-045.pdf

This is a proposal to write a program which, starting from input statements of problems in a restricted English, will be able to formulate problems symbolically and then solve problems from elementary algebra.

AIM-44

Author[s]: Marvin Minsky

A Simple Direct Proof of Post's Normal Form Theorem

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-044.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-044.pdf

The theorem proved in this note is the Normal Form Theorem proved in Post"s 1943 paper, "Formal Reductions of the General Combinatorial Decision Problem". We have long felt that this result is one of the most beautiful in mathematics. The fact that any formal systems can be reduced to Post canonical systems with a single axiom and productions of the restricted form.

AIM-43

Author[s]: Bert Raphael

Proposal for a General Learning Machine

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-043.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-043.pdf

This memo proposes the development of a computer system which is capable of learning certain facts about arbitrary subject matter with an arbitrary vocabulary. It is believed by most researchers in this field that some sort of general learning machine is essential for the ultimate solution of the "Artificial Intelligence Problem." I believe that the system described below, which will be programmed to construct internal models based on the concepts indicated by the syntactic structure of the input text (but not on the specific subject area), will constitute a significant step toward such a machine.

AIM-41

Author[s]: A. Kotok

A Chess Playing Program

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-041.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-041.pdf

This paper covers the development of a chess playing program. The preliminary planning led to the decision to use a variable depth search, terminating at either an arbitrary maximum, or at a stable position. Two schemes of controlling material balance are discussed. Or major significance is the use of the "alpha- beta" heuristic, a method of pruning the tree of moves. This heuristic makes use of values obtained at previous branches in the tree to eliminate the necessity to search obviously worse branches later. The program has played four long game fragments in which it played chess comparable to an amateur with about 100 games experience.

AIM-40

Author[s]: Donald Dawson

A Note on the Possibility of Application of the Davis Putnam Proof Procedure to Elementary Number Theory

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-040.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-040.pdf

In ref.1 Davis and Putnam present a computational proof procedure for quantification theory which they suggest might be applied to obtain proofs in mathematical domains. In ref.2 they give a finite axiom system for elementary number theory with the aim of applying the computational proof procedure to it. In ref.3 Wang points out that as it stands this procedure would be far too inefficient to prove non trivial theorems and discusses how it might be made more efficient. In this note we will indicate that even the type of modification that Wang considered would not be sufficient to enable the system to prove non trivial theorems.

AIM-39

Author[s]: T. Hart and M. Levin

The New Compiler

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-039.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-039.pdf

This memo introduces the brand new LISP 1.5 Compiler designed and programmed by Tim Hart and Mike Levin. It is written entirely in LISP and is the first compiler that has ever compiled itself by being executed interpretively.

AIM-38

Author[s]: Bert Raphael

Machine Understanding of Linguistic Information: A Survey and Proposal

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-038.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-038.pdf

For the past few months I have been studying the problem of how to make a computer understand linguistic information (in some generally accepted sense of "understand". I have listened to courses on Linguistic Structure (Dr. Chomsky) and Mechanical Translation (Dr. Yngve), and read the works of various linguists, ranging through semantics, information retrieval, and mechanical translation. The remainder of this paper is divided into two parts: a survey of various ideas and results appearing in the current literature (with some editorial comment); and a proposal for future work to include a computer system for storing and extracting semantic information.

AIM-37

Author[s]: Lewis M. Norton

Some Identities Concerning the Function Subst [x; y; z]

January 1962 (Revised March 1962)

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-037.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-037.pdf

The purpose of this paper is two-fold; 1) to explore the use of recursion induction in proving theorem about functions of symbolic expressions, in particular. 2) to investigate thoroughly the algebraic properties of the LISP function subst [x; y; z] by this method. The main result is embodied in Theorem 8.

AIM-36

Author[s]: Michael Levin, Marvin Minsky and Roland Silver

On the Effective Definition of "Random Sequence"

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-036.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-036.pdf

Mathematicians have always had difficulty in coming to agreement over what is meant by "randomness". In order to agree on a formal model for a "random process", we have to agree on what intuitive aspects of the matter we want to build into our system. The most prominent point of agreement is that the process should be unpredictable, but this is in itself is a very small beginning. The solution that has become conventional in modern mathematics is based on the notion of "random variable", a highly technical notion in which the basic process is represented as a certain kind of infinite function-space. This space contains all possible observed behavior sequences together with a "measure" structure which enables one to calculate the relative frequency of certain ("measurable") complex events. "Event" here usually refers to a whole class of behaviors

AIM-35

Author[s]: None listed

LAP (LISP Assembly program)

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-035.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-035.pdf

LAP is an internal two pass assembler for LISP 1.5. It is a pseudo-function with two arguments called the listing and the symbol table.

AIM-34

Author[s]: John McCarthy

A New Eval Function

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-034.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-034.pdf

The actual working definition of eval describes how the LISP system determines what, if anything, is denoted by a given S-expression. As things now stand, there are two versions of eval: the theoretical version, given in RFSE, and the system version. Neither of these behaves in the most desirable way; and there exist S-expressions which will be handled correctly by the theoretical version but not by the system version, and conversely. The chief defect of the system eval lies in its handling of functional arguments; the chief defect of the RFSE eval lies in its ignorance of property lists. If we wish to have a theory about how LISP really works, then it is necessary to have a version of eval which is satisfactory both theoretically and practically. I will propose a definition for eval, and then illustrate how this eval differs from the existing system and RFSE definitions by means of examples.

AIM-33

Author[s]: Marvin L. Minsky

Universality of (p=2) Tag Systems and a 4 Symbol 7 State Universal Turing Machine

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-033.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-033.pdf

This report describes (1) an improvement and great simplification of the proof that the "Tag" systems of Post can represent any computable process, and (2) a Universal Turing machine with just four symbols and seven states -- the smallest yet reported.

AIM-32

Author[s]: John McCarthy

On Efficient Ways of Evaluating Certain Recursive Functions

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-032.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-032.pdf

The purpose of this memorandum is to illustrate a method for evaluating a recursive function when the same subexpression may occur many times during the evaluation and should be evaluated only once.

AIM-31

Author[s]: John McCarthy

A Basis for a Mathematical Theory of Computation

January 1962

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-031.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-031.pdf

This paper is a corrected version of the paper of the same title given at the Western Joint Computer Conference, May 1961. A tenth section discussing the relations between mathematical logic and computation has been added. Programs that learn to modify their own behaviors require a way of representing algorithms so that interesting properties and interesting transformations of algorithms are simply represented. Theories of computability have been based on Turing machines, recursive factions of integers and computer programs. Each of these has artificialities which make it difficult to manipulate algorithms or to prove things about them. The present paper presents a formalism based on conditional forms and recursive functions whereby the functions computable in terms of certain base functions can be simply expressed. We also describe some of the formal properties of conditional forms and a method called recursion induction for proving facts about algorithms. A final section in the relations between computation and mathematical logic is included.

AIM-30

Author[s]: D.J. Richards and T.P. Hart

The Alpha-Beta Heuristic

December 1961

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-030.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-030.pdf

The Alpha-Beta heuristic is a method for pruning unneeded branches from the move tree of a game. The algorithm makes use of information gained about part of the tree to reject those branches which will not affect the principle variation.

AIM-29

Author[s]: Bertram Raphael

Introduction to the Calculus of Knowledge

November 1961

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-029.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-029.pdf

This paper deals with the "Calculus of Knowledge", an extension of the propositional calculus in which one may reason about what other people know. Semantic and Syntactic systems are developed, certain theorems are proven, and a formal solution in the system of a well-known reasoning problem is presented.

AIM-28

Author[s]: Michael Levin

Information About the LISP 1.5 Programmer's Manual

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-028.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-028.pdf

This memo is to be issued simultaneously with the new LISP 1.5 Programmer's Manual. At the present time, the manual is without a general index and without any appendicies. Appendix A is planned as a complete list of functions within the LISP system, and will be issued shortly. Other appendices will contain detailed information about the interpreter, input-output, and system operation. The manual is intended to apply to a version of LISP 1.5 called "LISP 1.5 Export A" which has not yet been issued. LISP 1.5 systems preceding this version differ in certain details.

AIM-27

Author[s]: Timothy Hart

Simplify

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-027.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-027.pdf

Simplify is a compilable set of 45 8- expression-defined functions which simplify algebraic expressions. The expressions which are appropriate for simplify are defined recursively as follows: P= all atoms, fixed and floating point numbers, Q= all expressions of the form: (PLUS, s1, s2,…,Sn), (PRDCT, s1,s2,…Sn), (MINUS, . s), (RECIP . s), (DIVIDE. S1, s2), (POWER, s 1, s2), (SUBT, s1 , s2), where s, s1, s2, …, Sn E P UQ. Simplify is a function, not a pseudo-func tion, that is, the list structure of an expressio n is not modified by simplify. Simplify takes 6 000 words of free storage when stored as ** S-expression and about 9000 words whe n compiled. It takes about 5 minutes to read all the functions into the 709 using the online card reader, and about 4 minutes from tape.

AIM-26

Author[s]: Michael Levin

Errorset

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-026.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-026.pdf

Errorset is a function available to the interpreter and compiler for making a graceful retreat from an error condition encountered during a subroutine.

AIM-25

Author[s]: No author

LISP Error Stops as of May 10, 1961

May 1961

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-025.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-025.pdf

no abstract

AIM-24

Author[s]: Michael Levin

Arithmetic in LISP 1.5

April 1961

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-024.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-024.pdf

As of the present, the following parts of LISP 1.5 are working. This is an excerpt from the forth coming LISP 1.5 Programmer’s Manual.

AIM-23

Author[s]: Robert Brayton

Trace Printing for Compiled Programs

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-023.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-023.pdf

The compiler now has a tracing feature which is equivalent to the TRACLIS feature of the interpreter. COMPILE MODE is a function of one argument which must be either TRACE or NORMAL.

AIM-22

Author[s]: Paul Abrahams

Character-Handling Facilities in the LISP System

January 1961

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-022.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-022.pdf

Because of the new read program, a number of facilities are being added to the LISP system to permit manipulation of single characters and print names. Machine-language functions have been provided for breaking print names down into a list of their characters, for forming a list of characters into a print name, for creating a numerical object from a list of its characters, for reading in characters one by one from an input medium, and for testing characters to see whether they are letters, numbers, operation characters, etc. A number of auxiliary objects and sub-routines are also described in this memo.

AIM-21

Author[s]: Paul Abrahams

The Proofchecker

January 1961

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-021.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-021.pdf

The Proofchecker is a heuristically oriented computer program for checking mathematical proofs, with the checking of textbook proofs as its ultimate goal. It constructs, from each proof step given to it, a corresponding sequence of formal steps, if possible. It records the current state of the proof in the form of what it is sufficient to prove. There are two logical rules of inference: modus powers and insertion (if it is sufficient to prove B, and A is the theorem, then it is sufficient to prove A implies B). The permissible formal steps include these rules of inference as well as provision for handling definitions, lemmas, calculations, and reversion to previous states. As of now, most of the formalisms are programmed and partially debugged, but the heuristic aspects have yet to be programmed.

AIM-20

Author[s]: John McCarthy

Puzzle Solving Program in LISP

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-020.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-020.pdf

In this note we give as an example of LISP programming a function for solving a class of puzzles in a recent prize contest.

AIM-19

Author[s]: Daniel J. Edwards

LISP II Garbage Collector

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-019.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-019.pdf

The present LISP free storage control program, the garbage collector, has a severe limitation in that it can handle well only list structure. LISP II will be able to handle arrays, binary progras and other quantitites, therefore the garbage collector will have to be able to recognize these quantities and control free storage accordingly. Since arrays and binary programs require blocks of contiguous free storage, the garbage collector must be able to relocate items to be saved in order to coalesce the isolated blocks of items discarded into one contiguous block.

AIM-18

Author[s]: Louis Hodes

Some Results from a Pattern Recognition Program Using LISP

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-018.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-018.pdf

This paper describes some aspects of an elaborate pattern recognition system being programmed by the author under the supervision of Marvin Minsky. A more detailed discussion is forthcoming as a Lincoln Laboratory group report.

AIM-17

Author[s]: John McCarthy

Programs with Common Sense

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-017.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-017.pdf

Interesting work is being done in programming computers to solve problems which require a high degree of intelligence in humans. However, certain elementary verbal reasoning processes so simple that they can be carried out by any non-feeble-minded human have yet to be simulated by machine programs. This paper will discuss programs to manipulate in a suitable formal language (most likely a part of the predicate calculus) common instrumental statements. The basic program will draw immediate conclusions from a list of premises. These conclusions will be either declarative or imperative sentences. When an imperative sentence is deduced the program takes a corresponding action. These actions may include printing sentences, moving sentences on lists, and reinitiating the basic deduction process on these lists. Facilities will be provided for communication with humans in the system via manual intervention and display devices connected to the computer.

AIM-16

Author[s]: Anthony Valliant Phillips

A Question-Answering Routine

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-016.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-016.pdf

A program has been written in the LISP programming language to answer English- language questions by consulting an English- language text. The program can handle questions about the subject, verb, place and time of simple sentences. The program proceeds in two steps. In the first, the machine analyzes the question and the sentences of the text, puts them into a form in which they can be compared. For this analysis the machine must have as input a dictionary of part-of-speech tags, and a set of rules, analogous to phrase-structure rules, according to shich it will organize the sentences. This analysis organizes the sentences into noun-phrases, verbs, and prepositional phrases. The machine then picks from the sentence a subject, a verb, an object, and prepositional phrases relating to place and time. This is the "canonical form" of the sentence. The next part of the program compares the question with each of the sentences in the text. Those that match, i.e. contain the information the question is asking for, are stored and the answer is made up from them. If none are found, an appropriate negative answer is given.

AIM-15

Author[s]: John McCarthy

SML-Examples of Proofs by Recursion Induction

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-015.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-015.pdf

Recursion induction has turned out to have certain bugs and some restrictions have to be imposed. The proofs given in the sections of my notes reproduced below probably will turn out to satisfy whatever restrictions have to be imposed.

AIM-14

Author[s]: J. McCarthy

The Wang Algorithm for the Propositional Calculus Programmed in LISP

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-014.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-014.pdf

This memorandum describes a LISP program for deciding whether an expression in the propositional calculus is a tautology according to Wang’s algorithm. Wang’s algorithm is an excellent example of the kind of algorithm which is conveniently programmed in LISP, and the main purpose of this memorandum is to help would-be users of LISP see how to use it.

AIM-13

Author[s]: K. Maling

Symbol Manipulating Language - The Maling-Silver Read Program

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-013.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-013.pdf

Three types of expressions can read (1) m – expressions (2) s - expressions (3) algebraic expressions The program uses RDB, which means that single embedded blanks may be part of a print name; that a left parenthesis followed by, or a right parenthesis preceded by, any combination of periods and commas is treated as a special parenthesis; and that any combination of + - = * 1 is an "operation" group.

AIM-12

Author[s]: John McCarthy

Programs in LISP

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-012.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-012.pdf

?pends only on the RLE QPR No. 53 discussion of LISP. Its objective is to add to the system of that report a program feature. This takes the form of allowing functions to be defined by programs including sequences of Fortran-like statements, e.g. Y= cons[ff[subst[A;y;z]]; (A,B)] Such a feature was included in the informal version of LISP from which we hand-compiled into SAP and is also available in the latest version of the apply operator. The version in the present apply operator is added merely as a convenience and does not have the mathematical elegance that we require. In the present memorandum, I will try to add a program feature to the system in a systematic way. It may be some time before this version is available in the programming system.

AIM-11

Author[s]: J. McCarthy

Recursive Functions of Symbolic Expressions and Their Computation

March 30, 1959

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-011.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-011.pdf

This memorandum is a continuation of Memo 8.

AIM-10

Author[s]: K. Maling

The LISP Differentiation Demonstration Program

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-010.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-010.pdf

This program is a byproduct of the machine language which is being developed for the Artificial Intelligence project. It was written because the process of differentiation and to some extent that of simplification, turned out to be very conveniently expressable in LISP. There are two main reasons for this: one is the fact that algebraic expressions are most easily represented in a computer by means of a list language and the other is the ability of LISP to describe recursive processes.

AIM-9

Author[s]: S. R. Russell

Explanation of Big "P" as of March 20, 1959

March 1959

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-009.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-009.pdf

ERROR is a routine to provide a common location for all routines. Its celling sequence is: SXD SERROR,4 TSX SERROR+1,4 The above is normally followed immediately by up to 20 registers of BCD remarks terminated by a word of 1’s. This may be left out, however. ERROR prints out the remark, if any, the location of the TSX that entered error, restores the console except for the AC overflow, and transfers to the user’s error routine specified by the calling sequence of SETUP.

AIM-8

Author[s]: J. McCarthy

Recursive Functions of Symbolic Expressions and Their Computation by Machine

March 13, 1959

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-008.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-008.pdf

The attached paper is a description of the LISP system starting with the machine-independent system of recursive functions of symbolic expressions. This seems to be a better point of view for looking at the system than the original programming approach. After revision, the paper will be submitted for publication in a logic or computing journal. This memorandum contains only the machine independent parts of the system. The representation of S-expressions in the computer and the system for representing S-functions by computer subroutines will be added.

AIM-7

Author[s]: J. McCarthy

Notes on the Compiler

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-007.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-007.pdf

We will start with a very modest compiler. Our first major goal is a compiler that will compile recursive function definitions. Its input will be LISP statements in restricted notation and its output will be a SAP tape. However we will start with an even simpler compiler that will only compile programs to evaluate expressions and at first we will print rather than punch them.

AIM-6

Author[s]: S. Russell

Writing and Debugging Programs

no date listed

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-006.ps

ftp://publications.ai.mit.edu/ai-publications/pdf/AIM-006.pdf

A subroutine is a fixed set of instructions that is used many times. The kind most often used explicitly are closed subroutines such MAPLIST which are set up so that they may be used by any part of a program.