Next: System Integration Up: Implementation Previous: Profiling

Selecting Alternatives

The select_alternatives pass performs several tasks: it checks for existence of a raw profiling data file, and if found, processes the file; it decides what experiment to conduct next, or if it is ready to generate a non-experimental version; finally, it creates the actual executable corresponding to its decision.

Experimentation Frequency

There are many sources of noise in the profiling data, including short-term variations in usage pattern and temporary variations in machine load (creating noise for the real time, but not CPU time, profiling mechanism). Summing data over repeated program runs can help filter out much of the noise; hence, the current implementation of the clever compiler generates executables which after four runs which successfully result in profiling data saved, will re-invoke the compiler. Four is merely an arbitrarily chosen default, and can be overridden by the RUNS environment variable.

It would be preferable to have the default number of runs before recompilation be adaptive --- for example, if select_alternatives notices that the profiled parts of the program collectively took less than a second of real time to execute in the last four runs, then it could adjust the number of runs upward to reduce the frequency of recompilation, since there is not much benefit to trying to optimize this program, and recompiling less often will help amortize the cost of recompilation. Another factor in how many runs should occur before recompilation is whether the clever compiler is still in an experimental phase, when most of the program version space still unexplored, or whether it has reached an equilibrium phase, when much of the space has been adequately explored and the clever compiler therefore has a good idea what version is optimal for current usage. Even in the equilibrium phase, the clever compiler should still monitor program performance, but recompilations can occur at lengthier intervals.

Space Searching

Given the formulation of the problem faced by the clever compiler on page as searching a space of program versions, any of a large number of generic techniques for space-searching and function minimization can be applied. Because of time considerations, I chose to implement the most native possible procedure, a combination of the generate-and-test principle and the British Museum procedure: namely, select_alternatives generates all possible program versions and tries them all in the generated order. The current implementation of even this procedure is incomplete: to complete this procedure, it would just need to automatically choose the program version which generated the shortest run time, whereas currently it emits a warning to the user when it runs out of program versions to try.

Note that some function minimization techniques such as hill-climbing (alternately, gradient descent) explicitly require functions of nearly continuously variable parameters which have calculable derivative or gradient functions. However, other techniques such as downhill simplex method and Powell's method (direction-set methods) [] do not require the derivative or gradient. Still other techniques, like simulated annealing, are designed to solve combinatorial problems, where the dimensions of the space are discrete rather than continuous, and it is explicitly recognized that in such a configuration space, the notion of ``continuing downhill in a favorable direction'' may be meaningless. Thus the latter set of techniques seem to offer a better fit.

Given that the clever compiler is searching in a very specific kind of domain, additional heuristics may be applied to improve the efficiency and the effectiveness of a combinatorial space search. For example, non-interacting sets can be modeled thusly: if a represents the first dimension (a set of 10 quasistatic variables which are always used together in the same qif statements), and b represents the second dimension (a quasistatic parameter which can take on any of 4 different values), and c represents the third dimension (a quasistatic parameter which can take on any of 20 different values), then we're seeking to minimize in a space of 800 possible program versions. However, if a's effect on the performance of the program and b's appear to be dependent on each other because there is a qinfluence statement for b which encloses one of the a qif statements, but c's appears to be ``completely independent'', then we can rewrite what we are trying to minimize as , in which case and can be independently minimized, effectively collapsing the search space. Also, as a first approximation, the clever compiler's search engine might assume that if c1 and c3 are relatively close to each other in value and it is known that --- this is a weak belief that is close to continuous when moving onlyu along the dimension corresponding to a quasistatic parameter. Finally, if some profiling indicates that the parts of the program for which a and b influence the performance take up a negligible percentage of the total run-time, then the search engine can to a first approximation ignore the a and b dimensions altogether.

Decision Implementation

A decision determines exactly one version of the program, and consists of the quasistatic variables which have been selected for, all quasistatic parameters, each with its selected value. Implementing a decision merely requires creating definitions in the SUIF file with the appropriate initializing value for each quasistatic variable (assigned the integer value 1 if selected for, 0 if not) and for each quasistatic parameter. Then the create_executable script is invoked, and it calls the SUIF s2c pass and then gcc to compile and link the program against the profiling library.

Next: System Integration Up: Implementation Previous: Profiling

Reinventing Computing, MIT AI Lab. Author: pshuang@ai.mit.edu (Ping Huang)