Optimizing of Parallel Programs (Publications)

Jürgen Vollmer. Analyse und Transformation kontrollflußparalleler Programme
Ph.D. Thesis, University of Karlsruhe, 1997
The goal of this dissertation was to develop methods to improve the runtime of parallel programs using compiler optimizations. To do so, 1. the theoretical data-flow analysis background for parallel programs was derived; 2. based on this, practical and efficient algorithms were developed to analyse parallel programs; and 3. it was shown how well known optimizing algorithms can be implemented in a compiler of parallel programming languages.
The three main results of the work were:
1. the data-flow analysis theory for parallel programs (pDFA)
2. data structures (pCFG, pSSA) and algorithms to perform the data-flow analysis
3. implementation of two very effective optimization techniques for parallel programs: elimination of common subexpressions and lazy code motion.
Jens Knoop, Bernhard Steffen, and Jürgen Vollmer. Parallelism for free: Efficient and optimal bitvector analyses for parallel programs.
ACM Transactions on Programming Languages and System (TOPLAS), May 1996, Volume 18, Number 3, pages 268-299,

We consider parallel programs with shared memory and interleaving semantics, for which we show how to construct for unidirectional bitvector problems optimal analysis algorithms that are as efficient as their purely sequential counterparts and can easily be implemented. Whereas the complexity result is rather obvious, our optimality result is a consequence of a new Kam/Ullman-style Coincidence Theorem. Thus, using our method, the standard algorithms for sequential programs computing liveness, availability, very busyness, reaching definitions, definition-use chains, or analyses for performing code motion, assignment motion, partial dead-code elimination or strength reduction can easily be transferred to the parallel setting at almost no cost.
Jürgen Vollmer. Data flow analysis of parallel programs.
In Proceedings of the IFIP WG 10.3 Working Conference on Parallel Architectures and Compilation Techniques, PACT'95, Limassol, Cyprus, pages 168-177, June 1995.

Data flow analysis is the prerequisite for performing optimizations such as code motion of partially redundant expressions on imperative sequential programs. To apply these transformations to parallel imperative programs, the notion of data flow must be extended to concurrent programs. The additional parallel source-language features are: shared memory and nested parallel statements (PAR). The underlying interleaving semantics of the concurrently executed processes result in the so-called state space explosion which, on a first view, prevents the computation of the meet over all path solution needed for data flow analysis. For the class of bitvector data flow problems we can show that for the computation of the meet over all path solution, not all interleavings are needed. Based on this, we can give simple data flow equations representing the data flow effects of the PAR statement. The definition of a parallel control flow graph leads to the efficient extension of Killdal's algorithm to compute the data flow of a concurrent program. The time complexity is the same as for analysing a "comparable" sequential program.
Jens Knoop, Bernhard Steffen, and Jürgen Vollmer. Optimal code motion for parallel programs.
Technical Report MIP-9511, Universität Passau, Fakultät für Mathematik und Informatik, September 1995

Code motion is well-known as a powerful technique for the optimization of sequential programs. It improves the runtime efficiency by avoiding unnecessary recomputations of values, and it is even possible to obtain computationally optimal results, i.e., results where no program path can be improved any further by means of semantics preserving code motion. In this paper we present a code motion algorithm that for the first time achieves this optimality result for parallel programs. Fundamental is the framework of [Knoop, Steffen, Vollmer: Parallelism for Free: Efficient and Optimal Bitvector Analyses for Parallel Programs, TOPLAS, May 1996] showing how to perform optimal bitvector analyses for parallel programs as easily and as efficiently as for sequential ones. Moreover, the analyses can easily be adapted from their sequential counterparts. This is demonstrated here by constructing a computationally optimal code motion algorithm for parallel programs by systematically extending its counterpart for sequential programs, the busy code motion transformation of [Knoop, Rüthing, Steffen: Lazy Code Motion,PLDI 1992; Code Motion: Theory and Practice, TOPLAS 16(4), 1994]
Jens Knoop, Bernhard Steffen, and Jürgen Vollmer. Optimal code motion for parallel programs -- extended abstract.
In Proceedings of the 12th GI Workshop on Alternative Konzepte für Sprachen und Rechner, Physikzentrum Bad Honnef, May 2-4 1995. Technical Report University of Koblenz, May 1995.
Jens Knoop, Bernhard Steffen, and Jürgen Vollmer. Parallelism for free: Bitvector Analyses for Parallel Programs.
In Mario Tokoro and Remo Pareschi, editors, Proceedings of the International Workshop on Tools and Algorithms for the Construction and Analysis of Systems, TACAS'95, Aarhus, Denmark, volume 1019 of Lecture Notes in Computer Science, pages 264 -- 289. Springer Verlag, Heidelberg, New York, 1995.
Also in Proceedings of the Workshop on Tools and Algorithms for The Construction and Analysis of Systems, TACAS (Aarhus, Denmark, 19--20 May, 1995) BRICS notes number NS-95-2.

In this paper we show how to construct optimal bitvector analysis algorithms for parallel programs with shared memory that are as efficient as their purely sequential counterparts, and which can easily be implemented. Whereas the complexity result is rather obvious, our optimality result is a consequence of a new Kam/Ullman-style Coincidence Theorem. Thus, using our method, the standard algorithms for sequential programs computing liveness, availability, very business, reaching definitions, definition-use chains, or performing partially redundant expression and assignment elimination, partial dead code elimination or strength reduction, can straightforward be transferred to the parallel setting at almost no cost.
Jürgen Vollmer. Data flow analysis of parallel programs.
Interner Bericht 19/95, Universität Karlsruhe, Fakultät für Informatik, March 1995.

Data flow analysis is the prerequisite of performing optimizations such as common subexpression eliminations or code motion of partially redundant expressions on imperative sequential programs. To apply these transformations to parallel imperative programs, the notion of data flow must be extended to concurrent programs. The additional source language features are: common address space (shared memory), nested parallel statements (PAR), or-parallelism, critical regions and message passing. The underlying interleaving semantics of the concurrently-executed processes result in the so-called state space explosion which, on a first view, prevents the computation of the meet over all path solution needed for data flow analysis. For the class of one-bit data flow problems (also known as bitvector problems) we can show that for the computation of the meet over all path solution, not all interleavings are needed. Based on that, we can give simple data flow equations representing the data flow effects of the PAR statement. The definition of a parallel control flow graph leads to an efficient extension of Killdal's algorithm to compute the data flow of a concurrent program. The time complexity is the same as for analyzing a "comparable" sequential program.
Jens Knoop, Bernhard Steffen, and Jürgen Vollmer. Parallelism for free: Efficient and optimal bitvector analysis for parallel programs.
Technical Report MIP-9409, Universität Passau, Fakultät für Mathematik und Informatik, August 1994.

In this paper we show how to construct optimal bitvector analysis algorithms for parallel programs with shared memory that are as efficient as their purely sequential counterparts, and which can easily be implemented. Whereas the complexity result is rather obvious, our optimality result is a consequence of a new Kam/Ullman-style Coincidence Theorem. Thus, the important merits of sequential bitvector analyses survive the introduction of parallel statements.
Jürgen Vollmer. Dataflow equations for parallel programs that share memory.
Technical Report GMD-1101-dfepp, Release 1.1, Universität Karlsruhe, Fakultät für Informatik, January 1994. Deliverable 2.11.1 of the ESPRIT Project COMPARE #5933.

Traditional data flow analysis methods are designed for sequential programs and may therefore fail when applied to control flow parallel imperative programs that share memory and are based on the MIMD computer model. Current approaches mostly use a copy-in/copy-out semantics, when entering/exiting a process, to model shared memory. To avoid this restriction, this paper extends the notion of program execution paths. Selecting some specific paths out of the set of all possible paths allows to give simple data flow equations which are proved to be equal to the meet over all path solution. Since these data flow equations are extensions of the sequential ones, they fit in very well with traditional optimization methods. An example shows that the code generator of a compiler as well as a reordering assembler needs this kind of data flow analysis to avoid unnecessary memory barrier instructions and produce correct instruction reorderings.

Dr. Jürgen Vollmer (www.informatik-vollmer.de)
Diese Seite wurde am 13. Juli 2005 aktualisiert

	Informatik-Vollmer
	Veröffentlichungen > Cobol Optimizing Modula-P Others