Design and Implementation of pSystem for Distributed Memory Architectures
Hervé Miguel Cordeiro Paulino
MSc in Computer Science
Department of Computer Science
Faculty of Science, University of Porto
October 1998
Abstract
Recently, new parallel machines based on commodity components have
been proposed and built. These are usually distributed memory
architectures composed of clusters of multiprocessors interconnected
by very fast networks, such as Myrinet or ATM. The success of these
architectures is however conditioned by the programming environments
they offer to users. Parallel interfaces such as MPI or PVM are a
great help for programming distributed architectures, but they still
require from users a detailed knowledge of the underlying
architecture as well as explicit control of communication. An
alternative approach is to use a software layer, such as Treadmarks,
to provide virtual shared memory hence allowing a programming model
closer to that of shared memory.
This thesis presents the design and implementation of the di\_pSystem,
a programming environment for distributed memory architectures. It is
an extension of the pSystem, a programming environment for shared
memory that allows the annotation of C programs for parallel
execution. Our view is that the new system should retain, as much as
possible, the ease of programming of the pSystem as well as its
modularity in terms of architecture. Similarly, the di\_pSystem will
dynamically schedule and load balance all parallel work that is made
available during parallel execution. Communication is hidden from the
programmer and is automatically handled by the environment whenever
necessary. Programmers are only required to identify the C functions
that are candidate for parallel execution; communication for work
distribution is managed implicitly by the system.
The communication module within di\_pSystem uses the MPI interface to
support distributed computations. This is the only system dependent
module, but it can easily be adjusted to support other systems. The
system can also serve as test bed for studying the performance of
scheduling algorithms and it already implementations for four
dynamic algorithms. We have also implemented a small number of applications
that have served as benchmarks for an initial study of the performance
of the system.