An Experimental Study of Methods for Parallel Preconditioned Krylov Methods

Abstract
High performance multiprocessor architectures differ both in the number of processors, and in the delay costs for synchronization and communication. In order to obtain good performance on a given architecture for a given problem, adequate parallelization, good balance of load and an appropriate choice of granularity are essential. This document discusses the implementation of parallel version of PCGPAK for both shared memory architectures and hypercubes. The authors' parallel implementation is sufficiently efficient to allow them to complete the solution of our test problems on 16 processors of the Encore Multimax/320 in an amount of time that is a small multiple of that required by a single head of a Cray X/MP, despite the fact that the peak performance of the Multimax processors is not even close of the supercomputer range. The authors illustrate the effectiveness of our approach on a number of model problems from reservoir engineering and mathematics.

This publication has 0 references indexed in Scilit: