Performance Analysis
It's easy to write bad parallel programs, and hard to write good ones. Performance analysis and tuning maximize utilization of computing resources. Increasing complexity of programs with widespread parallelism and heterogeneity of architectures leads to a need to increased sophistication in analysis. Our most recent work in this area is on the performance characteristics of parallel programs based on message-passing traces of their execution on a set of processors in distributed memory environments. Using this methodology, one can investigate how perturbations in both single processor performance and the messaging layer impact the performance of the traced run. This analysis provides a quantitative description of the sensitivity of applications to a variety of performance parameters to better understand the range of systems upon which an application can be expected to perform well. These performance parameters include operating system interference and variability in message latencies within the interconnection network layer.
Our techniques for quantifying operating system interference as observed by applications have found widespread use throughout the Department of Energy complex, and have been used in tuning the fastest machines in existence.
The following are links to our work in this area:
- Performance analysis of parallel programs via message-passing graph traversal
- Performance Technology for Parallel and Distributed Component Software
- Analysis of microbenchmarks for performance tuning of clusters
- FTQ" Microbenchmark for operating system interference quantification.
- supermon" low impact cluster monitoring.