|
ABSTRACT
The HPC Challenge(HPCC) benchmark suite is increasingly being used to evaluate the performance of supercomputers. It augments the traditional LINPACK benchmark by adding six more benchmarks, each designed to measure a specific aspect of the system performance.In this paper, we analyze the HPCC Randomaccess benchmark which is designed to measure the performance of random memory updates. We show that, on many systems, the bisection bandwidth of the network may be the performance bottleneck of this benchmark. We suggest an aggregation and software routing based technique that may be used to optimize this benchmark. We report the performance results obtained using this technique on the Blue Gene/L supercomputer.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Adiga, N. R., Blumrich, M. A., Chen, D., Coteus, P., Gara, A., Giampapa, M. E., Heidelberger, P., Singh, S., Steinmacher-Burow, B. D., Takken, T., Tsao, M., and Vranas, P. 2005. Blue Gene/L torus interconnection network. IBM Journal of Research and Development 49, 265--276.
|
| |
2
|
Almasi, G., Archer, C., Castanos, J. G., Erway, C. C., Heidelberger, P., Martorell, X., Moreira, J. E., Pinnow, W., Ratterman, J., Smeds, N., Steinmacherburrow, B., Gropp, W., and Toonen, B. 2004. Implementing MPI on the Blue Gene/L supercomputer. In Lecture Notes in Computer Science, vol. 3149, 833--845.
|
 |
3
|
George Almási , Philip Heidelberger , Charles J. Archer , Xavier Martorell , C. Chris Erway , José E. Moreira , B. Steinmacher-Burow , Yili Zheng, Optimization of MPI collective communication on BlueGene/L systems, Proceedings of the 19th annual international conference on Supercomputing, June 20-22, 2005, Cambridge, Massachusetts
[doi> 10.1145/1088149.1088183]
|
 |
4
|
Sanjeev Arora , Satish Rao , Umesh Vazirani, Expander flows, geometric embeddings and graph partitioning, Proceedings of the thirty-sixth annual ACM symposium on Theory of computing, p.222-231, June 13-16, 2004, Chicago, IL, USA
[doi> 10.1145/1007352.1007355]
|
| |
5
|
|
| |
6
|
Dongarra, J., and Luszczek, P. 2005. Introduction to the HPC Challenge Benchmark Suite. Tech. Rep. ICLUT-05-01, ICL.
|
| |
7
|
Dongarra, J. J., Luszczek, P., and Petitet, A. 2003. The LINPACK benchmark: Past, present, and future. Concurrency and Computation: Practice and Experience 15, 1--18.
|
| |
8
|
Gara, A., Blumrich, M. A., Chen, D., Chiu, G. L.-T., Coteus, P., Giampapa, M. E., Haring, R. A., Heidelberger, P., Hoenicke, D., Kopcsay, G. V., Liebsch, T. A., Ohmacht, M., Steinmacher-Burow, B. D., Takken, T., and Vranas, P. 2005. Overview of the Blue Gene/L system architecture. IBM Journal of Research and Development 49, 195--212.
|
| |
9
|
Garg, R., and Sabharwal, Y. 2006. Analysis and Optimization of the HPCC Randomaccess Benchmark on Blue Gene/L Supercomputer: Extended Version. Tech. Rep. RI-05-010, IBM.
|
 |
10
|
Susan L. Graham , Peter B. Kessler , Marshall K. Mckusick, Gprof: A call graph execution profiler, Proceedings of the 1982 SIGPLAN symposium on Compiler construction, p.120-126, June 23-25, 1982, Boston, Massachusetts, United States
|
| |
11
|
|
| |
12
|
|
| |
13
|
Ohmacht, M., Bergamaschi, R. A., Bhattacharya, S., Gara, A., Giampapa, M. E., Gopalsamy, B., Haring, R. A., Hoenicke, D., Krolak, D. J., Marcella, J. A., Nathanson, B. J., Salapura, V., and Wazlowski, M. E. 2005. Blue Gene/L compute chip: Memory and Ethernet subsystem. IBM Journal of Research and Development 49, 255--264.
|
| |
14
|
|
| |
15
|
|
| |
16
|
|
|