Extension of ParJava model for HPC clusters with multicore nodes.
At the beginning of 2000 each node of high performance cluster with distributed memory contained processor with single core and each MPI process of parallel application used to utilize all resources of node. At this moment vendors propose microprocessors with multiple cores on chip and few processors on the single board. Using multiple threads in a single node with modern multicore processors allows to increase performance of parallel application due to usage of shared memory and lower overhead. An extension of model for parallel SPMD programs has been developed with ability to use Java threads. The usage of threads in program allows better utilization of the resources of multicore processor. Developed model allows estimate execution time of parallel program with explicit calls to MPI library, where parallel Java threads could be used in each process. However, there are a lot of problems arising when threads have been used in Java environment. This paper contains recommendations called for performance tuning of multiprocessed-multithreaded program concerning to JVM memory management, garbage collector configuration, management of local buffers etc. Java version of parallel application FT (Fast Furier Transformation) from NPB has been adapted for multiprocess-multithreaded environment. Tests on implemented application show 9-14% performance improvement. Model of multiprocess-multithreaded application has been developed. Performance prediction for multiprocess-multithreaded FT shows 3-7% prediction error.
Proceedings of the Institute for System Programming, vol. 23, 2012, pp. 13-32.
ISSN 2220-6426 (Online), ISSN 2079-8156 (Print).
DOI: 10.15514/ISPRAS-2012-23-1Full text of the paper in pdf (in Russian) Back to the contents of the volume