Instruction scheduling and software pipelining for modern architectures.
We describe the approach for instruction scheduling and software pipelining based on a two-stage extensible architecture of detecting and using the available instruction level parallelism. The detection stage is based on a selective scheduling approach and consists of a kernel supporting instruction movement with bookkeeping code creation and instruction unification, and of modules that implement additional instruction transformations such as register renaming, control and data speculation, predication (conditional execution). The usage stage is a number of heuristics for choosing the best instruction for schedule on the given scheduler iteration. Among the usual critical path heuristics, we introduce execution probability heuristics to control speculative movements, register renaming limitations with tracking register pressure, and the additional scheduling pass quickly removing the schedule holes possibly introduced by software pipelining. We concentrate on the basic approach improvements that were needed for implementing the suggested method in an industrial compiler. We also show experimental results of the scheduler on Intel Itanium and ARM platforms: Itanium speedup achieves 4% on average for SPEC FP 2000 tests with up to 10% for individual tests, while initial ARM support achieves 1-3% speedup for smaller programs.
Proceedings of the Institute for System Programming, vol. 22, 2012, Стр. 19-32.
ISSN 2220-6426 (Online), ISSN 2079-8156 (Print).