Applying Java bytecode static instrumentation for software dynamic analysis
This paper focuses on dynamic analysis of Java programs. We consider the following limitations: analysis tool may not have access to target program source code, and the program may be interpreted by a non-standard virtual machine with bytecode format different from Java Virtual Machine specifications. The paper describes an approach to bytecode instrumentation which is used to perform iterative dynamic analysis for new execution path discovery. Path discovery is performed through automatic input data generation by tracing tainted data, collecting path conditions, and satisfiability checking. The proposed approach is based on static bytecode instrumentation. The main advantages of this approach are analysis speedup (because of one-time instrumentation) and explicit access to statically generated instrumented bytecode which makes it possible to run instrumented program on different virtual machines with different bytecode formats. Proposed approaches were implemented in the Coffee Machine tool. Paper sections dedicated to this tool provide a detailed description of taint data tracing and automatic branch traversing techniques as well as a set of instrumentation utilities based on Coffee Machine allowing executed instructions printing, taint trace dumping, and synchronization events trace generation. Coffee Machine uses BCEL (bytecode instrumentation library) for instrumentation. The paper concludes with an overview of practical restrictions existing for discussed methods and possible future work directions. Main disadvantage of proposed approach is the inability to access dynamic data at run-time and instrument a set of system class methods. It may be resolved by method simulation and execution environment modifications.
Proceedings of the Institute for System Programming, vol. 27, issue 1, 2015, pp. 25-38.
ISSN 2220-6426 (Online), ISSN 2079-8156 (Print).
DOI: 10.15514/ISPRAS-2015-27(1)-2Full text of the paper in pdf (in Russian) Back to the contents of the volume