Within the scope of a research & development project for Samsung program analysis group employees have developed an instrumentation tool for automatic Android/Tizen ARM executable and library modification. Target file modification is performed through processing and replacing direct binary code using an approach similar to aspect-oriented programming methods. Developed tool was primarily used for performance analysis, targeting Android and Tizen system executables and libraries designed for drawing GUI.
Software plays a key role in many systems, e.g., safety-, security-, and missioncritical systems. Bugs in such software can lead to catastrophic consequences. As a result development of critical software is regulated by certification standards/guidelines (like DO-178С, ISO/IEC 15408, etc) that require following best practices in development process.
Avalanche − an automatic program traversal and defect detection tool based on Valgrind (dynamic instrumentation framework) − started by program analysis group as a project in 2009. It performs an extensive analysis of a target program by tracing tainted data flow (all external input data received through input streams, file system, network sockets, environment variables and command line arguments, and internal program data derived from external input data) through executed code branching points.
Many hardware-based techniques have been developed for support of increasing data flows: high-speed network channels and memory buses, high frequency CPUs, hard disks with high data density and low access time. However, numerous unsolved problems remain on the software side dealing with processing, analyzing and storing data. This software must use hardware resources efficiently and also satisfy rigid requirements: support batch processing of huge data volumes with high throughput, provide reliable functioning on unreliable hardware, allow for good scaling and efficient random data access. This project is aimed at creating a framework for data acquirement, filtering, analysis and storage in real time on high-speed network channels. This framework will allow automation of a wide range of tasks related to high-speed data flows: classifying traffic, ensuring network security, analyzing social networks, and forecasting using big data.
QEMU is a full-system multi-target open source emulator. It is widely used for software cross-development. Many large companies (e.g., Google, Samsung, Oracle) prototype and emulate their hardware platforms and peripheral devices on QEMU. QEMU 2.9 emulates 20 different hardware platform families, including x86, PowerPC, Sparc, MIPS, ARM.
Software developers often face a problem of incorporating complex computations, data encryption and compression algorithms, and similar common notions into their code. This is typically done by using standard libraries specializing in a group of tasks; these libraries are often distributed in binary code only. On the other hand, software maintenance is gradually becoming more and more important within the development cycle; software maintenance incorporates the task of updating both its code and external libraries. External libraries and auxiliary programs, distributed in binary form, need to conform to quality and security standards.
Casr creates automatic reports for crashes happened during program testing or deployment. The tool works by analyz-ing Linux coredump files. The resulting reports contain the crash’s severity and additional data that is helpful for pin-pointing the error cause.
The idea of the project is to build a solution for processing Big Data collected from numerical simulation of continuum mechanics problems.
Different methods of analysis of random graphs, construction of new mathematical models of scale-free graphs (conforming the so-called power law) is a topical area of research concerning the analysis of networks on the Internet (in particular, social networks such as Facebook, Twitter and many others). However, the properties and parameters of these networks can change. To predict such changes it is necessary to study the general properties of mathematical models of such networks which can be considered as random graphs.
This project was aimed on creation of a firewall based on free software and capable of providing an effective protection for local networks from unauthorized access from the outside.
The project goal is to create system toolchain software that improves programmer's productivity on distributed heterogeneous systems (typically with nodes having a couple of multicore CPUs and accelerator(s) like GPUs). We will be researching on tools for finding program bottlenecks, critical errors (including multithreaded performance), and trying new programming standards. We will also be improving problem specific parallel algorithms in the sparse matrix libraries and OpenFOAM framework for CFD problems.
Organization of effective interaction between IPv4 and IPv6 networks could not be done on the basis of existing mechanisms without the creation of application-level gateways translating corresponding protocol elements in this level. This project was aimed on design and implementation of such tools. Moreover network software debugging and testing is time consuming and error-prone if done manually. This is why as a part of this project tools for automated testing of network software were developed.
Test suite «Mathematics» — is a suite of tests for mathematical functions of POSICX standard programming interface checking both implementation conformance to the standard and accuracy of results on a huge set of specifically selected test data. Test data sources for the suite are intervals of homogeneous behavior of the functions under test, boundary and special values of floating-point numbers, and numbers, for which accurate computation of a function value requires more than average effort. The test reports can present results both in a brief view and with detailed distribution of the errors detected and lists of most serious errors.
Docmarking is a unique system for embedding digital watermarks into text documents. It allows creating a digital or physical document copy that is almost indistinguishable from the original yet exactly identifies the user or the device that was the intended recipient.
The aim of the project is to study various generalizations of the problem of unification and anti-unification of algebraic terms, the estimation of complexity and development of efficient algorithms for solving these problems, the study of formal models of programs in order to select those models in which the problem of detecting the similarity of programs is reduced to the problems of verification of equivalence and minimization programs атв also to the problems of unification and anti-unification of algebraic terms.
New method to provide compatibility for IPv4 and IPv6 protocols fulfilling the draft of the “Stateless IP/ICMP Translator (SIIT)” standard was created and implemented in this project. This method coupled with other means can provide seamless transition to IPv6 protocol. The method was successfully applied to Linux operation system after the studying of its specific features.
This project was concentrated on the specific features of FreeBSD and adaptation of stateless translator developed in the previous project to FreeBSD environment.
Stateless IP/ICMP Translator (SIIT) has a number of limitations. Implemented in this project context method of addresses and protocols translation (NAT-PT) in Linux and FreeBSD allowed usage of normal IPv6 addresses instead of specific ones in IPv6 subnetworks as well as dynamic assigning of IPv4 addresses to IPv6 nodes in the process of session creation when IPv6 and IPv4 networks communicate.
ISP Crusher is a toolset that combines various dynamic analysis approaches. It includes ISP Fuzzer, a fuzzing tool, and Sydr, an automatic test generation tool for complex pro-grams. Two other ISP RAS analyzers, BinSide and Casr, will be included in Crusher within the next two years. Crusher allows organizing a development process that is fully compliant with GOST R 56939-2016 and other regulatory requirements of FSTEC of Russia.
ISP Obfuscator is based on long-term research that ISP RAS started as early as 2002. The obfuscation technology grew up from basic research to industrial deployment. It is covered by dozens of publications and two PhD theses during these years. ISP Obfuscator integrates with compilers to make those transformations transparent for developers. At the moment two compiler infrastructures are supported: LLVM and GCC.
Linux Driver Verification (LDV) program is aimed to meet increased demand for large-scale verification tools applicable to high profile software.
The Modular Avionics System Integrator Workplace framework is intended to automate the design of real-time aviation electronics systems based on Integrated Modular Avionics (IMA) architecture.
MicroTESK (Microprocessor TEsting and Specification Kit) is a framework for generating test programs in assembly language for functional verification of microprocessors. It uses formal specifications as a source of knowledge about the configuration of the microprocessor under verification. Generation tasks are described in a special Ruby-based language which allows formulating verification goals in terms of test situations derived from formal specifications. Such an approach simplifies configuring the framework and improves the level of test coverage. MicroTESK has been successfully applied in industrial projects for verification of ARMv8 and MIPS64 microprocessors.
Study the possibility of using homomorphic computing to organize confidential calculations, including the development of a model for cloud computing privacy using homomorphic threshold calculations. Study the algorithmic complexity of solving equations in the semigroup of finite permutations of the first order.
The main aim of the project is to create software tools that allow more efficient use of computing resources in the cloud. The results are applied in the system UniHUB for hosting applications on virtual machines running OpenStack.
This project pursued the following two goals. The first was in providing means for active information exchange performed to enhance research activities in the area of system programming and creation of new system software in collaboration with Russian and foreign scientific institutes including the projects using the Internet. The second was in creation of ISP RAS own web-server.
The project is related to development of complex software systems based on formal static models and is addressed to the key problems of verification of large-scale data models applied in different industrial domains and specified using general-purpose object-oriented modeling languages (EXPRESS, UML/OCL, etc). The main directions of the project are:
Nowadays, the task of network traffic analysis is of increasing relevance: the reasons are improvement and deployment of new network technologies (VoIP, P2P, streaming video) and emergence of numerous application level protocols used by new network applications. Offline or online analysis is employed, depending on particular analysis system and the problem being solved.
Research and development of methods for distributed large graph processing.
The importance of network traffic analysis is constantly increasing because of novel network technologies being developed immediately hitting the market, thus increasing data volume (including personal and sensitive information) transmitted over network by innumerable network applications many of which implement closed application level protocols. Available network analysis tools typically don’t offer generic facilities to inspect application protocols, usually only widespread protocols are supported.
The project is aimed on research and development of the methods for storing, searching and processing of the information taking into account of its complex structure and spatio-temporal semantics. In the scope of the project the decomposition method base on dynamic octrees should be developed and investigated, their complexity should be estimated and recommendations for its practical usage should be elaborated.
Reuse of code fragments, often using in software development. At the level of the source code, it can be part of a program that performs a similar role, but copied with slight modifications. On the binary level it may be object files from libraries are included on the linking stage in the several executable files of the program.
To protect the binary code from analysis are used by many different methods, one of them - obfuscation transformations. Such transformations are usually made with automatic obfuscators, which takes as input the source code or binary file, and output provide an obfuscated executable program.
A joint project between ISP RAS and Synchro Software Ltd. (UK) focuses on the research in the field of system integration, software engineering, computer graphics and visualization, management of big data and aimed on development of emerging visual modeling and planning systems. The obtained results allowed to evolve multidisciplinary functionality of the developed system as well as increase its scalability and performance.
HDL Retrascope is a toolkit for reverse engineering and transformation of digital hardware designs described in such HDLs (hardware description languages) as Verilog и VHDL. The toolkit allows analyzing HDL descriptions, reconstructing the underlying models (extended finite state machines, EFSMs) and using the derived models for test generation, property checking and other tasks. HDL Retrascope is organized as an extendable framework with the ability to add new types of models as well as tools for their analysis and transformation. The primary application domain of the toolkit is functional verification of hardware at the unit level.
SciNoon is a system for collaborative exploration of scientific papers. SciNoon is an essential tool for a group of researchers to dive quickly into the new area of knowledge and to find answers on their questions, following up with tracking new research on the topic of interest with highly customizable alerts.
The project goal is to create methods for solving program understanding problems that arise during the program lifecycle. The basic information for such methods is program structure, that is, program entities, relations between them, and their metrics. The methods will be used in the task of easing the back/forward porting of code changes between different versions of the given program.
The project is designed to research the feasibility of applying static program analysis techniques for dynamically-typed languages. First prototype implementation of the analysis tool targeted Python language and performed type inference to automatically identify related errors. Current work on the project includes extending the type inference engine by processing program control flow structure.
A joint project between the program analysis group and Klocwork, a Rogue Wave Company (previously Klocwork Inc.) focuses on the development of a tool set for static source code analysis for extensive (over 1 million SLOC) C, C++ and C# projects.
Summer is a Java-based test development and test execution framework similar to JUnit and TestNG, but supporting model based testing techniques (like NModel for C#).
ISP RAS has developed Svace static analysis tool that satisfies all requirements for a production quality analyzer. Svace supports C/C++, Java, and C# programming languages (C# can be also shipped separately as it is implemented as a standalone tool), and it runs on Linux and Windows. Svace analyzes programs that can be built on Intel x86/x86-64 Linux/Windows, ARM/ARM64 architectures. Popular C/C++ compilers for Linux and Windows are supported as well as a range of compilers for embedded systems.
ISP RAS has developed a number of original methods for social analysis which were combined into a technology called Talisman. Unlike most existing solutions for social analytics, Talisman technology was originally aimed at working with large amounts of data. The most promising open solutions from the stack of Big Data technologies are employed, such as: Apache Spark, GraphX, MLLib, etc.
The project conducted in partnership with JSC "VimpelCom" is aimed at development of software testing practices at the level of enterprise information system as a whole. The project covers a wide variety of topics: from requirement gathering for legacy-systems to coverage analysis in end-to-end system testing.
The basic problem of text analysis is natural language ambiguity: same words can have different meanings depending on the context. Context understanding requires knowledge bases describing real world concepts. Construction of such knowledge bases (or ontologies) is a very resource- and time-consuming task. Texterra technology provides tools for automatic extraction of knowledge bases from partially structured resources, e.g. Wikipedia and Wikidata, and tools for analysis of texts semantics on top of these knowledge bases. Texterra technology is actively applied in research and industrial projects of ISP RAS.
UniTESK is a technology for testing application program interfaces (API) which is primarily designed for unit testing.
UniTESK stands for Unified TEsting Specification based toolKit. UniTESK uniformity is provided by the fact that the common testing methodology and general architecture can be implemented for testing modules written in almost all programming languages. Currently there are UniTESK implementations for such languages as C (CTESK), C++ (C++TESK), Java (JavaTESK and Summer), Python (PyTESK).
The studies of changes in security features of IP level introduced in IPsec v2 showed that formal specification and test scenarios designed in the previous project were almost impossible to use. New version of security features consisted of new protocols for protecting and transmitting data incompatible with the protocols from the previous version of IPsec. This project was aimed at the creation of new formal specifications and test scenarios and providing means for automated verification of security features in the implementations of the new protocols. This project was also done in close collaboration with Programming technologies department.
This project was devoted to research and development of formal methods for modeling telecommunication protocols in terms of security and mobility. Also new methods and tools were developed for automated tests generation used to check compliance with Internet standards. This project was done in collaboration with Programming technologies department.
Android Java (Dalvik) memory profiling tool was developed by the program analysis group within the scope of a research & development project for Samsung. The tool is designed to compute various statistical parameters for specified Java processes running on an Android device. The tool structure is embedded within Dalvik (Android Java virtual machine) and extends existing debug/analysis means in order to track virtual memory operations.
BizQuery is a package of servers and tools for application development in presence of heterogeneous data sources. The main component of the package is BizQuery Integration Server, which is for querying across multiple heterogeneous databases in XQuery language. BizQuery Integration Server supports the notion of global schema defined in XML.
The goal of this project was to explore capabilities of In-Memory Data Grid solutions for core banking tasks. Gridgain, RedHat Infinispan and Hazelcast have been tested.
The framework provides full-life cycle content and knowledge management services that are used to develop advanced information products based on encyclopedias and references. Our Sedna XML Database is the core component of the framework. It provides a single-sourcing publishing, powerful content reuse, superior search & navigation, and great flexibility in information products customization.
"Virtual supercomputer" software was developed in this project. The software complex is developed in free software model and is based on open source code components.
A prototype of web-center for program analysis was developed under the project on the base of the UniHUB technological platform software components, developed in the ISP RAS, the "University cluster" program computation infrastructure and Avalanche open program analysis package.
The analysis tool was developed within the scope of a research & development project for Samsung.
GNU SQL Server is a free portable multi-user relational database management system. It supports the full SQL89 dialect and has some extensions from SQL92. GNU SQL Server implements highly isolated transactions, and static & dynamic query compilation. Both, client & server sides of the system work on Unix-like systems. Client/server interaction is based on an RPC mechanism. The server sub processes facility requires message passing and memory sharing facilities.
A program model for distributed heterogeneous computation systems, with a single node consisting of a multicore general purpose computer (host-machine) and one or several PLD. Proposed model for programming heterogeneous systems combines best approaches for creating high-level programming models and approaches utilizing accelerators capabilities with the help of runtime libraries with maximum efficiency. At the high level a programmer can describe a data-parallel algorithm, which can be parameterized for certain heterogeneous node.
The objective of the project is to develop ICT building blocks to integrate, complement and empower existing tools for design and operation management to a Virtual Energy Lab (VEL) based on an interoperable ontology-supported platform. This will allow evaluating, simulating and optimizing the energy efficiency of products for built facilities and facility components in variations of real life scenarios before their realization, acknowledging the stochastic life-cycle nature.
Participants: Techinical University Dresden (Germany), Granlund Oy (Finland), University of Ljubljana (Slovenia), SOFiSTiK Hellas AE (Greece), Innovation Center Reykjavik (Iceland), National Observatory of Athens (Greece), Leonhardt, Andra und Partner GmbH (Germany), Trimo D.D. (Slovenia), University of Cyprus (Cyprus), Institute for System Programming RAS (Russia).
In the joint project ISP RAS has been responsible for the development of a product catalogue based on an ontology for unified energy aware prefabricated building elements representation as well as for implementation of intelligent web services for prefabricated building element selection, instantiation, consistency checking and configuration in the host facility.
ISP C++ ORB is a free tool for development of distributed software. ORB plays the role of communicator between different components of distributed applications which can run on the different platforms. ISP C++ ORB is compliant with OMG Common Object Request Broker Architecture 2.0 (CORBA 2.0) standard.
LSB Infrastructure was run by ISPRAS under a contract with the Linux Foundation. The project started in September 2006 and was targeted at long term partnership to advance LSB infrastructure quality and usability for supporting the rapidly growing LSB community. The key ISPRAS areas in the project included LSB Infrastructure Tools (this is to develop and later on maintain and mature various infrastructure tools to support LSB development and promotion), Linux Testing, various investigation and analytical work to find and fix various LSB related issues, perform data mining and prepare decision making materials.
One has to harness dynamic and adaptive recompilation methods when designing the system for general-purpose languages compilation which takes into account the specific factors of target hardware and the most likely way of usage. It is favorable to research those methods in the LLVM infrastructure environment.
During works on the project, problems on research of access methods to high-performance resources and on development of an experimental sample of hardware-software platform, providing access to high-performance resources as Web-services were solved.
Research and development of a basis for the computation platform and application programming interface (API) for automated numerical simulation of large scale aerodynamic and hydrodynamic problems on petaflops supercomputers.Start of project – 2011. End of project - 2012. Customer - The Ministry of Education and Science.
The project was aimed at the creation of an experimental platform for numerical simulation on the top of the OpenFOAM library for heterogeneous computer systems with graphical processing units transferring the most resource-intensive computations to the graphical processing unit using CUDA technology and managing central processing unit and graphical processing unit interaction.
One of the widespread problems in binary code analysis is recovery of structure of incoming network packets or files read by a program. In case of protected binary code the difficulty of manual format recovery becomes inadmissibly high. This project proposes to create an automated format recovery system which does not require specific knowledge about the target system software from its user. This system will increase work efficiency and recovery accuracy.
Most of developed tools for analysis for various libraries (MPI, OpenMP) and languages for parallel programming use low level approaches to analyze the performance of parallel applications. There are a lot of profiling tools and trace visualizers which produce tables, graphs with various statistics of executed program. In most cases developer has to manually look for bottlenecks and opportunities for performance improvement in the produced statistics and graphs. The amount of information developer has to handle manually, increase dramatically with number of cores, number of processes and size of problem in application. Therefore new methods of performance analysis fully or partially handling output information will be more beneficial.
The project is aimed at development of a software toolset for automated vulnerability detection and exploit construction. The toolset is designed to reveal vulnerabilities in binary code of programs that operate over network.
The idea of the project is to create a technological advance for development of effective method of unsteady near field turbulent flows simulation with accuracy required by engineering applications and a technological advance in area of software development for calculation of near field turbulent flows acoustic fields on hybrid architecture supercomputers.
TweetSieve – a system that allows obtaining news on any given subject by sifting the Twitter stream. Our work is related to frequecy-based analysis applied to blogs, but higher latency and lower coverage in blogs makes the analysis less effective than in case of micro-blogs.
The idea of the project was to create a technological advance in area of direct computation modeling of turbulence and large eddy method as well as to find ways for effective supercomputer usage in industrial applications. A software implementing algorithms for computation modeling of gas- and hydrodynamics numerical simulation in industrial applications based on OpenFOAM free software package was developed under the project. On the base of this software a method of using the supercomputer for numerical modeling of gas- and hydrodynamics problems in industrial applications was developed.
Visontia - service for visualizing Texterra.
This project provides a way of querying Wikipedia with XQuery. We have parsed Wikipedia content into well-structured XML representation, loaded it into Sedna XML database and implemented an XQuery Web interface.