Research and development of methods for distributed large graph processing.
ISP RAS has developed a number of original methods for social analysis which were combined into a technology called Talisman. Unlike most existing solutions for social analytics, Talisman technology was originally aimed at working with large amounts of data. The most promising open solutions from the stack of Big Data technologies are employed, such as: Apache Spark, GraphX, MLLib, etc.
The basic problem of text analysis is natural language ambiguity: same words can have different meanings depending on the context. Context understanding requires knowledge bases describing real world concepts. Construction of such knowledge bases (or ontologies) is a very resource- and time-consuming task. Texterra technology provides tools for automatic extraction of knowledge bases from partially structured resources, e.g. Wikipedia and Wikidata, and tools for analysis of texts semantics on top of these knowledge bases. Texterra technology is actively applied in research and industrial projects of ISP RAS.
SciNoon is a system for collaborative exploration of scientific papers. SciNoon is an essential tool for a group of researchers to dive quickly into the new area of knowledge and to find answers on their questions, following up with tracking new research on the topic of interest with highly customizable alerts.
Docmarking is a unique system for embedding digital watermarks into text documents. It allows creating a digital or physical document copy that is almost indistinguishable from the original yet exactly identifies the user or the device that was the intended recipient.
The goal of this project was to explore capabilities of In-Memory Data Grid solutions for core banking tasks. Gridgain, RedHat Infinispan and Hazelcast have been tested.
Visontia - service for visualizing Texterra.
This project provides a way of querying Wikipedia with XQuery. We have parsed Wikipedia content into well-structured XML representation, loaded it into Sedna XML database and implemented an XQuery Web interface.
The framework provides full-life cycle content and knowledge management services that are used to develop advanced information products based on encyclopedias and references. Our Sedna XML Database is the core component of the framework. It provides a single-sourcing publishing, powerful content reuse, superior search & navigation, and great flexibility in information products customization.
TweetSieve – a system that allows obtaining news on any given subject by sifting the Twitter stream. Our work is related to frequecy-based analysis applied to blogs, but higher latency and lower coverage in blogs makes the analysis less effective than in case of micro-blogs.
BizQuery is a package of servers and tools for application development in presence of heterogeneous data sources. The main component of the package is BizQuery Integration Server, which is for querying across multiple heterogeneous databases in XQuery language. BizQuery Integration Server supports the notion of global schema defined in XML.
ISP C++ ORB is a free tool for development of distributed software. ORB plays the role of communicator between different components of distributed applications which can run on the different platforms. ISP C++ ORB is compliant with OMG Common Object Request Broker Architecture 2.0 (CORBA 2.0) standard.
GNU SQL Server is a free portable multi-user relational database management system. It supports the full SQL89 dialect and has some extensions from SQL92. GNU SQL Server implements highly isolated transactions, and static & dynamic query compilation. Both, client & server sides of the system work on Unix-like systems. Client/server interaction is based on an RPC mechanism. The server sub processes facility requires message passing and memory sharing facilities.