Free Software Based and ISPRAS Developed Cloud IaaS Solutions
Cloud infrastructure can significantly reduce resources and development time by optimizing the use of resources and reducing the time required to deploy and configure systems. For example, the load of web-services with a large number of users can drastically change depending on the time of the day, the time of the year and events (such as the Christmas Day). With elastic balancing of resources in the cloud environment, it is possible to save a huge amount of resources. It's worth mentioning that cloud resources are not limited to virtual machines (or containers) only, but also include a few additional services, such as reliable block storage, object data storage, DBMS which support automatic scaling on demand (both SQL and NoSQL). Instagram is a very good example of using such resources: on the one hand, the number of stored photos and video recordings is not known in advance, and the Amazon S3 object storage allows increasing of storage space without loss of performance for each individual user. On the other hand, the relations between users are stored in a classic DBMS (most likely, it uses the Amazon RDS service, which allows dynamic scaling with classic RDBMS).
However, despite significant benefits that using of cloud technologies provides, its manual deployment requires significant expertise in networking, deep understanding of interactions between nodes and the principles of organizing reliable resources. In addition, existing open source solutions are far from ideal, and it is impossible to solve all problems with just single technology.
The cloud infrastructure of ISP RAS consists of several components based on the most promising technologies that provide virtualization and reliable storage.
Big Data Open Lab - Computing cluster based on Openstack for collaborative work with Big Data
Openstack is an open source technology that aims to be the industry standard for large cloud systems. The project involves more than a hundred companies, the most active participants: Red Hat, Mirantis, Rackspace, IBM, Intel, HPE, Huawei. Openstack provides most of the capabilities of cloud environments: virtualization of computing resources, reliable block data storage, object data storage, virtual networks, collection of resource use metrics, DBMS on demand, and so on. Developers of ISP RAS actively participate in the development of Openstack: 9 code reviews, 3 detected and fixed critical errors, 6 commits with new features in the released versions.
Understanding the principles of the technology allowed building its own small data center based on Openstack. Using the equipment provided by Dell (a partner of ISP RAS), we organized a laboratory for the analysis of Big data (Big Data Open Lab) for ISP RAS staff and partners.
Users are given the basic functionality of Openstack:
- Management of virtual computing clusters and virtual networks. The systems Keystone, Neutron, Nova (an alternative to Amazon EC2) are used. Totally, 512 CPUs, 4TB of RAM, 30TB of temporary storage of virtual machines, all OS versions working with clouds are available.
- Block storage of data, based on the Cinder system (Amazon Elastic Block Storage alternative). Users are provided with a storage capacity of 200 TB.
- Easily expandable object storage based on Openstack Swift (Amazon S3 alternative) with a current capacity of 30TB.
To improve the integration of Openstack and big data analysis technologies, ISP RAS has developed a solution that automatically creates and destroys virtual clusters with fully configured systems from the Big Data stack, including Apache Spark, Apache Hadoop, Apache Ignite. Creating a cluster with an arbitrary number of computational nodes is done by running one script and takes 5 minutes, after which the developer receives full ready systems.
Resources of Big Data Open Lab are used in various industry projects. For example, the technology of social media monitoring TALISMAN uses virtual computing resources to analyze flows of information on the Internet.
Virtualization based on container technology
Using virtual machines is not always convenient: sometimes resource allocation time plays a significant role. Also some projects distribute container images that provide project functionality as simple as just running this image. This way is very convenient for testing and development purposes.
In such situations, we provide access to a relatively simple cloud for the Docker containers based on Rancher technology. The container cloud uses the resources of the Big Data Open Lab and makes it easy to control the resources allocated to our container projects.
In addition to creating computational clusters, developers regularly need to deploy and configure virtual machines for presentational purposes. For security reasons, this functionality was placed in a separate server pool and built on the XenServer system. The key requirement for such virtual machines is reliability. However, XenServer does not provide free solutions for easy management of the virtual machine pool (even the Xen Orchestra project). To solve this problem, the VMEmperor solution was developed in ISP RAS. It makes possible to easily create virtual machines with automatic proxying through the central node of the system and by binding to the domain names of ISP RAS. The cloud resources include 800GB of RAM, 92 CPUs and 27TB of shared storage.
Fanlight - jobs in the Desktop as a Service (DaaS) model
The FANLIGHT platform is a software technology for building a solid Webenvironment (Web-laboratory) based on the concept of virtual workplaces in the DaaS model, designed to support multidisciplinary teams in carrying out computational experiments using applied engineering analysis applications as a part of solving scientific tasks.
Web-laboratory offers the following basic features:
- User access to virtual workstations with integrated application packages and hardware resources via a Web browser over a network (including the Internet);
- Customization for a given application area by integrating specialized packages into virtual workstations using the Docker image engine;
- Expansion of the computing capabilities of the web laboratory by connecting the hardware resources necessary for organizing computing and data storage (HPC / BigData clusters, storage systems, servers including graphics accelerators);
- User account management;
- Quoting and accounting of computational resources;
- Export of user data to the working environment of the Web laboratory and their import to local workstation or external storage;
- Access to Web-lab functions via Web-interface or RESTFul API.
At the same time, the following are provided:
- Comfortable work with heavy engineering CAD-CAE applications that require hardware accelerated 3D graphics. Support for full-screen and multi-display modes;
- Support for CUDA applications through access to NVIDIA GPU hardware;
- Scalability through flfiexible management of containerized applications on a distributed hardware infrastructure using Docker Swarm;
- Low entry threshold when deploying your own Web laboratory due to the possibility of flfiexible selection of hardware with further expansion of computing capabilities on a "pay as you grow" basis. A Web laboratory can be deployed on a single server, a computer farm, on a public cloud (from the IaaS level) or in its own cloud-based data center (OpenStack);
- Replicability and easy installation by automatically deploying Web labs using Docker Compose;
- Ability to work with various client devices (from workstations to mobile devices) due to adaptive web-design;
- Easy start for users who do not have deep knowledge of computer administration due to the absence of special requirements for client devices and their configuration (no need to install any software - Java, Flash, VNC clients, etc. or configure network access rights, because all the communications are done through standard Web-socket ports). All user work is carried out through any modern Web browser: Chrome, Internet Explorer, Firefox, Opera, etc.