News
Research on the impact of network degradation on automatic speech recognition models
Abstract
Despite the success of automatic speech recognition models on various datasets and in different languages, their use in everyday life is still limited due to their inability to handle certain scenarios, such as calls with unstable network connections or telephone channels with interference. In this paper, we present a specialized benchmark for Russian-language speech, which includes a representative dataset that simulates the effects of an unstable internet connection on speech. This benchmark was designed to test and compare the performance of modern speech recognition approaches. To quantify the level of degradation, we used an automated method based on analyzing a set of acoustic features and neural network metrics. The results obtained from our benchmark allow us to identify methods that are more resistant to acoustic distortion. These methods can be used to improve the reliability of speech recognition systems in real-world scenarios with challenging conditions.
Keywords
Edition
Proceedings of the Institute for System Programming, vol. 38, issue 3, part 4, 2026, pp. 101-118
ISSN 2220-6426 (Online), ISSN 2079-8156 (Print).
DOI: 10.15514/ISPRAS-2026-38(3)-49
For citation
Full text of the paper in pdf (in Russian)
Back to the contents of the volume