Ivannikov Institute for System Programming of the RAS


Methods for Automatic Term Recognition in Domain-Specific Text Collections: A Survey

Authors

N. A. Astrakhantsev, D. G. Fedorenko, D. Yu. Turdakov

Abstract

Applications related to domain specific text processing often use glossaries and ontologies, and the
main step of such resource construction is term recognition. This paper presents a survey of existing definitions
of the term and its linguistic features, formulates the task definition for term recognition, and analyzes
presently-available methods for automatic term recognition, such as methods for candidates collection,
methods based on statistics and contexts of term occurrences, methods using topic models, and methods
based on external resources (such as text collections from other domains, ontologies, and Wikipedia). This
paper also provides an overview of standard methodologies and datasets for experimental research.

Full text of the paper in pdf

Keywords

automatic term recognition, term extraction, domain-specific terms, terminology, ontology

Edition

Programming and Computer Software, 2015, Vol. 41, No. 6, pp. 336–349.

DOI: 10.1134/S036176881506002X

0361-7688

Research Group

Information Systems

All publications during 2015 All publications