Automatic Enrichment of Informal Ontology by Analyzing a Domain-Specific Text Collection.
The core part of an entity linking system, in particular one oriented to wikification, is ontology, which is often informal and supports semantic relatedness as the only type of relation. Most of these systems suffer from the problem of ontology incompleteness. It is especially important for specific domains, since often the only source of extractable knowledge is plain text. This paper formulates the incompleteness problem as a task of ontology enrichment from domain-specific texts and presents a novel approach that combines state-of-the-art methods for terminology enrichment, our own ML-based method for homonymy detection, and methods adopted from the related field for relations extraction.
Experimental evaluation shows that the bottleneck is terminology enrichment step: its average precision is about 35%, which is inapplicable for automatic usage, especially taking into account the strict requirements for ontology correctness; however, recall is high enough to help semi-automatic terminology enrichment.
We also show that the best features for terminology enrichment differ from those for classic terminology recognition task.Full text of the paper in pdf
Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference "Dialogue" (2014) Issue 13, pp. 29-42.