Proceedings of ISP RAS

Use of Multiple Features for Extracting Topics from News Clusters.

A.A. Alekseev, N.V. Loukachevitch.


In this paper we consider a method for extraction of alternative names of a concept or a named entity mentioned in a news cluster. The method is based on the structural organization of news clusters and exploits comparison of various contexts of words. The word contexts are used as basis for multiword expression extraction and main entity detection. At the end of cluster processing we obtain groups of near-synonyms, in which the main synonym of a group is determined.


near-synonym detection; text-structure models; multi-document summarization


Proceedings of the Institute for System Programming, vol. 23, 2012, pp. 257-276.

ISSN 2220-6426 (Online), ISSN 2079-8156 (Print).

DOI: 10.15514/ISPRAS-2012-23-15

