Proceedings of ISP RAS


Use of Multiple Features for Extracting Topics from News Clusters.

A.A. Alekseev, N.V. Loukachevitch.

Abstract

In this paper we consider a method for extraction of alternative names of a concept or a named entity mentioned in a news cluster. The method is based on the structural organization of news clusters and exploits comparison of various contexts of words. The word contexts are used as basis for multiword expression extraction and main entity detection. At the end of cluster processing we obtain groups of near-synonyms, in which the main synonym of a group is determined.

Keywords

near-synonym detection; text-structure models; multi-document summarization

Edition

Proceedings of the Institute for System Programming, vol. 23, 2012, pp. 257-276.

ISSN 2220-6426 (Online), ISSN 2079-8156 (Print).

DOI: 10.15514/ISPRAS-2012-23-15

Full text of the paper in pdf Back to the contents of the volume