Ivannikov Institute for System Programming of the RAS


Sense Disambiguation of Wikipedia's Terms based on Hidden Markov Model.

Authors

Turdakov D.

Abstract

The paper presents a method for word sense disambiguation using external knowledge extracted from the open encyclopedia Wikipedia. We analyse the drawbacks of the existing word sense disambiguation algorithms and propose own algorithm, based on Hidden Markov Model, to overcome these drawbacks. HMM parameters are estimated by empirical probabilities derived from the Wikipedia dictionary and link structure. A heuristics for speeding up the computational aspects of the algorithm is proposed, and the evaluation of the algorithm for several test collections is provided.

Full text of the paper in pdf (in Russian)

Edition

RCDL 2009: Digital Libraries: Advanced Methods and Technologies, Digital Collections.

Research Group

Information Systems

All publications during 2009 All publications