Ivannikov Institute for System Programming of the RAS


HMM Expanded to Multiple Interleaved Chains as a Model for Word Sense Disambiguation.

Authors

Turdakov D., Lizorkin D.

Abstract

The paper proposes a method for Word Sense Disambiguation based on an expanded Hidden Markov Model. The method is based on our observation that natural language text typically traces multiple interleaved chains consisting of semantically related terms. The observation confirms that the classical HMM is too restricted for the WSD task. We thus propose the expansion of HMM to support multiple interleaved chains. The paper presents an algorithm for computing the most probable sequence of meanings for terms in text and proposes a technique for estimating parameters of the model with the aid of structure and content of Wikipedia. Experiments indicate that the presented method produces systematically better WSD results than the existing state-of-the-art knowledge-based WSD methods.

Full text of the paper in pdf

Edition

PACLIC 2009: The 23rd Pacific Asia Conference on Language, Information and Computations.

Research Group

Information Systems

All publications during 2009 All publications