Hierarchical clustering of words
Open Access
- 1 January 1996
- proceedings article
- Published by Association for Computational Linguistics (ACL)
- Vol. 2, 1159-1162
- https://doi.org/10.3115/993268.993390
Abstract
This paper describes a data-driven method for hierarchical clustering of words in which a large vocabulary of English words is clustered bottom-up, with respect to corpora ranging in size from 5 to 50 million words, using a greedy algorithm that tries to minimize average loss of mutual information of adjacent classes. The resulting hierarchical clusters of words are then naturally transformed to a bit-string representation of (i.e. word bilts for) all the words in the vocabulary. Introducing word bits into the ATR Decision-Tree POS Tagger is shown to significantly reduce the tagging error rate. Portability of word bits from one domain to another is also disscussed.Keywords
This publication has 0 references indexed in Scilit: