HITS-based Seed Selection and Stop List Construction for Bootstrapping

Tetsuo Kiso,  Masashi Shimbo,  Mamoru Komachi,  Yuji Matsumoto
NAIST


Abstract

In bootstrapping (seed set expansion), selecting good seeds and creating stop lists are two effective ways to reduce semantic drift, but these methods generally need human supervision. In this paper, we propose a graph-based approach to helping editors choose effective seeds and stop list instances, applicable to Pantel and Pennacchiotti's Espresso bootstrapping algorithm. The idea is to select seeds and create a stop list using the rankings of instances and patterns computed by Kleinberg's HITS algorithm. Experimental results on a variation of the lexical sample task show the effectiveness of our method.




Full paper: http://www.aclweb.org/anthology/P/P11/P11-2006.pdf