POS Induction (State of the art)

Evaluation

Many-to-1: Map every induced label to a gold standard tag greedily (45 labels to 45 tags of the Penn tag set). Use the mapping to compute tag accuracy on the Wall Street Journal portion of the Penn TreeBank.

Results

Listed in order of decreasing accuracy

System name	Short description	Main publications	Software	Many-to-1
UPOS	Learning Syntactic Categories Using Paradigmatic Representations of Word Context	Yatbaz et al. (2012)	upos	80.2%
Brown+proto	MRF initialized with Brown prototypes	Christodoulopoulos, Goldwater and Steedman (2010)		76.1%
	Logistic regression with features and LBFGS	Berg-Kirkpatrick et al. (2010)		75.5%
Clark DMF	Distributional clustering + morphology + frequency	Clark (2003)	alexc	71.2%*