Difference between revisions of "POS Induction (State of the art)"
Jump to navigation
Jump to search
Denizyuret (talk | contribs) |
(→Software: fix mkcls link) |
||
(3 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
==Evaluation== | ==Evaluation== | ||
+ | |||
'''Many-to-1:''' Map every induced label to a gold standard tag greedily (45 labels to 45 tags of the Penn tag set). Use the mapping to compute tag accuracy on the Wall Street Journal portion of the Penn TreeBank. | '''Many-to-1:''' Map every induced label to a gold standard tag greedily (45 labels to 45 tags of the Penn tag set). Use the mapping to compute tag accuracy on the Wall Street Journal portion of the Penn TreeBank. | ||
+ | |||
==Results== | ==Results== | ||
+ | '''Listed in order of decreasing accuracy''' | ||
+ | |||
{| border="1" cellpadding="5" cellspacing="1" width="100%" | {| border="1" cellpadding="5" cellspacing="1" width="100%" | ||
Line 39: | Line 43: | ||
<nowiki>*</nowiki> according to Christodoulopoulos, Goldwater and Steedman (2010) | <nowiki>*</nowiki> according to Christodoulopoulos, Goldwater and Steedman (2010) | ||
+ | |||
== References == | == References == | ||
+ | '''Listed alphabetically.''' | ||
− | * [http://aclweb.org/anthology// | + | * Berg-Kirkpatrick, Taylor, Alexandre Bouchard-Cote, John DeNero, and Dan Klein. 2010. [http://www.aclweb.org/anthology/N/N10/N10-1083.pdf Painless Unsupervised Learning with Features]. NAACL 2010. |
− | + | * Christodoulopoulos, Christos, Sharon Goldwater and Mark Steedman. 2010. [http://www.aclweb.org/anthology/D/D10/D10-1056.pdf Two Decades of Unsupervised POS induction: How far have we come?] In Proceedings of EMNLP 2010. | |
− | * [http://www.aclweb.org/anthology/D/D10/D10-1056.pdf | + | * Clark, Alexander. 2003. [http://www.aclweb.org/anthology/E/E03/E03-1009.pdf Combining distributional and morphological information for part of speech induction]. In Proceedings of EACL 2003, pages 59–66, Morristown, NJ, USA. |
− | + | * Yatbaz, Mehmet Ali, Enis Sert and Deniz Yuret. 2012. [http://aclweb.org/anthology//D/D12/D12-1086.pdf Learning Syntactic Categories Using Paradigmatic Representations of Word Context]. In Proceedings of EMNLP 2012, pages 940–951. | |
− | * [http://www.aclweb.org/anthology/ | ||
− | * [http://www. | + | == Software == |
+ | * [http://www.cs.rhul.ac.uk/home/alexc/pos2.tar.gz alexc] | ||
+ | * [https://github.com/percyliang/brown-cluster brown-cluster] | ||
+ | * [https://code.google.com/p/giza-pp/ mkcls] | ||
+ | * [http://wortschatz.uni-leipzig.de/~cbiemann/software/unsupos.html unsupos] | ||
+ | * [https://github.com/ai-ku/upos upos] | ||
== See also == | == See also == |
Latest revision as of 17:40, 7 March 2014
Evaluation
Many-to-1: Map every induced label to a gold standard tag greedily (45 labels to 45 tags of the Penn tag set). Use the mapping to compute tag accuracy on the Wall Street Journal portion of the Penn TreeBank.
Results
Listed in order of decreasing accuracy
System name | Short description | Main publications | Software | Many-to-1 |
---|---|---|---|---|
UPOS | Learning Syntactic Categories Using Paradigmatic Representations of Word Context | Yatbaz et al. (2012) | upos | 80.2% |
Brown+proto | MRF initialized with Brown prototypes | Christodoulopoulos, Goldwater and Steedman (2010) | 76.1% | |
Logistic regression with features and LBFGS | Berg-Kirkpatrick et al. (2010) | 75.5% | ||
Clark DMF | Distributional clustering + morphology + frequency | Clark (2003) | alexc | 71.2%* |
* according to Christodoulopoulos, Goldwater and Steedman (2010)
References
Listed alphabetically.
- Berg-Kirkpatrick, Taylor, Alexandre Bouchard-Cote, John DeNero, and Dan Klein. 2010. Painless Unsupervised Learning with Features. NAACL 2010.
- Christodoulopoulos, Christos, Sharon Goldwater and Mark Steedman. 2010. Two Decades of Unsupervised POS induction: How far have we come? In Proceedings of EMNLP 2010.
- Clark, Alexander. 2003. Combining distributional and morphological information for part of speech induction. In Proceedings of EACL 2003, pages 59–66, Morristown, NJ, USA.
- Yatbaz, Mehmet Ali, Enis Sert and Deniz Yuret. 2012. Learning Syntactic Categories Using Paradigmatic Representations of Word Context. In Proceedings of EMNLP 2012, pages 940–951.