Homa B. Hashemi


2017

pdf bib
A Corpus of Annotated Revisions for Studying Argumentative Writing
Fan Zhang | Homa B. Hashemi | Rebecca Hwa | Diane Litman
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

This paper presents ArgRewrite, a corpus of between-draft revisions of argumentative essays. Drafts are manually aligned at the sentence level, and the writer’s purpose for each revision is annotated with categories analogous to those used in argument mining and discourse analysis. The corpus should enable advanced research in writing comparison and revision analysis, as demonstrated via our own studies of student revision behavior and of automatic revision purpose prediction.

2016

pdf bib
An Evaluation of Parser Robustness for Ungrammatical Sentences
Homa B. Hashemi | Rebecca Hwa
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf bib
ArgRewrite: A Web-based Revision Assistant for Argumentative Writings
Fan Zhang | Rebecca Hwa | Diane Litman | Homa B. Hashemi
Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations

2014

pdf bib
A Comparison of MT Errors and ESL Errors
Homa B. Hashemi | Rebecca Hwa
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

Generating fluent and grammatical sentences is a major goal for both Machine Translation (MT) and second-language Grammar Error Correction (GEC), but there have not been a lot of cross-fertilization between the two research communities. Arguably, an automatic translate-to-English system might be seen as an English as a Second Language (ESL) writer whose native language is the source language. This paper investigates whether research findings from the GEC community may help with characterizing MT error analysis. We describe a method for the automatic classification of MT errors according to English as a Second Language (ESL) error categories and conduct a large comparison experiment that includes both high-performing and low-performing translate-to-English MT systems for several source languages. Comparing the distribution of MT error types for all the systems suggests that MT systems have fairly similar distributions regardless of their source languages, and the high-performing MT systems have error distributions that are more similar to those of the low-performing MT systems than to those of ESL learners with the same L1.