Jiefu Gong


2023

pdf bib
Sentence Ordering with a Coherence Verifier
Sainan Jia | Wei Song | Jiefu Gong | Shijin Wang | Ting Liu
Findings of the Association for Computational Linguistics: ACL 2023

This paper presents a novel sentence ordering method by plugging a coherence verifier (CoVer) into pair-wise ranking-based and sequence generation-based methods. It does not change the model parameters of the baseline, and only verifies the coherence of candidate (partial) orders produced by the baseline and reranks them in beam search. We also propose a coherence model as CoVer with a novel graph formulation and a novel data construction strategy for contrastive pre-training independently of the sentence ordering task. Experimental results on four benchmarks demonstrate the effectiveness of our method with topological sorting-based and pointer network-based methods as the baselines. Detailed analyses illustrate how CoVer improves the baselines and confirm the importance of its graph formulation and training strategy. Our code is available at https://github.com/SN-Jia/SO_with_CoVer.

pdf bib
Chinese Metaphorical Relation Extraction: Dataset and Models
Guihua Chen | Tiantian Wu | MiaoMiao Cheng | Xu Han | Jiefu Gong | Shijin Wang | Wei Song
Findings of the Association for Computational Linguistics: EMNLP 2023

Metaphor identification is usually formulated as a sequence labeling or a syntactically related word-pair classification problem. In this paper, we propose a novel formulation of metaphor identification as a relation extraction problem. We introduce metaphorical relations, which are links between two spans, a target span and a source-related span, which are realized in sentences. Based on spans, we can use more flexible and precise text units beyond single words for capturing the properties of the target and the source. We create a dataset for Chinese metaphorical relation extraction, with more than 4,200 sentences annotated with metaphorical relations, corresponding target/source-related spans, and fine-grained span types. We develop a span-based end-to-end model for metaphorical relation extraction and demonstrate its effectiveness. We expect that metaphorical relation extraction can serve as a bridge for connecting linguistic and conceptual metaphor processing. The dataset is at https://github.com/cnunlp/CMRE.

2021

pdf bib
IFlyEA: A Chinese Essay Assessment System with Automated Rating, Review Generation, and Recommendation
Jiefu Gong | Xiao Hu | Wei Song | Ruiji Fu | Zhichao Sheng | Bo Zhu | Shijin Wang | Ting Liu
Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing: System Demonstrations

Automated Essay Assessment (AEA) aims to judge students’ writing proficiency in an automatic way. This paper presents a Chinese AEA system IFlyEssayAssess (IFlyEA), targeting on evaluating essays written by native Chinese students from primary and junior schools. IFlyEA provides multi-level and multi-dimension analytical modules for essay assessment. It has state-of-the-art grammar level analysis techniques, and also integrates components for rhetoric and discourse level analysis, which are important for evaluating native speakers’ writing ability, but still challenging and less studied in previous work. Based on the comprehensive analysis, IFlyEA provides application services for essay scoring, review generation, recommendation, and explainable analytical visualization. These services can benefit both teachers and students during the process of writing teaching and learning.

2020

pdf bib
Combining ResNet and Transformer for Chinese Grammatical Error Diagnosis
Shaolei Wang | Baoxin Wang | Jiefu Gong | Zhongyuan Wang | Xiao Hu | Xingyi Duan | Zizhuo Shen | Gang Yue | Ruiji Fu | Dayong Wu | Wanxiang Che | Shijin Wang | Guoping Hu | Ting Liu
Proceedings of the 6th Workshop on Natural Language Processing Techniques for Educational Applications

Grammatical error diagnosis is an important task in natural language processing. This paper introduces our system at NLPTEA-2020 Task: Chinese Grammatical Error Diagnosis (CGED). CGED aims to diagnose four types of grammatical errors which are missing words (M), redundant words (R), bad word selection (S) and disordered words (W). Our system is built on the model of multi-layer bidirectional transformer encoder and ResNet is integrated into the encoder to improve the performance. We also explore two ensemble strategies including weighted averaging and stepwise ensemble selection from libraries of models to improve the performance of single model. In official evaluation, our system obtains the highest F1 scores at identification level and position level. We also recommend error corrections for specific error types and achieve the second highest F1 score at correction level.

2018

pdf bib
Chinese Grammatical Error Diagnosis using Statistical and Prior Knowledge driven Features with Probabilistic Ensemble Enhancement
Ruiji Fu | Zhengqi Pei | Jiefu Gong | Wei Song | Dechuan Teng | Wanxiang Che | Shijin Wang | Guoping Hu | Ting Liu
Proceedings of the 5th Workshop on Natural Language Processing Techniques for Educational Applications

This paper describes our system at NLPTEA-2018 Task #1: Chinese Grammatical Error Diagnosis. Grammatical Error Diagnosis is one of the most challenging NLP tasks, which is to locate grammar errors and tell error types. Our system is built on the model of bidirectional Long Short-Term Memory with a conditional random field layer (BiLSTM-CRF) but integrates with several new features. First, richer features are considered in the BiLSTM-CRF model; second, a probabilistic ensemble approach is adopted; third, Template Matcher are used during a post-processing to bring in human knowledge. In official evaluation, our system obtains the highest F1 scores at identifying error types and locating error positions, the second highest F1 score at sentence level error detection. We also recommend error corrections for specific error types and achieve the best F1 performance among all participants.