Tiejun Zhao

Also published as: Tie-jun Zhao, Tie-Jun Zhao, TieJun Zhao


2019

pdf bib
CN-HIT-MI.T at SemEval-2019 Task 6: Offensive Language Identification Based on BiLSTM with Double Attention
Yaojie Zhang | Bing Xu | Tiejun Zhao
Proceedings of the 13th International Workshop on Semantic Evaluation

Offensive language has become pervasive in social media. In Offensive Language Identification tasks, it may be difficult to predict accurately only according to the surface words. So we try to dig deeper semantic information of text. This paper presents use an attention-based two layers bidirectional longshort memory neural network (BiLSTM) for semantic feature extraction. Additionally, a residual connection mechanism is used to synthesize two different deep features, and an emoji attention mechanism is used to extract semantic information of emojis in text. We participated in three sub-tasks of SemEval 2019 Task 6 as CN-HIT-MI.T team. Our macro-averaged F1-score in sub-task A is 0.768, ranking 28/103. We got 0.638 in sub-task B, ranking 30/75. In sub-task C, we got 0.549, ranking 22/65. We also tried some other methods of not submitting results.

pdf bib
Understanding and Improving Hidden Representations for Neural Machine Translation
Guanlin Li | Lemao Liu | Xintong Li | Conghui Zhu | Tiejun Zhao | Shuming Shi
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

Multilayer architectures are currently the gold standard for large-scale neural machine translation. Existing works have explored some methods for understanding the hidden representations, however, they have not sought to improve the translation quality rationally according to their understanding. Towards understanding for performance improvement, we first artificially construct a sequence of nested relative tasks and measure the feature generalization ability of the learned hidden representation over these tasks. Based on our understanding, we then propose to regularize the layer-wise representations with all tree-induced tasks. To overcome the computational bottleneck resulting from the large number of regularization terms, we design efficient approximation methods by selecting a few coarse-to-fine tasks for regularization. Extensive experiments on two widely-used datasets demonstrate the proposed methods only lead to small extra overheads in training but no additional overheads in testing, and achieve consistent improvements (up to +1.3 BLEU) compared to the state-of-the-art translation model.

pdf bib
Improving Neural Machine Translation with Neural Syntactic Distance
Chunpeng Ma | Akihiro Tamura | Masao Utiyama | Eiichiro Sumita | Tiejun Zhao
Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)

The explicit use of syntactic information has been proved useful for neural machine translation (NMT). However, previous methods resort to either tree-structured neural networks or long linearized sequences, both of which are inefficient. Neural syntactic distance (NSD) enables us to represent a constituent tree using a sequence whose length is identical to the number of words in the sentence. NSD has been used for constituent parsing, but not in machine translation. We propose five strategies to improve NMT with NSD. Experiments show that it is not trivial to improve NMT with NSD; however, the proposed strategies are shown to improve translation performance of the baseline model (+2.1 (En–Ja), +1.3 (Ja–En), +1.2 (En–Ch), and +1.0 (Ch–En) BLEU).

pdf bib
Unsupervised Bilingual Word Embedding Agreement for Unsupervised Neural Machine Translation
Haipeng Sun | Rui Wang | Kehai Chen | Masao Utiyama | Eiichiro Sumita | Tiejun Zhao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Unsupervised bilingual word embedding (UBWE), together with other technologies such as back-translation and denoising, has helped unsupervised neural machine translation (UNMT) achieve remarkable results in several language pairs. In previous methods, UBWE is first trained using non-parallel monolingual corpora and then this pre-trained UBWE is used to initialize the word embedding in the encoder and decoder of UNMT. That is, the training of UBWE and UNMT are separate. In this paper, we first empirically investigate the relationship between UBWE and UNMT. The empirical findings show that the performance of UNMT is significantly affected by the performance of UBWE. Thus, we propose two methods that train UNMT with UBWE agreement. Empirical results on several language pairs show that the proposed methods significantly outperform conventional UNMT.

pdf bib
Sentence-Level Agreement for Neural Machine Translation
Mingming Yang | Rui Wang | Kehai Chen | Masao Utiyama | Eiichiro Sumita | Min Zhang | Tiejun Zhao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

The training objective of neural machine translation (NMT) is to minimize the loss between the words in the translated sentences and those in the references. In NMT, there is a natural correspondence between the source sentence and the target sentence. However, this relationship has only been represented using the entire neural network and the training objective is computed in word-level. In this paper, we propose a sentence-level agreement module to directly minimize the difference between the representation of source and target sentence. The proposed agreement module can be integrated into NMT as an additional training objective function and can also be used to enhance the representation of the source sentences. Empirical results on the NIST Chinese-to-English and WMT English-to-German tasks show the proposed agreement module can significantly improve the NMT performance.

pdf bib
Selection Bias Explorations and Debias Methods for Natural Language Sentence Matching Datasets
Guanhua Zhang | Bing Bai | Jian Liang | Kun Bai | Shiyu Chang | Mo Yu | Conghui Zhu | Tiejun Zhao
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

Natural Language Sentence Matching (NLSM) has gained substantial attention from both academics and the industry, and rich public datasets contribute a lot to this process. However, biased datasets can also hurt the generalization performance of trained models and give untrustworthy evaluation results. For many NLSM datasets, the providers select some pairs of sentences into the datasets, and this sampling procedure can easily bring unintended pattern, i.e., selection bias. One example is the QuoraQP dataset, where some content-independent naive features are unreasonably predictive. Such features are the reflection of the selection bias and termed as the “leakage features.” In this paper, we investigate the problem of selection bias on six NLSM datasets and find that four out of them are significantly biased. We further propose a training and evaluation framework to alleviate the bias. Experimental results on QuoraQP suggest that the proposed framework can improve the generalization ability of trained models, and give more trustworthy evaluation results for real-world adoptions.

2018

pdf bib
Neural Document Summarization by Jointly Learning to Score and Select Sentences
Qingyu Zhou | Nan Yang | Furu Wei | Shaohan Huang | Ming Zhou | Tiejun Zhao
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Sentence scoring and sentence selection are two main steps in extractive document summarization systems. However, previous works treat them as two separated subtasks. In this paper, we present a novel end-to-end neural network framework for extractive document summarization by jointly learning to score and select sentences. It first reads the document sentences with a hierarchical encoder to obtain the representation of sentences. Then it builds the output summary by extracting sentences one by one. Different from previous methods, our approach integrates the selection strategy into the scoring model, which directly predicts the relative importance given previously selected sentences. Experiments on the CNN/Daily Mail dataset show that the proposed framework significantly outperforms the state-of-the-art extractive summarization models.

pdf bib
Forest-Based Neural Machine Translation
Chunpeng Ma | Akihiro Tamura | Masao Utiyama | Tiejun Zhao | Eiichiro Sumita
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Tree-based neural machine translation (NMT) approaches, although achieved impressive performance, suffer from a major drawback: they only use the 1-best parse tree to direct the translation, which potentially introduces translation mistakes due to parsing errors. For statistical machine translation (SMT), forest-based methods have been proven to be effective for solving this problem, while for NMT this kind of approach has not been attempted. This paper proposes a forest-based NMT method that translates a linearized packed forest under a simple sequence-to-sequence framework (i.e., a forest-to-sequence NMT model). The BLEU score of the proposed method is higher than that of the sequence-to-sequence NMT, tree-based NMT, and forest-based SMT systems.

2017

pdf bib
Context-Aware Smoothing for Neural Machine Translation
Kehai Chen | Rui Wang | Masao Utiyama | Eiichiro Sumita | Tiejun Zhao
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

In Neural Machine Translation (NMT), each word is represented as a low-dimension, real-value vector for encoding its syntax and semantic information. This means that even if the word is in a different sentence context, it is represented as the fixed vector to learn source representation. Moreover, a large number of Out-Of-Vocabulary (OOV) words, which have different syntax and semantic information, are represented as the same vector representation of “unk”. To alleviate this problem, we propose a novel context-aware smoothing method to dynamically learn a sentence-specific vector for each word (including OOV words) depending on its local context words in a sentence. The learned context-aware representation is integrated into the NMT to improve the translation performance. Empirical results on NIST Chinese-to-English translation task show that the proposed approach achieves 1.78 BLEU improvements on average over a strong attentional NMT, and outperforms some existing systems.

pdf bib
Neural Machine Translation with Source Dependency Representation
Kehai Chen | Rui Wang | Masao Utiyama | Lemao Liu | Akihiro Tamura | Eiichiro Sumita | Tiejun Zhao
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing

Source dependency information has been successfully introduced into statistical machine translation. However, there are only a few preliminary attempts for Neural Machine Translation (NMT), such as concatenating representations of source word and its dependency label together. In this paper, we propose a novel NMT with source dependency representation to improve translation performance of NMT, especially long sentences. Empirical results on NIST Chinese-to-English translation task show that our method achieves 1.6 BLEU improvements on average over a strong NMT system.

2016

pdf bib
Building A Case-based Semantic English-Chinese Parallel Treebank
Huaxing Shi | Tiejun Zhao | Keh-Yih Su
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

We construct a case-based English-to-Chinese semantic constituent parallel Treebank for a Statistical Machine Translation (SMT) task by labelling each node of the Deep Syntactic Tree (DST) with our refined semantic cases. Since subtree span-crossing is harmful in tree-based SMT, DST is adopted to alleviate this problem. At the same time, we tailor an existing case set to represent bilingual shallow semantic relations more precisely. This Treebank is a part of a semantic corpus building project, which aims to build a semantic bilingual corpus annotated with syntactic, semantic cases and word senses. Data in our Treebank is from the news domain of Datum corpus. 4,000 sentence pairs are selected to cover various lexicons and part-of-speech (POS) n-gram patterns as much as possible. This paper presents the construction of this case Treebank. Also, we have tested the effect of adopting DST structure in alleviating subtree span-crossing. Our preliminary analysis shows that the compatibility between Chinese and English trees can be significantly increased by transforming the parse-tree into the DST. Furthermore, the human agreement rate in annotation is found to be acceptable (90% in English nodes, 75% in Chinese nodes).

pdf bib
A Distribution-based Model to Learn Bilingual Word Embeddings
Hailong Cao | Tiejun Zhao | Shu Zhang | Yao Meng
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We introduce a distribution based model to learn bilingual word embeddings from monolingual data. It is simple, effective and does not require any parallel data or any seed lexicon. We take advantage of the fact that word embeddings are usually in form of dense real-valued low-dimensional vector and therefore the distribution of them can be accurately estimated. A novel cross-lingual learning objective is proposed which directly matches the distributions of word embeddings in one language with that in the other language. During the joint learning process, we dynamically estimate the distributions of word embeddings in two languages respectively and minimize the dissimilarity between them through standard back propagation algorithm. Our learned bilingual word embeddings allow to group each word and its translations together in the shared vector space. We demonstrate the utility of the learned embeddings on the task of finding word-to-word translations from monolingual corpora. Our model achieved encouraging performance on data in both related languages and substantially different languages.

pdf bib
Constraint-Based Question Answering with Knowledge Graph
Junwei Bao | Nan Duan | Zhao Yan | Ming Zhou | Tiejun Zhao
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

WebQuestions and SimpleQuestions are two benchmark data-sets commonly used in recent knowledge-based question answering (KBQA) work. Most questions in them are ‘simple’ questions which can be answered based on a single relation in the knowledge base. Such data-sets lack the capability of evaluating KBQA systems on complicated questions. Motivated by this issue, we release a new data-set, namely ComplexQuestions, aiming to measure the quality of KBQA systems on ‘multi-constraint’ questions which require multiple knowledge base relations to get the answer. Beside, we propose a novel systematic KBQA approach to solve multi-constraint questions. Compared to state-of-the-art methods, our approach not only obtains comparable results on the two existing benchmark data-sets, but also achieves significant improvements on the ComplexQuestions.

2015

pdf bib
Efficient Disfluency Detection with Transition-based Parsing
Shuangzhi Wu | Dongdong Zhang | Ming Zhou | Tiejun Zhao
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

2014

pdf bib
Knowledge-Based Question Answering as Machine Translation
Junwei Bao | Nan Duan | Ming Zhou | Tiejun Zhao
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Improving Pivot-Based Statistical Machine Translation by Pivoting the Co-occurrence Count of Phrase Pairs
Xiaoning Zhu | Zhongjun He | Hua Wu | Conghui Zhu | Haifeng Wang | Tiejun Zhao
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

pdf bib
A Lexicalized Reordering Model for Hierarchical Phrase-based Translation
Hailong Cao | Dongdong Zhang | Mu Li | Ming Zhou | Tiejun Zhao
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib
Soft Dependency Matching for Hierarchical Phrase-based Machine Translation
Hailong Cao | Dongdong Zhang | Ming Zhou | Tiejun Zhao
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2013

pdf bib
Additive Neural Networks for Statistical Machine Translation
Lemao Liu | Taro Watanabe | Eiichiro Sumita | Tiejun Zhao
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Hierarchical Phrase Table Combination for Machine Translation
Conghui Zhu | Taro Watanabe | Eiichiro Sumita | Tiejun Zhao
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf bib
Cross-lingual Projections between Languages from Different Families
Mo Yu | Tiejun Zhao | Yalong Bai | Hao Tian | Dianhai Yu
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
A Tightly-coupled Unsupervised Clustering and Bilingual Alignment Model for Transliteration
Tingting Li | Tiejun Zhao | Andrew Finch | Chunyue Zhang
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf bib
Tuning SMT with a Large Number of Features via Online Feature Grouping
Lemao Liu | Tiejun Zhao | Taro Watanabe | Eiichiro Sumita
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
Repairing Incorrect Translation with Examples
Junguo Zhu | Muyun Yang | Sheng Li | Tiejun Zhao
Proceedings of the Sixth International Joint Conference on Natural Language Processing

pdf bib
Compound Embedding Features for Semi-supervised Learning
Mo Yu | Tiejun Zhao | Daxiang Dong | Hao Tian | Dianhai Yu
Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib
Improving Pivot-Based Statistical Machine Translation Using Random Walk
Xiaoning Zhu | Zhongjun He | Hua Wu | Haifeng Wang | Conghui Zhu | Tiejun Zhao
Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing

2012

pdf bib
Syllable-based Machine Transliteration with Extra Phrase Features
Chunyue Zhang | Tingting Li | Tiejun Zhao
Proceedings of the 4th Named Entity Workshop (NEWS) 2012

pdf bib
Locally Training the Log-Linear Model for SMT
Lemao Liu | Hailong Cao | Taro Watanabe | Tiejun Zhao | Mo Yu | Conghui Zhu
Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

pdf bib
Expected Error Minimization with Ultraconservative Update for SMT
Lemao Liu | Tiejun Zhao | Taro Watanabe | Hailong Cao | Conghui Zhu
Proceedings of COLING 2012: Posters

2011

pdf bib
Target-dependent Twitter Sentiment Classification
Long Jiang | Mo Yu | Ming Zhou | Xiaohua Liu | Tiejun Zhao
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies

2010

pdf bib
A Joint Rule Selection Model for Hierarchical Phrase-Based Translation
Lei Cui | Dongdong Zhang | Mu Li | Ming Zhou | Tiejun Zhao
Proceedings of the ACL 2010 Conference Short Papers

pdf bib
PKU_HIT: An Event Detection System Based on Instances Expansion and Rich Syntactic Features
Shiqi Li | Pengyuan Liu | Tiejun Zhao | Qin Lu | Hanjing Li
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib
PengYuan@PKU: Extracting Infrequent Sense Instance with the Same N-Gram Pattern for the SemEval-2010 Task 15
Peng-Yuan Liu | Shi-Wen Yu | Shui Liu | Tie-Jun Zhao
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib
Using Deep Belief Nets for Chinese Named Entity Categorization
Yu Chen | You Ouyang | Wenjie Li | Dequan Zheng | Tiejun Zhao
Proceedings of the 2010 Named Entities Workshop

pdf bib
Exploring Deep Belief Network for Chinese Relation Extraction
Yu Chen | Wenjie Li | Yan Liu | Dequan Zheng | Tiejun Zhao
CIPS-SIGHAN Joint Conference on Chinese Language Processing

pdf bib
Hybrid Decoding: Decoding with Partial Hypotheses Combination over Multiple SMT Systems
Lei Cui | Dongdong Zhang | Mu Li | Ming Zhou | Tiejun Zhao
Coling 2010: Posters

pdf bib
Combining Constituent and Dependency Syntactic Views for Chinese Semantic Role Labeling
Shiqi Li | Qin Lu | Tiejun Zhao | Pengyuan Liu | Hanjing Li
Coling 2010: Posters

pdf bib
Reexamination on Potential for Personalization in Web Search
Daren Li | Muyun Yang | HaoLiang Qi | Sheng Li | Tiejun Zhao
Coling 2010: Posters

pdf bib
Head-modifier Relation based Non-lexical Reordering Model for Phrase-Based Translation
Shui Liu | Sheng Li | Tiejun Zhao | Min Zhang | Pengyuan Liu
Coling 2010: Posters

pdf bib
Utilizing Variability of Time and Term Content, within and across Users in Session Detection
Shuqi Sun | Sheng Li | Muyun Yang | Haoliang Qi | Tiejun Zhao
Coling 2010: Posters

pdf bib
All in Strings: a Powerful String-based Automatic MT Evaluation Metric with Multiple Granularities
Junguo Zhu | Muyun Yang | Bo Wang | Sheng Li | Tiejun Zhao
Coling 2010: Posters

2009

pdf bib
References Extension for the Automatic Evaluation of MT by Syntactic Hybridization
Bo Wang | Tiejun Zhao | Muyun Yang | Sheng Li
Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation (SSST-3) at NAACL HLT 2009

pdf bib
A Study of Translation Rule Classification for Syntax-based Statistical Machine Translation
Hongfei Jiang | Sheng Li | Muyun Yang | Tiejun Zhao
Proceedings of the Third Workshop on Syntax and Structure in Statistical Translation (SSST-3) at NAACL HLT 2009

pdf bib
Train the Machine with What It Can Learn—Corpus Selection for SMT
Xiwu Han | Hanzhang Li | Tiejun Zhao
Proceedings of the 2nd Workshop on Building and Using Comparable Corpora: from Parallel to Non-parallel Corpora (BUCC)

pdf bib
A Statistical Machine Translation Model Based on a Synthetic Synchronous Grammar
Hongfei Jiang | Muyun Yang | Tiejun Zhao | Sheng Li | Bo Wang
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

pdf bib
Chinese Term Extraction Using Different Types of Relevance
Yuhang Yang | Tiejun Zhao | Qin Lu | Dequan Zheng | Hao Yu
Proceedings of the ACL-IJCNLP 2009 Conference Short Papers

2008

pdf bib
Chinese Term Extraction Using Minimal Resources
Yuhang Yang | Qin Lu | Tiejun Zhao
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Diagnostic Evaluation of Machine Translation Systems Using Automatically Constructed Linguistic Check-Points
Ming Zhou | Bo Wang | Shujie Liu | Mu Li | Dongdong Zhang | Tiejun Zhao
Proceedings of the 22nd International Conference on Computational Linguistics (Coling 2008)

pdf bib
Chinese Term Extraction Based on Delimiters
Yuhang Yang | Qin Lu | Tiejun Zhao
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

2007

pdf bib
A Unified Tagging Approach to Text Normalization
Conghui Zhu | Jie Tang | Hang Li | Hwee Tou Ng | Tiejun Zhao
Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics

pdf bib
HIT-WSD: Using Search Engine for Multilingual Chinese-English Lexical Sample Task
PengYuan Liu | TieJun Zhao | MuYun Yang
Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007)

pdf bib
Meta-Structure Transformation Model for Statistical Machine Translation
Jiadong Sun | Tiejun Zhao | Huashen Liang
Proceedings of the Second Workshop on Statistical Machine Translation

pdf bib
The Extraction of Trajectories from Real Texts Based on Linear Classification
Hanjing Li | Tiejun Zhao | Sheng Li | Jiyuan Zhao
Proceedings of the 16th Nordic Conference of Computational Linguistics (NODALIDA 2007)

2006

pdf bib
Improving English Subcategorization Acquisition with Diathesis Alternations as Heuristic Information
Xiwu Han | Tiejun Zhao | Xingshang Fu
Proceedings of the COLING/ACL 2006 Main Conference Poster Sessions

pdf bib
Two-Fold Filtering for Chinese Subcategorization Acquisition with Diathesis Alternations Used as Heuristic Information
Xiwu Han | Tiejun Zhao
International Journal of Computational Linguistics & Chinese Language Processing, Volume 11, Number 2, June 2006

2005

pdf bib
A Hybrid Chinese Language Model based on a Combination of Ontology with Statistical Method
Dequan Zheng | Tiejun Zhao | Sheng Li | Hao Yu
Companion Volume to the Proceedings of Conference including Posters/Demos and tutorial abstracts

2004

pdf bib
Subcategorization Acquisition and Evaluation for Chinese Verbs
Xiwu Han | Tiejun Zhao | Haoliang Qi | Hao Yu
COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics

2002

pdf bib
Automatic Information Transfer between English and Chinese
Jianmin Yao | Hao Yu | Tiejun Zhao | Xiaohong Li
COLING-02: Machine Translation in Asia

pdf bib
Learning Chinese Bracketing Knowledge Based on a Bilingual Language Model
Yajuan Lü | Sheng Li | Tiejun Zhao | Muyun Yang
COLING 2002: The 19th International Conference on Computational Linguistics

pdf bib
An Automatic Evaluation Method for Localization Oriented Lexicalised EBMT System
Jianmin Yao | Ming Zhou | Tiejun Zhao | Hao Yu | Sheng Li
COLING 2002: The 19th International Conference on Computational Linguistics

2001

pdf bib
Automatic Translation Template Acquisition Based on Bilingual Structure Alignment
Yajuan Lu | Ming Zhou | Sheng Li | Changning Huang | Tiejun Zhao
International Journal of Computational Linguistics & Chinese Language Processing, Volume 6, Number 1, February 2001: Special Issue on Natural Language Processing Researches in MSRA

pdf bib
Automatic Detection of Prosody Phrase Boundaries for Text-to-Speech System
Xin Lv | Tie-jun Zhao | Zhan-yi Liu | Mu-yun Yang
Proceedings of the Seventh International Workshop on Parsing Technologies

2000

pdf bib
Statistics Based Hybrid Approach to Chinese Base Phrase Identification
Tie-jun Zhao | Mu-yun Yang | Fang Liu | Jian-min Yao | Hao Yu
Second Chinese Language Processing Workshop