Jixing Li


2023

pdf bib
Roles of Scaling and Instruction Tuning in Language Perception: Model vs. Human Attention
Changjiang Gao | Shujian Huang | Jixing Li | Jiajun Chen
Findings of the Association for Computational Linguistics: EMNLP 2023

Recent large language models (LLMs) have revealed strong abilities to understand natural language. Since most of them share the same basic structure, i.e. the transformer block, possible contributors to their success in the training process are scaling and instruction tuning. However, how these factors affect the models’ language perception is unclear. This work compares the self-attention of several existing LLMs (LLaMA, Alpaca and Vicuna) in different sizes (7B, 13B, 30B, 65B), together with eye saccade, an aspect of human reading attention, to assess the effect of scaling and instruction tuning on language perception. Results show that scaling enhances the human resemblance and improves the effective attention by reducing the trivial pattern reliance, while instruction tuning does not. However, instruction tuning significantly enhances the models’ sensitivity to instructions. We also find that current LLMs are consistently closer to non-native than native speakers in attention, suggesting a sub-optimal language perception of all models. Our code and data used in the analysis is available on GitHub.

2022

pdf bib
Quantifying Discourse Support for Omitted Pronouns
Shulin Zhang | Jixing Li | John Hale
Proceedings of the Fifth Workshop on Computational Models of Reference, Anaphora and Coreference

Pro-drop is commonly seen in many languages, but its discourse motivations have not been well characterized. Inspired by the topic chain theory in Chinese, this study shows how character-verb usage continuity distinguishes dropped pronouns from overt references to story characters. We model the choice to drop vs. not drop as a function of character-verb continuity. The results show that omitted subjects have higher character history-current verb continuity salience than non-omitted subjects. This is consistent with the idea that discourse coherence with a particular topic, such as a story character, indeed facilitates the omission of pronouns in languages and contexts where they are optional.

2018

pdf bib
Modeling Brain Activity Associated with Pronoun Resolution in English and Chinese
Jixing Li | Murielle Fabre | Wen-Ming Luh | John Hale
Proceedings of the First Workshop on Computational Models of Reference, Anaphora and Coreference

Typological differences between English and Chinese suggest stronger reliance on salience of the antecedent during pronoun resolution in Chinese. We examined this hypothesis by correlating a difficulty measure of pronoun resolution derived by the activation-based ACT-R model with the brain activity of English and Chinese participants listening to a same audiobook during fMRI recording. The ACT-R model predicts higher overall difficulty for English speakers, which is supported at the brain level in left Broca’s area. More generally, it confirms that computational modeling approach is able to dissociate different dimensions that are involved in the complex process of pronoun resolution in the brain.

pdf bib
The Role of Syntax During Pronoun Resolution: Evidence from fMRI
Jixing Li | Murielle Fabre | Wen-Ming Luh | John Hale
Proceedings of the Eight Workshop on Cognitive Aspects of Computational Language Learning and Processing

The current study examined the role of syntactic structure during pronoun resolution. We correlated complexity measures derived by the syntax-sensitive Hobbs algorithm and a neural network model for pronoun resolution with brain activity of participants listening to an audiobook during fMRI recording. Compared to the neural network model, the Hobbs algorithm is associated with larger clusters of brain activation in a network including the left Broca’s area.

2016

pdf bib
Temporal Lobes as Combinatory Engines for both Form and Meaning
Jixing Li | Jonathan Brennan | Adam Mahar | John Hale
Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity (CL4LC)

The relative contributions of meaning and form to sentence processing remains an outstanding issue across the language sciences. We examine this issue by formalizing four incremental complexity metrics and comparing them against freely-available ROI timecourses. Syntax-related metrics based on top-down parsing and structural dependency-distance turn out to significantly improve a regression model, compared to a simpler model that formalizes only conceptual combination using a distributional vector-space model. This confirms the view of the anterior temporal lobes as combinatory engines that deal in both form (see e.g. Brennan et al., 2012; Mazoyer, 1993) and meaning (see e.g., Patterson et al., 2007). This same characterization applies to a posterior temporal region in roughly “Wernicke’s Area.”