Bias and Fairness in Natural Language Processing

Kai-Wei Chang, Vinodkumar Prabhakaran, Vicente Ordonez


Abstract
Recent advances in data-driven machine learning techniques (e.g., deep neural networks) have revolutionized many natural language processing applications. These approaches automatically learn how to make decisions based on the statistics and diagnostic information from large amounts of training data. Despite the remarkable accuracy of machine learning in various applications, learning algorithms run the risk of relying on societal biases encoded in the training data to make predictions. This often occurs even when gender and ethnicity information is not explicitly provided to the system because learning algorithms are able to discover implicit associations between individuals and their demographic information based on other variables such as names, titles, home addresses, etc. Therefore, machine learning algorithms risk potentially encouraging unfair and discriminatory decision making and raise serious privacy concerns. Without properly quantifying and reducing the reliance on such correlations, broad adoption of these models might have the undesirable effect of magnifying harmful stereotypes or implicit biases that rely on sensitive demographic attributes.In this tutorial, we will review the history of bias and fairness studies in machine learning and language processing and present recent community effort in quantifying and mitigating bias in natural language processing models for a wide spectrum of tasks, including word embeddings, co-reference resolution, machine translation, and vision-and-language tasks. In particular, we will focus on the following topics:+ Definitions of fairness and bias.+ Data, algorithms, and models that propagate and even amplify social bias to NLP applications and metrics to quantify these biases.+ Algorithmic solutions; learning objective; design principles to prevent social bias in NLP systems and their potential drawbacks.The tutorial will bring researchers and practitioners to be aware of this issue, and encourage the research community to propose innovative solutions to promote fairness in NLP.
Anthology ID:
D19-2004
Volume:
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): Tutorial Abstracts
Month:
November
Year:
2019
Address:
Hong Kong, China
Editors:
Timothy Baldwin, Marine Carpuat
Venues:
EMNLP | IJCNLP
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
Language:
URL:
https://aclanthology.org/D19-2004
DOI:
Bibkey:
Cite (ACL):
Kai-Wei Chang, Vinodkumar Prabhakaran, and Vicente Ordonez. 2019. Bias and Fairness in Natural Language Processing. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP): Tutorial Abstracts, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Bias and Fairness in Natural Language Processing (Chang et al., EMNLP-IJCNLP 2019)
Copy Citation: