Second Workshop on Computational Approaches to Code Switching

Event Notification Type: 
Call for Papers
Abbreviated Title: 
Code-Switching Workshop
Location: 
Hilton
Wednesday, 2 November 2016
State: 
TX
Country: 
USA
City: 
Austin
Contact: 
Thamar Solorio
Fahad Ghamdi
Submission Deadline: 
Wednesday, 17 August 2016

Code-switching (CS) is the phenomenon by which multilingual speakers switch back and forth between their common languages in written or spoken communication. CS is typically present on the inter sentential, intra sentential (mixing of words from multiple languages in the same utterance) and even morphological (mixing of morphemes) levels. CS presents serious challenges for language technologies, including parsing, Machine Translation (MT), automatic speech recognition (ASR), information retrieval (IR) and extraction (IE), and semantic processing. Traditional techniques trained for one language quickly break down when there is input mixed in from another. Even for problems that are considered solved, such as language identification, or part of speech tagging, performance will degrade at a rate proportional to the amount and level of mixed-language present.

CS is pervasive in informal text communications such as news groups, tweets, blogs, and other social media of multilingual communities. Such genres are increasingly being studied as rich sources of social, commercial and political information. Apart from the informal genre challenge associated with such data within a single language processing scenario, the CS phenomenon adds another significant layer of complexity to the processing of the data. Efficiently and robustly processing CS data presents a new frontier for our NLP algorithms on all levels. This workshop aims to bring together researchers interested in solving the problem and to increase awareness of the community at large with possible viable solutions to reduce the complexity of the phenomenon.

The workshop invites contributions from researchers working in NLP approaches for the analysis and/processing of mixed-language data especially with a focus on intra sentential code switching. Topics of relevance to the workshop will include the following:

Development of linguistic resources to support research on code switched data
NLP approaches for language identification in code switched data
NLP techniques for the syntactic analysis of code switched data
Domain/dialect/genre adaptation techniques applied to code switched data processing
Language modeling approaches to code switch data processing
Crowdsourcing approaches for the annotation of code switched data
Machine translation approaches for code switched data
Position papers discussing the challenges of code switched data to NLP techniques
Methods for improving ASR in code switched data
Survey papers of NLP research for code switched data
Sociolinguistic aspects of code switching
Sociopragmatic aspects of code switching