Over the last few years, there has been a growing public and enterprise interest in 'social media' and their role in modern society. At the heart of this interest is the ability for users to create and share content via a variety of platforms such as blogs, micro-blogs, collaborative wikis, multimedia sharing sites, social networking sites etc. The volume and variety of user-generated content (UGC) and the user participation network behind it are creating new opportunities for understanding web-based practices and building socially intelligent and personalized applications. Investigations around social data can be broadly categorized along the following dimensions: (a) understanding aspects of the user-generated content (b) modeling and observing the user network that the content is generated in and (c) characterizing individuals and groups that produce and consume the content.
The goals for this workshop are to focus on sharing research efforts and results in the first area of understanding language usage on social media.
While there is a rich body of previous work in processing textual content, certain characteristics of UGC on social media introduce challenges in their analyses. A large portion of language found in UGC is in the Informal English domain - a blend of abbreviations, slang and context specific terms; lacking in sufficient context and regularities and delivered with an indifferent approach to grammar and spelling. Traditional content analysis techniques developed for a more formal genre like news, Wikipedia or scientific articles do not translate effectively to UGC. Consequently, well-understood problems such as information extraction, search or monetization on the Web are facing pertinent challenges owing to this new class of textual data.
CALL FOR PAPERS
We invite original and unpublished research papers on all topics related to the intersection of computational linguistics and language in social media, including but not limited to the sample topics below.
Sample topics of Interest:
What are people talking about?
What are the Named Entities and topics that people are making references to?
What are effective summaries of volumes of user comments around a news-worthy event that offer a lens into the society's perceptions?
How are cultures interpreting any situation in local contexts and supporting them in their variable observations on a social medium?
How are they expressing themselves?
What do word usages tell us about an active population or about individual allegiances or non-conformity to group practices?
Are we seeing differences in how users self-present on this new form of digital media?
Why do they scribe?
What are the diverse intentions that produce the diverse content on social media?
Can we understand why we share by looking at what we predominantly do with the medium? What emotions are people sharing about content?
What level of linguistic analysis is possible/necessary in a noisy medium such as social media?
How can existing analysis techniques be adapted to this medium?
Language and network structure: How do language and social network properties interact?
What properties of a network (structural connections) or the participants (personalities, influencers, followers) correlate with which properties of the language used?
Semantic Web / Ontologies / Domain models to aid in social data understanding:
Given the recent interest in the Semantic Web and LOD community to expose models of a domain, how can we utilize these public knowledge bases to serve as priors in linguistic analysis?
John Breslin (U of Galway)
Cindy Chung (UTexas)
Munmun De Choudhury (Arizona State University)
Cristian Danescu-Niculescu-Mizil (Cornell)
Susan Dumais (Microsoft Research)
Jennifer Foster (Dublin City University)
Sam Gosling (UTexas)
Julia Grace (IBM)
Daniel Gruhl (IBM)
Kevin Haas (Microsoft)
Emre Kiciman (Microsoft Research)
Nicolas Nicolov (Microsoft)
Daniel Ramage (Stanford)
Alan Ritter (University of Washington)
Christine Robson (IBM)
Hassan Sayyadi (University of Maryland)
Valerie Shalin (Wright State)
Amit Sheth (Wright State)
Ian Soboroff (NIST)
Hari Sundaram (ASU)