Multiling is a community-driven initiative for benchmarking multilingual summarization systems, nurturing further research, and pushing the state-of-the-art in the area. The aim of MultiLing 2015 is to continue this evolution and, in addition, to introduce new tasks promoting research on summarizing free human interaction in online fora and customer call centres. With this call we wish to invite the summarization research community to participate in MultiLing 2015.
= Introduction =
From Caesar's "Veni, Vidi, Vici" to "What might be in a summary?" (Karen Sparck-Jones, 1993) summarization techniques have been key to successfully grasping the main points of large amounts of information, and much research has been devoted to improving such techniques. In the past two decades, the progress of summarization research has been
supported by evaluation exercises and shared tasks such as DUC, TAC and, more recently, MultiLing (2011, 2013). Multiling is a community-driven initiative for benchmarking multilingual summarization systems, nurturing further research, and pushing the state-of-the-art in the area. The aim of MultiLing 2015 is to continue this evolution and, in addition, to introduce new tasks promoting research on summarizing free human interaction in online fora and customer call centres. With this call we wish to invite the summarization research community to participate in MultiLing 2015.
= The Tasks =
MultiLing 2015 will feature the Multilingual Multi-document Summarization
task familiar from previous editions and its predecessor, the Multilingual
Single-document Summarization. In addition, we will pilot two new tracks,
Online Forum Summarization (OnForumS) and Call Centre Conversation
Summarization (CCCS), in collaboration with the SENSEI EU project
(http://www.sensei-conversation.eu). We describe each task in turn below.
== Multilingual Multi-document Summarization (MMS) ==
The multilingual multi-document summarization track aims to evaluate the
application of (partially or fully) language-independent summarization
algorithms on a variety of languages. Each system participating in the track
will be called to provide summaries for a range of different languages,
based on a news corpus. Participating systems will be required to
apply their methods to a minimum of two languages.
Evaluation will favor systems that apply their methods to more languages.
The corpus used in the Multilingual multi-document summarization track
will be based on WikiNews texts (http://www.wikinews.org/). Source
texts will be UTF-8, clean texts (without any mark-up, images,etc.).
The task requires systems to generate a single, fluent, representative
summary from a set of documents describing an event sequence. The language of
the document set will be within a given range of languages and all documents
in a set share the same language. The output summary should be of the same
language as its source documents. The output summary should be 250 words at
most.
== Multilingual Single-document Summarization (MSS) ==
Following the pilot task of 2013, the multi-lingual single-document summarization
task will be to generate a single document summary for all the given Wikipedia
feature articles from one of about 40 languages provided. The provided training
data will be the 2013 Single-Document Summarization Pilot Task data from MultiLing 2013.
A new set of data will be generated based on additional Wikipedia feature articles.
For each language 30 documents are given. The documents will be UTF-8 without mark-ups and images.
For each document of the training set, the human-generated summary is provided. For MultiLing 2015
the character length of the human summary for each document will be provided, called the target length.
Each machine summary should be as close to the target length provided as possible. For the purpose of
evaluation all machine summaries greater than the target length will be truncated to the target length.
The summaries will be evaluated via automatic methods and participants will be required to perform
some limited summarization evaluations.
The manual evaluation will consist of pairwise comparisons of machine-generated summaries. Each evaluator
will be presented the human-generated summary and two machine-generated summaries. The evaluation task
is to read the human summary and then judge if the one machine-generated summary is significantly closer to
the human generated summary information content (e.g. system A > system B or system B > system A) or if
the two machine-generated summaries contain comparable quanties of information as the human-generated summary.
== Online Forum Summarization (OnForumS) ==
Most major on-line news publishers, such as The Guardian or Le Monde,
publish articles on different topics and encourage reader engagement
through the provision of an on-line comment facility. A given news
article can often give rise to thousands of reader comments \u2014 some
related to specific points within the article, others that are replies
to previous comments. The great volume of such user-supplied comments
suggests the need for automated methods to summarize this content,
which in turn poses an exciting and novel challenge for the
summarization community.
The purpose of the Online Forum Summarization (OnForumS) track at
MultiLing\u201915 is to set the ground for investigating how such a mass
of comments can be summarised. We posit that a crucial initial step in
developing reader comment summarization systems is to determine what
comments relate to, be that either specific points within the text of
the article, the global topic of the article, or comments made by
other users. This constitutes a linking task. Furthermore, a set of
link types or labels may be articulated to capture whether, for
example, a comment agrees with, elaborates, disagrees with, etc., the
point made in the commented-upon text. Solving this labelled linking
problem should facilitate the creation of reader comment summaries by
allowing, for example, that comments relating to the same article
content can be clustered, points attracting the most comment can be
identified, representative comments can be chosen for each key point,
and the implications of labelled links can be digested (e.g., numbers
for or against a particular point), etc.
The SMS task at MultiLing\u201915 is a particular specification of the
linking task, in which systems will take as input a news article with
a reduced set of comments (sifted, according to predefined criteria,
from what could otherwise be thousands of comments) and are asked to
link and label each comment to sentences in the article (which, for
simplification, are assumed to be the appropriate units here), to the
article topic as a whole, or to preceding comments. Precise guidelines
for when to link and for the link types, will be released as part of
the formal task specification, but we anticipate the condition for
linking will require sentences addressing the same assertion, and that
link types will include at least agreement, disagreement, and
sentiment indicators. The data will cover at least three
languages (English, Italian, and French); a small set of
link-labelled articles will be provided by the SENSEI project
for each of these languages for illustration and for
development. Additional languages may be covered if the data for these
are provided by the participants in the task. These data could be
either translations of the data for other languages, or comparable
articles *on the same topics*.
Evaluation will be based on the results of a crowd-sourcing exercise,
in which crowd workers are asked to judge whether potential links, and
associated labels, are correct for each given test article plus
associated comments.
== Call Centre Conversation Summarization (CCCS) ==
Speech summarization has been of great interest to the community
because speech is the principal modality of human communications and
it is not as easy to skim, search or browse speech transcripts as it
is for textual messages. Speech recorded from call centers offers a
great opportunity to study goal-oriented and focused conversations
between an agent and a caller. The Call Centre Conversation
Summarization (CCCS) task consists in automatically generating
summaries of spoken conversations in the form of textual synopses that
shall inform on the content of a conversation and might be used for
browsing a large database of recordings. Compared to news
summarization where extractive approaches have been very successful,
the CCCS task's objective is to foster work on abstractive
summarization in order to depict what happened in a conversation
instead of what people actually said.
The MultiLing'15 CCCS track leverages conversations from the DECODA
and LUNA corpora of French and Italian call center recordings, both
with transcripts available in their original language as well as
English translation (both manual and automatic). Recording duration
range from a few minutes to 15 minutes, involving two or sometimes
more speakers. In the public transportation and help desk domains, the
dialogs offer a rich range of situations (with emotions such as anger
or frustration) while staying in a coherent domain.
Given transcripts, participants to the task shall generate abstractive summaries
informing a reader about the main events of the conversations, such as
the objective of the caller, whether and how it was solved by the
agent, and the attitude of both parties. Evaluation will be performed
by comparing submissions to reference synopses written by experts.
Both conversations and reference summaries are kindly provided by the
SENSEI project.
= How can I participate? =
For now you only need to fill in your contact details in the following form:
http://go.scify.gr/multiling2015participation
Make sure you also visit the MultiLing community website:
http://multiling.iit.demokritos.gr/
= Roadmap =
Finalization pending.
(PLEASE PROVIDE FEEDBACK on the submission dates, if you plan to participate,
by e-mailing: ggianna AT iit DOT demokritos DOT gr.)
* Training data ready: (date to be finalized per task) Dec 12th, 2014
* Test data available: Feb 15th, 2015
* System submissions due: Feb 28th, 2015
* Evaluation starts: Mar 1st, 2015
* Evaluation ends: Mar 31st, 2015
* Paper submission due: May 1st, 2015
* Paper reviews due: May 15th, 2015
* Camera-ready due: Jun 15th, 2015
* Workshop: 1st week of Sep , 2015
*NOTE*: Individual task dates may differ. Please check the MultiLing
website (http://multiling.iit.demokritos.gr) for more information.
= Venue =
(Finalization pending)
Collocated with SIGDIAL, Prague, Czech Republic
= Program Committee Members =
(Full list of PC members pending)
The Program Committee members are:
George Giannakopoulos - NCSR Demokritos (overall chair, MMS Task chair)
Jeff Kubina, John Conroy - IDA Center for Computing Sciences (MSS Task chairs)
Mijail Kabadjov - University of Essex (OnForumS Task co-chair)
Josef Steinberger - University of West Bohemia, Czech Republic (OnForumS Task co-chair)
Benoit Favre - University of Marseille (CCCS Task co-chair)
Udo Kruschwitz and Massimo Poesio - University of Essex
Emma Barker, Rob Gaizauskas and Mark Hepple - University of Sheffield
Vangelis Karkaletsis - NCSR Demokritos
Fabio Celli - University of Trento
Data Contributors (from MultiLing 2013)
===========================================
Georgios Petasis, George Giannakopoulos - NCSR "Demokritos", Greece
Josef Steinberger - University of West Bohemia, Czech Republic
Mahmoud El-Haj - Lancaster University, UK
Ahmad Alharthi - King Saud University, Saudi Arabia
Maha Althobaiti - Essex University, UK
Corina Forascu - Romanian Academy Research Institute for Artificial
Intelligence (RACAI), and Alexandru Ioan Cuza University of Iasi (UAIC), Romania
Jeff Kubina, John Conroy, Judith Shleshinger - IDA/Center for Computing Sciences, USA
Lei Li - Beijing University of Posts and Telecommunications (BUPT), China
Marina Litvak - Sami Shamoon College of Engineering, Israel
Sabino Miranda - Center for Computing Research, Instituto Politécnico Nacional, Mexico