Difference between revisions of "SemEval 2012 versus 2013"

From ACL Wiki
Jump to: navigation, search
Line 18: Line 18:
''Increment the numbers below according to your preference.''
''Increment the numbers below according to your preference.''
'''For 2012:''' 2
'''For 2012:''' 3
'''For 2013:''' 9
'''For 2013:''' 9

Revision as of 08:57, 3 November 2010

So far, SemEval has been organised in a 3 year cycle. However, many participants feel that this is a long wait. Many other shared tasks such as CoNLL and RTE run annually. For this reason we are giving the opportunity for task organisers to choose between a 2 year or a 3 year cycle. Task proposers will be asked to vote for the date of SemEval-3 and choose amongst:

  • 2012 - ACL
  • 2013 - NAACL
  • 2013 - ACL

Related links:


Deadline for votes: November 22, 2010

Increment the numbers below according to your preference.

For 2012: 3

For 2013: 9

For 2013 NAACL: 0
For 2013 ACL: 0

Reasons for 2012

We still have almost 2 years! If its after 3 years, we probably won't be doing anything in the first year anyway. Hence, I vote for 2012.
-- Naushad UzZaman, University of Rochester (TempEval-2 participant and TempEval-3 co-organizer)

Reasons for 2013

I understand that some participants are eager to run in yet another "competition" sooner rather than later. This is no reason to believe that squeezing the cycle into two years serves a useful purpose. My perspective is that of an organizer. A new task requires much thought and even more legwork. An old task merely repeated is not worth the bytes its data sit in. There must be new elements. To reuse the old data is easier said than done. I could share our experience with tasks 4 (2007) and 8 (2010). There was a markedly higher effort in 2009-2010 than any of us had initially thought. If the community goes with the idea of a common annotation style, well, that alone requires a deeper reflection.

I could go on, but you may already be bored. Let me just make a social observation. What we do is not a spat, a fisticuff or a race. It is a shared evaluation exercise. That many people treat is as a fight is painfully obvious. i suspect that it is not uncommon for someone to use the scores -- especially a showing close to the top -- as an argument in grant applications or requests for promotion. A stimulating intellectual challenge turns into (excusez le mot) a pissing contest. Naturally, it is better to have more chances to win that medal.

I propose to keep the usual pace. A three-year cycle has worked well. It allows organizers to do their work carefully and thoughtfully, without overstraining themselves. Those who run annual events probably survive only because innovation is very incremental.

-- Stan Szpakowicz, PhD, Professor SITE, Computer Science, University of Ottawa

As a task organiser from 2007 (task 10) and 2010 (task 2) I concur with Stan's sentiments. It does all depend on the thought that is required for the new approach/annotation and how much time the organisers have to spare. Another argument for a longer cycle is that it gives some time for analysis of the previous data before implementing the new ideas. I do agree with Suresh and Deniz (the current SemEval co-chairs) that the decision should rest with those who are willing to organise the tasks.

-- Diana McCarthy, (co-) Director Lexical Computing Ltd., Brighton UK

As of the date of this posting (Oct 28, 2010), I would note that about 6? months has already passed since the results were submitted in the last SemEval. I think there is some argument for allowing ourselves a bit of breathing room, and also time to reflect on what happened last time before moving on to the next round.

I particularly like the idea of having a workshop of some kind about one year after the conclusion of a SemEval to allow for more detailed discussion of what happened last time around, and might help to avoid too much repetition in tasks and also provide a nice opportunity for a more in depth discussion of lessons learned. Given that, I tend to prefer a 3 year window (particularly if this enables a lessons learned style workshop after 1 year). There was, for example, an ACL 2002 workshop after Senseval-2 (2001) that focused on "Recent Successes and Future Directions" and featured some papers that did a bit more analysis of results, etc. than is normally possible right after the event. I couldn't find the call for this event, but you can see the proceedings here :


I'd also be concerned that a 2 year window this time around might not be sustainable, and so we'd end up fluctuating a bit on the interval as the years go by. I think having it generally understood that SemEval will happen every 3 years is nice in that folks can generally plan on it. This can be helpful when working with or planning to work with students or others on a fixed time interval.

Ted Pedersen


task participant 2001, 2004, 2007, 2010 and task organizer 2004

In my opinion, *before* one can design a proposal, one needs to know the timeline in advance: is it 2-year or 3-year?

One could plan a task very differently for a 2-year and for a 3-year cycle: (a) For a 2-year cycle, given the rush, there would be a tendency to repeat an old task with some minor changes, which is of questionable utility. (b) For a 3-year cycle, one could think of something really new and interesting; this would require a careful task design (which involves much discussions and fighting), for which one month would not be enough.

--- Preslav Nakov, Ph.D. National University of Singapore http://nakov.eu

As a co-organiser of two Semeval-2007 tasks I can say that you need to a) carefully think about what to do next (you want your task to allow for something new), b) choose the dataset(s), c) build an annotation interface, d) annotate, e) think about how to evaluate and prepare an evaluation software. Especially point (a) needs time, so devising a new task that it not derivative takes time and thinking. Also, I agree that analyzing the results of previous tasks and meeting to talk about that is an important step before developing new tasks and ideas. So, if it wasn't clear before :-), I am all for Semeval 2013!

--- Roberto Navigli SAPIENZA Universita' di Roma http://www.dsi.uniroma1.it/~navigli

I have been participating in Senseval/Semeval since the beginning, as a participant and as an organizer. I have two major points to make: (1) As currently envisaged, the identification of a task occurs in the mind of a single individual or small coordinated group. As a result, the task is somewhat random. This year's organizers have made the strong point about combining tasks, so as to have several subtasks that have some coherence. I would suggest that we all need some overall coherence about outstanding semantic evaluation issues, with the hope that the tasks and subtasks are homing in on those problems. I made a start on this with a quick grouping of task types at http://aclweb.org/aclwiki/index.php?title=SemEval_Portal. I would like to encourage people to exapnd on this grouping with a greater identification of issues. This might facilitate the development of future tasks. (2) With the given structure, a task is defined and then somewhat cast in concrete. As both a participant and an organizer, this results in a major difficulty that the task can't evolve and play out. As an organizer, I've found it necessary to have some back and forth before the final form of a task is crystallized. As a participant, it takes some time to play around with trial data and to wrap oneself around the task. The FrameNet linking task in SemEval-2010 is a case in point. It was sufficiently complex that many of those with an initial interest ended up not participating and the original task was subdivided into a couple of smaller tasks. So, what I'm suggesting is that potential organizers throw out an initial, rough conceptualization with some trial data and that the task not be locked down without a full airing of potential issues and refinements. I don't think it's necessary to have a lot of time between the availability of the training data and submission of results (5 months). Once you're pretty sure that you can get the kind of results you want, it's just a matter of making the final run.

Ken Litkowski CL Research http://www.clres.com