SemEval 2014 Task 7 - Analysis of Clinical Text

Abbreviated Title: 
ACT
Call for Participation
Submission Deadline: 
30 Apr 2014
Event Dates: 
23 Aug 2014 - 24 Aug 2014
Location: 
Co-located with COLING and *SEM
City: 
Dublin
Contact: 
Suresh Manandhar
Contact Email: 
suresh [at] cs [dot] york [dot] ac [dot] uk

http://alt.qcri.org/semeval2014/task7

The purpose of this task is to enhance current research in natural language processing methods used in the clinical domain.

The aim of the task is to identify entities in the clinical domain and to map entities to UMLS CUIs (Concept Unique Identifiers). In this task, the focus will be to identify and disambiguate disorder mentions.

TASK DESCRIPTION
The task is a continuation of the CLEF/eHealth ShARe 2013 Shared Task. Significant additional annotations will be provided for subtasks A and B with the aim of correcting any existing errors and creating additional data to address sparsity issues.

Task A
This includes the recognition of mentions of concepts that belong to the UMLS semantic group disorders.

Here are a few examples—more are provided in the annotation guidelines and in the page on Task website (under Datasets).

  1. The rhythm appears to be atrial fibrillation.
  2. The left atrium is moderately dilated.
  3. 53 year old man s/p fall from ladder.

In examples 1. and 3., the phrases atrial fibrillation and fall from ladder fall in the disorder semantic group in the UMLS. Example 2. is a case of discontigous mentions represented by left atrium...dialated. This phenomenon where a discontiguous phrase is the best representative of the disorder occurs more commonly in the clinical domain than in the general domain, and therefore is annotated as such.

Task B
This task involves the mapping of each disorder mention to a unique UMLS CUI. This is referred to as the task of normalization and the mapping is limited to UMLS CUIs of SNOMED codes.

The disorder entities in the Considering examples above map to the following CUIs:

  1. atrial fibrillation - C0004238; UMLS preferred term atrial fibrillation
  2. left atrium...dilated - C0344720; UMLS preferred term left atrial dilatation
  3. fall from ladder - C0337212; UMLS preferred term is accidental fall from ladder
    1. DATASETS
      The following tarball contains trial data along with their annotations:
      semeval-2014-task-7-trial.tar.gz

      Access to full training data will require Data User Agreement (DUA). Details are provided in the task website under "Data and Tools" tab.

      PARTICIPATION
      Participants are free to participate in one or both tasks.

      IMPORTANT DATES
      Trial data ready October 31, 2013
      Training data ready December 15, 2013
      Evaluation period March 15-30, 2014
      Paper submission due April 30, 2014 [TBC]
      SemEval workshop August 23-24, 2014, co-located with COLING and *SEM in Dublin, Ireland.

      MORE INFORMATION
      The Semeval-2014 Task 7 website includes details on the training data, evaluation, and examples of the comparison types:
      http://alt.qcri.org/semeval2014/task7

      ORGANIZERS
      Sameer S. Pradhan, Harvard University
      Suresh Manandhar, University of York, UK
      Wendy W. Chapman, University of Utah
      Noemie Elhadad, Columbia University
      Guergana K. Savova, Harvard University