Announcement of Data Release and Call for Participation
2014 i2b2/UTHealth Shared-Tasks and Workshop
on Challenges in Natural Language Processing for Clinical Data
Registration: begins March 14, 2014 ** NOW OPEN **
Training Data Release: 1 May, 2014
Test Data Release: 1 July, 2014
System Outputs Due: 3 July, 2014
Systems Due: 15 July, 2014
Paper Submission: 1 August, 2014
The 2014 i2b2/UTHealth challenge consists of two traditional NLP tracks:
Track 1: De-identification: Removing protected health information (PHI) is a critical step in making medical records accessible to more people, yet it is a very difficult and nuanced task. This track addresses the problem of de-identifying medical records over a new set of over 1300 patient records, with surrogate PHI for participants to identify.
Track 2: Identifying risk factors for heart disease over time: Medical records for diabetic patients contain information about heart disease risk factors such as high blood pressure and cholesterol levels, obesity, smoking status, etc. This track aims to identify the information that is medically relevant to identifying heart disease risk, and track their progression over sets of longitudinal patient records.
The data for this task is provided by Partners HealthCare. All records have been fully de-identified and manually annotated for risk factors related to diabetes and heart disease risk factors.
Data for the challenge will be released under a Rules of Conduct and Data Use Agreement. Obtaining the data requires completing a registration, which is NOW OPEN.
For either track, i2b2 will give the participants the option to submit their system software in addition to their system output for evaluation. Teams that submit software will be evaluated separately from teams that submit only system output.
In addition, the i2b2 organizing committee is currently considering hosting a general "i2b2 Software for Clinical NLP" track that allows all past and current i2b2 challenge participants to submit and share with the community any software developed for any i2b2 shared-task.
Evaluation Dates, File Formats, and Evaluation Metrics
The evaluation for the NLP tracks will be conducted using withheld test data. Participating teams are asked to stop development as soon as they download the test data. Each team is allowed to upload (through this website) up to three system runs for each of the tasks. System output is expected in the form of standoff annotations, following the exact format of the ground truth annotations to be provided by the organizers.
Evaluation of submitted software will be evaluated by the program committee on factors such as ease of installation and ease of use.
Participants are asked to submit a short paper describing their system and analyzing their performance. Papers should be in AMIA style and should not exceed five pages. Authors of top performing systems and of particularly novel approaches will be invited to present or demo their systems at the workshop. Submitted software can be presented at the final workshop in the form of a poster or live demo.
March 14, 2014 Registration Opens ** NOW OPEN **
May 1, 2014 Training Data Release
July 1, 2014 Test Data Release
July 3, 2014 System Outputs on Test Data Due at 11:59pm Eastern Time
July 15, 2014 Systems Due
August 1, 2014 Paper Submissions
Ozlem Uzuner, co-chair, SUNY at Albany
Amber Stubbs, co-chair, SUNY at Albany
Hua Xu, co-chair, University of Texas, Houston
John Aberdeen, MITRE
Susanne Churchill, Partners Healthcare
Cheryl Clark, MITRE
Dina Demner Fushman, NIH/NLM
Joshua Denny, Vanderbilt University
Bill Hersh, Oregon Health and Science University
Lynette Hirschman, MITRE
Issac Kohane, Partners Healthcare
Vishesh Kumar, Massachusetts General Hospital
Anna Rumshisky, UMass Lowell
Stanley Shaw, Massachusetts General Hospital
Peter Szolovits, MIT
Meliha Yetisgen, University of Washington
Please see the announcements for more information. Questions on the challenge can be addressed to Ozlem Uzuner, firstname.lastname@example.org.