Announcement of Data Release and Call for Participation
2014 i2b2/UTHealth Shared-Tasks and Workshop
on Challenges in Natural Language Processing for Clinical Data
Registration: begins March 14, 2014
Training Data Release: 1 May, 2014
Test Data Release: 1 July, 2014
System Outputs Due: 3 July, 2014
Systems Due: 15 July, 2014
Paper Submission: 1 August, 2014
The 2014 i2b2/UTHealth challenge consists of two traditional NLP tracks:
Track 1: De-identification: Removing protected health information (PHI) is a critical step in making medical records accessible to more people, yet it is a very difficult and nuanced task. This track addresses the problem of de-identifying medical records over a new set of over 1300 patient records, with surrogate PHI for participants to identify.
Track 2: Identifying risk factors for heart disease over time: Medical records for diabetic patients contain information about heart disease risk factors such as high blood pressure and cholesterol levels, obesity, smoking status, etc. This track aims to identify the information that is medically relevant to identifying heart disease risk, and track their progression over sets of longitudinal patient records.
The data for this task is provided by Partners HealthCare. All records have been fully de-identified and manually annotated for risk factors related to diabetes and heart disease risk factors.
Data for the challenge will be released under a Rules of Conduct and Data Use Agreement. Obtaining the data requires completing a registration, which will start March 14, 2014.
For either track, i2b2 will give the participants the option to submit their system software in addition to their system output for evaluation. Teams that submit software will be evaluated separately from teams that submit only system output.
In addition, the i2b2 organizing committee is currently considering hosting a general "i2b2 Software for Clinical NLP" track that allows all past and current i2b2 challenge participants to submit and share with the community any software developed for any i2b2 shared-task.
Evaluation Dates, File Formats, and Evaluation Metrics
The evaluation for the NLP tracks will be conducted using withheld test data. Participating teams are asked to stop development as soon as they download the test data. Each team is allowed to upload (through this website) up to three system runs for each of the tasks. System output is expected in the form of standoff annotations, following the exact format of the ground truth annotations to be provided by the organizers.
Evaluation of submitted software will be evaluated by the program committee on factors such as ease of installation and ease of use.
Participants are asked to submit a short paper describing their system and analyzing their performance. Papers should be in AMIA style and should not exceed five pages. Authors of top performing systems and of particularly novel approaches will be invited to present or demo their systems at the workshop. Submitted software can be presented at the final workshop in the form of a poster or live demo.
|March 14, 2014||Registration Opens|
|May 1, 2014||Training Data Release|
|July 1, 2014||Test Data Release|
|July 3, 2014||System Outputs on Test Data Due at 11:59pm Eastern Time|
|July 15, 2014||Systems Due|
|August 1, 2014||Paper Submissions|
|Ozlem Uzuner, co-chair,||SUNY at Albany|
|Amber Stubbs, co-chair,||SUNY at Albany|
|Hua Xu, co-chair, ||University of Texas, Houston|
|Susanne Churchill,||Partners Healthcare|
|Dina Demner Fushman,||NIH/NLM|
|Joshua Denny,||Vanderbilt University|
|Bill Hersh,||Oregon Health and Science University|
|Issac Kohane,||Partners Healthcare|
|Vishesh Kumar,||Massachusetts General Hospital|
|Anna Rumshisky,||UMass Lowell|
|Stanley Shaw,||Massachusetts General Hospital|
|Meliha Yetisgen,||University of Washington|
Please see the announcements for more information. Questions on the challenge can be addressed to Ozlem Uzuner, firstname.lastname@example.org.