i2b2: Informatics for Integrating Biology & the Bedside - A National Center for Biomedical Computing
2008 NLP
Shared Task

Announcement of Data Release and Call for Participation

Second i2b2 Shared-Task and Workshop
Challenges in Natural Language Processing for Clinical Data
Obesity Challenge (A Shared-Task on Obesity): Who's obese and what co-morbidities do they (definitely/likely) have?

Data Release: 15 March, 2008
Evaluation: 23-25 June, 2008
Paper Submission: 1 September, 2008
Workshop: 7 November, 2008 in Washington, DC

Organizer: Informatics for Integrating Biology and the Bedside, i2b2, a National Center for Biomedical Computing

The obesity challenge is a multi-class, multi-label classification task focused on obesity and its co-morbidities. The data for the challenge consist of discharge summaries from Partners Healthcare. All records have been fully de-identified. Obesity information and co-morbidities have been marked at a document level as present, absent, questionable, or unmentioned in the documents. For each patient, both textual judgments, i.e., what the text explicitly states about obesity and co-morbidities, and intuitive judgments, i.e., what the text implies about obesity and co-morbidities, are provided. The goal of the challenge is to evaluate systems on their ability to recognize whether a patient is obese and what co-morbidities they exhibit.

The challenge opened to preregistration on February 1, 2008. Training data for the challenge will be released in installments; the first installment will be released on March 15, 2008. The rest of the installments will follow soon after. Test data are scheduled to be released for only three days and will be used only for evaluation purposes. The results of the challenge will be presented at the workshop organized by i2b2.

Data for the obesity challenge will be released under a Data Use Agreement and are to be used for the challenge only. Obtaining the data requires completing a preregistration and signing the Data Use Agreement.

Evaluation Dates, File Formats, and Evaluation Metrics.

The obesity challenge evaluation will be on the test data. Participating teams are asked to stop development as soon as they download the test data. Each team is allowed to upload (through this website) up to three system runs. System output is expected in the form of standoff annotations, following the exact format of the ground truth annotations provided by i2b2. Precision, recall, and f-measure will be used as evaluation metrics.

Participants are asked to submit a short paper describing their system and analyzing their performance. Papers should be in AMIA style and should not exceed five pages. Authors of top performing systems and of particularly novel approaches will be invited to present or demo their systems at the workshop.

Tentative Schedule
February 1, 2008   Preregistration Open
March 15, 2008   Training Data Release
April 15, 2008   Commitment to Participate in Challenge
June 23, 2008   Test Data Release at 9am EST
June 25, 2008   Output Due at Midnight EST
August 1, 2008   Notification of Results to Each Participant
September 1, 2008   Short Papers Due
October 1, 2008   Invitations to Present at the Workshop
November 7, 2008   Workshop

Organizing Committee:

Ozlem Uzuner, Chair,SUNY at Albany
Middle East Technical University Northern Cyprus Campus
Peter Szolovits,MIT CSAIL
Isaac Kohane,Partners Healthcare

Please see the FAQs and announcements for more information. Questions on the challenge can be addressed to Ozlem Uzuner, i2b2nlp@albany.edu.


