Announcement of Data Release and Call for Participation
Fourth i2b2/VA Shared-Task and Workshop
Challenges in Natural Language Processing for Clinical Data
Data Release: 15 April, 2010
Evaluation: 22-24 July, 2010
Paper Submission: 1 September, 2010
Workshop: November, 2010 in Washington, DC
The fourth i2b2/VA challenge is a three tiered challenge that studies:
- extraction of medical problems, tests, and treatments
- classification of assertions made on medical problems
- relations of medical problems, tests, and treatments
View the Workshop Agenda.
Sample records and their annotations can be found under the Documentation link.
The data for this challenge includes discharge summaries from Partners HealthCare and from Beth Israel Deaconess Medical Center (MIMIC II Database), as well as discharge summaries and progress notes from University of Pittsburgh Medical Center. All records have been fully de-identified and manually annotated for concept, assertion, and relation information. Participants of the Fourth i2b2/VA Challenge can choose to tackle any subset of the tiers. There is no limitation on the number of tiers a team can participate in.
The challenge registration started on March 15, 2010. Training data for the challenge will be released in installments starting April 15, 2010. Test data are scheduled to be released in July 2010. The results of the challenge will be presented at the workshop organized by i2b2 and VA.
Data for the Fourth i2b2/VA Challenge will be released under a Data Use Agreement and are to be used for research only. Obtaining the data requires completing a registration and signing the Data Use Agreement.
Evaluation Dates, File Formats, and Evaluation Metrics.
The Fourth i2b2/VA Challenge evaluation will be conducted using withheld test data. Participating teams are asked to stop development as soon as they download the test data. Each team is allowed to upload (through this website) up to three system runs for each of the tiers of the challenge. System output is expected in the form of standoff annotations, following the exact format of the ground truth annotations provided by the organizers. Precision, recall, and f-measure will be used as evaluation metrics.
Participants are asked to submit a short paper describing their system and analyzing their performance. Papers should be in AMIA style and should not exceed five pages. Authors of top performing systems and of particularly novel approaches will be invited to present or demo their systems at the workshop.
|March 15, 2010||Registration Opens|
|April 15, 2010||Commitment to Participate in Challenge & Training Data Release|
|July 22, 2010||Test Data Release|
|September 1, 2010||Short Papers Due|
|October 1, 2010||Invitations to Present at the Workshop|
|Ozlem Uzuner, co-chair,||SUNY at Albany|
|Scott L DuVall, co-chair,||VA Salt Lake City Health Care System|
|Imre Solti,||University of Washington|
|Leonard D'Avolio,||VA Boston Healthcare System, Center for Surgery and Public Health, Brigham and Women's Hospital, Harvard School of Medicine|
|Wendy Chapman,||Department of Biomedical Informatics, University of Pittsburgh|
|Shuying Shen,||VA Salt Lake City Health Care System|
|Guergana Savova,||Harvard Medical School and Childrens Hospital Boston|
|Norris Harber Heintzelman,||Lockheed Martin, IS&GS|
|Brett R South,||VA Salt Lake City Health Care System, University of Utah, Department of Internal Medicine|
|Jennifer Hornung Garvin,||VA Salt Lake City Health Care System and the University of Utah School of Medicine|
|Charlene Weir,||VA Salt Lake City Health Care System, GRECC|
Please see the FAQs and announcements for more information. Questions on the challenge can be addressed to Ozlem Uzuner, firstname.lastname@example.org.