DAIR Digital Breast Tomosynthesis Cancer Detection Challenge - Phase 2 (DBTex2)

Organized by challenge-organizer - Current server time: June 15, 2021, 9:30 p.m. UTC


May 24, 2021, midnight UTC


June 15, 2021, midnight UTC


July 11, 2021, midnight UTC

Overview: We ask for participation in the DBTex2 Grand Challenge by submitting algorithms for the detection of breast lesions on digital breast tomosynthesis images. The results of the competition will be announced at the Grand Challenges Symposium session of the 2021 AAPM Annual Meeting.

Organizers: This challenge is organized by the Duke Center for Artificial Intelligence in Radiology (DAIR) in collaboration with the SPIE-AAPM-NCI Grand Challenges Commitee.

Prizes: The winning team will receive $1000 prize sponsored by Duke Center for Artificial Intelligence in Radiology. Additionally, an individual from each of the two top-performing teams will receive a waiver of the meeting registration fee in order to present their methods during the SPIE Medical Imaging Conference.

Publication: Members of top teams will be invited to prepare and co-author a joint publication on this competition alongside the competition organizers, detailing the challenge itself, the methods used by the participants, and the corresponding results.

Important Dates

  1. Release date of training set cases with truth: May 24, 2021
  2. Release date of validation set cases: May 24, 2021 
  3. Release date of test set cases: May 24, 2021
  4. Submissions allowed for validation set evaluation: June 15, 2021
  5. Submissions allowed for test set evaluation: July 11, 2021
  6. Test set submission deadline for participants: July 21, 2021
  7. Winning teams notified of challenge results: July 24, 2021
  8. Deadline for public GitHub repositories with teams' final models/code: July 26, 2021
  9. Winning results presented at Grand Challenges Symposium session of the 2021 AAPM Annual Meeting: July 28, 2021

Major Contributors:

  1. Maciej Mazurowski, Duke University (maciej.mazurowski@duke.edu)
  2. Sam Armato, University of Chicago (s-armato@uchicago.edu)
  3. Karen Drukker, University of Chicago (kdrukker@uchicago.edu)
  4. Lubomir Hadjiiski, University of Michigan (lhadjisk@umich.edu)
  5. Kenny Cha, FDA (Kenny.Cha@fda.hhs.gov)
  6. Keyvan Farahani, NIH/NCI (farahank@mail.nih.gov)
  7. Mateusz Buda, Duke University (mateusz.buda@duke.edu)
  8. Jichen Yang, Duke University (jy168@duke.edu)
  9. Nick Konz, Duke University (nicholas.konz@duke.edu)
  10. Ashirbani Saha, Duke University (as698@duke.edu)
  11. Reshma Munbodh, Brown University (reshma_munbodh@brown.edu)
  12. Jinzhong Yang, MD Anderson (jyang4@mdanderson.org)
  13. Nicholas Petrick, FDA (nicholas.petrick@fda.hhs.gov)
  14. Justin Kirby, NIH/NCI (kirbyju@mail.nih.gov)
  15. Jayashree Kalpathy-Cramer, Harvard University (kalpathy@nmr.mgh.harvard.edu)
  16. Benjamin Bearce, Massachusetts General Hospital (kalpathy@nmr.mgh.harvard.edu)

Note: The participants can visit https://www.reddit.com/r/DukeDBTData/ for additional advice and discussion.

Formatting the submission file:

The output of your method submitted to the evaluation system should be a single CSV file with the following columns:

  1. PatientID: string - patient identifier
  2. StudyUID: string - study identifier
  3. View: string - view name, one of: RLL, LCC, RMLO, LMLO
  4. X: integer - X coordinate (on the horizontal axis) of the left edge of the predicted bounding box in 0-based indexing (for the left-most column of the image x=0)
  5. Width: integer - predicted bounding box width (along the horizontal axis)
  6. Y: integer - Y coordinate (on the vertical axis) of the top edge of the predicted bounding box in 0-based indexing (for the top-most column of the image y=0)
  7. Height: integer - predicted bounding box height (along the vertical axis)
  8. Z: integer - the first bounding box slice number in 0-based indexing (for the first slice of the image z=0)
  9. Depth: integer - predicted bounding box slice span (size along the depth axis)
  10. Score: float - predicted bounding box confidence score on arbitrary scale, unified across all the cases within a single submission (e.g. 0.0 – 1.0)


PatientID,StudyUID,View,X,Width,Y,Height,Z,Depth,Score ID1,UID1,RLL,X(int),Width(int),Y(int),Height(int),Z(int),Depth(int),Score(float) ID2,UID2,LCC,X(int),Width(int),Y(int),Height(int),Z(int),Depth(int),Score(float) ID3,UID3,RMLO,X(int),Width(int),Y(int),Height(int),Z(int),Depth(int),Score(float) ID4,UID4,LMLO,X(int),Width(int),Y(int),Height(int),Z(int),Depth(int),Score(float)

Coordinates of the predicted bounding boxes should be given for the correct image orientation. In the official competition GitHub repository, we provide a python function for loading image data from a DICOM file into 3D array of pixel values.

Each entry (row) in the submission file must correspond to exactly one predicted bounding box. There may be arbitrary number of predicted bounding boxes for each DBT volume. It is not required to have predictions for all DBT volumes.

Example submission file will be provided on the competition website.

Definition of a true-positive detection

A predicted box is going to be counted as a true positive if the distance in pixels in the original image between its center point and the center of a ground truth box is less than half of its diagonal or 100 pixels, whichever is larger.

In terms of the third dimension, the ground truth bounding box is assumed to span 25% of volume slices before and after the ground truth center slice and the predicted box center slice is required to be included in this range to be considered a true positive.

Performance metric:

The overall performance will be assessed as the average sensitivity for 1, 2, 3, and 4 FP/s per volume. The competition performance will be assessed only on studies containing a biopsied lesion.

Participants may use the training set cases in any manner they would like for the purpose of training their systems (consistent with the data license); there will be no restrictions on the advice sought from local experts for training purposes. The participants can also combine the provided training data with other data if they disclose that in the description of the algorithm.

The validation set and test sets cases, however, are to be manipulated, processed, and analyzed without human intervention.

Participants are free to download the training set and, subsequently, the validation and test sets when these datasets become available. It is important to note that once participants submit their test set output to the challenge organizers, they will be considered fully vested in the challenge, so that their performance results (without links to the identity of the participant) will become part of any presentations, publications, or subsequent analyses derived from the challenge at the discretion of the organizers.

The submission of test set output will not be considered complete unless it is accompanied by (1) a public GitHub repository that contains fully documented and reproducible code for your team's method (see the Overview page for the deadline) and (2) an agreement to be acknowledged in the Acknowledgment section (by name and institution—but without any link to the performance score of your particular method) of any manuscript that results from the challenge.

Members of top teams will be invited to prepare and co-author a joint publication on this competition alongside the competition organizers. This manuscript will describe the details of the challenge itself, as well as the methods used by the participants and the corresponding results. The representatives of the invited teams will be expected to provide descriptions of their algorithms and participate in the analysis of results and the preparation of the manuscript. However, please not that this is not a condition to participate.

Challenges are designed to motivate and reward novel computational approaches to a defined task. The use of commercial software (unless your group is affiliated with the organization that holds intellectual property rights to that software) or open source software (unless your group has a recognized association with the creation of that software) is not allowed, unless you can clearly demonstrate an innovative use, alteration, or enhancement to the application of such software.

Participation in the DBTex Challenges acknowledges the educational, friendly competition, and community-building nature of the challenges and commits to conduct consistent with this spirit for the advancement of the medical imaging research community. See this article for a discussion of lessons learned from the LUNGx Challenge, also sponsored by SPIE, AAPM, and NCI.

Terms and Conditions

By participating in this challenge, each participant agrees to:

  1. Detect lesions that were subsequently sent to biopsy. Those include both cancers and benign lesions.
  2. All participants must attest that they are not directly affiliated with the labs of any of the DBTex2 organizers or major contributors. Please refer to the Challenge Organizer Guidance document of the AAPM Working Group on Grand Challenges here.
  3. All participants must create a public GitHub repository with a working, documented and fully reproducible version of their code/models to have a place in the competition (see the deadline for this under "Overview").
  4. The participants are encouraged to use this data beyond this competition and consistently with the use conditions described on the TCIA website https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=64685580). The use of the data should be acknowledged as described on the TCIA website.



Start: May 24, 2021, midnight


Start: June 15, 2021, midnight


Start: July 11, 2021, midnight

Competition Ends

July 21, 2021, midnight

You must be logged in to participate in competitions.

Sign In