Solve for X: Lessons Learned from PROSTATEx
The PROSTATEx challenge organizer has published a paper in the Journal of Medical Imaging summarizing lessons learned and opportunities for future grand challenges.
In 2016, the American Association of Physicists in Medicine (AAPM), National Cancer Institute (NCI), and SPIE, partnered together to issue a challenge-a Grand Challenge-to the medical imaging research community: come up with a method to automatically classify prostate lesions as either clinically significant or not, based on images acquired from MRI. In other words, we'll give you the datasets, you submit the results of your computerized method.
On the heels of the first challenge, a second part was announced in 2017, with the task of developing a computerized method to determine the Gleason Grade Group in prostate cancer based on those same MRI images. Prostate cancer cells are classified into Gleason Grade Groups 1 to 5 based on how closely the cancerous cells resemble normal healthy prostate tissue. A close match is a 1, very abnormal cells are a 5.
The purpose of these Grand Challenges is not to find one algorithm to rule them all, but to provide an opportunity for the comparison of perhaps dozens of algorithms through a specific and well-defined infrastructure. Sam Armato, SPIE Member and one of the organizers of the two-part PROSTATEx Challenge, believes that these challenges foster interest in the task and encourage innovation in the field.
Now that the challenge has concluded, Armato, along with other challenge organizers, has published a paper summarizing lessons learned and opportunities for the future:
1. Of the 71 methods submitted to the first part of the challenge-the one tasked with classifying prostate lesions as clinically significant or not-the majority of those methods outperformed random guessing, and the four best-performing methods could not be statistically distinguished. The conclusion: automated classification of clinically significant cancer seems feasible.
2. Of the 43 methods submitted to the second part of the challenge-the one computationally assigning lesions to a Gleason Grade Group-only two did marginally better than random guessing. The conclusion: it's very difficult for computerized methods to differentiate one of five possible pathological grades. This result demonstrates that there is a novel line of investigation on this subject that is open for creative ideas.
3. The relatively small size of the test datasets (204 in the first challenge and 99 in the second) made it difficult to achieve statistically significant conclusions. More images in future datasets will improve the likelihood of drawing such conclusions.
4. Dataset quality can be a problem for challenges—sometimes annotations are ambiguous, and sometimes the reference standard has inherent variability. These issues must be minimized as much as possible by challenge organizers.
5. Grand Challenges for medical imaging shouldn't be one-and-done. There's a need to keep the challenges going, even after the competition deadlines have passed.
Answering the last lesson learned, the challenge lives on at prostatex.grand-challenge.org. New methods can be added at any time, and researchers can continue to receive objective feedback.
The paper summarizing these results can be found in the Journal of Medical Imaging on the SPIE Digital Library.
The latest Grand Challenge co-sponsored by SPIE, AAPM, and NCI is the BreastPathQ. Challenge results will be released to participants on 4 January 2019, and a workshop will take place pas part of the 2019 SPIE Medical Imaging Conference in February.
|Enjoy this article?
Get similar news in your inbox