Teach AI to Compare Mammogram Views, Part 2
Learn what a 0.96 AUC score means. The mammogram screening process captures X-ray pictures of the breasts. In a standard 2D mammogram, doctors take two pictures of each breast from different angles. Radiologists use this dual-view technique to compare both images of each breast to detect abnormalities. Radiologists typically interpret four mammogram images per patient. Last month in the first article of the series, Revenue Cycle Insider provided a brief background on artificial intelligence (AI) and examined a 2025 study by Pelluet et al. where researchers trained an AI tool to analyze both mammogram images without needing precise image alignment. Read on to learn the results of the study and how AI can fit into healthcare going forward. Review the Study’s Results Researchers evaluated the AI’s ability to discern cancerous from non-cancerous lesions in breasts with a breast-level area under the curve (AUC) score. The AI achieved a 0.96 AUC score with the INbreast dataset and a 0.92 AUC score with the larger VinDR dataset. An AUC of 1.0 represents perfect discrimination, and 0.5 is random chance. AUCs of 0.96 and 0.92 mean this model has a strong capability to distinguish between cancerous and non-cancerous breasts. The model detected 97 percent of cancers with only 0.33 false alarms per image. This means that if there are 100 patients who have breast cancer and the AI model examines all 100, the model correctly identifies 97 of them as having cancer (true positives). Only three out of the 100 are missed (false negatives). This result is noteworthy since mammography detection rates are widely variable. Fitzjohn et al.’s assessment of mammography accuracy across multiple studies of both human and computer-assisted detection found that sensitivity ranges from 27 to 97 percent, with the estimated true diagnostic accuracy averaging 60 percent. This new method by Pelluet reduced false positives compared to other approaches. For comparison, a different model mentioned in the study achieved the same high true positive rate (97 percent) but resulted in 1.94 false positives per image. This means that the comparative model would flag nearly six times as many non-cancerous regions as suspicious. Fewer false positives saves radiologists time and helps reduce anxiety for patients undergoing additional testing. One consideration is that this technology requires a computer with enough memory to analyze high-resolution images, which not all clinical environments have access to. What Does This Mean for AI in Healthcare? According to the American Cancer Society’s 2025 U.S. cancer statistics, breast cancer is the most commonly diagnosed cancer in women and the second leading cause of cancer death in women after lung cancer. Detecting it early improves chances of successful treatment. AI computer-aided diagnostic (CAD) systems address a real problem. With millions of mammograms ordered annually in the U.S. and radiologists reviewing on average four images per patient, the workload is immense. AI can help catch cancers that might be missed due to fatigue or address the challenge of spotting abnormalities in dense breast tissue. Studies suggest AI assistance improves diagnostic accuracy, especially for junior radiologists. For example, a 2025 study by Gong et al. examined AI-assisted ultrasound assessment of axillary lymph nodes in breast cancer patients. Junior radiologists improved their mean (average) diagnostic accuracy (distinguishing malignant from benign) from 83.4 percent to 90.6 percent when using AI assistance, nearly matching the 92.0 percent accuracy of senior radiologists using AI. Both the Pelluet and Gong studies focus on providing explainable results like heatmaps, showing doctors exactly which regions in the images led to the AI’s conclusion. These tools are meant to assist, not replace, the professional judgment of radiologists. While the Pelluet model is currently still in its research phase, other AI systems for breast cancer detection have already entered clinical practice. Between 2017 and 2021, the Food and Drug Administration (FDA) cleared nine AI breast cancer detection products. However, all nine were cleared based on retrospective data, which has limitations since the data was collected prior to the study rather than prospectively. Post-marketing trials and more emphasis on clinical diversity (e.g., different breast densities and patient ages) are needed to ensure effectiveness for all patients. The work by Pelluet is an important step in creating AI that can partner with radiologists to analyze dual-view mammograms. While the transition to widespread clinical use requires FDA approval, this smarter scan technology offers a path to increased cancer detection, fewer unnecessary follow-ups, and reduced radiologist workloads. Angela Halasey, BS, CPC, CCS, Contributing Writer
