Representative chest X-ray images from the RSNA dataset, showing both normal cases and pneumonia-positive cases with annotated bounding boxes indicating areas of lung opacity.
Comparative Analysis of Faster R-CNN Architectures for Medical Image Classification
Project Overview: This project explores automated pneumonia detection in chest X-rays using three distinct Faster R-CNN implementations. We evaluated different backbone architectures, optimizers, and transfer learning approaches on the RSNA Pneumonia Detection Challenge dataset, achieving up to 77.34% training accuracy and 76.81% validation accuracy with ResNet-50 pretrained on ImageNet.
Representative chest X-ray images from the RSNA dataset, showing both normal cases and pneumonia-positive cases with annotated bounding boxes indicating areas of lung opacity.
| Completion Date | December 2024 |
| Course Context | CS 5785: Applied Machine Learning Final Project |
| Dataset | RSNA Pneumonia Detection Challenge (26,000+ training images) |
| Key Technologies | PyTorch, Faster R-CNN, ResNet-50, Transfer Learning |
| Core Concepts | Object Detection, Medical Image Analysis, Deep Learning, Binary Classification |
We implemented and compared three distinct approaches to automated pneumonia detection, each leveraging different aspects of the Faster R-CNN architecture for object detection and localization of lung opacities in chest radiographs.
Architecture: Faster R-CNN with ResNet-50 Feature Pyramid Network (FPN) backbone
Transfer Learning: Pretrained weights from COCO dataset
Training Configuration:
Architecture: ResNet-50 with custom classification head
Transfer Learning: Pretrained weights from ImageNet
Network Design:
Training Configuration:
Architecture: Faster R-CNN with ResNet-50-FPN backbone
Transfer Learning: PyTorch default pretrained weights
Training Configuration:
The Faster R-CNN architecture combines two key components for object detection:
Region Proposal Network (RPN): Generates object proposals by sliding a small network over the CNN feature map. For each location, it predicts object/no-object scores and bounding box coordinates.
Fast R-CNN Detection Network: Takes the proposed regions and performs classification and bounding box regression. The network outputs class probabilities and refined bounding box coordinates.
The loss function combines classification loss and localization loss:
L = L_cls + λL_reg
where L_cls is the classification loss (cross-entropy) and L_reg is the bounding box regression loss (smooth L1 loss).
The RSNA Pneumonia Detection Challenge dataset contains over 26,000 chest X-ray images from the National Institutes of Health Clinical Center. Each image includes:
| Method | Training Accuracy | Validation Accuracy | Training Loss | Validation Loss | Epochs |
|---|---|---|---|---|---|
| Method 1: Torch + Adam | 82.45% | 72.61% | 0.52 | 0.55 | 2 |
| Method 2: ImageNet + ResNet-50 | 77.34% | 76.81% | 0.5450 | 0.5151 | 1 |
| Method 3: PyTorch + SGD | 83.33% | 73.95% | 0.30 | N/A | 2 |
Method 3 loss convergence: Training loss decreased to approximately 0.3 over batch iterations
Best Overall Performance: Method 2 (Faster R-CNN with ImageNet pretrained ResNet-50) achieved the highest validation accuracy of 76.81%, demonstrating superior generalization despite training for only 1 epoch.
Transfer Learning Impact: ImageNet pretraining provided better initialization than COCO pretraining for this medical imaging task, likely due to the larger and more diverse ImageNet dataset.
Training Efficiency: Method 2 reached optimal performance in fewer epochs, suggesting that ImageNet features transfer effectively to medical image analysis tasks.
Computational Constraints: Limited computational resources restricted extensive hyperparameter exploration and prevented evaluation of additional architectures like YOLO.
This work demonstrates the feasibility of automated pneumonia detection systems that could assist radiologists in clinical settings. The achieved accuracy levels suggest potential for: