Automated Pneumonia Detection using Deep Learning

Comparative Analysis of Faster R-CNN Architectures for Medical Image Classification

CS 5785 Applied Machine Learning • Fall 2024 • Team Project

Project Overview: This project explores automated pneumonia detection in chest X-rays using three distinct Faster R-CNN implementations. We evaluated different backbone architectures, optimizers, and transfer learning approaches on the RSNA Pneumonia Detection Challenge dataset, achieving up to 77.34% training accuracy and 76.81% validation accuracy with ResNet-50 pretrained on ImageNet.

Example chest X-ray images from the RSNA dataset showing normal and pneumonia cases

Representative chest X-ray images from the RSNA dataset, showing both normal cases and pneumonia-positive cases with annotated bounding boxes indicating areas of lung opacity.

Completion Date	December 2024
Course Context	CS 5785: Applied Machine Learning Final Project
Dataset	RSNA Pneumonia Detection Challenge (26,000+ training images)
Key Technologies	PyTorch, Faster R-CNN, ResNet-50, Transfer Learning
Core Concepts	Object Detection, Medical Image Analysis, Deep Learning, Binary Classification

Technical Approach

We implemented and compared three distinct approaches to automated pneumonia detection, each leveraging different aspects of the Faster R-CNN architecture for object detection and localization of lung opacities in chest radiographs.

Method 1: Torch Faster R-CNN with ResNet-50 and Adam Optimizer

Architecture: Faster R-CNN with ResNet-50 Feature Pyramid Network (FPN) backbone

Transfer Learning: Pretrained weights from COCO dataset

Training Configuration:

Optimizer: Adam (learning rate: 0.001)
Epochs: 2
Batch size: 4
Image resolution: 128×128 pixels
Data split: 80% training, 20% validation

Method 2: Faster R-CNN with ResNet-50 (ImageNet Pretrained)

Architecture: ResNet-50 with custom classification head

Transfer Learning: Pretrained weights from ImageNet

Network Design:

Feature extractor: ResNet-50 (frozen pretrained layers)
Global Average Pooling layer
Dense layer: 1024 units with ReLU activation
Output layer: Sigmoid activation for binary classification

Training Configuration:

Loss function: Binary cross-entropy
Optimizer: Adam
Epochs: 10 (early stopping at 1 epoch for optimal performance)
Batch size: 32

Method 3: PyTorch Faster R-CNN with ResNet-50-FPN and SGD

Architecture: Faster R-CNN with ResNet-50-FPN backbone

Transfer Learning: PyTorch default pretrained weights

Training Configuration:

Optimizer: Stochastic Gradient Descent (SGD)
Learning rate: 0.01
Weight decay: 0.0005
Epochs: 2
Data split: 80% training, 20% validation

Faster R-CNN Architecture

The Faster R-CNN architecture combines two key components for object detection:

Region Proposal Network (RPN): Generates object proposals by sliding a small network over the CNN feature map. For each location, it predicts object/no-object scores and bounding box coordinates.

Fast R-CNN Detection Network: Takes the proposed regions and performs classification and bounding box regression. The network outputs class probabilities and refined bounding box coordinates.

The loss function combines classification loss and localization loss:

L = L_cls + λL_reg

where L_cls is the classification loss (cross-entropy) and L_reg is the bounding box regression loss (smooth L1 loss).

Dataset and Preprocessing

The RSNA Pneumonia Detection Challenge dataset contains over 26,000 chest X-ray images from the National Institutes of Health Clinical Center. Each image includes:

Patient ID and binary pneumonia classification target
Bounding box annotations (x-min, y-min, width, height) for pneumonia regions
Multiple bounding boxes per image when applicable

Data Preprocessing Pipeline

Custom PyTorch dataset class for DICOM image handling
Image resizing and tensor conversion
Normalization and data augmentation
Invalid bounding box filtering for data quality
Train/validation split with minibatch loading

Results and Performance Comparison

Method	Training Accuracy	Validation Accuracy	Training Loss	Validation Loss	Epochs
Method 1: Torch + Adam	82.45%	72.61%	0.52	0.55	2
Method 2: ImageNet + ResNet-50	77.34%	76.81%	0.5450	0.5151	1
Method 3: PyTorch + SGD	83.33%	73.95%	0.30	N/A	2

Loss convergence graph for Method 3 showing decreasing loss over batch iterations

Method 3 loss convergence: Training loss decreased to approximately 0.3 over batch iterations

Key Findings

Best Overall Performance: Method 2 (Faster R-CNN with ImageNet pretrained ResNet-50) achieved the highest validation accuracy of 76.81%, demonstrating superior generalization despite training for only 1 epoch.

Transfer Learning Impact: ImageNet pretraining provided better initialization than COCO pretraining for this medical imaging task, likely due to the larger and more diverse ImageNet dataset.

Training Efficiency: Method 2 reached optimal performance in fewer epochs, suggesting that ImageNet features transfer effectively to medical image analysis tasks.

Computational Constraints: Limited computational resources restricted extensive hyperparameter exploration and prevented evaluation of additional architectures like YOLO.

Technical Achievements

Multi-architecture Evaluation: Successfully implemented and compared three distinct Faster R-CNN variants with different backbones and optimization strategies
Medical Image Pipeline: Developed robust preprocessing pipeline for DICOM chest X-ray images with bounding box handling
Transfer Learning Analysis: Demonstrated effectiveness of ImageNet pretraining over COCO pretraining for medical imaging applications
Performance Optimization: Achieved competitive accuracy (76.81% validation) on challenging medical dataset with limited computational resources
Reproducible Framework: Created modular, extensible codebase for pneumonia detection that can be adapted for other medical imaging tasks

Clinical Relevance

This work demonstrates the feasibility of automated pneumonia detection systems that could assist radiologists in clinical settings. The achieved accuracy levels suggest potential for:

Prioritizing urgent cases requiring immediate radiologist review
Reducing diagnostic burden in resource-limited healthcare settings
Providing consistent preliminary screening for large volumes of chest X-rays
Supporting radiologist decision-making with automated opacity localization

← Back to Projects