Adversarial ML / 2023-09-15
Adversarial Robustness in Medical Image Classification
This master's thesis investigated how adversarial perturbations affect image-classification models in a medical screening context. The work used the Intel and MobileODT Cervical Cancer Screening dataset to evaluate attack methods including FGSM, random perturbation, Gaussian noise, and BIM, then assessed defensive strategies such as adversarial training and ensemble-based robustness.
Artificial Intelligence and Big Data
Cervical cancer screening images
FGSM, RPA, GNA, and BIM
Adversarial training and ensembles
Medical image classifiers can appear accurate on clean validation data while remaining vulnerable to small perturbations that change model decisions.
Implemented multiple adversarial attack methods against a transfer-learning classifier, compared clean and attacked accuracy, and evaluated defensive training strategies.
The work showed that adversarial exposure materially degrades model accuracy and that defensive training improves robustness, while also highlighting the limits of current defense methods.
- Python
- TensorFlow/Keras
- ResNet50
- Adversarial ML
- FGSM
- BIM
- Gaussian noise
- Medical imaging
Research context
The thesis studies a practical risk in applied machine learning: a classifier can perform well on normal validation images and still be brittle when the input is intentionally perturbed.
The work focuses on cervical cancer screening image classification, where model robustness matters because the prediction context is sensitive and errors can carry real downstream consequences.
Methodology
The project used transfer learning with ResNet50 and evaluated several adversarial attack strategies against the image classifier. The attacks were used to compare clean-model behavior with attacked-model behavior and to understand where the classifier was most fragile.
The defense work then tested whether adversarial training and ensemble-oriented strategies could improve robustness without treating clean accuracy as the only success metric.
| Area | Implementation |
|---|---|
| Modeling | Transfer learning with ResNet50 |
| Dataset | Intel and MobileODT cervical cancer screening images |
| Attacks | FGSM, Random Perturbation Attack, Gaussian Noise Attack, BIM |
| Defenses | Adversarial training and ensemble-based robustness checks |
Findings
- Adversarial examples exposed failure modes that were not visible through clean validation accuracy alone.
- Iterative and gradient-based perturbations were especially useful for stress-testing model behavior.
- Defensive training improved robustness, but the thesis also shows why adversarial defense should be evaluated as an ongoing reliability problem rather than a one-time fix.
Professional relevance
The work connects directly to production AI engineering: reliable systems need evaluation beyond average-case accuracy, especially when model outputs influence operational or high-stakes decisions.
The same evaluation mindset applies to the rest of this site: define the failure mode, build a repeatable test, measure the system under stress, and choose the architecture from observed behavior.
Detailed methodology and results
Supporting methodology, figures, and tables are rendered here as native page content with the same visual system as the rest of this website.
Master's thesis work for the M.S. in Artificial Intelligence and Big Data, focused on adversarial robustness for medical image classification.
Executive Summary
The thesis evaluates how small, intentionally constructed image perturbations can change the predictions of a deep-learning classifier trained for cervical cancer screening. The goal was not only to demonstrate attacks, but to measure their practical impact and evaluate defensive strategies that could improve model robustness.
The project used the Intel and MobileODT Cervical Cancer Screening dataset and a transfer-learning workflow based on ResNet50. It compared clean model behavior against adversarially attacked inputs, then tested whether adversarial training and ensemble-oriented defenses could reduce the observed vulnerability.
Research Questions
- How vulnerable is a medical image classifier when adversarial perturbations are applied to otherwise valid input images?
- Which attack methods create the most meaningful degradation in model behavior for this image-classification setting?
- Can adversarial training and ensemble-based defenses improve robustness without hiding the model's remaining weaknesses?
Methods
| Area | Details |
|---|---|
| Dataset | Intel and MobileODT Cervical Cancer Screening image dataset |
| Model | Transfer learning with ResNet50 |
| Attack methods | Fast Gradient Sign Method, Random Perturbation Attack, Gaussian Noise Attack, Basic Iterative Method |
| Defense methods | Adversarial training and ensemble-oriented robustness evaluation |
| Evaluation focus | Clean accuracy, attacked accuracy, and robustness behavior under perturbed inputs |
The implementation treated adversarial examples as an evaluation tool, not only as a security demonstration. Each attack made the model answer the same classification problem under a different stress condition, exposing behavior that a clean validation set would not show.
Results and Interpretation
- The experiments showed a meaningful drop in model accuracy once adversarial perturbations were introduced.
- Gradient-based and iterative attacks provided a clearer picture of model fragility than clean-sample accuracy alone.
- Defensive training improved robustness, but the results also made clear that adversarial defense has to be continuously evaluated against changing attack assumptions.
- The work reinforced a production AI principle: model quality should be measured under realistic and adversarial stress, not only under ideal benchmark conditions.
Resources
The public implementation is available on GitHub, and the full thesis PDF is linked from this page for readers who want the complete academic writeup.