Adversarial Attacks And Defense Mechanisms In Deep Learning

Thumbnail Image

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

university of bordj bou arreridj

Abstract

This work explores the adversarial vulnerabilities of deep learning models in image clas- sification, with a focus on evaluating and defending against evasion-based attacks. Using the MNIST dataset and a ResNet18 architecture, we implemented several notable adversarial at- tacks, including FGSM, PGD, Clean Label, Backdoor (BadNet), and Square Attack. To mitigate these threats, we applied a variety of defense mechanisms across three cate- gories: preprocessing (Gaussian noise, bit-depth reduction, JPEG compression), training-based (adversarial training, label smoothing), and postprocessing (confidence thresholding, random- ized smoothing). Evaluation was conducted using standard performance metrics and qualitative visualizations. The results confirm the effectiveness of adversarial training and hybrid approaches in en- hancing model robustness. This work provides a reproducible framework and contributes to ongoing efforts toward secure and resilient deep learning systems.

Description

Keywords

: Deep learning, adversarial attacks, model robustness, image classification, ad- versarial training, defense mechanisms

Citation

MM/881

Endorsement

Review

Supplemented By

Referenced By