Abstract
Deep neural networks are susceptible to various backdoor attacks, such as training time attacks, where the attacker can inject a trigger pattern into a small portion of the dataset to control the model’s predictions at runtime. Backdoor attacks are dangerous because they do not degrade the model’s performance. This paper explores the feasibility of a new type of backdoor attack, a <italic>data-free</italic> backdoor. Unlike traditional backdoor attacks that require poisoning data and injection during training, our approach, the Iterative Optimization Trigger Method (IOTM), enables trigger generation without compromising the integrity of the models and datasets. We propose an attack based on an IOTM technique, guided by an Adaptive Trigger Generator (ATG) and employing a custom objective function. ATG dynamically refines the trigger using feedback from the model’s predictions. We empirically evaluated the effectiveness of IOTM with three deep learning models (CNN, VGG16, and ResNet18) using the CIFAR10 dataset. The achieved Runtime-Attack Success Rate (R-ASR) varies across different classes. For some classes, the R-ASR reached 100%, whereas, for others, it reached 62%. Furthermore, we conducted an ablation study to investigate critical factors in the runtime backdoor, including optimizer, weight, “REG”, and trigger visibility on R-ASR using the CIFAR100 dataset. We observed significant variations in the R-ASR by changing the optimizer, including Adam and SGD, with and without momentum. The R-ASR reached 81.25% with the Adam optimizer, whereas the SGD with momentum and without results reached 46.87% and 3.12%, respectively.
Original language | English |
---|---|
Pages (from-to) | 1-12 |
Number of pages | 12 |
Journal | IEEE Transactions on Artificial Intelligence |
DOIs | |
Publication status | Accepted/In press - 2024 |
Keywords
- Adaptation models
- Cyber security
- Data models
- Data-free Backdoor attack
- Deep learning
- Iterative methods
- Runtime
- Runtime attack
- Toxicology
- Training
- Training data
- trigger generation