Abstract
Audio DeepFakes are artificially generated utterances created using deep learning methods with the main aim to fool the listeners, most of such audio is highly convincing. Their quality is sufficient to pose a serious threat in terms of security and privacy, such as the reliability of news or defamation. To prevent the threats, multiple neural networks--based methods to detect generated speech have been proposed. In this work, we cover the topic of adversarial attacks, which decrease the performance of detectors by adding superficial (difficult to spot by a human) changes to input data. Our contribution contains evaluating the robustness of 3 detection architectures against adversarial attacks in two scenarios (white-box and using transferability mechanism) and enhancing it later by the use of adversarial training performed by our novel adaptive training method.
Samples
We provide samples of successfully attacked utterances:
FGSM attacks
Sample Type | FGSM (ε=0.0005) | FGSM (ε=0.00075) | FGSM (ε=0.001) |
---|---|---|---|
Original | |||
Adversarial Attack |
PGDL2 attacks
Sample Type | PGDL2 (ε=0.10) | PGDL2 (ε=0.15) | PGDL2 (ε=0.20) |
---|---|---|---|
Original | |||
Adversarial Attack |
FAB attacks
Sample Type | FAB (η=10) | FAB (η=20) | FAB (η=30) |
---|---|---|---|
Original | |||
Adversarial Attack |
Citation
If you want to cite this work, please use the following BibTeX-entry:
@inproceedings{kawa23_interspeech,
author={Piotr Kawa and Marcin Plata and Piotr Syga},
title={{Defense Against Adversarial Attacks on Audio DeepFake Detection}},
year=2023,
booktitle={Proc. INTERSPEECH 2023},
pages={5276--5280},
doi={10.21437/Interspeech.2023-409}
}