Defense against Adversarial Attacks on Audio DeepFake Detection

Piotr Kawa*, Marcin Plata*, Piotr Syga

Department of Artificial Intelligence, Wrocław University of Science and Technology, Poland

* Equal contribution.
Paper is available here.

Abstract

Audio DeepFakes are artificially generated utterances created using deep learning methods with the main aim to fool the listeners, most of such audio is highly convincing. Their quality is sufficient to pose a serious threat in terms of security and privacy, such as the reliability of news or defamation. To prevent the threats, multiple neural networks--based methods to detect generated speech have been proposed. In this work, we cover the topic of adversarial attacks, which decrease the performance of detectors by adding superficial (difficult to spot by a human) changes to input data. Our contribution contains evaluating the robustness of 3 detection architectures against adversarial attacks in two scenarios (white-box and using transferability mechanism) and enhancing it later by the use of adversarial training performed by our novel adaptive training method.

Samples

We provide samples of successfully attacked utterances:

FGSM attacks

Sample Type FGSM (ε=0.0005) FGSM (ε=0.00075) FGSM (ε=0.001)
Original
Adversarial Attack

PGDL2 attacks

Sample Type PGDL2 (ε=0.10) PGDL2 (ε=0.15) PGDL2 (ε=0.20)
Original
Adversarial Attack

FAB attacks

Sample Type FAB (η=10) FAB (η=20) FAB (η=30)
Original
Adversarial Attack

Citation

If you want to cite this work, please use the following BibTeX-entry:

@inproceedings{kawa23_interspeech,
    author={Piotr Kawa and Marcin Plata and Piotr Syga},
    title={{Defense Against Adversarial Attacks on Audio DeepFake Detection}},
    year=2023,
    booktitle={Proc. INTERSPEECH 2023},
    pages={5276--5280},
    doi={10.21437/Interspeech.2023-409}
  }