A Review of Modern Deepfake Detection Methods: Challenges and Future Directions

Authors:

Zaynab Almutairi

Where published:

Algorithms

 

Dataset names (used for):

  • The M-AILABS Speech: AD detection, Baidu Silicon Valley AI Lab cloned audio: Neural voice cloning with a few samples
  • Fake oR Real (FoR): for synthetic speech detection
  • AR-DAD: Arabic Diversified Audio
  • H-Voice: used to train a machine learning system to classify original and fake voice recordings obtained with the imitation and Deep Voice algorithms
  • ASV spoof 2021 Challenge
  • FakeAVCeleb
  • ADD

 

Some description of the approach:

The article reviews existing audio deepfake (AD) detection methods and compares faked audio datasets. It introduces types of AD attacks and analyzes detection methods and datasets for imitation and synthetic-based deepfakes.

 

Some description of the data (number of data points, any other features that describe the data):

The paper focuses on the review of methods rather than specific dataset metrics. The datasets discussed vary in type and size depending on the study.

 

Keywords:

Audio Deepfakes (ADs); Machine Learning (ML); Deep Learning (DL); imitated audio

Instance Represent:

Various types of audio samples including real and synthetically generated voices.

Dataset Characteristics:

Varied, depending on the study and dataset discussed.

Subject Area:

Audio Security, Machine Learning

Associated Tools:

Detection of audio deepfakes.

Feature Type:

Audio features such as Mel-spectrograms.

Main Paper Link


License: 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).


Last Accessed: 6/13/2024 (5:30PM)

NSF Award #2346473