A Deep Learning Framework for Audio Deepfake Detection

Authors:

Janavi Khochare, Chaitali Joshi, Bakul Yenarkar, Shraddha Suratkar, Faruk Kazi

Where published:

Arabian Journal for Science and Engineering, 2021, № 3, p. 3447-3458

Dataset names (used for):

Fake or Real (FoR)

Some description of the approach:

used Temporal Convolutional Network (TCN) and Spatial Transformer Network (STN) to classify a benchmark Fake or Real dataset. Using MEL spectrogram as the input feature of the audio data. Being limited to the FoR dataset reduces the generalizability of the model, since it only consists of one sub-type of audio deepfake called text-to-speech.

Some description of the data (number of data points, any other features that describe the data):

FoR dataset includes only TTS samples

Keywords:

TTS, audio anti-spoofing

Instance Represent:

Hand-crafted acoustic fatures and more

Dataset Characteristics:

Only audio deepfake TTS samples

Subject Area:

Security of audio authentication systems

Dataset Link

Main Paper Link

License Link

Last Accessed: 11/26/2024

NSF Award #2346473

Community Infrastructure to Strengthen AI for Audio Deepfake analysis (CISAAD)

College of Engineering and Information Technology

Community Infrastructure to Strengthen AI for Audio Deepfake analysis (CISAAD)

A Deep Learning Framework for Audio Deepfake Detection

Community Infrastructure to Strengthen AI for Audio Deepfake analysis (CISAAD)

Subscribe to UMBC Weekly Top Stories

I am interested in: