Authors:
Pascu, Octavian and Stan, Adriana and Oneata, Dan and Oneata, Elisabeta and Cucu, Horia
Where published:
INTERSPEECH
Dataset names (used for):
- ASVspoof 19
- In-the-wild
- FoR
- MLAAD
- TIM
Some description of the approach:
This study uses a common model which is previously utilized in image deepfake detection for the purpose of generalization [7] as its representation learning to achieve generalization in SAD. It also proposed a “direct method of estimating the uncertainty from the output probabilities of the detector, by computing the entropy over the outputs” as its calibration technique. This study is considered as the-state-of-the-art in terms of generalization and calibration performance.
Some description of the data (number of data points, any other features that describe the data):
This paper focuses on data including full utterances and partial fakes covering diverse domains, languages and spoofing systems. Representations extracted from self-supervised models (e.g., wav2vec2, XLS-R).
Keywords:
Deepfake detection, anti-spoofing, pretrained representations
Instance Represent:
Audio clips (real or synthesized)
Dataset Characteristics:
N/A
Subject Area:
Security of audio authentication systems