Authors:
Tak, Hemlata and Patino, Jose and Todisco, Massimiliano and Nautsch, Andreas and Evans, Nicholas and Larcher, Anthony
Where published:
ICASSP
Dataset names (used for):
- ASVspoof 2019 Logical Access Dataset
Some description of the approach:
RawNet2 is a DNN-based model, input with raw audio waveforms. RawNet2 is a combination of the original RawNet1 [1] architecture and SincNet. “The first layer of RawNet2 is essentially the same as that of SincNet”, because SincNet processes raw audio waveforms directly, avoiding the need for handcrafted feature extraction. SincNet employs sinc functions as predefined filter shapes [2]. In RawNet2, the upper layers are the same as the residual blocks and GRU layer as RawNet1. They also used a “filter-wise feature map scaling (FMS) using a sigmoid function applied to residual block outputs” . The role of FMS is acting like an attention mechanism to provide more discriminative representations.
Some description of the data (number of data points, any other features that describe the data):
Raw audio waveforms. It includes 19 different algorithms for generating spoofed audio, with A01-A06, A07-A19.
Keywords:
Audio anti-spoofing
Instance Represent:
Waveforms
Dataset Characteristics:
Raw audio waveforms, diverse spoofing attack algorithms.
Subject Area:
Security of audio authentication systems