End-to-End Anti-spoofing with RawNet2

Authors:

Tak, Hemlata and Patino, Jose and Todisco, Massimiliano and Nautsch, Andreas and Evans, Nicholas and Larcher, Anthony

Where published:

ICASSP

 

Dataset names (used for):

  • ASVspoof 2019 Logical Access Dataset

 

Some description of the approach:

RawNet2 is a DNN-based model, input with raw audio waveforms. RawNet2 is a combination of the original RawNet1 [1] architecture and SincNet. “The first layer of RawNet2 is essentially the same as that of SincNet”, because SincNet processes raw audio waveforms directly, avoiding the need for handcrafted feature extraction. SincNet employs sinc functions as predefined filter shapes [2]. In RawNet2, the upper layers are the same as the residual blocks and GRU layer as RawNet1. They also used a “filter-wise feature map scaling (FMS) using a sigmoid function applied to residual block outputs” . The role of FMS is acting like an attention mechanism to provide more discriminative representations.

 

Some description of the data (number of data points, any other features that describe the data):

Raw audio waveforms. It includes 19 different algorithms for generating spoofed audio, with A01-A06, A07-A19.

 

Keywords:

Audio anti-spoofing

Instance Represent:

Waveforms

Dataset Characteristics:

Raw audio waveforms, diverse spoofing attack algorithms.

Subject Area:

Security of audio authentication systems

Dataset Link


Main Paper Link


License Link


Last Accessed: 11/26/2024

NSF Award #2346473