End-to-End Anti-spoofing with RawNet2

Authors:

Tak, Hemlata and Patino, Jose and Todisco, Massimiliano and Nautsch, Andreas and Evans, Nicholas and Larcher, Anthony

Where published:

ICASSP

Dataset names (used for):

ASVspoof 2019 Logical Access Dataset

Some description of the approach:

RawNet2 is a DNN-based model, input with raw audio waveforms. RawNet2 is a combination of the original RawNet1 [1] architecture and SincNet. “The first layer of RawNet2 is essentially the same as that of SincNet”, because SincNet processes raw audio waveforms directly, avoiding the need for handcrafted feature extraction. SincNet employs sinc functions as predefined filter shapes [2]. In RawNet2, the upper layers are the same as the residual blocks and GRU layer as RawNet1. They also used a “filter-wise feature map scaling (FMS) using a sigmoid function applied to residual block outputs” . The role of FMS is acting like an attention mechanism to provide more discriminative representations.

Some description of the data (number of data points, any other features that describe the data):

Raw audio waveforms. It includes 19 different algorithms for generating spoofed audio, with A01-A06, A07-A19.

Keywords:

Audio anti-spoofing

Instance Represent:

Waveforms

Dataset Characteristics:

Raw audio waveforms, diverse spoofing attack algorithms.

Subject Area:

Security of audio authentication systems

Dataset Link

Main Paper Link

License Link

Last Accessed: 11/26/2024

NSF Award #2346473

Community Infrastructure to Strengthen AI for Audio Deepfake analysis (CISAAD)

College of Engineering and Information Technology

Community Infrastructure to Strengthen AI for Audio Deepfake analysis (CISAAD)

End-to-End Anti-spoofing with RawNet2

Community Infrastructure to Strengthen AI for Audio Deepfake analysis (CISAAD)

Subscribe to UMBC Weekly Top Stories

I am interested in: