Prior studies show some linguistic features are shared between deepfake audio, and natural human speech. Some of those features are listed here, along with how you can differentiate between a real person speaking, and computer-generated speech.
Learn more about how you can identify each linguistic feature for spoofed audio detection with the videos below.
Linguistic Cues of Audio Deepfakes – An Overview
Listening for Pitch
This video goes into detail about one of our Expert-Defined Linguistic Features (EDLFs), Pitch, showing you how to listen for this feature when you hear potentially fake audio.
Listening for Pause
This video gives a more in-depth discussion of Pause, one of our Expert-Defined Linguistic Features (EDLFs) and how to listen for this feature in a speech sample.
Listening for Consonant Bursts
This video provides a definition of Initial and Final Stop Consonant Bursts, one of our Expert-Defined Linguistic Features (EDLFs) and shows you how to identify this feature in stop consonants in spoken English.
Listening for Breath
In this video, we explain one of our Expert-Defined Linguistic Features (EDLFs), Breath, and show you how to listen for it in a speech sample.
Listening for Audio Quality
This video discusses Audio Quality, one of our Expert-Defined Linguistic Features (EDLFs) that can be used to spot fake speech.
Linguistic Cues of Deepfake Audio, in Review
In this video, we’ll briefly review how to use our five Expert-Defined Linguistic Features (EDLFs) as a tool to spot fake audio. Remember, human language is complex, and variation in speech is normal and natural. As you use EDLFs to help you spot misleading content, it’s important to always keep context–information about the individual speaker, intended audience, and setting–in mind when discerning real from fake.
Infographics by Pragya Pandit
Website Design by Lavanya Neelakandan