Deepfake Voice and Scam Call Detector

Nirmal Poreddiwar; Shubham Gale

Title:
Deepfake Voice and Scam Call Detector

Authors:
Shubham Gale | Nirmal Poreddiwar

Cite This Article :

Shubham Gale | Nirmal Poreddiwar "Deepfake Voice and Scam Call Detector" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Special Issue | Recent Advances in Computer Applications and Information Technology, March 2026, pp.358-363, URL: https://www.ijtsrd.com/papers/ijtsrd101361.pdf

Download

Abstract :
The landscape of voice security in 2026 has evolved into a sophisticated hybrid ecosystem where Deepfake Voice and Scam Call Detectors operate through a dual-layered defense of algorithmic precision and human intuition. On the technical front, modern detectors like Pindrop Pulse and Reality Defender utilize "liveness" detection to scan for microscopic digital artifacts—such as spectral gaps, unnatural "robotic" prosody, and the absence of biological breathing patterns—that are invisible to the human ear. These systems are now being integrated directly into carrier networks and smartphone operating systems, providing a real-time "Trust Score" for incoming calls. By leveraging the C2PA standard, which acts as a digital watermark for authentic media, these tools can instantly flag unverified synthetic audio, creating a formidable technological shield against the initial wave of automated AI phishing and high-fidelity voice cloning. However, as AI models become more adept at mimicking human imperfections, the final and most critical line of defense remains Human-Centric Analysis. This approach moves beyond simple audio scanning to focus on "Social Engineering" detection, where users are trained to identify the psychological red flags of a scam—such as manufactured urgency, requests for sensitive data, or subtle latencies in response time during a conversation. Human-in-the-loop protocols involve "Challenge-Response" tests, where the recipient asks the caller to recall a specific, un-indexed personal memory or use a pre-arranged family "safe word." Because 2026 generative models still struggle with the high cognitive load of unexpected, non-linear interruptions, this human intervention creates a "stress test" for the AI. Ultimately, the most resilient security posture is one that treats the detector’s data as a warning signal but relies on human verification and behavioral skepticism to confirm the caller's true identity. The proliferation of generative adversarial networks (GANs) has facilitated the rise of high-fidelity voice cloning, enabling sophisticated "vishing" attacks that bypass traditional biometric and metadata-based security. This paper presents a dual-layered detection framework designed to identify deepfake audio and fraudulent intent in real-time telecommunications. The primary layer utilizes Mel-Frequency Cepstral Coefficients (MFCCs) and Constant-Q Transform (CQT) features to detect subtle spectral anomalies and phase inconsistencies inherent in synthetic speech. These features are processed through a Lightweight Convolutional Neural Network (LCNN) optimized for low-latency mobile environments. To supplement acoustic analysis, a secondary Natural Language Processing (NLP) module employs a Bidirectional Encoder Representations from Transformers (BERT) model to analyze live transcripts for linguistic markers of social engineering and urgency. Experimental evaluation on the ASVspoof 2019/2021 datasets demonstrates that the proposed hybrid approach achieves a significantly lower Equal Error Rate (EER) compared to baseline models. The results indicate that integrating acoustic forensics with intent analysis provides a robust defense mechanism against the evolving landscape of AI-driven telecommunication fraud.

Keywords :
Deepfake Detection, Voice Cloning, Vishing, Speech Forensics, Deep Learning, Telecommunication Security.

Publication Details:

Unique Identification Number : IJTSRD101361

Published In : Special Issue | Recent Advances in Computer Applications and Information Technology, March 2026

Page Number(s) : 358-363

Publisher Name : IJTSRD | www.ijtsrd.com | E-ISSN 2456-6470

Copyright © 2019 by author(s) and International Journal of Trend in Scientific Research and Development Journal. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0) (http://creativecommons.org/licenses/by/4.0)

About IJTSRD
Indexing

International Journal of Trend in Scientific Research and Development - IJTSRD having online ISSN 2456-6470. IJTSRD is a leading Open Access, Peer-Reviewed International Journal which provides rapid publication of your research articles and aims to promote the theory and practice along with knowledge sharing between researchers, developers, engineers, students, and practitioners working in and around the world in many areas like Sciences, Technology, Innovation, Engineering, Agriculture, Management and many more and it is recommended by all Universities, review articles and short communications in all subjects. IJTSRD running an International Journal who are proving quality publication of peer reviewed and refereed international journals from diverse fields that emphasizes new research, development and their applications. IJTSRD provides an online access to exchange your research work, technical notes & surveying results among professionals throughout the world in e-journals. IJTSRD is a fastest growing and dynamic professional organization. The aim of this organization is to provide access not only to world class research resources, but through its professionals aim to bring in a significant transformation in the real of open access journals and online publishing.