IJIRCST

Vol. 14, Issue 3, May 2026

Submission Last Date: 27 May 2026
First Decision: 2-3 Days
Peer Review: 15 Days
Submission to Acceptance: 30 Days
Acceptance to Publication: 35-38 Days
Acceptance Rate: 34%

1	Title of the Article	AI-Powered Pronunciation Mistake Detection Using Gemini 1.5 Flash: A Training-Free Approach
2	Author's name	Supritha P O: Assistant Professor, Department of Computer Science & Engineering, Sri Dharmasthala Manjunatheshwara Institute of Technology, Ujire, Karnataka, India
3	Author's name	Omkar Mahale, Shalya Gaonkar, Shetty Aditya Udaya, Sooraj Devadiga
4	Subject	Computer Science
5	Keyword(s)	Pronunciation Error Detection; Gemini 1.5 Flash; Speech Processing; Multimodal Llms; Prompt Engineering; Phoneme Analysis; Real-Time Pronunciation Feedback; Ai-Assisted Learning
6	Abstract	Pronunciation accuracy is a fundamental factor in effective language learning; however, many existing systems face difficulties in delivering real-time error analysis without relying on computationally intensive acoustic model training. This paper introduces an AI-driven pronunciation mistake detection system developed using Google Gemini 1.5 Flash, a low-latency multimodal large language model capable of directly processing spoken input. Unlike conventional approaches based on MFCC features or task-specific deep learning pipelines, the proposed system employs prompt-guided reasoning combined with algorithmic scoring methods to detect pronunciation errors at the word, phoneme, and prosodic levels. Learner speech is transmitted to the Gemini API, which generates a structured pronunciation analysis that includes phoneme-level interpretations and word-level discrepancies. These outputs are further processed by a custom scoring framework to evaluate pronunciation quality and produce clear, actionable feedback. Experimental evaluation using diverse English utterances demonstrates the system’s effectiveness in identifying vowel–consonant substitutions, omitted syllables, and stress-related errors. The findings underscore the potential of LLM-based audio reasoning as a lightweight, scalable, and real-time solution for automated pronunciation assessment.
7	Publisher	Innovative Research Publication
8	Journal Name; vol., no.	International Journal of Innovative Research in Computer Science & Technology (IJIRCST); Volume-14 Issue-1
9	Publication Date	January 2026
10	Type	Peer-reviewed Article
11	Format	PDF
12	Uniform Resource Identifier	https://ijircst.org/view_abstract.php?title=AI-Powered-Pronunciation-Mistake-Detection-Using-Gemini-1.5-Flash:-A-Training-Free-Approach&year=2026&vol=14&primary=QVJULTE0Mzg=
13	Digital Object Identifier(DOI)	10.55524/ijircst.2026.14.1.10 https://doi.org/10.55524/ijircst.2026.14.1.10
14	Language	English
15	Page No	79-88