IJIRCST

Vol. 14, Issue 4, July 2026

Submission Last Date: 27 July 2026
First Acknowledgement: Within 2 Days
Peer Review: 10-15 Days
Submission to Acceptance: 30 Days
Acceptance to Publication: 35-38 Days
Acceptance Rate: 34%

1	Title of the Article	Speech Recognition Technologies: Design, Challenges, and Real-World Applications
2	Author's name	Maruti Maurya: Assistant Professor, Department of Computer Science and Engineering, Integral University, Lucknow, India
3	Author's name	Mohd Zaheer, Nawab Mohammad, Sadaf siddiqui, Mohd Zeeshan Khan, Mohd Ayan Akram
4	Subject	Computer Science and Engineering
5	Keyword(s)	OpenAI Whisper Model, YouTube Audio Transcription, Word Error Rate (WER), Character Error Rate (CER), Multilingual Speech Recognition, Audio Preprocessing
6	Abstract	This paper presents an automated speech recognition (ASR) system that transcribes audio from YouTube videos into accurate text using OpenAI's Whisper model. Leveraging tools such as yt_dlp, FFmpeg, and PyTorch, the system creates a robust speech-to-text pipeline. On receiving a video URL, the system extracts and preprocesses audio, transcribes it using Whisper, and evaluates transcription quality through metrics like Word Error Rate (WER), Character Error Rate (CER), and Match Error Rate (MER). The pipeline supports offline use, making it suitable for accessible, cost-effective deployment in educational, research, and assistive applications.
7	Publisher	Innovative Research Publication
8	Journal Name; vol., no.	International Journal of Innovative Research in Computer Science & Technology (IJIRCST); Volume-13 Issue-3
9	Publication Date	May 2025
10	Type	Peer-reviewed Article
11	Format	PDF
12	Uniform Resource Identifier	https://ijircst.org/view_abstract.php?title=Speech-Recognition-Technologies:-Design,-Challenges,-and-Real-World-Applications&year=2025&vol=13&primary=QVJULTEzNzI=
13	Digital Object Identifier(DOI)	10.55524/ijircst.2025.13.3.9 https://doi.org/10.55524/ijircst.2025.13.3.9
14	Language	English
15	Page No	55-61