NewPlatano

How to Use OpenAI's Whisper in React Native

Beto, October 27, 2025 · 10,458 views

Learn how to integrate OpenAI's Whisper speech recognition system directly into React Native apps to run fully offline. Whisper is an open-source, multilingual speech-to-text model that can transcribe and translate audio locally on Android and iOS devices.

You'll see a demo app that downloads Whisper models from Hugging Face, stores them locally using Expo File System, and runs live transcription sessions. This is perfect for apps needing real-time voice notes, translations, or audio transcription without network dependency or cost.

What's inside

  • Introduction to Whisper and its capabilities in React Native
  • Downloading and managing Whisper models from Hugging Face
  • Storing models locally with Expo File System
  • Running live speech transcription sessions on Android and iOS
  • Testing on real devices versus emulators
  • Transcribing audio files besides live speech
  • Multilingual transcription and translation features
  • Practical use cases and integration tips

Introduction to Whisper and its capabilities in React Native

Whisper is an automatic speech recognition system released by OpenAI in September 2022. Unlike large language models, Whisper focuses solely on transcribing speech to text and supports multiple languages. It runs locally on device, enabling real-time transcription without internet or API costs.

In React Native, you can use Whisper through the Whisper RN project, which has existed for a couple of years. This allows you to integrate offline speech recognition into your app on both Android and iOS. I emphasize Whisper's reliability and cost-effectiveness for voice-driven features.

Downloading and managing Whisper models from Hugging Face

Whisper models are hosted on Hugging Face, a platform for open-source machine learning models. I show how to download different Whisper model sizes (like tiny and small) directly to the user's device.

Model sizes vary significantly: the tiny model is about 74 MB and lightweight, while the small model is around 500 MB and more accurate but heavier. The app lets users select and download models on demand, with options to delete models to save space.

Storing models locally with Expo File System

To keep the models available offline, I demonstrate using Expo File System to save downloaded Whisper models on the device. This local storage approach ensures transcription works without Wi-Fi or cellular data.

The app also manages model files by allowing users to remove unwanted models. This is important for managing device storage, especially with larger models.

Running live speech transcription sessions on Android and iOS

The demo app supports live transcription sessions where users speak and get real-time text output. I show live demos on both Android and iOS devices.

I recommend testing on real devices, especially Android phones, because emulators may not handle live audio input well. iOS simulators worked better in the demo. The transcription is fairly accurate and responsive, though performance depends on device specs and background processes.

Testing on real devices versus emulators

I stress the importance of testing Whisper transcription on physical devices. Android emulators struggled with live audio input and translation, while iOS simulators performed better.

For production apps, real device testing is crucial to ensure smooth, reliable transcription experiences. This advice helps avoid common pitfalls during development.

Transcribing audio files besides live speech

Besides live sessions, Whisper can transcribe pre-recorded audio files. I show a button that processes a sample audio file and outputs the transcript.

This feature enables use cases like transcribing downloaded podcasts, YouTube audio, or voice memos. I cautions about very long audio files (e.g., 2 hours), which might be too heavy to process locally depending on the device.

Multilingual transcription and translation features

Whisper supports multilingual audio recognition and can translate speech from one language to another. I demonstrate speaking Spanish and getting an English transcript.

This is useful for apps that want to offer live translation or support multiple languages. Some Whisper models detect language automatically and transcribe to English by default, which is helpful since many models only understand English well.

Practical use cases and integration tips

Whisper unlocks many possibilities like voice note taking, transcription workflows, and AI-powered summarization. I mention integrating Whisper with AI to summarize transcripts or generate bullet points.

It also highlights Whisper's offline capability, making it cost-effective and privacy-friendly. I encourage developers to explore Whisper RN for React Native apps and shares tips on model selection, storage management, and testing.

Resources

CourseReact Native course

Premium resourcePro membership

Let's connect!

Had a win? Get featured on Code with Beto.Share your story