ARTICLE AD BOX
Integrating Speech-to-Text functionality into Django applications can significantly enhance user experience by allowing audio transcription directly within the app. According to AssemblyAI, developers can leverage their API to implement this feature seamlessly.
Setting Up the Project
To get started, create a new project folder and establish a virtual environment:
# Mac/Linux python3 -m venv venv . venv/bin/activate # Windows python -m venv venv .\venv\Scripts\activate.batNext, install the necessary packages including Django, AssemblyAI Python SDK, and python-dotenv:
pip install Django assemblyai python-dotenvCreating the Django Project
Create a new Django project named 'stt_project' and a new app within it called 'transcriptions':
django-admin startproject stt_project cd stt_project python manage.py startapp transcriptionsBuilding the View
In the 'transcriptions' app, create a view to handle file uploads and transcriptions. Open transcriptions/views.py and add the following code:
from django.shortcuts import render from django import forms import assemblyai as aai class UploadFileForm(forms.Form): audio_file = forms.FileField() def index(request): context = None if request.method == 'POST': form = UploadFileForm(request.POST, request.FILES) if form.is_valid(): file = request.FILES['audio_file'] transcriber = aai.Transcriber() transcript = transcriber.transcribe(file.file) file.close() context = {'transcript': transcript.text} if not transcript.error else {'error': transcript.error} return render(request, 'transcriptions/index.html', context)Defining URL Configuration
Map the view to a URL by creating transcriptions/urls.py:
from django.urls import path from . import views urlpatterns = [ path('', views.index, name='index'), ]Include this app URL pattern in the global project URL configuration in stt_project/urls.py:
from django.contrib import admin from django.urls import include, path urlpatterns = [ path('', include('transcriptions.urls')), path('admin/', admin.site.urls), ]Creating the HTML Template
Inside the 'transcriptions/templates' directory, create an index.html file with the following content:
<!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <meta name="viewport" content="width=device-width, initial-scale=1.0"> <title>AssemblyAI Django App</title> </head> <body> <h1>Transcription App with AssemblyAI</h1> <form method="post" enctype="multipart/form-data"> {% csrf_token %} <input type="file" accept="audio/*" name="audio_file"> <button type="submit">Upload</button> </form> <h2>Transcript:</h2> {% if error %} <p style="color: red">{{ error }}</p> {% endif %} <p>{{ transcript }}</p> </body> </html>Setting the API Key
Store the AssemblyAI API key in a .env file in the root directory:
ASSEMBLYAI_API_KEY=your_api_key_hereLoad this environment variable in stt_project/settings.py:
from dotenv import load_dotenv load_dotenv()Running the Django App
Start the server using the following command:
python manage.py runserverVisit the app in your browser, upload an audio file, and see the transcribed text appear.
Non-blocking Implementations
To avoid blocking operations, consider using webhooks or async functions. Webhooks notify you when the transcription is ready, while async calls allow the app to continue running during the transcription process.
Using Webhooks
Set a webhook URL in the transcription config and handle the webhook delivery in a separate view function:
webhook_url = f'{request.get_host()}/webhook' config = aai.TranscriptionConfig().set_webhook(webhook_url) transcriber.submit(file.file, config)Define the webhook receiver:
def webhook(request): if request.method == 'POST': data = json.loads(request.body) transcript_id = data['transcript_id'] transcript = aai.Transcript.get_by_id(transcript_id)Map this view to a URL:
urlpatterns = [ path('', views.index, name='index'), path('webhook/', views.webhook, name='webhook'), ]Using Async Functions
Utilize async views in Django for non-blocking transcription:
transcript_future = transcriber.transcribe_async(file.file) if transcript_future.done(): transcript = transcript_future.result()Speech-to-Text Options for Django Apps
When implementing Speech-to-Text, consider cloud-based APIs like AssemblyAI or Google Cloud Speech-to-Text for high accuracy and scalability, or open-source libraries like SpeechRecognition and Whisper for greater control and privacy.
Conclusion
This guide shows how to integrate Speech-to-Text into Django apps using the AssemblyAI API. Developers can choose between blocking and non-blocking implementations and select the best Speech-to-Text solution based on their needs.
For more details, visit the AssemblyAI blog.
Image source: Shutterstock