This project provides an API with user level access support to transcribe speech to text using a finetuned and processed Whisper ASR model.
This open source project provides a self-hostable API for speech to text transcription using a finetuned Whisper ASR model. The API allows you to easily convert audio files to text through HTTP requests. Ideal for adding speech recognition capabilities to your applications.
Key features:
This repository contains code to deploy the API server along with finetuning and quantizing models. Check out the documentation for getting started!
To install the necessary dependencies, run the following command:
# Install ffmpeg for Audio Processing
sudo apt install ffmpeg
# Install Python Package
pip install -r requirements.txt
To run the project, use the following command:
uvicorn app.main:app --reload
To get your token, use the following command:
curl -X 'POST' \
'https://innovatorved-whisper-api.hf.space/api/v1/users/get_token' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"email": "[email protected]",
"password": "password"
}'
To upload a file and transcribe it, use the following command:
Note: The token is a dummy token and will not work. Please use the token provided by the admin.
Here are the available models:
# Modify the token and audioFilePath
curl -X 'POST' \
'http://localhost:8000/api/v1/transcribe/?model=tiny.en.q5' \
-H 'accept: application/json' \
-H 'Authentication: e9b7658aa93342c492fa64153849c68b8md9uBmaqCwKq4VcgkuBD0G54FmsE8JT' \
-H 'Content-Type: multipart/form-data' \
-F '[email protected];type=audio/wav'
Just try to be a developer!
For support, email [email protected]