voice2json

Command-line tools for speech and intent recognition on Linux

1110

Python

voice2json logo

voice2json is a collection of command-line tools for offline speech/intent recognition on Linux. It is free, open source (MIT), and supports 18 human languages.

From the command-line:

$ voice2json -p en transcribe-wav \
      < turn-on-the-light.wav | \
      voice2json -p en recognize-intent | \
      jq .

produces a JSON event like:

{
    "text": "turn on the light",
    "intent": {
        "name": "LightState"
    },
    "slots": {
        "state": "on"
    }
}

when trained with this template:

[LightState]
states = (on | off)
turn (<states>){state} [the] light

voice2json is optimized for:

It can be used to:

Supported speech to text systems include:

Supported Languages

voice2json is more than just a wrapper around open source speech to text systems!

Training produces both a speech and intent recognizer. By describing your voice commands with voice2json’s templating language, you get more than just transcriptions for free.
Re-training is fast enough to be done at runtime (usually < 5s), even up to millions of possible voice commands. This means you can change referenced slot values or add/remove intents on the fly.
All of the available commands are designed to work well in Unix pipelines, typically consuming/emitting plaintext or newline-delimited JSON. Audio input/output is file-based, so you can receive audio from any source.

download-profile - Download missing files for a profile
train-profile - Generate speech/intent artifacts
transcribe-wav - Transcribe WAV file to text
- Add --open for unrestricted speech to text
transcribe-stream - Transcribe live audio stream to text
- Add --open for unrestricted speech to text
recognize-intent - Recognize intent from JSON or text
wait-wake - Listen to live audio stream for wake word
record-command - Record voice command from live audio stream
pronounce-word - Look up or guess how a word is pronounced
generate-examples - Generate random intents
record-examples - Generate and record speech examples
test-examples - Test recorded speech examples
show-documentation - Run HTTP server locally with documentation
print-profile - Print profile settings
print-downloads - Print profile file download information
print-files - Print user profile files for backup