A speech-to-text-to-speech program that does things.
Find a file
2024-07-18 21:13:09 +03:00
.vscode Prep 2024-04-14 23:36:41 +03:00
src Now it functions (barely) on my new NixOS install :3 2024-07-18 21:13:09 +03:00
.envrc Now it functions (barely) on my new NixOS install :3 2024-07-18 21:13:09 +03:00
.gitignore Some experiments 2024-06-19 16:41:56 +03:00
Cargo.lock Now it functions (barely) on my new NixOS install :3 2024-07-18 21:13:09 +03:00
Cargo.toml Now it functions (barely) on my new NixOS install :3 2024-07-18 21:13:09 +03:00
config.json Now it functions (barely) on my new NixOS install :3 2024-07-18 21:13:09 +03:00
en_GB-cori-high.onnx.json Some experiments 2024-06-19 16:41:56 +03:00
flake.lock Now it functions (barely) on my new NixOS install :3 2024-07-18 21:13:09 +03:00
flake.nix Now it functions (barely) on my new NixOS install :3 2024-07-18 21:13:09 +03:00
README.md Some experiments 2024-06-19 16:41:56 +03:00
rust-toolchain.toml Now it functions (barely) on my new NixOS install :3 2024-07-18 21:13:09 +03:00
voices.json Now it functions (barely) on my new NixOS install :3 2024-07-18 21:13:09 +03:00

vocalization

Vocalization (from now on just voc) is a speech to text to speech program.

The process it uses is as follows:

  1. Loads a whisper.cpp compatible model (available from https://huggingface.co/ggerganov/whisper.cpp)
  2. Initializes access to espeak
  3. Live listens to the microphone, with the settings from config.json taken into account (live-reload)
  4. Sends the audio to whisper to decode
    • If there is >=~1s of audio, then it goes through as normal
    • If there is <~1s of audio, then it gets padded with silence
  5. The text from there is sent to espeak to speak out.

Future Goals

  • Integrate a neural TTS (piper is the leading option)
  • Output to a fake microphone instead
  • UX improvements
  • Better silence detection and noise suppression

Usage

It's evil, you shouldn't. ROCm only and probably requires some system libs that I can't list here because I already had them installed and don't know which ones they would be.

But if you want to:

  1. Change the model path in the config.json file
  2. Start the program and let it do its work

Extras

Yes I know I've committed the config. No I don't think it matters. ;3 There's nothing in the config that's useful to anyone outside of my home.