Getting Started
Build babylon.cpp from source, download the required model files, and run your first phonemization and speech synthesis in a few minutes.
Prerequisites
The following tools must be available on your system before building:
| Tool | Version | Notes |
|---|---|---|
| CMake | 3.18+ | Used to configure and build the project |
| C++17 compiler | GCC, Clang, or MSVC | MSVC requires Visual Studio 2019+ |
| Git | Any recent version | Required for submodule checkout |
| Xcode CLI Tools | — | macOS only |
Clone the repository
Babylon.cpp uses Git submodules for ONNX Runtime and other dependencies. The --recursive flag is required.
git clone --recursive https://github.com/Mobile-Artificial-Intelligence/babylon.cpp.git
cd babylon.cppBuild targets
The project provides a Makefile that wraps CMake for common build scenarios. All output is placed in bin/.
| Target | Description |
|---|---|
| make lib | Build the shared library only |
| make cli | Library + CLI binary + runtime files copied to bin/ |
| make debug | CLI build in Debug mode |
| make android | Cross-compile for Android (requires NDK r27+) |
For most users, make cli is the right starting point. It also copies data/config.json, data/index.html, and the models/ directory into bin/.
make cliDownload model files
Babylon.cpp requires external ONNX model files that are not bundled in the repository. Place them in the bin/models/ directory (created by make cli), or configure custom paths in bin/config.json.
| File | Description |
|---|---|
| open-phonemizer.onnx | Neural G2P model (Open Phonemizer) |
| dictionary.json | ~130,000-entry pronunciation dictionary |
| kokoro-quantized.onnx | Kokoro TTS model (recommended engine) |
| voices/*.bin | Kokoro voice style embeddings (one file per voice) |
| curie.onnx | VITS TTS model (optional, Piper-compatible) |
Configure model paths
On startup, the babylon binary automatically loads config.json from the same directory as the executable. Edit it to point to your model files:
{
"phonemizer_model": "models/open-phonemizer.onnx",
"dictionary": "models/dictionary.json",
"kokoro_model": "models/kokoro-quantized.onnx",
"kokoro_voice": "en-US-heart",
"kokoro_voices": "models/voices"
}Paths are resolved relative to the config file. You can also override any key with a CLI flag at runtime.
Run your first synthesis
From the bin/ directory, run:
# Phonemize text to IPA
./babylon phonemize "Hello world"
# → hɛloʊ wɜːld
# Synthesise speech (writes output.wav by default)
./babylon tts "Hello world"
# Choose a voice and output path
./babylon tts --voice en-US-nova --speed 1.2 "Hello world" -o hello.wavAndroid builds
Cross-compiling for Android requires the Android NDK r27 or later. Set the ANDROID_NDK_HOME environment variable before running make android:
export ANDROID_NDK_HOME=/path/to/android-ndk-r27
make androidThe output library (libbabylon.so) is placed in bin/android/ for each supported ABI.