logo

Mobile Artificial IntelligenceMobile AI

Babylon.cpp / Guides / Getting Started

Getting Started

Build babylon.cpp from source, download the required model files, and run your first phonemization and speech synthesis in a few minutes.

Prerequisites

The following tools must be available on your system before building:

ToolVersionNotes
CMake3.18+Used to configure and build the project
C++17 compilerGCC, Clang, or MSVCMSVC requires Visual Studio 2019+
GitAny recent versionRequired for submodule checkout
Xcode CLI ToolsmacOS only

Clone the repository

Babylon.cpp uses Git submodules for ONNX Runtime and other dependencies. The --recursive flag is required.

git clone --recursive https://github.com/Mobile-Artificial-Intelligence/babylon.cpp.git cd babylon.cpp

Build targets

The project provides a Makefile that wraps CMake for common build scenarios. All output is placed in bin/.

TargetDescription
make libBuild the shared library only
make cliLibrary + CLI binary + runtime files copied to bin/
make debugCLI build in Debug mode
make androidCross-compile for Android (requires NDK r27+)

For most users, make cli is the right starting point. It also copies data/config.json, data/index.html, and the models/ directory into bin/.

make cli

Download model files

Babylon.cpp requires external ONNX model files that are not bundled in the repository. Place them in the bin/models/ directory (created by make cli), or configure custom paths in bin/config.json.

FileDescription
open-phonemizer.onnxNeural G2P model (Open Phonemizer)
dictionary.json~130,000-entry pronunciation dictionary
kokoro-quantized.onnxKokoro TTS model (recommended engine)
voices/*.binKokoro voice style embeddings (one file per voice)
curie.onnxVITS TTS model (optional, Piper-compatible)
The Kokoro model and voice files are available from the Kokoro repository. The Open Phonemizer model and dictionary are from the OpenPhonemizer repository.

Configure model paths

On startup, the babylon binary automatically loads config.json from the same directory as the executable. Edit it to point to your model files:

{ "phonemizer_model": "models/open-phonemizer.onnx", "dictionary": "models/dictionary.json", "kokoro_model": "models/kokoro-quantized.onnx", "kokoro_voice": "en-US-heart", "kokoro_voices": "models/voices" }

Paths are resolved relative to the config file. You can also override any key with a CLI flag at runtime.

Run your first synthesis

From the bin/ directory, run:

# Phonemize text to IPA ./babylon phonemize "Hello world" # → hɛloʊ wɜːld # Synthesise speech (writes output.wav by default) ./babylon tts "Hello world" # Choose a voice and output path ./babylon tts --voice en-US-nova --speed 1.2 "Hello world" -o hello.wav
If you see a model-not-found error, double-check that the paths in config.json are correct and the files have been downloaded.

Android builds

Cross-compiling for Android requires the Android NDK r27 or later. Set the ANDROID_NDK_HOME environment variable before running make android:

export ANDROID_NDK_HOME=/path/to/android-ndk-r27 make android

The output library (libbabylon.so) is placed in bin/android/ for each supported ABI.