Babylon.cpp / Guides / Getting Started

Getting Started

Build babylon.cpp from source, download the required model files, and run your first phonemization and speech synthesis in a few minutes.

Prerequisites

The following tools must be available on your system before building:

Tool	Version	Notes
CMake	3.18+	Used to configure and build the project
C++17 compiler	GCC, Clang, or MSVC	MSVC requires Visual Studio 2019+
Git	Any recent version	Required for submodule checkout
Xcode CLI Tools	—	macOS only

Clone the repository

Babylon.cpp uses Git submodules for ONNX Runtime and other dependencies. The --recursive flag is required.

git clone --recursive https://github.com/Mobile-Artificial-Intelligence/babylon.cpp.git
cd babylon.cpp

Build targets

The project provides a Makefile that wraps CMake for common build scenarios. All output is placed in bin/.

Target	Description
make lib	Build the shared library only
make cli	Library + CLI binary + runtime files copied to bin/
make debug	CLI build in Debug mode
make android	Cross-compile for Android (requires NDK r27+)

For most users, make cli is the right starting point. It also copies data/config.json, data/index.html, and the models/ directory into bin/.

make cli

Download model files

Babylon.cpp requires external ONNX model files that are not bundled in the repository. Place them in the bin/models/ directory (created by make cli), or configure custom paths in bin/config.json.

File	Description
open-phonemizer.onnx	Neural G2P model (Open Phonemizer)
dictionary.json	~130,000-entry pronunciation dictionary
kokoro-quantized.onnx	Kokoro TTS model (recommended engine)
voices/*.bin	Kokoro voice style embeddings (one file per voice)
curie.onnx	VITS TTS model (optional, Piper-compatible)

The Kokoro model and voice files are available from the Kokoro repository. The Open Phonemizer model and dictionary are from the OpenPhonemizer repository.

Configure model paths

On startup, the babylon binary automatically loads config.json from the same directory as the executable. Edit it to point to your model files:

{
  "phonemizer_model": "models/open-phonemizer.onnx",
  "dictionary":       "models/dictionary.json",
  "kokoro_model":     "models/kokoro-quantized.onnx",
  "kokoro_voice":     "en-US-heart",
  "kokoro_voices":    "models/voices"
}

Paths are resolved relative to the config file. You can also override any key with a CLI flag at runtime.

Run your first synthesis

From the bin/ directory, run:

# Phonemize text to IPA
./babylon phonemize "Hello world"
# → hɛloʊ wɜːld

# Synthesise speech (writes output.wav by default)
./babylon tts "Hello world"

# Choose a voice and output path
./babylon tts --voice en-US-nova --speed 1.2 "Hello world" -o hello.wav

If you see a model-not-found error, double-check that the paths in config.json are correct and the files have been downloaded.

Android builds

Cross-compiling for Android requires the Android NDK r27 or later. Set the ANDROID_NDK_HOME environment variable before running make android:

export ANDROID_NDK_HOME=/path/to/android-ndk-r27
make android

The output library (libbabylon.so) is placed in bin/android/ for each supported ABI.

← All guides CLI Usage →