C / C++ API
Include babylon.h and link against libbabylon to embed local G2P and TTS directly in your C or C++ application. A stable C API allows FFI use from any language.
Linking
After building with make cli, link against the shared library and include the header:
# Linux / Android
gcc -o myapp myapp.c -I/path/to/babylon.cpp/include -L/path/to/bin -lbabylon
# macOS
clang -o myapp myapp.c -I/path/to/babylon.cpp/include -L/path/to/bin -lbabylonThe library must be able to locate its ONNX Runtime dependency at runtime. On Linux, add the bin/ directory to LD_LIBRARY_PATH. On macOS, use DYLD_LIBRARY_PATH.
C API — G2P (Phonemization)
babylon_g2p_init
int babylon_g2p_init(const char* model_path,
babylon_g2p_options_t options);Loads the Open Phonemizer ONNX model. Must be called before any phonemization or TTS function. Returns 0 on success, non-zero on failure.
typedef struct {
const char* dictionary_path; // path to dictionary.json, or NULL
const unsigned char use_punctuation; // 1 to preserve punctuation, 0 to strip
} babylon_g2p_options_t;babylon_g2p
char* babylon_g2p(const char* text);Phonemizes text and returns a heap-allocated IPA string. The caller must call free() on the result.
babylon_g2p_tokens
int* babylon_g2p_tokens(const char* text);Phonemizes text and returns a heap-allocated, -1-terminated array of Kokoro-compatible token IDs. The caller must call free() on the result.
babylon_g2p_free
void babylon_g2p_free(void);Releases the G2P session and frees all associated memory.
C API — Kokoro TTS
babylon_kokoro_init
int babylon_kokoro_init(const char* model_path);Loads the Kokoro ONNX model. The G2P session must already be initialised. Returns 0 on success.
babylon_kokoro_tts
void babylon_kokoro_tts(const char* text,
const char* voice_path,
float speed,
const char* output_path);Synthesises text using the voice style loaded from voice_path (.bin file) at the given speed, writing a 24 kHz WAV file to output_path.
babylon_kokoro_free
void babylon_kokoro_free(void);Releases the Kokoro session.
C API — VITS TTS
babylon_tts_init
int babylon_tts_init(const char* model_path);Loads a VITS ONNX model. The G2P session must already be initialised. Returns 0 on success.
babylon_tts
void babylon_tts(const char* text, const char* output_path);Synthesises text and writes a WAV file to output_path. The sample rate is determined by metadata embedded in the model.
babylon_tts_free
void babylon_tts_free(void);Releases the VITS session.
C API example
#include "babylon.h"
#include <stdlib.h>
int main(void) {
babylon_g2p_options_t opts = {
.dictionary_path = "models/dictionary.json",
.use_punctuation = 1,
};
if (babylon_g2p_init("models/open-phonemizer.onnx", opts) != 0)
return 1;
if (babylon_kokoro_init("models/kokoro-quantized.onnx") != 0)
return 1;
babylon_kokoro_tts(
"Hello world",
"models/voices/en-US-heart.bin",
1.0f,
"output.wav"
);
babylon_kokoro_free();
babylon_g2p_free();
return 0;
}C++ API
Higher-level C++ session classes are available in their respective namespaces. Include babylon.h and link against the same libbabylon.
OpenPhonemizer::Session
namespace OpenPhonemizer {
class Session {
public:
Session(const std::string& model_path,
const std::string& dictionary_path = "",
bool use_punctuation = false);
// Returns concatenated IPA string
std::string phonemize(const std::string& text);
// Returns Kokoro-compatible token IDs
std::vector<int64_t> phonemize_tokens(const std::string& text);
};
}Kokoro::Session
namespace Kokoro {
class Session {
public:
Session(const std::string& model_path);
void tts(const std::string& phonemes,
const std::string& voice_path,
float speed,
const std::string& output_path);
};
}Vits::Session
namespace Vits {
class Session {
public:
Session(const std::string& model_path);
void tts(const std::vector<std::string>& phonemes,
const std::string& output_path);
};
}C++ API example
#include "babylon.h"
int main() {
OpenPhonemizer::Session phonemizer(
"models/open-phonemizer.onnx",
"models/dictionary.json",
/* use_punctuation = */ true
);
Kokoro::Session kokoro("models/kokoro-quantized.onnx");
std::string phonemes = phonemizer.phonemize("Hello world");
kokoro.tts(phonemes,
"models/voices/en-US-heart.bin",
/* speed = */ 1.0f,
"output.wav");
return 0;
}