Voices & Languages Reference
Maise ships with 68 Kokoro voices across 9 languages — all bundled in the app, no downloads required. Every voice runs entirely on-device.
How to preview and change voices
Voice catalogue
All 68 voices are listed below by language. Voice names are the short identifier used in the dropdown — prefix with the language code to get the full ID (e.g. en-US-nova-kokoro).
English (US) (en-US)
English (UK) (en-GB)
German (de-DE)
French (fr-FR)
Greek (el-GR)
Italian (it-IT)
Japanese (ja-JP)
Portuguese (BR) (pt-BR)
Chinese (Simplified) (zh-CN)
Voice quality & characteristics
All voices are generated by the Kokoro neural TTS model, which produces natural-sounding speech at 24 kHz. Voice quality is consistent across the catalogue — the differences between voices are in character, accent, and speaking style rather than fidelity.
English (US) has the largest selection with 20 voices covering a range of tones — from warm and conversational (heart, bella) to clear and neutral (nova, alloy). If you're using Maise primarily for AI responses in Maid, voices like heart, nova, or echo tend to work well for conversational text.
For other languages, the number of available voices is smaller but all are production-quality. Japanese voices in particular are well-suited for both conversational and narrative text.
Reporting mispronunciations
Neural TTS can occasionally mispronounce uncommon words, proper nouns, or technical terms. If you encounter a word that is spoken incorrectly, tap Report mispronunciation in the Maise app to open a GitHub issue. Providing the exact text and the voice you were using helps the maintainers reproduce and fix the issue.