Context Navigation

Changes between Version 4 and Version 5 of FrequentlyAskedQuestions

Timestamp:: 02/20/06 08:53:43 (19 years ago)
Author:: schroed
Comment:: added "How to add new language" question

Legend:

: Unmodified
: Added
: Removed
: Modified

FrequentlyAskedQuestions

-                      v4
+                      v5
 ...
+'''How difficult is it to put together the database needed in order to synthesize Hebrew/Italian/Spanish/Hindi/...? Is Mary modular in that sense?'''
+Mary is very modular, and a number of modules exist in a language-independent and configurable implementation, but there is still enough work left to do.
+For many languages, you could start with the existing MBROLA diphone voices:
+http://tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html
+You would then need at least the following MARY TTS modules:
+ * needed: a Tokeniser, cutting the input into sentences and tokens (it may be possible to re-use source:trunk/java/de/dfki/lt/mary/modules/JTokeniser.java for a number of languages)
+ * optional: a text normalisation which expands numbers, abbreviations  etc. into a pronounceable form (but that can be left out at the beginning)
+ * optional: a part-of-speech tagger, distinguishing at least between content words and function words
+ * crucially needed: a phonemiser, converting the input text into sound symbols, e.g. in SAMPA. This can be based on rules for some languages (probably, Spanish), but a pronounciation lexicon is required for others when the link between spelling and pronounciation is less regular. Then, also, the lexicon must be complemented with "letter-to-sound" rules for unknown words.
+ * optional: a prosody assignment module, predicting e.g. ToBI labels based on part-of-speech and other information.
+source:java/de/dfki/lt/mary/modules/ProsodyGeneric.java, written by my student Stephanie Becker, may be a good place to start.
+ * needed: a duration assignment module, predicting phone durations. As a very first start, the Klatt rules as currently used in the Tibetan language component: source:java/de/dfki/lt/mary/modules/tib/KlattDurationModeller
+could be used, of course adapted to the language-specific phoneme set.
+ * optional: an intonation contour realisation module. For example, there is a generic source:java/de/dfki/lt/mary/modules/TobiContourGenerator that can be used for different languages by writing appropriate config files.
+ * needed: synthesis, e.g. using MBROLA voices.
+So, in summary, for adding a new language, you most crucially need a
+phonemiser, and you need to get at least a tokeniser and a duration
+assigner to work. Assuming that there is already an acceptable MBROLA
+voice for your language.
+On the bright side, as data representation is based on Unicode, there
+should be no problem with non-European scripts.