Changes between Version 4 and Version 5 of FrequentlyAskedQuestions


Ignore:
Timestamp:
02/20/06 08:53:43 (19 years ago)
Author:
schroed
Comment:

added "How to add new language" question

Legend:

Unmodified
Added
Removed
Modified
  • FrequentlyAskedQuestions

    v4 v5  
    1111 
    1212... 
     13 
     14'''How difficult is it to put together the database needed in order to synthesize Hebrew/Italian/Spanish/Hindi/...? Is Mary modular in that sense?''' 
     15 
     16Mary is very modular, and a number of modules exist in a language-independent and configurable implementation, but there is still enough work left to do. 
     17 
     18For many languages, you could start with the existing MBROLA diphone voices:  
     19http://tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html 
     20 
     21You would then need at least the following MARY TTS modules: 
     22 
     23 * needed: a Tokeniser, cutting the input into sentences and tokens (it may be possible to re-use source:trunk/java/de/dfki/lt/mary/modules/JTokeniser.java for a number of languages) 
     24 
     25 * optional: a text normalisation which expands numbers, abbreviations  etc. into a pronounceable form (but that can be left out at the beginning) 
     26 
     27 * optional: a part-of-speech tagger, distinguishing at least between content words and function words 
     28 
     29 * crucially needed: a phonemiser, converting the input text into sound symbols, e.g. in SAMPA. This can be based on rules for some languages (probably, Spanish), but a pronounciation lexicon is required for others when the link between spelling and pronounciation is less regular. Then, also, the lexicon must be complemented with "letter-to-sound" rules for unknown words. 
     30 
     31 * optional: a prosody assignment module, predicting e.g. ToBI labels based on part-of-speech and other information.  
     32source:java/de/dfki/lt/mary/modules/ProsodyGeneric.java, written by my student Stephanie Becker, may be a good place to start. 
     33 
     34 * needed: a duration assignment module, predicting phone durations. As a very first start, the Klatt rules as currently used in the Tibetan language component: source:java/de/dfki/lt/mary/modules/tib/KlattDurationModeller  
     35could be used, of course adapted to the language-specific phoneme set. 
     36 
     37 * optional: an intonation contour realisation module. For example, there is a generic source:java/de/dfki/lt/mary/modules/TobiContourGenerator that can be used for different languages by writing appropriate config files. 
     38 
     39 * needed: synthesis, e.g. using MBROLA voices. 
     40 
     41So, in summary, for adding a new language, you most crucially need a  
     42phonemiser, and you need to get at least a tokeniser and a duration  
     43assigner to work. Assuming that there is already an acceptable MBROLA  
     44voice for your language. 
     45 
     46On the bright side, as data representation is based on Unicode, there  
     47should be no problem with non-European scripts.