Changes between Version 10 and Version 11 of NewLanguageSupport

06/30/09 14:09:27 (15 years ago)

added short description for new language nlp components


  • NewLanguageSupport

    v10 v11  
    184184== 4. Minimal NLP components for the new language == 
     186With the files generated by the Transcription tool, we can now create a first instance of the NLP components in the TTS system for our language. 
     188We add support for our language to MARY TTS by creating a new config file in the folder MARY TTS\conf. By convention the file is called <locale>.config. It tells the MARY server which TTS modules to load, and which data files to use. 
     190The following is an example for Turkish (locale "tr"). 
     194# MARY TTS configuration file tr.config 
     197name = tr 
     198tr.version = 4.0.0 
     200provides = a-language 
     202requires = \ 
     203    marybase 
     207############################## The Modules  ############################### 
     209modules.classes.list = \ 
     210        marytts.modules.JPhonemiser(tr.)  \ 
     211        marytts.modules.MinimalisticPosTagger(tr,tr.) \ 
     215####################### Module settings  ########################### 
     218# Phonemiser settings 
     219tr.allophoneset = MARY_BASE/lib/modules/tr/lexicon/ 
     220tr.lexicon = MARY_BASE/lib/modules/tr/lexicon/tr_lexicon.fst 
     221tr.lettertosound = MARY_BASE/lib/modules/tr/lexicon/tr.lts 
     222#tr.userdict = MARY_BASE/lib/modules/tr/lexicon/userdict.txt 
     224# POS tagger settings 
     225tr.partsofspeech.fst = MARY_BASE/lib/modules/tr/tagger/tr_pos.fst 
     226tr.partsofspeech.punctuation = ,.?!; 
     231It can be seen that the tr.config file refers to the following files: 
     240They must be copied from the TranscriptionGUI folder to the expected place on the file system. 
     243Now, it should be possible to start the mary server, and place a query via the HTTP interface, for input format TEXT, locale tr, and output formats up to TARGETFEATURES. A suitable test request can be placed from http://localhost:59125/documentation.html. It is a good idea to check whether the output for TOKENS, PARTSOFSPEECH, PHONEMES, INTONATION and ALLOPHONES looks roughly as expected. 
     245In order to continue with the next step, you will need to have a mary server with this config file running, so that the FeatureMaker can compute feature vectors for computing diphone coverage.  
    187247== 5. Run feature maker with the minimal nlp components ==