Changes between Version 10 and Version 11 of NewLanguageSupport


Ignore:
Timestamp:
06/30/09 14:09:27 (15 years ago)
Author:
masc01
Comment:

added short description for new language nlp components

Legend:

Unmodified
Added
Removed
Modified
  • NewLanguageSupport

    v10 v11  
    184184== 4. Minimal NLP components for the new language == 
    185185 
     186With the files generated by the Transcription tool, we can now create a first instance of the NLP components in the TTS system for our language. 
     187 
     188We add support for our language to MARY TTS by creating a new config file in the folder MARY TTS\conf. By convention the file is called <locale>.config. It tells the MARY server which TTS modules to load, and which data files to use. 
     189 
     190The following is an example for Turkish (locale "tr"). 
     191 
     192{{{ 
     193########################################################################## 
     194# MARY TTS configuration file tr.config 
     195########################################################################## 
     196 
     197name = tr 
     198tr.version = 4.0.0 
     199 
     200provides = a-language 
     201 
     202requires = \ 
     203    marybase 
     204 
     205 
     206########################################################################### 
     207############################## The Modules  ############################### 
     208########################################################################### 
     209modules.classes.list = \ 
     210        marytts.modules.JPhonemiser(tr.)  \ 
     211        marytts.modules.MinimalisticPosTagger(tr,tr.) \ 
     212 
     213 
     214#################################################################### 
     215####################### Module settings  ########################### 
     216#################################################################### 
     217 
     218# Phonemiser settings 
     219tr.allophoneset = MARY_BASE/lib/modules/tr/lexicon/allophones.tr.xml 
     220tr.lexicon = MARY_BASE/lib/modules/tr/lexicon/tr_lexicon.fst 
     221tr.lettertosound = MARY_BASE/lib/modules/tr/lexicon/tr.lts 
     222#tr.userdict = MARY_BASE/lib/modules/tr/lexicon/userdict.txt 
     223 
     224# POS tagger settings 
     225tr.partsofspeech.fst = MARY_BASE/lib/modules/tr/tagger/tr_pos.fst 
     226tr.partsofspeech.punctuation = ,.?!; 
     227 
     228}}} 
     229 
     230 
     231It can be seen that the tr.config file refers to the following files: 
     232 
     233{{{ 
     234MARY_BASE/lib/modules/tr/lexicon/allophones.tr.xml 
     235MARY_BASE/lib/modules/tr/lexicon/tr_lexicon.fst 
     236MARY_BASE/lib/modules/tr/lexicon/tr.lts 
     237MARY_BASE/lib/modules/tr/tagger/tr_pos.fst 
     238}}} 
     239 
     240They must be copied from the TranscriptionGUI folder to the expected place on the file system. 
     241 
     242 
     243Now, it should be possible to start the mary server, and place a query via the HTTP interface, for input format TEXT, locale tr, and output formats up to TARGETFEATURES. A suitable test request can be placed from http://localhost:59125/documentation.html. It is a good idea to check whether the output for TOKENS, PARTSOFSPEECH, PHONEMES, INTONATION and ALLOPHONES looks roughly as expected. 
     244 
     245In order to continue with the next step, you will need to have a mary server with this config file running, so that the FeatureMaker can compute feature vectors for computing diphone coverage.  
    186246 
    187247== 5. Run feature maker with the minimal nlp components ==