Changes between Version 20 and Version 21 of HMMVoiceCreationMary4.0


Ignore:
Timestamp:
08/31/10 17:39:09 (14 years ago)
Author:
marcela_charfuelan
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • HMMVoiceCreationMary4.0

    v20 v21  
    1  
    2 = '''Voice Import Tools Tutorial : How to build a HMM-based voice for the MARY 4.0 platform''' = 
    3  
    4 For creating HMM-based voices we use a version of the speaker dependent training scripts provided by [http://hts.sp.nitech.ac.jp/ HTS] that was adapted to the MARY 4.0 platform. The steps for building a HMM voice for the MARY platform can be summarised in:[[BR]] 
    5  
    6 I) Download MARY TTS including Voice import tools[[BR]] 
    7 II) Check necessary programs and files[[BR]] 
    8 III) Check data: audio and text files[[BR]] 
    9 IV) Run the Voice import tools [[BR]] 
    10 V) Creating other voice in a language different from German or English (US). 
    11  
    12 The training scripts used here are the latest versions, that is, it is required HTS_2.1 and SPTK-3.2. Some scripts have been added-modified to:[[BR]] 
    13 - Use MARY instead of festival as text analyzer.[[BR]] 
    14 - Train bandpass voicing strengths and Fourier magnitudes for mixed excitation.[[BR]] 
    15  
    16 '''MARY requirements:'''[[BR]] 
    17 - Operating System - Linux (tested on Ubuntu 9.04) [[BR]] 
    18 - MARY TTS 4.0.0 including Voice import tools during installation - link: [http://mary.dfki.de/download/4.0/openmary-standalone-install-4.0.0.jar MARY TTS 4.0.0] [[BR]] 
    19 - HTS '''speaker dependent training demo''' adapted to the MARY 4.0.0 platform, included in your MARY TTS 4.0 installation. 
    20  
    21  
    22 ''' 
    23 == I) Download MARY TTS including Voice import tools == 
    24 ''' 
    25  
    26 Click on the latest MARY release [http://mary.dfki.de/download/4.0/openmary-standalone-install-4.0.0.jar MARY download] or download the file and run it with: 
    27 {{{ 
    28 java -jar openmary-standalone-install-4.0.0.jar 
    29 }}} 
    30  
    31  
    32 ''' 
    33 == II) Check the necessary programs and files: == 
    34 ''' 
    35  
    36 We provide an script to facilitate the checking and installation of the necessary external programs, once installed MARY TTS open a command line shell in your voice building directory and run the shell script: 
    37 {{{ 
    38 $MARY_BASE/lib/external/check_install_external_programs.sh 
    39 }}} 
    40  
    41 With the option '''-check''', this script will check if the necessary programs and versions are installed (that is, the programs can be found in the PATH or in the paths provided by the user).[[BR]] 
    42 With the option '''-install''' this script will try to download and install the necessary programs in: $MARY_BASE/lib/external/bin (if problems, it will suggest how to install manually the programs). 
    43  
    44 If you have already installed some of the required programs, '''please include their paths in the PATH variable or provide the paths''', for example: 
    45 {{{ 
    46 $MARY_BASE/lib/external/check_install_external_programs.sh -check /your/path/to/htk/bin /your/path/to/Festival/festvox/src/ehmm/bin 
    47 }}} 
    48  
    49 This script generates a '''$MARY_BASE/lib/external/externalBinaries.config''' file that will be used by the Voice import tools to locate the necessary external programs. 
    50  
    51 The necessary programs that this script checks are:[[BR]] 
    52  
    53 '''HTS requirements:'''[[BR]] 
    54 - [http://hts.sp.nitech.ac.jp/archives/2.1/HTS-2.1_for_HTK-3.4.tar.bz2 HTS-2.1_for_HTK-3.4.patch] [[BR]] 
    55 - HTK-3.4 and HDecode patched with HTS-2.1_for_HTK-3.4.patch links: 
    56     * [http://htk.eng.cam.ac.uk/ftp/software/HTK-3.4.tar.gz HTK-3.4] (you will need to register first) [[BR]]  
    57     * [http://htk.eng.cam.ac.uk/prot-docs/hdecode.shtml HDecode] (you will need to register first) [[BR]] 
    58 - [http://downloads.sourceforge.net/sp-tk/SPTK-3.2.tar.gz SPTK-3.2] [[BR]] 
    59 - [http://downloads.sourceforge.net/hts-engine/hts_engine_API-1.01.tar.gz hts_engine_API-1.01] [[BR]] 
    60  
    61 '''Other requirements:'''[[BR]] 
    62 - awk normally available in linux [[BR]] 
    63 - perl normally available in linux [[BR]] 
    64 - bc normally available in linux [[BR]] 
    65 - sox, v13.0 or greater [http://sox.sourceforge.net/ SoX], normally available in linux.  [[BR]] 
    66 - tcl supporting snack, for example  [http://www.activestate.com/Products/ActiveTcl/ ActiveTcl.] Note that only ActiveTcl 8.4 includes snack; 8.5+ requires manual installation. [[BR]] 
    67 - [http://www.speech.kth.se/snack/download.html snack] library for tcl.  [[BR]] 
    68 - EHMM for automatic labeling, available with [http://festvox.org/download.html festvox-2.1] [[BR]] 
    69  
    70  
    71  
    72  
    73 ''' 
    74 == III) Check data: audio and text files[[BR]] == 
    75 ''' 
    76  
    77 In your voice building directory execute the step-by-step procedure in [http://mary.opendfki.de/wiki/VoiceImportToolsTutorial VoiceImportToolsTutorial] to make 
    78 sure that the data, sound (wav) and text files are in the correct place and format.[[BR]] 
    79  
    80 As a result of this step your voice building directory should contain a wav and text directories. 
    81  
    82  
    83 ''' 
    84 == IV) Run the Voice Import tools == 
    85 ''' 
    86  
    87 In your voice building directory run the voice import tools: 
    88 {{{ 
    89 export MARY_BASE="/your/path/to/MARY TTS/" 
    90 java -Xmx1024m -jar $MARY_BASE/java/voiceimport.jar 
    91 }}} 
    92  
    93 After starting the Voice Import Tools check the global settings of the voice, make sure that the allophones file is provided and exists: 
    94 {{{ 
    95 db.alophonesSet = $MARY_BASE/lib/modules/xx/lexicon/allophones.xx.xml  (where xx is the corresponding language) 
    96 }}} 
    97  
    98  
    99 And run the following components: 
    100  
    101  
    102 '''1-''' Run the HMMVoiceDataPreparation of the HMM Voice Trainer group to set up the environment to create a HMM voice and check if required external programs and text and wav files are available and in the correct paths.  
    103  
    104 '''2-''' Run the AllophonesExtractor of the Automatic Labeling group to create the '''prompt_allophones''' directory required in the next step. This component requires the MARY server. [[BR]] 
    105  
    106 '''3-''' Run the EHMMlabeler component of the Automatic Labeling group to label automatically the wav files using the corresponding transcriptions. This procedure might 
    107 take several hours. For running EHMMLabeler, please use the settings editor of this component to set, according to your festvox installation, the variable: 
    108 {{{ 
    109    EHMMLabeler.ehmm  = ../festvox/src/ehmm/bin/ 
    110 }}} 
    111 The result of this step is a '''ehmm/lab''' directory. 
    112  
    113 '''4-''' Run the LabelPauseDeleter component of the Automatic Labeling group. Please use the settings editor of this component to set the variable: 
    114 {{{ 
    115    LabelPauseDeleter.threshold  =  10 
    116 }}} 
    117 The result of this step is a '''lab''' directory. 
    118  
    119 '''5-''' Run the TranscriptionAligner component of the Label-Transcript Alignment group.  This program will create the '''allophones''' directory. 
    120  
    121 '''6-''' Run the PhoneUnitLabelComputer component of the Label-Transcript Alignment group. This procedure has as input the '''lab''' directory and will create as an output the  '''phonelab''' directory.  
    122  
    123 '''7-''' Run the FeatureSelelection component of the Feature Extraction group. This program will create a '''mary/features.txt''' file, it requires the MARY server running. Select here all the features and save the file. 
    124  
    125 '''8-''' Run the PhoneUnitFeatureComputer component of the Feature Extraction group to extract context feature vectors from the text data. This procedure will create a '''phonefeatures''' directory. For running this component the MARY server should be running as well.  
    126  
    127 '''9-''' Run the PhonelabelFeatureAligner component of the Verify Alignment group. This procedure will verify alignment between "phonefeatures" and "phonelabels".[[BR]] 
    128  
    129 As a result of previous steps we should have:[[BR]] 
    130 - phonefeatures directory [[BR]] 
    131 - phonelab directory [[BR]] 
    132 - mary/features.txt file [[BR]] 
    133 - $MARY_BASE/lib/external/externalBinaries.config 
    134  
    135  
    136 ''' 
    137 === HMM models training: === 
    138 ''' 
    139  
    140 '''10-''' Run the HMMVoiceConfigure component of the HMM Voice trainer group. The default setting values are already fixed for the arctic slt voice, some path settings depend on your installation, and will be taken from $MARY_BASE/lib/external/externalBinaries.config 
    141  
    142 If running configure for other voice, for example a male German voice, please use the settings editor of this component to set the variables: 
    143 {{{ 
    144   HMMVoiceConfigure.dataSet      =  german_set_name 
    145   HMMVoiceConfigure.speaker      =  speaker_name  
    146   HMMVoiceConfigure.lowerF0      =  40  (male=40,  female=80)   
    147   HMMVoiceConfigure.upperF0      =  280 (male=280, female=350) 
    148 }}} 
    149  
    150 Using the settings editor of this component you can also change other variables like using LSP instead og MGC, sampling frequency, etc., the same as you would do when running "make configure + parameters" with the original HTS scripts. 
    151  
    152 '''11-''' Run the HMMVoiceFeatureSelection component of the HMM Voice trainer group. This program reads the '''mary/features.txt''' file (created in step 11), and generates the file '''mary/hmmFeatures.txt'''. This file contains extra features, apart from phone and phonological features, that will be used to train HMMs. When running this program a small set of features will be presented on top, separated by an empty line:[[BR]] 
    153 {{{ 
    154    pos_in_syl 
    155    syl_break 
    156    prev_syl_break 
    157    position_type 
    158     
    159    accented 
    160    accented_syls_from_phrase_end 
    161    accented_syls_from_phrase_start 
    162    breakindex 
    163    edge 
    164    ... 
    165 }}} 
    166 If you are not sure about using other features, use the first four, delete the others and save the file. 
    167  
    168 '''12-''' Run the HMMVoiceMakeData component of the HMM Voice trainer group to run the HTS procedure "make data". This procedure require the following files: 
    169 {{{ 
    170    HMMVoiceMakeData.allophonesFile   = allophones.en_US.xml  # allophones set (language dependent) 
    171    HMMVoiceMakeData.featureListFile  = mary/hmmFeatures.txt  # extra context features used for training HMMs. 
    172 }}} 
    173  
    174 The allophones set file is language dependent, it can be found in $MARY_BASE/lib/modules/en/us/lexicon/allophones.en_US.xml[[BR]] 
    175 The hmmFeatures.txt is the file created in step 15 and contains additional context features, apart from phone and phonological features, used for training HMMs.[[BR]] 
    176  
    177 The HMMVoiceMakeData procedure is similar to the original HTS scripts with additional sections for calculating strengths, Fourier magnitudes (for mixed excitation), global variance and composing training data files from mgc, lf0, str and mag files. This component will execute in the hts/data/ directory:  
    178 {{{ 
    179   make all-mary  or 
    180   make mgc lf0 str-mary mag-mary cmp-mary gv-mary gv list scp  
    181 }}} 
    182 The '''label''' directory and the '''mlf''' files in MARY are done with the Voice Import Tools: HMMVoiceMakeData.makeLabels()[[BR]] 
    183 The '''questions''' file in MARY is done with the Voice Import Tools: HMMVoiceMakeData.makeQuestions()  
    184  
    185  
    186 Particular procedures can be repeated isolated fixing the particular settings for this component. For example, if the procedure that creates strengths (in the str directory) has to be repeated with a different set of filters (data/filters/), set: 
    187 {{{ 
    188   HMMVoiceMakeData.makeSTR       =  1 
    189   HMMVocieMakeData.makeCMPMARY   =  1 
    190 }}} 
    191 all the other variables in 0, and run again the component. (In this case you need to run makeCMPMARY again because you need to compose again the vectors mgc+lf0+str+mag). 
    192  
    193 The procedures can be repeated manually as well, going to the hts/data directory and running "make str-mary" and "make cmp-mary". 
    194  
    195 NOTE: the Makefile in data/ includes a gv: section to calculate global variance files. In MARY, these files are generated little endian and contain a header of size one short to indicate the size of the vectors it contains. 
    196  
    197  
    198 '''13-''' Run the HMMVoiceMakeVoice component of the HMM Voice trainer group, here again particular training steps can be repeated selecting them (setting in 1, all the others in 0) from the settings of this component. This is equivalent to run again: 
    199 {{{ 
    200    perl scripts/Training.pl scripts/Config.pm > logfile & 
    201 }}} 
    202 after modifying the Config.pm file, as is normally done with the original HTS scripts. 
    203   
    204 This component will generate general information about the execution of the training steps. Detailed information about the training status can be found in the logfile in the current directory. 
    205  
    206 The training procedure can take several hours, please check the log file time to time to check progress. 
    207  
    208  
    209 ''' 
    210 === Adding a new voice in the MARY platform: === 
    211 ''' 
    212  
    213 '''14-''' Run the HMMVoiceInstaller component of the Install Voice group. The default setting values of this component are already fixed for the HTS-demo_CMU-ARCTIC-SLT voice. Some settings of the voice can be changed here, for example: 
    214 {{{ 
    215   HMMVoiceInstaller.useMixExc   =  true 
    216                                    set this variable to true if using mixed excitation 
    217   HMMVoiceInstaller.useGV       =  true  
    218                                    set this variable to true if using global variance in parameter generation. 
    219 }}} 
    220  
    221 The VoiceInstaller will:  [[BR]] 
    222 - Create a new mary config file in: $MARY_BASE/conf/german-hsmm-voice.config [[BR]] 
    223 - Add the files corresponding to this voice in: $MARY_BASE/lib/voices/hsmm-voice/  [[BR]] 
    224 - copy one example of phonefeatures for testing the synthesiser: data/phonefeatures/cmu_us_arctic_slt_xxxx.pfeats to $MARY_BASE/lib/voices/hsmm-voice  [[BR]] 
    225 - copy the HTS trees: voices/qst001/ver1/*.inf to $MARY_BASE/lib/voices/hmm-voice  [[BR]] 
    226 - copy the HTS PDF models: voices/qst001/ver1/*.pdf to $MARY_BASE/lib/voices/hmm-voice [[BR]] 
    227 - copy global variance models (if useGV is set to true): data/gv/gv-*-littend.pdf to $MARY_BASE/lib/voices/hmm-voice [[BR]] 
    228 - copy filter taps for mixed excitation: data/filters/mix_excitation_filters.txt to $MARY_BASE/lib/voices/hmm-voice [[BR]] 
    229 - copy the trickyPhones.txt file, if one was created during training, to $MARY_BASE/lib/voices/hmm-voice [[BR]] 
    230 After successfully installing a new voice, it can be used with the mary_server and the mary_client. 
    231  
    232  
    233 ''' 
    234 === V) Creating other voice in a language different from German or English (US). === 
    235 ''' 
    236  
    237 If you are creating a voice in other language you will need to specify: 
    238  
    239 - '''Minimal NLP components''': if you are creating a new voice from scratch, for example following the steps in [http://mary.opendfki.de/wiki/NewLanguageSupport NewLanguageSupport], you will need to create Minimal NLP components for the new language. These minimal components are necessary to run the MARY server in the new language and extract context features ('''phonefeatures''' directory). 
    240  
    241 - '''Phoneme set''':  contained in $MARY_BASE/lib/modules/xx/lexicon/allophones.xx.xml , where xx corresponds to the new language. 
    242  
    243 - After creating the minimal components, you will need wav files (in a wav directory) and the corresponding transcriptions (one file per wav file in a text directory). [[BR]] 
    244 Afterwards follow the instructions as normal from step 1. Provide general settings for: 
    245 {{{ 
    246    db.gender    =  male  (or female) 
    247    db.locale    =  new_language locale (according to your minimal NLP components, ex. tr for Turkish, te for Telugu, etc.) 
    248    db.marybase  =  /path/to/mary/base/ 
    249    db.voicename =  new_language_voice_name 
    250 }}} 
    251  
    252  
    253    
    254  
    255 [[BR]] 
    256 [[BR]] 
    257  
    258 Marcela Charfuelan[[BR]] 
    259 Thu Sep 24 15:19:25 CEST 2009 
     1This page has been redirected to: [wiki:HMMVoiceCreation HMMVoiceCreation]