Changes between Version 1 and Version 2 of HMMVoiceCreationAdapt


Ignore:
Timestamp:
05/09/08 14:53:26 (17 years ago)
Author:
marcela_charfuelan
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • HMMVoiceCreationAdapt

    v1 v2  
    22 
    33 
    4 = '''(Draft) Voice Import Tools Tutorial : How to build an adapted HMM-based voice for the MARY platform''' = 
     4= '''Voice Import Tools Tutorial : How to build an adapted HMM-based voice for the MARY platform''' = 
    55 
    66An adapted HMM-based voice is a voice created after adapting the (generally) small corpus of a particular voice to another voice that has been trained with  
     
    30300.1) download and un-zip, un-tar the latest speaker adaptation/adaptive training demo for English. 
    3131 
    32 Here it is used: http://hts.sp.nitech.ac.jp/archives/2.0.1/HTS-demo_CMU-ARCTIC-ADAPT.tar.bz2 for HTS-2.0.1 
     32http://hts.sp.nitech.ac.jp/archives/2.0.1/HTS-demo_CMU-ARCTIC-ADAPT.tar.bz2 for HTS-2.0.1 
    3333 
    34340.2) download and unzip the adaptation/adaptive patch file for using MARY instead of Festival as text analyser. 
     
    3636https://mary.opendfki.de/repos/trunk/lib/hts/HTS-2.0.1-demo_CMU-ARCTIC-ADAPT_for_Mary-3.6.patch.zip  [[BR]] 
    3737 
    38 apply the patch to the HTS-demo_CMU-ARCTIC-ADAPT directory:  [[BR]]  
     38apply the patch to the HTS-demo_CMU-ARCTIC-ADAPT directory: 
     39{{{ 
    3940   patch -p1 -d . < HTS-2.0.1-demo_CMU-ARCTIC-ADAPT_for_Mary-3.6.patch 
     41}}} 
    4042 
    41430.3) create a wav directory. 
     
    43450.4) Run the VoiceImport program 
    4446 
    45 First of all you need to set your MARY_BASE directory:  [[BR]] 
     47First of all you need to set your MARY_BASE directory and then run the program: 
     48{{{ 
    4649   export MARY_BASE="/dir/to/openmary" 
    47  
    48 then you can run:  [[BR]] 
    4950   java -jar -Xmx1024m  $MARY_BASE/java/voiceimport.jar 
     51}}} 
    5052 
    5153If you are not familiar or have problems with the VoiceImport program, please read and follow the instructions in the Voice Import Tools 
     
    5355 
    5456If you want to create another adapted voice in German or English please see the section V below. 
     57 
     58Please remember that whenever you are in doubt about the settings of a particular component you can check its corresponding help for a description of the meaning (and possible values) of each variable. 
    5559 
    5660''' 
     
    129133HTS-demo_CMU-ARCTIC-ADAPT voice. 
    130134 
    131 IMPORTANT: the names of the files should contain a label that identifies the data set and a label that identifies the voice.  
     135IMPORTANT: the names of the files contain a label that identifies the data set (cmu_us_arctic) and another label that identifies the voice (awb).  
    132136This is important because the training scripts require a mask to differentiate the data from one user to another.  
    133 The spkrMask label should reflect the data set and the voice name labels. For example for the CMU_US_ARTIC data the settings are: 
     137Another important configuration setting is the f0Ranges, that is the set of F0 ranges for all the voices. The format of this setting is:[[BR]] 
     138spkr1 lowerF01 upperF01  spkr2 lowerF02 upperF02  ... . The voice order of appearance is first the trainSpkr names and then the adaptSpkr names. 
     139 
     140For example the spkrMask and f0Ranges settings for the CMU_US_ARTIC data are: 
    134141{{{ 
    135142HMMVoiceConfigureAdapt.dataSet     = cmu_us_arctic 
    136 HMMVoiceConfigureAdapt.trainSpkr   = awb bdl clb jmk rms 
     143HMMVoiceConfigureAdapt.trainSpkr   = 'awb bdl clb jmk rms'   (please use quotes if there is more than one name) 
    137144HMMVoiceConfigureAdapt.adaptSpkr   = slt 
    138145HMMVoiceConfigureAdapt.spkrMask    = */cmu_us_arctic_%%%_* 
    139146                                    (here the voice name is exactly 3 letters, so all the voice names should be 3 letters long) 
    140 }}} 
    141  
    142 The file names for CMU_US_ARCTIC have the fromat: 
     147HMMVoiceConfigureAdapt.f0Ranges    = 'awb 40 280  bdl 40 280  clb 80 350  jmk 40 280  rms 40 280  slt 80 350'    
     148                                     (please leave two spaces after each set) 
     149}}} 
     150 
     151The file names for CMU_US_ARCTIC have the format: 
    143152{{{ 
    144153awb --> cmu_us_arctic_awb_*.* 
     
    149158slt --> cmu_us_arctic_slt_*.* 
    150159}}} 
     160 
    151161Using the setting of this component you can also change other variables like using LSP instead og MGC, sampling frequency, etc., 
    152162the same as you would do when running "make configure" with the original HTS scripts. 
    153163 
    154164 
    155 8- Run HMMVoiceMakeData component of the HMM Voice trainer group to run the HTS procedure "make data". This procedure is the same as in the original scripts with additional sections for calculating strenghts (for mixed exitation), global variance, and handling of MARY context features. 
    156  
    157 Particular procedures can be repeated isolated, fixing the particular settings for this component. For example, if the procedure that creates strengths (str directory) has to be repeated with a different set of filters (data/filters/), please set: 
    158 {{{ 
    159   HMMVoiceMakeData.makeSTR       1 
    160   HMMVocieMakeData.makeCMPMARY   1 
    161 }}} 
    162 all the other variables in 0, and run again the component. (In this case you need to run as well makeCMPMARY because you need to compose again the vectors mgc+lf0+str). 
    163  
    164 The procedures can be repeated manually as well, going to the data directory and running "make data" or "make str", as is normally done with the original HTS scripts. 
    165  
    166 Note: the Makefile in data/ includes a gv: section copied from HTS-2.1alpha version to calculate global variance files. In MARY, these files are generated little endian and contain a header of size one short to indicate the size of the vectors it contains. In the case of adapted voices, the gv variance is calculated from the adapted corpus. 
     1658- Run the HMMVoiceMakeData component of the HMM Voice trainer group to execute the HTS procedure "make data". This procedure is the same as in the original scripts with additional sections for calculating strengths (for mixed excitation), global variance, and handling of MARY context features. 
     166 
     167NOTE: the Makefile in data/ includes a gv: section copied from HTS-2.1alpha version to calculate global variance files. In MARY, these files are generated little endian and contain a header of size one short to indicate the size of the vectors it contains. In the case of adapted voices, the gv variance is calculated from the adapted corpus for each adapted voice. 
    167168 
    1681699- Run the HMMVoiceMakeVoiceAdapt component of the HMM Voice trainer group, here again particular training steps can be repeated selecting them (setting in 1, all the others in 0) from the settings of this component. This is equivalent to run again: 
     
    174175This component will generate general information about the execution of the training steps. Detailed information about the training status can be found in the logfile in the current directory. 
    175176 
    176 The training procedure can take several hours (or days), check the log file time to time to check progress. 
     177The adaptive training procedure can take several hours (or days), check the log file time to time to check progress. 
    177178 
    178179 
     
    181182''' 
    182183 
    183 10- Run HMMVoiceInstaller component of the Install Voice group. This step is similar to the the speaker dependent demo, but please set the appropriate directories 
    184 for the gv and voices data, which should be in the directory of the adapted voice, for example for the adapted slt voice: 
     18410- Run the HMMVoiceInstaller component of the Install Voice group. This step is similar to the the speaker dependent demo, but the gv and voice directories 
     185have to be fixed according to the adapted voice. For example for the adapted slt voice please set: 
    185186{{{ 
    186187HMMVoiceInstaller.Fgva data/gv/slt/gv-mag-littend.pdf 
     
    203204}}} 
    204205 
     206If more than one voice is adapted, this procedure should be repeated setting the appropriate directories for gv and voice. 
    205207 
    206208''' 
     
    210212If using German: 
    211213 
    212 For creating a new German voice it is necessary: [[BR]] 
     214For creating a new adapted German voice it is necessary: [[BR]] 
    213215  * a wav or raw directory with the speech files you will use for training the German voice. [[BR]] 
    214216  * transcriptions of the files, one text file per speech file, or transcriptions in festival format if available. [[BR]] 
    215217 
     218Please be aware of the names format for the adaptive scripts. Since it is used a mask for the names it is better if the names of your files have a particular format.  
     219For example we experiment adapting a neutral voice to different styles with the male German PAVOQUE database. For this database the file names have the format: 
     220{{{ 
     221neutr --> pavoque_neutr_*.*    training data, big corpus, male voice with neutral style. 
     222obadi --> pavoque_obadi_*.*    data for adaptation, small corpus, the same male voice but with depressed style.  
     223poppy --> pavoque_poppy_*.*    data for adaptation, small corpus, the same male voice but with happy style.  
     224spike --> pavoque_spike_*.*    data for adaptation, small corpus, the same male voice but with angry style.  
     225}}} 
     226 
     227Having this distribution of files, our settings for configureAdapt looked like: 
     228{{{ 
     229HMMVoiceConfigureAdapt.dataSet     = pavoque 
     230HMMVoiceConfigureAdapt.trainSpkr   = neutr 
     231HMMVoiceConfigureAdapt.adaptSpkr   = 'obadi poppy spike' 
     232HMMVoiceConfigureAdapt.spkrMask    = */pavoque_%%%%%_* 
     233                                    (here the voice names are exactly 5 letters long, it can not be a voice name with more that 5 letters!) 
     234HMMVoiceConfigureAdapt.f0Ranges    = 'neutr 40 280  obadi 40 280  poppy 40 280  spike 40 280'    
     235                                     (please leave two spaces after each set) 
     236}}} 
     237 
    216238Then we use as a base the original HTS-demo_CMU-ARCTIC-SLT directory: 
    217239 
     
    227249- Move your transcription files to this directory, if you have a text directory containing the transcription of each file in separate files, this should be copied in the current directory (german_voice/text). If you have transcriptions in festival format please copy this directory in the data/utts directory (german_voice/data/utts/). 
    228250 
    229  
    230 - Please be aware of the names format for the adaptive scripts. Since it is used a mask for the names it is bettr if the names of your files have a particular format.  
    231 For example for our PAVOQUE database the file names have the format: 
    232 {{{ 
    233 neutr --> pavoque_neutr_*.*    training data, big corpus, male voice and sound effect neutral. 
    234 obadi --> pavoque_obadi_*.*    adapted voice, small corpus, the same male voice but sound effect depressed.  
    235 poppy --> pavoque_poppy_*.*    adapted voice, small corpus, the same male voice but sound effect happy.  
    236 spike --> pavoque_spike_*.*    adapted voice, small corpus, the same male voice but sound effect angry.  
    237 }}} 
    238  
    239 Having this distribution of files, our setting for configure will look like: 
    240 {{{ 
    241 HMMVoiceConfigureAdapt.dataSet     = pavoque 
    242 HMMVoiceConfigureAdapt.trainSpkr   = neutr 
    243 HMMVoiceConfigureAdapt.adaptSpkr   = obadi poppy spike 
    244 HMMVoiceConfigureAdapt.spkrMask    = */pavoque_%%%%%_* 
    245                                     (here the voice names are exactly 5 letters long, it can not be a voice name with more that 5 letters!) 
    246 }}} 
    247  
    248 - Now run the VoiceImport program and follow the instructions as normal. Provide settings for locale must be "de", path to mary_base and name of the voice. 
     251- Now run the VoiceImport program and follow the HMMVoiceCreationAdapt instructions as normal. Provide settings for locale, if German it must be "de", path to mary_base and name of the voice.  
     252NOTE: In our PAVOQUE example we generate three adapted voices, so the name of the voice at the beginning (during training) can be a general one, but during installation, please set the name according to the adapted voice you are going to install. 
     253 
    249254 
    250255[[BR]] 
    251256[[BR]] 
    252257 
    253 Marcela Charfuelan DFKI 
    254 Thu May  8 17:26:44 CEST 2008 
    255  
    256  
     258Marcela Charfuelan [[BR]] 
     259DFKI - Fri May  9 14:35:27 CEST 2008 
     260