Changes between Version 9 and Version 10 of HMMVoiceCreation


Ignore:
Timestamp:
05/09/08 11:55:16 (17 years ago)
Author:
marcela_charfuelan
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • HMMVoiceCreation

    v9 v10  
    1313V)   Creating other voice in German or English (__if you want to train HMMs with another speech database__).[[BR]] 
    1414 
    15 The previous steps will be explained creating a HMM voice using the HTS '''speaker dependent training demo''':[[BR]] 
    16 http://hts.sp.nitech.ac.jp/archives/2.0.1/HTS-demo_CMU-ARCTIC-SLT.tar.bz2 
    17  
    18 For an explanation on how to create an adapted HMM voice using the '''speaker adaptation/adaptive training demo''':[[BR]] 
    19 http://hts.sp.nitech.ac.jp/archives/2.0.1/HTS-demo_CMU-ARCTIC-ADAPT.tar.bz2 [[BR]] 
    20 please see [wiki:HMMVoiceCreationAdapt]. 
     15The previous steps will be explained below creating a HMM voice using the HTS '''speaker dependent training demo'''.[[BR]] 
     16 
     17For an explanation on how to create an adapted HMM voice using the '''speaker adaptation/adaptive training demo''' please see [wiki:HMMVoiceCreationAdapt]. [[BR]] 
    2118 
    2219The training scripts used here are the latest versions, that is, it is required HTS_2.0.1 and SPTK-3.1. Some scripts have been added-modified to:[[BR]] 
     
    51480.1) download and un-zip, un-tar the latest speaker dependent training demo for English. 
    5249 
    53 Here it is used: HTS-demo_CMU-ARCTIC-SLT.tar.bz2 for HTS-2.0.1 
     50http://hts.sp.nitech.ac.jp/archives/2.0.1/HTS-demo_CMU-ARCTIC-SLT.tar.bz2 for HTS-2.0.1 
    5451 
    55520.2) download and unzip the patch file for using MARY instead of Festival as text analyser. 
     
    5855 
    5956apply the patch to the HTS-demo_CMU-ARCTIC-SLT directory:  [[BR]]  
     57{{{ 
    6058   patch -p1 -d . < HTS-2.0.1-demo_CMU-ARCTIC-SLT_for_Mary-3.5.0.patch 
     59}}} 
    6160 
    62610.3) create a wav directory. 
     
    64630.4) Run the VoiceImport program 
    6564 
    66 First of all you need to set your MARY_BASE directory:  [[BR]] 
     65First of all you need to set your MARY_BASE directory and then run the program:  [[BR]] 
     66{{{ 
    6767   export MARY_BASE="/dir/to/openmary" 
    68  
    69 then you can run:  [[BR]] 
    7068   java -jar -Xmx1024m  $MARY_BASE/java/voiceimport.jar 
     69}}} 
    7170 
    7271If you are not familiar or have problems with the VoiceImport program, please read and follow the instructions in the Voice Import Tools 
     
    7978''' 
    8079 
    81 1- Run HMMVoiceDataPreparation of the HMM Voice Trainer group, to check if text, wav and data/raw files are available and in the correct paths. 
    82 If just data/raw provided, the program will do the conversion.  If no text files are available but data/utts in festival format, the program will do the conversion as well. 
    83  
    84 2- Run PhoneUnitFeatureComputer component of the Feature Extraction group to extract context feature vectors from the text data. This procedure will create a "phonefeatures" directory.  For running this component the MARY server should be running as well. 
    85  
    86 3- Run the EHMMlabeler component of the Automatic Labeling group to label automatically the wav files using the corresponding transcriptions. 
    87 For running EHMMLabeler, please set:  [[BR]] 
    88    * EHMMLabeler.ehmm to corresponding path in ../festvox/src/ehmm/bin/ 
    89  
    90 4- Run LabelPauseDeleter component of the Automatic Labeling group. Please set:  [[BR]] 
    91    * LabelPauseDeleter.threshold = 10. 
    92  
    93 5- Run PhoneUnitLabelComputer component of the Labels and Pause Correction group. 
    94  
    95 6- Run PhonelabelFeatureAligner component of the Labels and Pause Correction group. This procedure will create a "phonelab" directory. 
     801- Run the HMMVoiceDataPreparation of the HMM Voice Trainer group to check if text, wav and data/raw files are available and in the correct paths. 
     81If just data/raw is provided, the program will do the conversion.  If no text files are available but data/utts in festival format, the program will do the  
     82conversion as well. 
     83 
     842- Run the PhoneUnitFeatureComputer component of the Feature Extraction group to extract context feature vectors from the text data. This procedure will create a "phonefeatures" directory.  For running this component the MARY server should be running as well. 
     85 
     863- Run the EHMMlabeler component of the Automatic Labeling group to label automatically the wav files using the corresponding transcriptions. This procedure might 
     87take several hours. For running EHMMLabeler, please use the settings editor of this component to set, according to your festvox installation, the variable: 
     88{{{ 
     89   EHMMLabeler.ehmm  = ../festvox/src/ehmm/bin/ 
     90}}} 
     91 
     924- Run the LabelPauseDeleter component of the Automatic Labeling group. Please use the settings editor of this component to set the variable: 
     93{{{ 
     94   LabelPauseDeleter.threshold  =  10 
     95}}} 
     96 
     975- Run the PhoneUnitLabelComputer component of the Labels and Pause Correction group. This procedure will create a "phonelab" directory. 
     98 
     996- Run the PhonelabelFeatureAligner component of the Labels and Pause Correction group. This procedure will verify alignment between "phonefeatures" and "phonelabels". 
     100 
    96101 
    97102''' 
     
    99104''' 
    100105 
    101 7- Run HMMVoiceConfigure component of the HMM Voice trainer group, the default setting values of this component are already fixed for the 
     1067- Run the HMMVoiceConfigure component of the HMM Voice trainer group, the default setting values of this component are already fixed for the 
    102107HTS-demo_CMU-ARCTIC-SLT voice. 
    103108 
    104 If running for other voice, for example a male German voice, please set:  [[BR]] 
    105   * HMMVoiceConfigure.dataSet     : german_set_name  [[BR]] 
    106   * HMMVoiceConfigure.featureList : feature_list_de.pl (context features used for this voice can be change in this file). [[BR]] 
    107   * HMMVoiceConfigure.lowerF0     : 40 (for male) [[BR]] 
    108   * HMMVoiceConfigure.speaker     : speaker_name [[BR]] 
    109   * HMMVoiceConfigure.upperF0     : 280 (for male)  [[BR]] 
    110   * HMMVoiceConfigure.voiceLang   : de [[BR]] 
    111  
    112 Using the setting of this component you can also change other variables like using LSP instead og MGC, sampling frequency, etc., 
    113 the same as you would do when running "make configure" with the original HTS scripts. 
    114  
    115 8- Run HMMVoiceMakeData component of the HMM Voice trainer group to run the HTS procedure "make data". This procedure is the same as in the original scripts with additional sections for calculating strenghts (for mixed exitation), global variance, and handling of MARY context features. 
    116  
    117 Particular procedures can be repeated isolated, fixing the particular settings for this component. For example, if the procedure that creates strengths (str directory) has to be repeated with a different set of filters (data/filters/), please set: [[BR]] 
    118   * HMMVoiceMakeData.makeSTR       1  [[BR]] 
    119   * HMMVocieMakeData.makeCMPMARY   1  [[BR]] 
     109If running configure for other voice, for example a male German voice, please use the settings editor of this component to set the variables: 
     110{{{ 
     111  HMMVoiceConfigure.dataSet      =  german_set_name 
     112  HMMVoiceConfigure.featureList  =  feature_list_de.pl  (the set of context features used for this voice can be change in this file). 
     113  HMMVoiceConfigure.lowerF0      =  40 (for male)  
     114  HMMVoiceConfigure.speaker      =  speaker_name  
     115  HMMVoiceConfigure.upperF0      =  280 (for male) 
     116  HMMVoiceConfigure.voiceLang    =  de 
     117}}} 
     118 
     119Using the settings editor of this component you can also change other variables like using LSP instead og MGC, sampling frequency, etc., 
     120the same as you would do when running "make configure + parameters" with the original HTS scripts. 
     121 
     1228- Run the HMMVoiceMakeData component of the HMM Voice trainer group to run the HTS procedure "make data". This procedure is the same as in the original scripts with additional sections for calculating strengths (for mixed excitation), global variance, and handling of MARY context features. 
     123 
     124Particular procedures can be repeated isolated fixing the particular settings for this component. For example, if the procedure that creates strengths (in the str directory) has to be repeated with a different set of filters (data/filters/), please set: 
     125{{{ 
     126  HMMVoiceMakeData.makeSTR       =  1 
     127  HMMVocieMakeData.makeCMPMARY   =  1 
     128}}} 
    120129all the other variables in 0, and run again the component. (In this case you need to run as well makeCMPMARY because you need to compose again the vectors mgc+lf0+str). 
    121130 
    122131The procedures can be repeated manually as well, going to the data directory and running "make data" or "make str", as is normally done with the original HTS scripts. 
    123132 
    124 Note: the Makefile in data/ includes a gv: section copied from HTS-2.1alpha version to calculate global variance files. In MARY, this files are generated little endian and contain a header of size one short to indicate the size of the vectors it contains. 
    125  
    126 9- Run HMMVoiceMakeVoice component of the HMM Voice trainer group, here again particular training steps can be repeated selecting them (setting in 1, all the others in 0) from the settings of this component. This is equivalent to run again:  [[BR]] 
    127    perl scripts/Training.pl scripts/Config.pm  [[BR]] 
     133NOTE: the Makefile in data/ includes a gv: section copied from HTS-2.1alpha version to calculate global variance files. In MARY, these files are generated little endian and contain a header of size one short to indicate the size of the vectors it contains. 
     134 
     1359- Run the HMMVoiceMakeVoice component of the HMM Voice trainer group, here again particular training steps can be repeated selecting them (setting in 1, all the others in 0) from the settings of this component. This is equivalent to run again: 
     136{{{ 
     137   perl scripts/Training.pl scripts/Config.pm 
     138}}} 
    128139after modifying the Config.pm file, as is normally done with the original HTS scripts. 
    129140  
    130141This component will generate general information about the execution of the training steps. Detailed information about the training status can be found in the logfile in the current directory. 
    131142 
    132 The training procedure can take several hours, check the log file time to time to check progress. 
     143The training procedure can take several hours, please check the log file time to time to check progress. 
    133144 
    134145 
     
    137148''' 
    138149 
    139 10- Run HMMVoiceInstaller component of the Install Voice group. The default setting values of this component are already fixed for the  HTS-demo_CMU-ARCTIC-SLT voice. If you are training other voice  please set: [[BR]] 
    140   * HMMVoiceInstaller.FeaList: make sure that this name is the one used during training, it should be the one in: [[BR]] 
    141          data/feature_list_xx.pl  (xx=en for English or xx=de for German) [[BR]] 
    142   * HMMVoiceInstaller.Flab: this is an example of file to synthesise in HTSCONTEXT format. One example can be found in: [[BR]] 
    143           data/labels/gen/ [[BR]] 
    144   * HMMVoiceInstaller.useMixExc: set this variable to true for using mixed excitation [[BR]] 
    145   * HMMVoiceInstaller.useGV: set this variable to true for using global variance in parameter generation.  [[BR]] 
     15010- Run the HMMVoiceInstaller component of the Install Voice group. The default setting values of this component are already fixed for the HTS-demo_CMU-ARCTIC-SLT voice. If you are training other voice  please use the settings editor of this component to set: 
     151{{{ 
     152  HMMVoiceInstaller.FeaList     =  data/feature_list_xx.pl  
     153                                   make sure that this file is the one used during training, xx=en for English or xx=de for German. 
     154  HMMVoiceInstaller.Flab        =  data/labels/gen/xx.lab  
     155                                   this is an example of a label file in HTSCONTEXT format for synthesise during start-up.  
     156  HMMVoiceInstaller.useMixExc   =  true 
     157                                   set this variable to true if using mixed excitation 
     158  HMMVoiceInstaller.useGV       =  true  
     159                                   set this variable to true if using global variance in parameter generation. 
     160}}} 
    146161 
    147162The VoiceInstaller will:  [[BR]] 
     
    181196- Move your transcription files to this directory, if you have a text directory containing the transcription of each file in separate files, this should be copied in the current directory (german_voice/text). If you have transcriptions in festival format please copy this directory in the data/utts directory (german_voice/data/utts/). 
    182197 
    183 - Now run the VoiceImport program and follow the instructions as normal. Provide general settings for gender, locale must be "de", path to mary_base and name of the voice. 
     198- Now run the VoiceImport program and follow the instructions as normal. Provide general settings for: 
     199{{{ 
     200   db.gender    =  male  (or female) 
     201   db.locale    =  de 
     202   db.marybase  =  /path/to/mary/base/ 
     203   db.voicename =  german_voice 
     204}}} 
    184205 
    185206[[BR]] 
    186207[[BR]] 
    187208 
    188 Marcela Charfuelan 
    189 DFKI 04.09.2008 
    190  
    191  
    192  
     209Marcela Charfuelan[[BR]] 
     210DFKI - Fri May  9 11:54:00 CEST 2008 
     211 
     212 
     213