Changes between Initial Version and Version 1 of HMMVoiceCreation


Ignore:
Timestamp:
04/28/08 16:24:27 (17 years ago)
Author:
marcela_charfuelan
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • HMMVoiceCreation

    v1 v1  
     1 
     2''' 
     3 
     4== Creating HMM voices for MARY with the HMM Voice building tools. == 
     5 
     6''' 
     7 
     8The steps for building a HMM voice for Mary system can be summarised in:[[BR]] 
     9I)   Checking necessary programs and files[[BR]] 
     10II)  Data preparation[[BR]] 
     11III) Training of HMM models[[BR]] 
     12IV) Adding a new HMM voice in the Mary system.[[BR]] 
     13 
     14The following is an explanation of these steps for running the HTS-demo_CMU-ARCTIC-SLT (http://hts.sp.nitech.ac.jp/archives/2.0.1/HTS-demo_CMU-ARCTIC-SLT.tar.bz2) 
     15using the Mary system. 
     16 
     17The training scripts used here are the latest versions, that is, it is required HTS_2.0.1 and SPTK-3.1. Some scripts have been added-modified to:[[BR]] 
     18- Use Mary instead of festival as text analyzer.[[BR]] 
     19- Train bandpass voicing strengths for mixed excitation.[[BR]] 
     20- Process language specific settings as parameters.[[BR]] 
     21 
     22''' 
     23=== I) Checking the necessary programs and files: === 
     24''' 
     25 
     26MARY requirements:[[BR]] 
     27- Operating System - Linux[[BR]] 
     28- MARY TTS Recent Version - Download Link: http://mary.dfki.de/Download [[BR]] 
     29- Openmary - SVN from http://mary.opendfki.de [[BR]] 
     30- Mary patch for HTS demo: HTS-2.0.1-demo_CMU-ARCTIC-SLT_for_Mary-3.5.0.patch [[BR]] 
     31 
     32 
     33HTS requirements, please download and follow the instructions for installing:[[BR]] 
     34- HTK-3.4 [[BR]] 
     35- HTS-2.0.1_for_HTK-3.4.patch [[BR]] 
     36- SPTK-3.1 [[BR]] 
     37- HTS-demo_CMU-ARCTIC-SLT (for HTS-2.0.1) [[BR]] 
     38 
     39Other requirements, please download and follow the instructions for installing: [[BR]] 
     40- EHMM for automatic labeling, available with festvox-2.1 (Recent Version) http://festvox.org/download.html [[BR]] 
     41- sox, normally available in linux.  [[BR]] 
     42- tcl-tk supporting snack, for example  ActiveTcl - Download Link: http://www.activestate.com/Products/ActiveTcl/ [[BR]] 
     43- perl, normally available in linux.  [[BR]] 
     44 
     450.1) download and un-zip, un-tar the latest Speaker dependent training demo for English. 
     46 
     47Here it is used: HTS-demo_CMU-ARCTIC-SLT.tar.bz2 for HTS-2.0.1 
     48 
     490.2) download the patch file for using Mary instead of Festival text analyser. 
     50 
     51http://mary.dfki.de/Download/HTS-2.0.1-demo_CMU-ARCTIC-SLT_for_Mary-3.5.0.patch  [[BR]] 
     52 
     53apply the patch to the HTS-demo_CMU-ARCTIC-SLT directory:  [[BR]]  
     54   patch -p1 -d . < HTS-2.0.1-demo_CMU-ARCTIC-SLT_for_Mary-3.5.0.patch 
     55 
     560.3) create a wav directory. 
     57 
     580.4) Run the VoiceImport program 
     59 
     60First of all you need to set your MARY_BASE directory:  [[BR]] 
     61   export MARY_BASE="/dir/to/openmary" 
     62 
     63then you can run:  [[BR]] 
     64   java -jar -Xmx1024m  $MARY_BASE/java/voiceimport.jar 
     65 
     66If you are not familiar or have problems with the VoiceImport program, please read and follow the instructions in the Voice Import Tools 
     67Tutorial: http://mary.opendfki.de/wiki/VoiceImportToolsTutorial 
     68 
     69If you want to create another voice in German or English please see the section V below. 
     70 
     71''' 
     72=== II) Data preparation: === 
     73''' 
     74 
     751- Run HMMVoiceDataPreparation of the HMM Voice Trainer group, to check if text, wav and data/raw files are available and in the correct paths. 
     76If just data/raw provided, the program will do the conversion.  If no text files are available but data/utts in festival format, the program will do the conversion as well. 
     77 
     782- Run PhoneUnitFeatureComputer component of the Feature Extraction group to extract context feature vectors from the text data. This procedure will create a "phonefeatures" directory.  For running this component the MARY server should be running as well. 
     79 
     803- Run the EHMMlabeler component of the Automatic Labeling group to label automatically the wav files using the corresponding transcriptions. 
     81For running EHMMLabeler, please set:  [[BR]] 
     82   * EHMMLabeler.ehmm to corresponding path in ../festvox/src/ehmm/bin/ 
     83 
     844- Run LabelPauseDeleter component of the Automatic Labeling group. Please set:  [[BR]] 
     85   * LabelPauseDeleter.threshold = 10. 
     86 
     875- Run PhoneUnitLabelComputer component of the Labels and Pause Correction group. 
     88 
     896- Run PhonelabelFeatureAligner component of the Labels and Pause Correction group. This procedure will create a "phonelab" directory. 
     90 
     91''' 
     92=== III) HMM models training: === 
     93''' 
     94 
     957- Run HMMVoiceConfigure component of the HMM Voice trainer group, the default setting values of this component are already fixed for the 
     96HTS-demo_CMU-ARCTIC-SLT voice. 
     97 
     98If running for other voice, for example a male German voice, please set:  [[BR]] 
     99  * HMMVoiceConfigure.dataSet     : german_set_name  [[BR]] 
     100  * HMMVoiceConfigure.featureList : feature_list_de.pl (context features used for this voice can be change in this file). [[BR]] 
     101  * HMMVoiceConfigure.lowerF0     : 40 (for male) [[BR]] 
     102  * HMMVoiceConfigure.speaker     : speaker_name [[BR]] 
     103  * HMMVoiceConfigure.upperF0     : 280 (for male)  [[BR]] 
     104  * HMMVoiceConfigure.voiceLang   : de [[BR]] 
     105 
     106Using the setting of this component you can also change other variables like using LSP instead og MGC, sampling frequency, etc., 
     107the same as you would do when running "make configure" with the original HTS scripts. 
     108 
     1098- Run HMMVoiceMakeData component of the HMM Voice trainer group to run the HTS procedure "make data". This procedure is the same as in the original scripts with additional sections for calculating strenghts (for mixed exitation), global variance, and handling of Mary context features. 
     110 
     111Particular procedures can be repeated isolated, fixing the particular settings for this component. For example, if the procedure that creates strengths (str directory) has to be repeated with a different set of filters (data/filters/), please set: [[BR]] 
     112  * HMMVoiceMakeData.makeSTR       1  [[BR]] 
     113  * HMMVocieMakeData.makeCMPMARY   1  [[BR]] 
     114all the other variables in 0, and run again the component. (In this case you need to run as well makeCMPMARY because you need to compose again the vectors mgc+lf0+str). 
     115 
     116The procedures can be repeated manually as well, going to the data directory and running "make data" or "make str", as is normally done with the original HTS scripts. 
     117 
     118Note: the Makefile in data/ includes a gv: section copied from HTS-2.1alpha version to calculate global variance files. In Mary, this files are generated little endian and contain a header of size one short to indicate the size of the vectors it contains. 
     119 
     1209- Run HMMVoiceMakeVoice component of the HMM Voice trainer group, here again particular training steps can be repeated selecting them (setting in 1, all the others in 0) from the settings of this component. This is equivalent to run again:  [[BR]] 
     121   perl scripts/Training.pl scripts/Config.pm  [[BR]] 
     122after modifying the Config.pm file, as is normally done with the original HTS scripts. 
     123  
     124This component will generate general information about the execution of the training steps. Detailed information about the training status can be found in the logfile in the current directory. 
     125 
     126The training procedure can take several hours, check the log file time to time to check progress. 
     127 
     128 
     129''' 
     130===IV) Adding a new voice in the MARY platform: === 
     131''' 
     132 
     13310- Run HMMVoiceInstaller component of the Install Voice group. The default setting values of this component are already fixed for the  HTS-demo_CMU-ARCTIC-SLT voice. If you are training other voice  please set: [[BR]] 
     134  * HMMVoiceInstaller.FeaList: make sure that this name is the one used during training, it should be the one in: [[BR]] 
     135         data/feature_list_xx.pl  (xx=en for English or xx=de for German) [[BR]] 
     136  * HMMVoiceInstaller.Flab: this is an example of file to synthesise in HTSCONTEXT format. One example can be found in: [[BR]] 
     137          data/labels/gen/ [[BR]] 
     138  * HMMVoiceInstaller.useMixExc: set this variable to true for using mixed excitation [[BR]] 
     139  * HMMVoiceInstaller.useGV: set this variable to true for using global variance in parameter generation.  [[BR]] 
     140 
     141The VoiceInstaller will:  [[BR]] 
     142- Create a new mary config file in: $MARY_BASE/conf/german-hmm-voice.config [[BR]] 
     143- Add the files corresponding to this voice in: $MARY_BASE/lib/voices/hmm-voice/  [[BR]] 
     144- copy features list: data/feature_list_en.pl to  $MARY_BASE/lib/voices/hmm-voice  [[BR]] 
     145- copy one example of phonelab for testing the synthesiser: data/labels/gen/gen_cmu_us_arctic_slt_xxxx.lab to $MARY_BASE/lib/voices/hmm-voice  [[BR]] 
     146- copy the HTS trees: voices/qst001/ver1/*.inf to $MARY_BASE/lib/voices/hmm-voice  [[BR]] 
     147- copy the HTS PDF models: voices/qst001/ver1/*.pdf to $MARY_BASE/lib/voices/hmm-voice [[BR]] 
     148- copy global variance models (if useGV is set to true): data/gv/gv-*-littend.pdf to $MARY_BASE/lib/voices/hmm-voice [[BR]] 
     149- copy filter taps for mixed excitation: data/filters/mix_excitation_filters.txt to $MARY_BASE/lib/voices/hmm-voice [[BR]] 
     150 
     151After successfully installing a new voice, it can be used with the mary_server and the mary_client. 
     152 
     153 
     154''' 
     155=== V) Creating other voice in German or English. === 
     156''' 
     157 
     158If using German: 
     159 
     160For creating a new German voice it is necessary: [[BR]] 
     161  * a wav or raw directory with the speech files you will use for training the German voice. [[BR]] 
     162  * transcriptions of the files, one text file per speech file, or transcriptions in festival format if available. [[BR]] 
     163 
     164Then we use as a base the original HTS-demo_CMU-ARCTIC-SLT directory: 
     165 
     166- Download and un-zip, un-tar the HTS-demo_CMU-ARCTIC-SLT for HTS-2.0.1 
     167 
     168- Rename this directory as your new voice name, for example german_voice, and delete the directories data/raw and data/utt. 
     169 
     170- Apply the Mary patch to the german_voice directory. [[BR]] 
     171  patch -p1 -d . < HTS-2.0.1-demo_CMU-ARCTIC-SLT_for_Mary-3.5.0.patch 
     172 
     173- Move your speech files to this directory, if you have a wav directory, this should be copied in the current directory (german_voice/wav). If you have a raw directory, this should be copied in the data directory (german_voice/data/raw). 
     174 
     175- Move your transcription files to this directory, if you have a text directory containing the transcription of each file in separate files, this should be copied in the current directory (german_voice/text). If you have transcriptions in festival format please copy this directory in the data/utts directory (german_voice/data/utts/). 
     176 
     177- Now run the VoiceImport program and follow the instructions as normal. Provide general settings for gender, locale must be "de", path to mary_base and name of the voice. 
     178 
     179 
     180Marcela Charfuelan 
     181DFKI 04.09.2008 
     182 
     183 
     184