Changes between Version 22 and Version 23 of VoiceImportToolsTutorial
- Timestamp:
- 11/17/09 15:48:43 (15 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
VoiceImportToolsTutorial
v22 v23 1 2 3 4 1 = Voice Import Tools Tutorial : How to build a new Voice with Voice Import Tools = 5 6 2 This Tutorial explains the procedure to build a new voice with Voice Import Tools (VIT) under MARY Environment. 7 3 … … 56 52 </div> 57 53 }}} 58 59 60 54 == Requirements Needed: == 61 62 63 55 * Operating System - Linux (Recommended) 64 56 * MARY TTS Recent Version - Download Link: http://mary.dfki.de/Download … … 67 59 (we also able to use Windows also, if we can able to compile properly the following dependent tools.) 68 60 61 == Dependendent Tools: == 62 * Praat Pitch Marker or Snack - For pitch marks 69 63 70 == Dependendent Tools: ==71 72 73 - Praat Pitch Marker or Snack - For pitch marks74 75 64 Download Link for praat : http://www.fon.hum.uva.nl/praat 76 65 77 66 Installing Snack : Require tcl and snack. Installation instructions available at http://www.speech.kth.se/snack/ 78 79 -Edinburgh Speech Tools Library – For MFCCs and Wagon (CART)80 67 68 * Edinburgh Speech Tools Library – For MFCCs and Wagon (CART) 69 81 70 Download Link for Speech Tools: http://www.cstr.ed.ac.uk/projects/speech_tools/ 82 71 83 -EHMM or Sphinx – For Automatic Labeling72 * EHMM or Sphinx – For Automatic Labeling 84 73 85 74 EHMM is available with festvox-2.1 (Recent Version) - http://festvox.org/download.html … … 87 76 Sphinx - http://cmusphinx.sourceforge.net/webpage/html/download.php 88 77 89 90 78 == Voice Import Components: == 91 92 93 79 Following Components are available with Voice Import Components: 94 80 95 - !PraatPitchmarker 96 - !SnackPitchmarker 97 - MCEPMaker 98 - Festvox2MaryTranscripts 99 - Mary2FestvoxTranscripts 100 - !PhoneUnitFeatureComputer 101 - !HalfPhoneUnitFeatureComputer 102 - EHMMLabeler 103 - !LabelledFilesInspector 104 - !PhoneUnitLabelComputer 105 - !PhoneLabelFeatureAligner 106 - !HalfPhoneUnitLabelComputer 107 - !HalfPhoneLabelFeatureAligner 108 - !QualityControl 109 - !HalfPhoneUnitfileWriter 110 - !HalfPhoneFeatureFileWriter 111 - !JoinCostFileMaker 112 - !AcousticFeatureFileWriter 113 - CARTBuilder 114 - CARTPruner 115 - !VoiceInstaller 116 117 81 * !PraatPitchmarker 82 * !SnackPitchmarker 83 * MCEPMaker 84 * Festvox2MaryTranscripts 85 * Mary2FestvoxTranscripts 86 * !PhoneUnitFeatureComputer 87 * !HalfPhoneUnitFeatureComputer 88 * EHMMLabeler 89 * !LabelledFilesInspector 90 * !PhoneUnitLabelComputer 91 * !PhoneLabelFeatureAligner 92 * !HalfPhoneUnitLabelComputer 93 * !HalfPhoneLabelFeatureAligner 94 * !QualityControl 95 * !HalfPhoneUnitfileWriter 96 * !HalfPhoneFeatureFileWriter 97 * !JoinCostFileMaker 98 * !AcousticFeatureFileWriter 99 * CARTBuilder 100 * CARTPruner 101 * !VoiceInstaller 118 102 119 103 == Step-by-Step Procedure: == 120 121 104 1. First you need to have following 2 basic requirements for Voice Building 122 105 123 a. Wave files124 b. Corresponding Transcription (in MARY or Festvox Format)106 a. Wave files 107 a. Corresponding Transcription (in MARY or Festvox Format) 125 108 126 109 MARY Format : Each transcription represented by a single file. All these files placed in a single directory. By default, all these files placed in 'text' directory of voice-building directory. 127 110 128 Festvox (Festival) Format : A single file contains all transcriptions. For examples see below example. 111 Festvox (Festival) Format : A single file contains all transcriptions. For examples see below example. 129 112 130 113 {{{ … … 137 120 138 121 }}} 139 140 2. Create a new Voice Building Directory141 142 - Put all Wave files in "wav" directory143 144 3. Run below commands through Shell script from Voice Building Directory.145 122 123 124 For example, one way of getting data to test this is to use the ARCTIC data from CMU. Download and upack http://www.speech.cs.cmu.edu/cmu_arctic/packed/cmu_us_slt_arctic-0.95-release.tar.bz2, and copy the following two items to a new empty directory: 125 126 * wav/ folder including all wav files; 127 * etc/txt.done.data. 128 129 2. Create a new Voice Building Directory 130 131 * Put all Wave files in "wav" directory 132 133 3. Run below commands through Shell script from Voice Building Directory. 146 134 147 135 {{{ … … 151 139 152 140 }}} 141 When you are running first time above shell script, It asks you some basic configuration settings by presenting with a GUI window where you have to enter a few basic settings. Almost all other settings are based on these first settings and set automatically. 153 142 154 155 156 When you are running first time above shell script, It asks you some basic configuration settings by presenting with a GUI window where you 157 have to enter a few basic settings. Almost all other settings are based on these first settings and set automatically. 158 159 160 161 162 Global Configuration Settings window looks like below: 163 164 143 Global Configuration Settings window looks like below: 165 144 166 145 {{{ … … 170 149 </p> 171 150 }}} 172 173 151 '''Global Configuration Settings:''' 174 152 153 Domain - general or limited[[BR]] Gender - male or female[[BR]] Locale - which specifies language of domain (de - Deutsch or en - English) [[BR]] (Currently, MARY supporting 2 language only: 1. Deutsch 2. English)[[BR]] Marybase - MARY Installation Directory (Global Path)[[BR]] Rootdir - Voice Building Directory (Global Path)[[BR]] Wavdir - Where we can store Wave files [[BR]] Textdir - Where we can store corresponding Transcriptions [[BR]] 175 154 176 Domain - general or limited[[BR]] 177 Gender - male or female[[BR]] 178 Locale - which specifies language of domain (de - Deutsch or en - English) [[BR]] 179 (Currently, MARY supporting 2 language only: 1. Deutsch 2. English)[[BR]] 180 Marybase - MARY Installation Directory (Global Path)[[BR]] 181 Rootdir - Voice Building Directory (Global Path)[[BR]] 182 Wavdir - Where we can store Wave files [[BR]] 183 Textdir - Where we can store corresponding Transcriptions [[BR]] 155 After clicking the "'''Save'''"-button, you will get to the main window of Voice Import Tools as shown in Screen shot. There you can see a list of modules. A component is executed by ticking the associated checkbox and clicking on "Run". 184 156 157 4. User also can change the settings for each individual component by clicking on the '''wrench symbol''' next to the component. Clicking on "Settings" takes you to the window where you can change the basic settings. In a settings window, you can change the view to the settings of another module or the basic settings via the drop-down menu. Basically, all modules need to be run to import the voice into MARY. For more detailed information, check the general help file - just click on "Help" in the main window. Clicking on help in the settings window opens a help window with details about the displayed settings. We recommended to give Absolute Paths for individual Configuration Settings. These config. settings are arguments to components to perform corresponding task. 185 158 186 After clicking the "'''Save'''"-button, you will get to the main window of Voice Import Tools as shown in Screen shot. There you can see a list of modules. A component is executed by ticking the associated checkbox and clicking on "Run". 159 The import tool creates two files in the directory where you started it - database.config and importMain.config. database.config contains the values of the settings - you can change the settings also in this file, but be aware that this may cause problems. 187 160 188 4. User also can change the settings for each individual component by clicking on the '''wrench symbol''' next to the component. 189 Clicking on "Settings" takes you to the window where you can change the basic settings. 190 In a settings window, you can change the view to the settings of another module or the basic settings via the 191 drop-down menu. Basically, all modules need to be run to import the voice into MARY. For more detailed information, check the 192 general help file - just click on "Help" in the main window. Clicking on help in the settings window opens a help 193 window with details about the displayed settings. We recommended to give Absolute Paths for individual Configuration Settings. These config. settings are arguments to components to perform corresponding task. 161 5. Simplest way of Using Voice Import Components: 194 162 195 The import tool creates two files in the directory where you started it - database.config and importMain.config.196 database.config contains the values of the settings - you can change the settings also in this file, but be aware that 197 this may cause problems. 163 * Give Config. Settings for Each and Every Component. 164 * Tick mark all components 165 * Click RUN button [[BR]] 198 166 199 200 5. Simplest way of Using Voice Import Components: 201 202 - Give Config. Settings for Each and Every Component. 203 - Tick mark all components 204 - Click RUN button [[BR]] 205 206 207 It will complete all tasks in sequential manner. [[BR]] 167 It will complete all tasks in sequential manner. [[BR]] 208 168 209 169 6. But user need to make few decisions before doing Step 5. 210 170 211 Because there is no need to use all components for Building a New Voice.[[BR]] 212 For Example: For Pitch marks we can choose Praat or Snack. [[BR]] 171 Because there is no need to use all components for Building a New Voice.[[BR]] For Example: For Pitch marks we can choose Praat or Snack. [[BR]] 213 172 214 -Choose Praat or Snack (only one) for Pitch marks Extraction.[[BR]]173 * Choose Praat or Snack (only one) for Pitch marks Extraction.[[BR]] 215 174 216 -If your transcriptions are in Festvox Format, It is necessary to choose "''Festvox2MaryTranscripts''" Component. Because It will convert Festvox format transcriptions to MARY format transcriptions. Voice Import Tools uses MARY format transcription for building Voice. No need to choose "''Mary2FestvoxTranscripts''" component while Building a new Voice. Just we are providing that component for facilitating user to convert any format depending on requirements.[[BR]]175 * If your transcriptions are in Festvox Format, It is necessary to choose "''Festvox2MaryTranscripts''" Component. Because It will convert Festvox format transcriptions to MARY format transcriptions. Voice Import Tools uses MARY format transcription for building Voice. No need to choose "''Mary2FestvoxTranscripts''" component while Building a new Voice. Just we are providing that component for facilitating user to convert any format depending on requirements.[[BR]] 217 176 218 - ''!PhoneUnitFeatureComputer'' and ''!HalfPhoneUnitFeatureComputer'' needs a running MARY Server. It is very important point. User need to make sure a Mary Server running while executing above two Components. And one more important issue is MARY Server need to contain at least one Voice of language (German or English), which user wanted build a new voice.[[BR]] 219 '''**''' Before running Mary Server, please make sure "english-targetfeatures.config" and "english-halfphone-targetfeatures.config" in "$MARY_BASE/conf/" directory for building an English voice. Similarly, "german-targetfeatures.config" and "german-halfphone-targetfeatures.config" required for German voice building.[[BR]] 177 * ''!PhoneUnitFeatureComputer'' and ''!HalfPhoneUnitFeatureComputer'' needs a running MARY Server. It is very important point. User need to make sure a Mary Server running while executing above two Components. And one more important issue is MARY Server need to contain at least one Voice of language (German or English), which user wanted build a new voice.[[BR]] '''**''' Before running Mary Server, please make sure "english-targetfeatures.config" and "english-halfphone-targetfeatures.config" in "$MARY_BASE/conf/" directory for building an English voice. Similarly, "german-targetfeatures.config" and "german-halfphone-targetfeatures.config" required for German voice building.[[BR]] 220 178 221 - ''!LabelledFilesInspector'' gives a GUI interface to check how good Automatic labeling. It will also support user to listen phone segments according to given timestamps from Automatic labeling. If user don't want to inspect labeling, better no need to choose this component. Because it will pause Voice building in between. 222 223 7. While executing each component, a Progress bar shows the percentage of work completed for that component. Each Component converted to GREEN, if that component is executed successfully. And it converts to RED and it throws an exception, if that component unsuccessfully executed. If a component unsuccessfully executed, check configuration settings once again. 179 * ''!LabelledFilesInspector'' gives a GUI interface to check how good Automatic labeling. It will also support user to listen phone segments according to given timestamps from Automatic labeling. If user don't want to inspect labeling, better no need to choose this component. Because it will pause Voice building in between. 224 180 225 226 We hope this tutorial helps to build a new '''unit selection voice''' using the Voice Import Tools under the MARY platform. The Individual Voice Import Components are explained [wiki:VoiceImportComponents here]. 227 [[BR]]181 7. While executing each component, a Progress bar shows the percentage of work completed for that component. Each Component converted to GREEN, if that component is executed successfully. And it converts to RED and it throws an exception, if that component unsuccessfully executed. If a component unsuccessfully executed, check configuration settings once again. 182 183 We hope this tutorial helps to build a new '''unit selection voice''' using the Voice Import Tools under the MARY platform. The Individual Voice Import Components are explained [wiki:VoiceImportComponents here]. [[BR]] 228 184 229 185 * [wiki:VoiceImportComponents Explanation on Individual Voice Import Components] … … 233 189 * [wiki:HMMVoiceCreationMary4.0 Explanation on how to create HMM-based voices for MARY] 234 190 235 [[BR]]236 [[BR]]237 [[BR]]238 [[BR]]239 240 241 242 191 - Sathish Chandra Pammi (Sathish.Chandra@dfki.de) 243 244 245 246 247 248 249 250 251 252 253 254 255