Changes between Version 22 and Version 23 of VoiceImportToolsTutorial


Ignore:
Timestamp:
11/17/09 15:48:43 (15 years ago)
Author:
masc01
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • VoiceImportToolsTutorial

    v22 v23  
    1  
    2  
    3  
    41= Voice Import Tools Tutorial : How to build a new Voice with Voice Import Tools = 
    5  
    62This Tutorial explains the procedure to build a new voice with Voice Import Tools (VIT) under MARY Environment. 
    73 
     
    5652</div> 
    5753}}} 
    58  
    59  
    6054== Requirements Needed: == 
    61  
    62  
    6355 * Operating System - Linux (Recommended) 
    6456 * MARY TTS Recent Version - Download Link: http://mary.dfki.de/Download 
     
    6759(we also able to use Windows also, if we can able to compile properly the following dependent tools.) 
    6860 
     61== Dependendent Tools: == 
     62 * Praat Pitch Marker or Snack  - For pitch marks 
    6963 
    70 == Dependendent Tools:  == 
    71  
    72  
    73  - Praat Pitch Marker or Snack  - For pitch marks 
    74   
    7564  Download Link for praat : http://www.fon.hum.uva.nl/praat 
    7665 
    7766  Installing Snack : Require tcl and snack. Installation instructions available at http://www.speech.kth.se/snack/ 
    78          
    79  - Edinburgh Speech Tools Library – For MFCCs and Wagon (CART) 
    80   
     67 
     68 * Edinburgh Speech Tools Library – For MFCCs and Wagon (CART) 
     69 
    8170  Download Link for Speech Tools: http://www.cstr.ed.ac.uk/projects/speech_tools/ 
    8271 
    83  - EHMM or Sphinx – For Automatic Labeling 
     72 * EHMM or Sphinx – For Automatic Labeling 
    8473 
    8574  EHMM is available with festvox-2.1 (Recent Version) - http://festvox.org/download.html 
     
    8776  Sphinx -  http://cmusphinx.sourceforge.net/webpage/html/download.php 
    8877 
    89  
    9078== Voice Import Components: == 
    91  
    92  
    9379Following Components are available with Voice Import Components: 
    9480 
    95  - !PraatPitchmarker 
    96  - !SnackPitchmarker 
    97  - MCEPMaker 
    98  - Festvox2MaryTranscripts 
    99  - Mary2FestvoxTranscripts 
    100  - !PhoneUnitFeatureComputer 
    101  - !HalfPhoneUnitFeatureComputer 
    102  - EHMMLabeler 
    103  - !LabelledFilesInspector 
    104  - !PhoneUnitLabelComputer  
    105  - !PhoneLabelFeatureAligner  
    106  - !HalfPhoneUnitLabelComputer  
    107  - !HalfPhoneLabelFeatureAligner  
    108  - !QualityControl 
    109  - !HalfPhoneUnitfileWriter 
    110  - !HalfPhoneFeatureFileWriter 
    111  - !JoinCostFileMaker  
    112  - !AcousticFeatureFileWriter 
    113  - CARTBuilder 
    114  - CARTPruner 
    115  - !VoiceInstaller  
    116  
    117  
     81 * !PraatPitchmarker 
     82 * !SnackPitchmarker 
     83 * MCEPMaker 
     84 * Festvox2MaryTranscripts 
     85 * Mary2FestvoxTranscripts 
     86 * !PhoneUnitFeatureComputer 
     87 * !HalfPhoneUnitFeatureComputer 
     88 * EHMMLabeler 
     89 * !LabelledFilesInspector 
     90 * !PhoneUnitLabelComputer 
     91 * !PhoneLabelFeatureAligner 
     92 * !HalfPhoneUnitLabelComputer 
     93 * !HalfPhoneLabelFeatureAligner 
     94 * !QualityControl 
     95 * !HalfPhoneUnitfileWriter 
     96 * !HalfPhoneFeatureFileWriter 
     97 * !JoinCostFileMaker 
     98 * !AcousticFeatureFileWriter 
     99 * CARTBuilder 
     100 * CARTPruner 
     101 * !VoiceInstaller 
    118102 
    119103== Step-by-Step Procedure: == 
    120  
    1211041. First you need to have following 2 basic requirements for Voice Building 
    122105 
    123                   a. Wave files  
    124                   b. Corresponding Transcription (in MARY or Festvox Format)  
     106 a. Wave files 
     107 a. Corresponding Transcription (in MARY or Festvox Format) 
    125108 
    126109  MARY Format : Each transcription represented by a single file. All these files placed in a single directory. By default, all these files placed in 'text' directory of voice-building directory. 
    127110 
    128   Festvox (Festival) Format : A single file contains all transcriptions. For examples see below example.  
     111  Festvox (Festival) Format : A single file contains all transcriptions. For examples see below example. 
    129112 
    130113{{{ 
     
    137120 
    138121}}} 
    139      
    140 2. Create a new Voice Building Directory  
    141    
    142    - Put all Wave files in "wav" directory 
    143     
    144 3. Run below commands through Shell script from Voice Building Directory. 
    145122 
     123 
     124For example, one way of getting data to test this is to use the ARCTIC data from CMU. Download and upack http://www.speech.cs.cmu.edu/cmu_arctic/packed/cmu_us_slt_arctic-0.95-release.tar.bz2, and copy the following two items to a new empty directory: 
     125 
     126 * wav/ folder including all wav files; 
     127 * etc/txt.done.data. 
     128 
     129 2. Create a new Voice Building Directory  
     130 
     131 * Put all Wave files in "wav" directory 
     132 
     133 3. Run below commands through Shell script from Voice Building Directory. 
    146134 
    147135{{{ 
     
    151139 
    152140}}} 
     141When you are running first time above shell script, It asks you some basic configuration settings by presenting with a GUI window where you  have to enter a few basic settings. Almost all other settings are based on these first settings and set automatically. 
    153142 
    154  
    155  
    156 When you are running first time above shell script, It asks you some basic configuration settings by presenting with a GUI window where you  
    157 have to enter a few basic settings. Almost all other settings are based on these first settings and set automatically. 
    158  
    159  
    160  
    161  
    162 Global Configuration Settings window looks like below:  
    163  
    164  
     143Global Configuration Settings window looks like below: 
    165144 
    166145{{{ 
     
    170149</p> 
    171150}}} 
    172  
    173151'''Global Configuration Settings:''' 
    174152 
     153  Domain   - general or  limited[[BR]] Gender   - male or female[[BR]] Locale   - which specifies language of domain (de - Deutsch or en - English) [[BR]] (Currently,  MARY supporting 2 language only: 1. Deutsch 2. English)[[BR]] Marybase - MARY Installation Directory (Global Path)[[BR]] Rootdir  - Voice Building Directory (Global Path)[[BR]] Wavdir   - Where we can store Wave files [[BR]] Textdir  - Where we can store corresponding Transcriptions [[BR]] 
    175154 
    176  Domain   - general or  limited[[BR]] 
    177  Gender   - male or female[[BR]] 
    178  Locale   - which specifies language of domain (de - Deutsch or en - English) [[BR]] 
    179  (Currently,  MARY supporting 2 language only: 1. Deutsch 2. English)[[BR]] 
    180  Marybase - MARY Installation Directory (Global Path)[[BR]] 
    181  Rootdir  - Voice Building Directory (Global Path)[[BR]] 
    182  Wavdir   - Where we can store Wave files [[BR]] 
    183  Textdir  - Where we can store corresponding Transcriptions [[BR]] 
     155After clicking the "'''Save'''"-button, you will get to the main window of Voice Import Tools as shown in Screen shot. There you can see a list of modules. A component is executed by ticking the associated checkbox and clicking on "Run". 
    184156 
     1574. User also can change the settings for each individual component by clicking on the '''wrench symbol''' next to the component.  Clicking on "Settings" takes you to the window where you can change the basic settings.  In a settings window, you can change the view to the settings of another module or the basic settings via the  drop-down menu. Basically, all modules need to be run to import the voice into MARY. For more detailed information, check the general help file - just click on "Help" in the main window. Clicking on help in the settings window opens a help window with details about the displayed settings. We recommended to give Absolute Paths for individual Configuration Settings. These config. settings are arguments to components to perform corresponding task. 
    185158 
    186 After clicking the "'''Save'''"-button, you will get to the main window of Voice Import Tools as shown in Screen shot. There you can see a list of modules. A component is executed by ticking the associated checkbox and clicking on "Run".  
     159The import tool creates two files in the directory where you started it - database.config and importMain.config. database.config contains the values of the settings - you can change the settings also in this file, but be aware that  this may cause problems. 
    187160 
    188 4. User also can change the settings for each individual component by clicking on the '''wrench symbol''' next to the component.  
    189 Clicking on "Settings" takes you to the window where you can change the basic settings.  
    190 In a settings window, you can change the view to the settings of another module or the basic settings via the  
    191 drop-down menu. Basically, all modules need to be run to import the voice into MARY. For more detailed information, check the 
    192 general help file - just click on "Help" in the main window. Clicking on help in the settings window opens a help 
    193 window with details about the displayed settings. We recommended to give Absolute Paths for individual Configuration Settings. These config. settings are arguments to components to perform corresponding task.  
     161 5. Simplest way of Using Voice Import Components:  
    194162 
    195 The import tool creates two files in the directory where you started it - database.config and importMain.config. 
    196 database.config contains the values of the settings - you can change the settings also in this file, but be aware that  
    197 this may cause problems.  
     163 * Give Config. Settings for Each and Every Component. 
     164 * Tick mark all components 
     165 * Click RUN button  [[BR]] 
    198166 
    199   
    200 5. Simplest way of Using Voice Import Components:  
    201    
    202   - Give Config. Settings for Each and Every Component.  
    203   - Tick mark all components 
    204   - Click RUN button  [[BR]] 
    205  
    206            
    207 It will complete all tasks in sequential manner. [[BR]] 
     167 It will complete all tasks in sequential manner. [[BR]] 
    208168 
    2091696. But user need to make few decisions before doing Step 5. 
    210170 
    211  Because there is no need to use all components for Building a New Voice.[[BR]] 
    212  For Example: For Pitch marks we can choose Praat or Snack. [[BR]] 
     171  Because there is no need to use all components for Building a New Voice.[[BR]] For Example: For Pitch marks we can choose Praat or Snack. [[BR]] 
    213172 
    214  - Choose Praat or Snack (only one) for Pitch marks Extraction.[[BR]] 
     173 * Choose Praat or Snack (only one) for Pitch marks Extraction.[[BR]] 
    215174 
    216  - If your transcriptions are in Festvox Format, It is necessary to choose "''Festvox2MaryTranscripts''" Component. Because It will convert Festvox format transcriptions to MARY format transcriptions. Voice Import Tools uses MARY format transcription for building Voice. No need to choose "''Mary2FestvoxTranscripts''" component while Building a new Voice. Just we are providing that component for facilitating user to convert any format depending on requirements.[[BR]] 
     175 * If your transcriptions are in Festvox Format, It is necessary to choose "''Festvox2MaryTranscripts''" Component. Because It will convert Festvox format transcriptions to MARY format transcriptions. Voice Import Tools uses MARY format transcription for building Voice. No need to choose "''Mary2FestvoxTranscripts''" component while Building a new Voice. Just we are providing that component for facilitating user to convert any format depending on requirements.[[BR]] 
    217176 
    218  - ''!PhoneUnitFeatureComputer'' and ''!HalfPhoneUnitFeatureComputer'' needs a running MARY Server. It is very important point. User need to make sure a Mary Server running while executing above two Components. And one more important issue is MARY Server need to contain at least one Voice of language (German or English), which user wanted build a new voice.[[BR]] 
    219    '''**''' Before running Mary Server, please make sure "english-targetfeatures.config" and "english-halfphone-targetfeatures.config" in "$MARY_BASE/conf/" directory for building an English voice. Similarly, "german-targetfeatures.config" and "german-halfphone-targetfeatures.config" required for German voice building.[[BR]] 
     177 * ''!PhoneUnitFeatureComputer'' and ''!HalfPhoneUnitFeatureComputer'' needs a running MARY Server. It is very important point. User need to make sure a Mary Server running while executing above two Components. And one more important issue is MARY Server need to contain at least one Voice of language (German or English), which user wanted build a new voice.[[BR]] '''**''' Before running Mary Server, please make sure "english-targetfeatures.config" and "english-halfphone-targetfeatures.config" in "$MARY_BASE/conf/" directory for building an English voice. Similarly, "german-targetfeatures.config" and "german-halfphone-targetfeatures.config" required for German voice building.[[BR]] 
    220178 
    221  - ''!LabelledFilesInspector'' gives a GUI interface to check how good Automatic labeling. It will also support user to listen phone segments according to given timestamps from Automatic labeling. If user don't want to inspect labeling, better no need to choose this component. Because it will pause Voice building in between. 
    222      
    223 7. While executing each component, a Progress bar shows the percentage of work completed for that component. Each Component converted to GREEN, if that component is executed successfully. And it converts to RED and it throws an exception, if that component unsuccessfully executed. If a component unsuccessfully executed, check configuration settings once again.       
     179 * ''!LabelledFilesInspector'' gives a GUI interface to check how good Automatic labeling. It will also support user to listen phone segments according to given timestamps from Automatic labeling. If user don't want to inspect labeling, better no need to choose this component. Because it will pause Voice building in between. 
    224180 
    225     
    226 We hope this tutorial helps to build a new '''unit selection voice''' using the Voice Import Tools under the MARY platform. The Individual Voice Import Components are explained [wiki:VoiceImportComponents here].  
    227 [[BR]] 
     181 7. While executing each component, a Progress bar shows the percentage of work completed for that component. Each Component converted to GREEN, if that component is executed successfully. And it converts to RED and it throws an exception, if that component unsuccessfully executed. If a component unsuccessfully executed, check configuration settings once again. 
     182 
     183 We hope this tutorial helps to build a new '''unit selection voice''' using the Voice Import Tools under the MARY platform. The Individual Voice Import Components are explained [wiki:VoiceImportComponents here].  [[BR]] 
    228184 
    229185* [wiki:VoiceImportComponents Explanation on Individual Voice Import Components] 
     
    233189* [wiki:HMMVoiceCreationMary4.0 Explanation on how to create HMM-based voices for MARY] 
    234190 
    235 [[BR]] 
    236 [[BR]] 
    237 [[BR]] 
    238 [[BR]] 
    239  
    240  
    241  
    242191-  Sathish Chandra Pammi (Sathish.Chandra@dfki.de) 
    243  
    244  
    245  
    246  
    247  
    248  
    249  
    250  
    251  
    252  
    253  
    254  
    255