Changes between Version 25 and Version 26 of VoiceImportToolsTutorial


Ignore:
Timestamp:
11/17/09 17:05:21 (15 years ago)
Author:
masc01
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • VoiceImportToolsTutorial

    v25 v26  
    55[[Image(VIC1.jpg)]] 
    66 
    7 '''Voice Import Tools GUI Screenshot''' 
    87 
    98Voice Import Tool is a Graphical User Interface(GUI), which contains a 
    109set of Voice Import Components and helps the user to build new voices 
    1110under MARY(Modular Architecture for Research in speech sYnthesis) 
    12 Environment. This GUI Tool designing is primarily aims to build new 
    13 voices very easily by any user with out knowing much technical details 
    14 of Speech Synthesis. 
     11Environment. This GUI tool aims to simplify the task of building new 
     12synthesis voices so that users who do not have detailed technical knowledge 
     13of speech synthesis can build their own voices. 
    1514 
    16 Currently, Voice Import Tool supports following  categories mainly: 
     15In a nutshell, the Voice Import Tools cover the following steps in voice building: 
    1716 
    1817 1. Feature Extraction from Acoustic Data 
    19  
    2018 2. Feature Vector Extraction from Text Data 
    21  
    2219 3. Automatic Labeling 
    23  
    24  4. Unit Selection 
    25  
     20 4. Unit Selection voice building 
     21 4. HMM-based voice building 
    2622 5. Voice Installation to MARY 
    2723 
     
    133129 
    134130 
    135 '''Global Configuration Settings:''' 
     131'''Global Configuration Settings''' 
    136132 
    137133  Domain   - general or  limited[[BR]] Gender   - male or female[[BR]] Locale   - which specifies language of domain (de - Deutsch or en - English) [[BR]] (Currently,  MARY supporting 2 language only: 1. Deutsch 2. English)[[BR]] Marybase - MARY Installation Directory (Global Path)[[BR]] Rootdir  - Voice Building Directory (Global Path)[[BR]] Wavdir   - Where we can store Wave files [[BR]] Textdir  - Where we can store corresponding Transcriptions [[BR]] 
     
    139135After clicking the "'''Save'''"-button, you will get to the main window of Voice Import Tools as shown in Screen shot. There you can see a list of modules. A component is executed by ticking the associated checkbox and clicking on "Run". 
    140136 
    141 4. User also can change the settings for each individual component by clicking on the '''wrench symbol''' next to the component.  Clicking on "Settings" takes you to the window where you can change the basic settings.  In a settings window, you can change the view to the settings of another module or the basic settings via the  drop-down menu. Basically, all modules need to be run to import the voice into MARY. For more detailed information, check the general help file - just click on "Help" in the main window. Clicking on help in the settings window opens a help window with details about the displayed settings. We recommended to give Absolute Paths for individual Configuration Settings. These config. settings are arguments to components to perform corresponding task. 
     137'''Component Configuration Settings''' 
     138 
     139You can verify and change the settings for each individual component by clicking on the '''wrench symbol''' next to the component.  Clicking on "Settings" takes you to the window where you can change the basic settings.  In a settings window, you can change the view to the settings of another module or the basic settings via the  drop-down menu. Basically, all modules need to be run to import the voice into MARY. For more detailed information, check the general help file - just click on "Help" in the main window. Clicking on help in the settings window opens a help window with details about the displayed settings. We recommended to give Absolute Paths for individual Configuration Settings. These config settings are arguments to components to perform the corresponding task. 
    142140 
    143141The import tool creates two files in the directory where you started it - database.config and importMain.config. database.config contains the values of the settings - you can change the settings also in this file, but be aware that  this may cause problems. 
    144142 
    145  5. Simplest way of Using Voice Import Components:  
     143'''How to run the Voice Import Components''' 
     144 
     145In an ideal world the process of building a voice would look like this: 
    146146 
    147147 * Give Config. Settings for Each and Every Component. 
     
    151151 It will complete all tasks in sequential manner. [[BR]] 
    152152 
    153 6. But user need to make few decisions before doing Step 5. 
     153In the real world, however, the user needs to take a few decisions here and there, so the real-world process is usually a bit more complex than that. 
     154For example, pitch marking can be done with either Praat or Snack. When using a pitch marker, you may want to verify that the frequency range is appropriate for your recordings, and adapt the component's config settings before running it again. 
    154155 
    155   Because there is no need to use all components for Building a New Voice.[[BR]] For Example: For Pitch marks we can choose Praat or Snack. [[BR]] 
     156 * If your transcriptions are in Festvox format, it is necessary to choose "''Festvox2MaryTranscripts''" Component. This will convert the transcriptions in Festvox format (`txt.done.data`) to MARY format (`text/*.txt`). Voice Import Tools uses MARY format transcription for building a voice. If you have recorded your voice using Redstart, there is no need to run the "''Mary2FestvoxTranscripts''" component. 
    156157 
    157  * Choose Praat or Snack (only one) for Pitch marks Extraction.[[BR]] 
     158 * ''!PhoneUnitFeatureComputer'' and ''!HalfPhoneUnitFeatureComputer'' need a MARY Server which runs the NLP components for the target locale. This is very important point, since we need it to convert the text of an utterance into a phone sequence to align with the audio data. You need to make sure a Mary server is running while executing the above two Components. 
    158159 
    159  * If your transcriptions are in Festvox Format, It is necessary to choose "''Festvox2MaryTranscripts''" Component. Because It will convert Festvox format transcriptions to MARY format transcriptions. Voice Import Tools uses MARY format transcription for building Voice. No need to choose "''Mary2FestvoxTranscripts''" component while Building a new Voice. Just we are providing that component for facilitating user to convert any format depending on requirements.[[BR]] 
     160 * ''!LabelledFilesInspector'' gives a GUI interface to check the results of automatic labeling. It will let you listen to phone segments according to the timestamps from automatic labeling. If you don't want to inspect labeling, there is no need to choose this component. 
    160161 
    161  * ''!PhoneUnitFeatureComputer'' and ''!HalfPhoneUnitFeatureComputer'' needs a running MARY Server. It is very important point. User need to make sure a Mary Server running while executing above two Components. And one more important issue is MARY Server need to contain at least one Voice of language (German or English), which user wanted build a new voice.[[BR]] '''**''' Before running Mary Server, please make sure "english-targetfeatures.config" and "english-halfphone-targetfeatures.config" in "$MARY_BASE/conf/" directory for building an English voice. Similarly, "german-targetfeatures.config" and "german-halfphone-targetfeatures.config" required for German voice building.[[BR]] 
     162While executing each component, a Progress bar shows the percentage of work completed for that component. A Component is converted to GREEN if that component is executed successfully. It turns RED, and it throws an exception, if that component encounters an error. If you get there, you will need to understand what went wrong, and how it must be fixed. There is no simple recipe for that case. 
    162163 
    163  * ''!LabelledFilesInspector'' gives a GUI interface to check how good Automatic labeling. It will also support user to listen phone segments according to given timestamps from Automatic labeling. If user don't want to inspect labeling, better no need to choose this component. Because it will pause Voice building in between. 
    164  
    165  7. While executing each component, a Progress bar shows the percentage of work completed for that component. Each Component converted to GREEN, if that component is executed successfully. And it converts to RED and it throws an exception, if that component unsuccessfully executed. If a component unsuccessfully executed, check configuration settings once again. 
    166  
    167  We hope this tutorial helps to build a new '''unit selection voice''' using the Voice Import Tools under the MARY platform. The Individual Voice Import Components are explained [wiki:VoiceImportComponents here].  [[BR]] 
     164We hope this tutorial helps to build a new '''unit selection voice''' using the Voice Import Tools under the MARY platform. The Individual Voice Import Components are explained [wiki:VoiceImportComponents here].  [[BR]] 
    168165 
    169166* [wiki:VoiceImportComponents Explanation on Individual Voice Import Components] 
     
    173170* [wiki:HMMVoiceCreationMary4.0 Explanation on how to create HMM-based voices for MARY] 
    174171 
    175 -  Sathish Chandra Pammi (Sathish.Chandra@dfki.de) 
     172-  Sathish Chandra Pammi (Sathish.Chandra@dfki.de) and Marc Schröder