Changes between Version 11 and Version 12 of NewLanguageSupport
- Timestamp:
- 07/02/09 19:21:43 (15 years ago)
Legend:
- Unmodified
- Added
- Removed
- Modified
-
NewLanguageSupport
v11 v12 247 247 == 5. Run feature maker with the minimal nlp components == 248 248 249 The '''FeatureMaker Server''' program splits the clean text obtained in step 2 into sentences, classify them as reliable, or non-reliable (sentences with unknownWords or strangeSymbols) and extracts context features from the reliable sentences. All this extracted data will be249 The '''FeatureMaker''' program splits the clean text obtained in step 2 into sentences, classify them as reliable, or non-reliable (sentences with unknownWords or strangeSymbols) and extracts context features from the reliable sentences. All this extracted data will be 250 250 kept in the DB.[[BR]] 251 251 … … 260 260 # just the not processed records. 261 261 262 #Usage: java FeatureMaker MaryServer-locale language -mysqlHost host -mysqlUser user262 #Usage: java FeatureMaker -locale language -mysqlHost host -mysqlUser user 263 263 # -mysqlPasswd passwd -mysqlDB wikiDB 264 # [- maryHost localhost -maryPort 59125 -strictCredibility strict]264 # [-reliability strict] 265 265 # [-featuresForSelection phoneme,next_phoneme,selection_prosody] 266 266 # 267 267 # required: This program requires a MARY server running and an already created cleanText table in the DB. 268 268 # The cleanText table can be created with the WikipediaProcess program. 269 # default/optional: [-maryHost localhost -maryPort 59125] 270 # default/optional: [-featuresForSelection phoneme,next_phoneme,selection_prosody] (features separated by ,) 271 # optional: [-strictCredibility [strict|lax]] 272 # 273 # -strictCredibility: setting that determines what kind of sentences 274 # are regarded as credible. There are two settings: strict and lax. With 275 # setting strict (default), only those sentences that contain words in the lexicon 276 # or words that were transcribed by the preprocessor are regarded as credible; 277 # the other sentences as unreliable. With setting lax, also those words that 278 # are transcribed with the Denglish and the compound module are regarded as credible. 269 # default/optional: [-featuresForSelection phone,next_phone,selection_prosody] (features separated by ,) 270 # optional: [-reliability [strict|lax]] 271 # 272 # -reliability: setting that determines what kind of sentences 273 # are regarded as reliable. There are two settings: strict and lax. With 274 # setting strict, only those sentences that contain words in the lexicon 275 # or words that were transcribed by the preprocessor can be selected for the synthesis script; 276 # the other sentences as unreliable. With setting lax (default), also those words that 277 # are transcribed with the letter to sound component can be selected. 279 278 280 279 … … 283 282 284 283 java -Xmx1000m -classpath $CLASSPATH -Djava.endorsed.dirs=$MARY_BASE/lib/endorsed \ 285 -Dmary.base=$MARY_BASE marytts.tools.dbselection.FeatureMaker MaryServer\284 -Dmary.base=$MARY_BASE marytts.tools.dbselection.FeatureMaker \ 286 285 -locale "en_US" \ 287 286 -mysqlHost "localhost" \ … … 289 288 -mysqlPasswd "wiki123" \ 290 289 -mysqlDB "wiki" \ 291 -featuresForSelection "phoneme,next_phoneme,selection_prosody" 292 293 }}} 294 290 -featuresForSelection "phone,next_phone,selection_prosody" 291 292 }}} 293 294 There is a variant of the program, '''FeatureMakerMaryServer''', which calls an external Mary server instead of starting the Mary components internally. It takes the additional command line arguments ''-maryHost localhost -maryPort 59125''. 295 295 296 296 Output: