= Transcription Tool = MARY Transcription Tool, a graphical user interface, supports a semi-automatic procedure for transcribing new language text corpus and automatic training of Letter-to-sound(LTS) rules for that language. It stores all functional words in that language to build a primitive POS tagger. == Requirements: == 1. Prepare phoneset for your language ''' Example for locale en-US : ''' [http://mary.opendfki.de/wiki/TranscriptionTool/allophones_en-US.xml] 2. Acceptable input formats (input from file) Example 1: List of words {{{ Live Item Top Eintracht Spieltags Hannover sechsundneunzig Borussia Arminia }}} Example 2: List of words and transcriptions for few words {{{ Live 'laIf Item Top 'tOp Eintracht '?aIn-tRaxt Spieltags 'Spi:l-ta:ks Hannover sechsundneunzig Borussia bo:-'RU_si:-a: Arminia }}} Example 3: Load from MySQL table (3 columns: ID, word, frequency (word frequency in text corpus)) {{{ mysql> desc en_US_wordList; +-----------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +-----------+------------------+------+-----+---------+----------------+ | id | int(11) | NO | PRI | NULL | auto_increment | | word | varchar(255) | NO | | | | | frequency | int(10) unsigned | NO | | | | +-----------+------------------+------+-----+---------+----------------+ mysql> SELECT * from en_US_wordList WHERE id <= 5; +----+-------------+-----------+ | id | word | frequency | +----+-------------+-----------+ | 1 | treason | 15 | | 2 | indignation | 2 | | 3 | Oilinvest | 1 | | 4 | helgu | 1 | | 5 | perentie | 1 | +----+-------------+-----------+ }}} == How to run? == Run below commands through Shell script: {{{ export MARY_BASE="[PATH TO MARYBASE]" java -Xmx1024m -classpath $MARY_BASE/java:$MARY_BASE/java/mary-common.jar:\ $MARY_BASE/java/log4j-1.2.8.jar:$MARY_BASE/java/weka.jar:\ $MARY_BASE/java/mysql-connector-java-5.1.7-bin.jar\ -Djava.endorsed.dirs=$MARYBASE/lib/endorsed\ marytts.tools.transcription.TranscriptionGUI }}} {{{ #!html