12 | | The core OpenMary system, as released on this development page, is distributed under a very liberal BSD-style license which basically allows you to do anything you want with the code provided that you acknowledge where you have it from: http://mary.dfki.de/download/MARY%20software%20user%20agreement.html Scientific publications based on MARY are requested to cite the MARY reference paper Schröder & Trouvain (2003). |
13 | | |
14 | | The German language modules as well as the English part-of-speech tagger is released in binary form, under a research license: http://mary.dfki.de/download/DFKI%20MARY%20software%20user%20agreement.html You must not use this code in a commercial setup unless you obtain a separate license from DFKI, and there are other restrictions. Do read the license agreement carefully when you use the German component. |
15 | | |
16 | | The MBROLA binaries and voices, finally, are distributed with MARY because that is allowed by the MBROLA license: http://mary.dfki.de/download/Mbrola%20software%20user%20agreement.html These can only be used in a non-commercial, non-military setting. |
17 | | |
18 | | '''How difficult is it to add support for Hebrew/Italian/Spanish/Hindi/...? Is Mary modular in that sense?''' |
19 | | |
20 | | Mary is very modular, and a number of modules exist in a language-independent and configurable implementation, but there is still enough work left to do. |
21 | | |
22 | | For many languages, you could start with the existing MBROLA diphone voices: |
23 | | http://tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html |
24 | | |
25 | | You would then need at least the following MARY TTS modules: |
26 | | |
27 | | * needed: a Tokeniser, cutting the input into sentences and tokens (it may be possible to re-use source:trunk/java/de/dfki/lt/mary/modules/JTokeniser.java for a number of languages) |
28 | | |
29 | | * optional: a text normalisation which expands numbers, abbreviations etc. into a pronounceable form (but that can be left out at the beginning) |
30 | | |
31 | | * optional: a part-of-speech tagger, distinguishing at least between content words and function words |
32 | | |
33 | | * crucially needed: a phonemiser, converting the input text into sound symbols, e.g. in SAMPA. This can be based on rules for some languages (probably, Spanish), but a pronounciation lexicon is required for others when the link between spelling and pronounciation is less regular. Then, also, the lexicon must be complemented with "letter-to-sound" rules for unknown words. |
34 | | |
35 | | * optional: a prosody assignment module, predicting e.g. ToBI labels based on part-of-speech and other information. |
36 | | source:trunk/java/de/dfki/lt/mary/modules/ProsodyGeneric.java, written by my student Stephanie Becker, may be a good place to start. |
37 | | |
38 | | * needed: a duration assignment module, predicting phone durations. As a very first start, the Klatt rules as currently used in the Tibetan language component: source:trunk/java/de/dfki/lt/mary/modules/tib/KlattDurationModeller.java |
39 | | could be used, of course adapted to the language-specific phoneme set. |
40 | | |
41 | | * optional: an intonation contour realisation module. For example, there is a generic source:trunk/java/de/dfki/lt/mary/modules/TobiContourGenerator.java that can be used for different languages by writing appropriate config files. |
42 | | |
43 | | * needed: synthesis, e.g. using MBROLA voices. |
44 | | |
45 | | So, in summary, for adding a new language, you most crucially need a |
46 | | phonemiser, and you need to get at least a tokeniser and a duration |
47 | | assigner to work. Assuming that there is already an acceptable MBROLA |
48 | | voice for your language. |
49 | | |
50 | | On the bright side, as data representation is based on Unicode, there |
51 | | should be no problem with non-European scripts. |
| 11 | What takes time is starting components and, in particular, unit selection voices. So the more languages and voices you install, the longer maryserver will take to start up. Use the mary-component-installer to uninstall what you don't need. |
54 | | '''Unfortunately, I'm just a C++ programmer and have no experience with Java. I have made some changes in the Mary source code -- how I can compile and test my changes?''' |
| 14 | === What exactly is the license for the software? === |
| 15 | |
| 16 | The OpenMary core system is released under the Lesser GNU General Public License [LGPL|http://www.gnu.org/licenses/lgpl-3.0-standalone.html]. Language components for English, German, Telugu and Turkish are currently also released under the LGPL. |
| 17 | |
| 18 | Different speech synthesis voices are distributed under different licenses: |
| 19 | |
| 20 | * the [Arctic license|http://mary.dfki.de/download/voices/arctic-license.html] |
| 21 | * the [Creative Commons Attribution-NoDerivatives license|http://mary.dfki.de/download/by-nd-3.0.html] |
| 22 | * The [MBROLA license|http://mary.dfki.de/download/Mbrola%20software%20user%20agreement.html] |
| 23 | * maybe other licenses in the future. |
| 24 | |
| 25 | The installer should show you the respective license for a component you select. You must agree to a license before you can install and use a component. |
99 | | {{{ |
100 | | ... |
101 | | 2006-07-06 20:31:12,635 [main] INFO MbrolaSynthesizer Starting my own MbrolaCaller |
102 | | (de.dfki.lt.mary.modules.MbrolaJniCaller) |
103 | | # |
104 | | # An unexpected error has been detected by HotSpot Virtual Machine: |
105 | | # |
106 | | # EXCEPTION_ACCESS_VIOLATION (0xc0000005) at pc=0x00000000, pid=3320, tid=3324 |
107 | | # |
108 | | # Java VM: Java HotSpot(TM) Client VM (1.5.0_04-b05 mixed mode, sharing) |
109 | | # Problematic frame: |
110 | | # C 0x00000000 |
111 | | # |
112 | | # An error report file with more information is saved as hs_err_pid3320.log |
113 | | # |
114 | | # If you would like to submit a bug report, please visit: |
115 | | # http://java.sun.com/webapps/bugreport/crash.jsp |
116 | | # |
117 | | }}} |
118 | | |
119 | | (or similar), copy the files mbrola.dll and MbrolaJNI.dll to your system-directory (e.g. C:\Windows\System32). |
120 | | |
121 | | |
122 | | |
123 | | '''Will there be support for the open 'ogg vorbis' format for audio output?''' |
124 | | |
125 | | |
126 | | '''What are the requirements for MARY with ubuntustudio w/o online connection?''' |
127 | | |
128 | | * ubuntustudio 7.04 comes without a java compiler - which java (gij/gcj, the jre6 from sun.com, ...) is needed? |
129 | | |
130 | | * Is the online connection a requirement for the mary installer? what if it is not available -- is the sources package an option? |
| 75 | If someone writes a reliable ogg vorbis encoder in Java, we will be happy to add support for it. We do not intend to use native libraries though, the deployment issues are simply too complex and time-consuming. |