wiki:VocalizationSynthesis

Synthesis of vocalizations using MARY TTS

Listener vocalizations play an important role in communicating listener intentions while the interlocutor is talking. They include non-linguistic vocalizations like uh-huh, mhm, (laughter), and (sigh) as well as verbal response tokens such as yes, right, really, and absolutely. To communicate different intentions, a synthesiser should be capable of generating a broad range of vocalisations with different kinds of acoustic properties. In multimodal human-computer interaction, the ability of systems to generate vocal listener behavior is an important requirement for generating affective interaction. This page aims to provide examples to synthesize vocalizations using MARY speech synthesis framework.

MARY framework supports to synthesize vocalizations using following MARYXML (WORDS input type) requests:

1. Synthesis using a 'variant'

Example:

<maryxml version="0.5" xmlns="http://mary.dfki.de/2002/MaryXML" xml:lang="en-GB">
<voice name="dfki-poppy">
<p>
<vocalization variant="14"/>
</p>
</voice>
</maryxml>

2. Synthesize a vocalization which fits better for given target

Example 2.1:

<maryxml version="0.5" xmlns="http://mary.dfki.de/2002/MaryXML" xml:lang="en-GB">
<voice name="dfki-poppy">
<p>
<vocalization name="yeah" meaning="uncertain" intonation="falling" voicequality="modal"/>
</p>
</voice>
</maryxml>

Example 2.2:

<maryxml version="0.5" xmlns="http://mary.dfki.de/2002/MaryXML" xml:lang="en-GB">
<voice name="dfki-poppy">
<p>
<vocalization name="yeah" meaning="agreeing" intonation="mid" voicequality="modal"/>
</p>
</voice>
</maryxml>

Possible values currently supported for each of the attributes of the <vocalization> element in MaryXML:

Attribute Possible values
meaning anger, sadness, amusement, happiness, contempt, certain, uncertain, agreeing, disagreeing, interested, uninterested, low-anticipation, high-anticipation, low-solidarity, high-solidarity, low-antagonism, high-antagonism
intonation rising, falling, high, mid, low
voicequality modal, creaky, whispery, breathy, tense, lax
name yeah, yes, mhmh, mhm, right, tsright, tsyeah, aha, (snort), (sigh), (laughter), definitely, really, gosh, ah_I_see, oh_god_(gasp), yeah_absolutely

name attribute values are voice specific, see interactive documentation.

See also the interactive documentation at http://mary.dfki.de:59125/documentation.html#vocalizations

Last modified 13 years ago Last modified on 12/10/10 13:41:42