|
First Words
Welcome to “First Words” – the VoiceXML
Review’s column to teach you about VoiceXML and
how you can use it. We hope you enjoy the lesson.
VoiceXML
2.1
In
this lesson, we’re going to continue investigating
VoiceXML 2.1.
You may recall that as VoiceXML platform vendors and
application developers began to widely deploy VoiceXML
applications, they began to identify potential future
extensions to the language. The result of this experience
is a collection of field-proven features that are candidates
for addition to the VoiceXML language. These features
are being proposed as part of VoiceXML 2.1.
Just as a reminder, VoiceXML 2.1 has been released
as a Last Call Working Draft. Here is a pointer:
http://www.w3.org/TR/2004/WD-voicexml21-20040728/
Note:
if you’re reading this article after VoiceXML
2.1 has been finalized and published, you should spend
a few minutes tracking down the final specification
rather than this link, as the specification may have
undergone minor changes.
The
new features proposed for VoiceXML 2.1 are based
on feedback from application developers and VoiceXML
platform developers. The features we’ve covered
already include:
- Referencing
Grammars Dynamically – Generation
of a grammar URI reference with an expression;
- Referencing
Scripts Dynamically – Generation
of a script URI reference with an expression;
- Recording
user utterances while attempting recognition – Provides
access to the actual caller utterance, for
use in the user interface, or for submission to the application
server.
- Adding
namelist to <disconnect> -
The ability to pass information back to
the VoiceXML
platform environment (for example, if the application wishes to pass results
to a CCXML session related to the call)
- Using <mark> to
detect barge-in during prompt playback – Placement
of ‘bookmarks’ within
a prompt stream to identify where a barge-in
has occurred;
Here are the links to the previous articles in this
series:
http://www.voicexmlreview.org/Sep2004/columns/sep2004_first_words.html
http://www.voicexmlreview.org/Nov2004/columns/nov2004_first_words.html
http://www.voicexmlreview.org/Feb2005/columns/Feb2005_first_words.html
This
issue, we’re going to look at:
- Concatenating
Prompts Dynamically using <foreach>
The <foreach> Tag The <foreach> tag
is perhaps one of the most widely implemented extensions
to VoiceXML 2.0. As a
result of its usefulness, it was selected for inclusion
in VoiceXML 2.1. This feature adds a looping construct
to VoiceXML.
The primary use-case is constructing a dynamic list
of prompts without requiring a trip to the application
server. This allows wider use of static VoiceXML pages,
particularly when combined with some of the other features
in VoiceXML 2.1. This can lead to more efficient applications
with a better partitioning between presentation and
business logic as well.
The <foreach> tag
takes two attributes, and both are required. Otherwise
(as is usual) an error.badfetch
event will be thrown when processing the page. The
two attributes are:
- array – an
ECMAScript expression evaluating to an ECMAScript
array; The loop will be executed for
each member of this array;
- item – an ECMAScript
variable that is used as the ‘loop variable’.
For each iteration through the loop, this variable
will be set to
the current array element being processed.
The
VoiceXML 2.1 Last Call Working Draft section on <foreach> has
a great selection of example code:
http://www.w3.org/TR/2004/WD-voicexml21-20040728/#sec-foreach
We’re going to have a look at the first example,
which should give you a flavor of how <foreach> can
be used.
There
are two snippets of code in this example– the
first is the VoiceXML component.
<?xmlversion="1.0" encoding="UTF-8"?>
<vxml xmlns="http://www.w3.org/2001/vxml" version="2.1">
<script src="movies.js"/>
<form id="pick_movie">
<!--
GetMovieList returns an array of objects
with properties audio and tts.
The size of the array is undetermined until runtime.
-->
<var name="prompts" expr="GetMovieList()"/>
<field name="movie">
<grammar type="application/srgs+xml" src="movie_names.grxml"/>
<prompt>Say the name of the movie you want.</prompt>
<prompt count="2">
<audio>
When you hear the name of the movie you want,
just say it.
</audio>
<foreach item="thePrompt" array="prompts">
<audio expr="thePrompt.audio">
<value expr="thePrompt.tts"/>
</audio>
<break time="300ms"/>
</foreach>
</prompt>
<noinput>
I'm sorry. I didn't hear you.
<reprompt/>
</noinput>
<nomatch>
I'm sorry. I didn't get that.
<reprompt/>
</nomatch>
</field>
</form>
</vxml> |
The
second code snippet, again, straight from the VoiceXML
2.1 working draft, is an ECMAScript function returning
an array of Movies that can be requested:
function GetMovieList()
{
var movies = new Array(3);
movies[0] = new Object();
movies[0].audio = "godfather.wav";
movies[0].tts = "the godfather";
movies[1] = new Object();
movies[1].audio = "high_fidelity.wav";
movies[1].tts = "high fidelity";
movies[2] = new Object();
movies[2].audio = "raiders.wav";
movies[2].tts = "raiders of the lost ark";
return movies;
} |
In
this function, we’ve created a array consisting
of three ECMAScript objects. Each of these objects
has two properties – the name of an audio file
that contains the (pre-recorded) name of the movie,
and a string containing text that can be played in
the event that the audio file can’t be found. Here are the interesting bits from the VoiceXML page:
<var name="prompts" expr="GetMovieList()"/>
|
This
variable declaration calls our ECMAScript movie function,
generating the array of objects that will
be used in the <foreach> loop, shown below:
<foreach item="thePrompt" array="prompts">
<audio expr="thePrompt.audio">
<value expr="thePrompt.tts"/></audio>
<break time="300ms"/>
</foreach> |
If
the caller is unsure of what to say, or names a movie
not in our grammar file, the <foreach> loop
will be processed – this is triggered by the ‘count=”2”’ attribute
on the <prompt> enclosing the <foreach> loop.
By doing this the second time we prompt, we allow
the experienced user to move through their task quickly
while providing a fallback in the event that the
caller needs help. What this amounts to is the queuing of the following
audio for the second recognition attempt with the user:
<audio src="godfather.wav">the godfather</audio>
<break time="300ms"/>
<audio src="high_fidelity.wav">high fidelity</audio>
<break time="300ms"/>
<audio src="raiders.wav">raiders of thelost ark</audio>
<break time="300ms"/> |
You’ll note that we’ve
queued up the three pre-recorded audio files, along
with their alternate
text, and including the embedded pauses.
This
is a rudimentary example, that doesn’t
show a lot of advantage over just naming the prompts
in the original page. You may well ask why we didn’t
just generate the page with this list of files initially. But
imagine being able to decide how to generate this list
of prompts based on user input that has been received
earlier within this page (“No action movies for
me please”). Or being able to retrieve the movie
data (as an XML data object) from within a static VoiceXML
page (using the <data> tag), and then being able
to construct a list of prompts on the fly, without
going back to the application server. And then managing
that list of prompts on-the-fly from within the VoiceXML
page.
The <foreach> tag can be useful on its own,
but when combined with <data> it allows the construction
of very powerful pages using only static VoiceXML.
For a detailed example of this method in action, have
a look at the other examples provided in the VoiceXML
2.1 specification for the <foreach> tag.
Summary Here
is the direct link to the ‘foreach’ tag
feature:
http://www.w3.org/TR/2004/WD-voicexml21-20040728/#sec-foreach
VoiceXML
2.1 proposes some useful additional features for
VoiceXML 2.0, based on real-world deployment
experience. We’re going to continue looking
at these in the forthcoming issues drilling down
into these features.
In
future issues, we’re going to look at these:
- Using <data> to
fetch XML without requiring a dialog transition – Retrieval
of XML data, and construction of a related DOM
object, without requiring
a transition to another VoiceXML page.
- Adding
type to <transfer> - Support for additional
transfer flexibility (in particular, a supervised
transfer), among other capabilities.
These
are features that will likely get a full article
each, as they are powerful, and can provide the VoiceXML
developer with new ways to build applications. And
the astute reader will note that I’ve left the
hardest ones for last!
As always, if you questions or topics for VoiceXML
2.0 or 2.1, drop us a line!

back
to the top

Copyright
© 2001-2005 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE
Industry Standards and Technology Organization (IEEE-ISTO).
|