|
First Words
Welcome to “First Words” – the VoiceXML
Review’s column to teach you about VoiceXML and
how you can use it. We hope you enjoy the lesson.
VoiceXML
2.1
As promised last issue, we’re going to start
learning about VoiceXML 2.1. You may recall that as
VoiceXML platform vendors and application developers
began to widely deploy VoiceXML applications, they began
to identify potential future extensions to the language.
The result of this experience is a collection of field-proven
features that are candidates for addition to the VoiceXML
language. These features are being proposed as part
of VoiceXML 2.1.
For those keeping score, VoiceXML 2.1 has just (as
of this writing) been released as a Last Call Working
Draft. We encourage you to have a look:
http://www.w3.org/TR/2004/WD-voicexml21-20040728/
To review, the new features proposed for VoiceXML 2.1
are based on feedback from application developers and
VoiceXML platform developers. Those features proposed
as part of VoiceXML 2.1 include:
- Referencing Grammars Dynamically – Generation
of a grammar URI reference with an expression;
- Referencing Scripts Dynamically – Generation of
a script URI reference with an expression;
- Using <mark> to detect barge-in during prompt
playback – Placement of ‘bookmarks’
within a prompt stream to identify where a barge-in
has occurred;
- Using <data> to fetch XML without requiring a
dialog transition – Retrieval of XML data, and
construction of a related DOM object, without requiring
a transition to another VoiceXML page.
- Concatenating prompts dynamically using <foreach>
- Building of prompt sequences dynamically using Ecmascript;
- Recording user utterances while attempting recognition
– Provides access to the actual caller utterance,
for use in the user interface, or for submission to
the application server.
- Adding namelist to <disconnect> - The ability
to pass information back to the VoiceXML platform environment
(for example, if the application wishes to pass results
to a CCXML session related to this call)
- Adding type to <transfer> - Support for additional
transfer flexibility (in particular, a supervised transfer),
among other capabilities.
We’re going to peek at the first two in this
issue. These, along with several of the other features,
provide increased ability to process within the VoiceXML
page itself, rather than having to regenerate a VoiceXML
page from the application server. For more information
on how to generate dynamic VoiceXML, have a look at
the following First Words columns:
http://www.voicexmlreview.org/Jun2001/columns/Jun2001_first_words.html
http://www.voicexmlreview.org/Nov2001/columns/Nov2001_first_words.html
In VoiceXML 2.0, both grammars and scripts are placed
either in-line (within an XML <grammar> or <script>
element, respectively), or can reference a URL identifying
the data to be used for the grammar or script. The URL
in both cases is specified with the ‘src’
attribute, containing a static value.It is useful, however,
to have the capability to select a grammar or script
based on the evaluation of an expression. The following
example, directly from the VoiceXML 2.1 Last Call Working
Draft, illustrates the use of this feature in action
(for the case of the <grammar> element). Note
the two highlighted <grammar> elements:
<?xml version="1.0" encoding="UTF-8"?>
<vxml xmlns="http://www.w3.org/2001/vxml" version="2.1">
<form id="get_address">
<field name="citystate">
<grammar type="application/srgs+xml" src="citystate.grxml"/>
<prompt>Say a city and state.</prompt>
</field>
<field name="street">
<grammar type="application/srgs+xml" srcexpr="citystate + '.grxml'"/>
<prompt> What street are you looking for? </prompt>
</field>
<filled>
<prompt>
You chose
<value expr="street"/>
in
<value expr="citystate"/>
</prompt>
<exit/>
</filled>
</form>
</vxml>
|
The first field in the example collects a city and
state using a conventional static URL reference to a
grammar. The grammar is identified using the ‘src’
attribute.
The second field then uses this response in an expression
to select the grammar to be used for the second field
collection. The expression simply takes the first recognition
result and then appends “.grxml” to the
returned value. For example, recognition for the utterance
“Toronto Ontario” might return the semantic
interpretation “TorontoOntario” (I know,
it’s not a state, but I’m Canadian ?). The
ECMAScript expression that is contained in the ‘srcexpr’
attribute would then evaluate to “TorontoOntario.grxml”.
This URL is then used to fetch the grammar for the second
field. (You may recall that the “grxml”
filename extension is the convention used for SRGS format
grammar files.
A sample of the same technique with the <script>
tag is shown below. The ‘user_id’ variable
is used in the expression to dynamically construct an
ECMAScript URL reference. Notice that the URL in this
case contains a query component, to be used by the server
to pass a parameter to the ‘passport’ program
(which might be a script, servlet, or some other CGI
compatible program in this case).
<?xml version="1.0" encoding="UTF-8"?>
<vxml xmlns="http://www.w3.org/2001/vxml" version="2.1">
<form>
<var name="user_id" expr="12345"/>
<script srcexpr="'http://example.org/passport?id=' + user_id"/>
</form>
</vxml>
|
Error handling is the same for both features. Exactly
one of "src", "srcexpr", or an inline
grammar or script must be specified; otherwise, an error.badfetch
event is thrown.
These two new features allow the author to accomplish
more dynamically within a VoiceXML page, and help support
the development model where the focus is more on static
VoiceXML pages, as opposed to placing more processing
on the application server. A number of the other features
that we’ll review in the next few issues are directly
related to this development model as well, and provide
some exciting capabilities to the developer. These features
also make the language somewhat more consistent, as
URLs can be generated from expressions in a number of
other areas as well
Here are the direct links to these two new features.
http://www.w3.org/TR/2004/WD-voicexml21-20040728/#sec-grammar_expr
http://www.w3.org/TR/2004/WD-voicexml21-20040728/#sec-script_expr
Watch for more information on VoiceXML 2.1 in our forthcoming
issues.
Summary
VoiceXML 2.1 proposes some useful additional features
for VoiceXML 2.0, based on real-world deployment experience.
We’re going to continue looking at these in the
forthcoming issues drilling down into these features.
As always, if you questions or topics for VoiceXML 2.0
or 2.1, drop us a line!

back
to the top

Copyright
© 2001-2004 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE
Industry Standards and Technology Organization (IEEE-ISTO).
|