Elvira
- a VoiceXML Platform for Research
Continued from page 1...
This
feature is absolutely crucial for researchers. It allows
them to freely
extend practically any aspect of VoiceXML by the invocation
of a custom
function and perform easily virtually any operation.
The integration of
external functions into the system is absolutely seamless
and the functions
can be easily reused. Moreover, external functions can
be also called from
VoiceXML <object> tag.
External
functions can perform any task beyond the scope of VoiceXML.
They are
often used for connecting to a database and retrieving
data that are then
accessible from ECMAScript within VoiceXML. However,
the spectrum of their
use is much broader. We will mention some possibilities
in the following
section.
Research Scenarios
This section describes some tasks handled by researchers
in the field of
dialogue systems and presents Elvira as a base tool
for their solution.
Statistical
Processing of Dialogues
Statistical
methods play a key role in human language technologies.
Huge
collections of statistical data extracted from dialogues
are analyzed and
models describing various dialogue properties are deduced
from them.
Elvira is an excellent tool for collecting such statistical
data. Logging can be done in three different ways:
- Using the VoiceXML <log> tag - this is the
standard way allowing one to log e.g. the dialogue
flow
- Logging within an external function - can be used
e.g. for logging database queries
- Logging within an component - this is typically
used by input and output components for fine grained
logging of speech synthesis and recognition related
events.
Research
in the Field of Dialogue Strategies
Dialogue strategy determines next step of dialogue for
every dialogue state. Dialogue strategy of VoiceXML
is described by the form interpretation algorithm (FIA).
However, FIA is not always strong enough or suitable
for all dialogue models. Let us name at least two such situations:
- FIA does not support repeating prompts when the user
asks "what did you say". (If a <link> generating event is used
for catching the phrase, the prompt counters are increased and a different prompt
can be said)
- There is no way FIA can determine where the user
interrupted spoken prompt.
This information is needed for implementation of intelligent
tapered prompts.
Our VoiceXML platform makes it possible to replace FIA
by an external decision mechanism. The idea is quite
simple. Each VoiceXML form item has its cond attribute
that is an ECMAScript expression and hence an external
function can be called within the condition. The function
can simply enable or disable the items as needed. The
information which is not accessible in VoiceXML can
be used for the decision. It is actually not a real
replacement of FIA, it is rather its restriction to
only one possibility.
Wizard of Oz
Wizard
of Oz (WOZ) is a technique used for dialogue design.
It helps to find out how people are likely to interact
with a system before the system is finished or even
before its design began. When using this technique,
user
interacts with what appears to be an computer system
but is in fact a simulation provided by either a human
(called wizard) or the combination of a human and a
computer.
An environment for WOZ simulations was built upon Elvira
VoiceXML platform. It demonstrates capabilities of the
platform very well.
As
mentioned above, WOZ simulation requires a wizard who
should be able to inspect every step of the dialogue
and influence subsequent dialogue flow if needed. A
web-based user interface was created for the wizard
in this case. The interface is depicted in the following
figure.

User interface of Wizard of Oz application. The wizard can see all
values already specified by user for current frame and values specified
in the last dialogue step. The wizard can change the values as needed
and classify the last speech act. Next dialogue step is performed
accordingly.
A technique similar to the external decision mechanism
described above is used in this WOZ system. After each
dialogue step, an external function logs all important
information about the dialogue step into a database
and waits until a response is stored into the database.
A php script waits for the information stored by the
external function into the database and regenerates
the web page for the wizard. When the wizard submits
the form, the script stores wizard's corrections to
the database. The external function returns information
about the corrections back to VoiceXML and the dialogue
is modified accordingly.
The data in database are analyzed after the experiment
and an improved version of dialogue is created. The
power of VoiceXML excels here - modifications of the
dialogue can usually be done easily and quickly.
Continued...

back
to the top

Copyright
© 2001-2003 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE
Industry Standards and Technology Organization
(IEEE-ISTO).
|