|
continued from page 2...
The results indicate that Java technology has made major
advances since 1999. The old Microsoft 1.1 JVM
did not implement threading efficiently, and so it does
not scale at all well in this test as the number of
concurrent channels increases. But the recent
Sun and IBM JVMs run efficiently and scale very well
due to advances in garbage collection, threading, and
just-in-time compilation. In fact, the IBM 1.3
JVM handles one dialog every three seconds on each of
200 channels, five or six times the speed required by
the temperature conversion dialog. This
suggests that the desktop could handle over 1,000 channels
of this artificial test. So Java performance is
probably not far at all from C++ performance.
We can also conclude that VoiceXML interpretation by
itself can be very efficient.
Therefore,
we endorse the use of Java for writing voice browsers.
If you already use a Java-based voice browser, do experiment
with different JVMs to see which work best for your
system. Some other data using our browser suggests
that the Sun JVM is equally as fast as the IBM JVM.
Other tests indicate that the BEA JRockit JVM is significantly
faster than IBM, but sadly has stability issues.
You might also experiment with the IBM Jikes Java compiler
and see if it helps.
Methodology
Our team used an agile methodology, and when Kent Beck's
Extreme Programming Explained came out in September
1999, we quickly agreed with most of his ideas.
The VoxGateway was a good candidate for an agile methodology:
the area of voice browsing was new and innovative; we
had a small, highly experienced team; and the requirements
changed rapidly as VoiceXML 1.0 took form. So
it made good sense to be agile.
We
built the system in small increments of functionality,
always tried to have a working system, and tested every
change before committing it to the project source control
system. We continually refactored our code.
Although we intellectually assented to pair programming,
we never had the nerve to shift to it. Refactoring
was essential to deal with the huge changes between
VoiceXML 0.9 and 1.0, while testing each change before
inclusion into the project source tree gave us the courage
to refactor. I would recommend an agile approach
to anyone developing their own VoiceXML system.
Architecture
The
key architectural decision was to use the Factory pattern
(see Design Patterns by Gamma, Helm, Johnson,
and Vlissides, Addison-Wesley, 1995). In this
pattern a dimension of variability is identified and
then captured in an abstract superclass. The concrete
subclasses of this class then represent variations on
this dimension. A Factory object is the only place
where the subclasses are referenced: the rest of the
system only sees the abstraction. For example,
in our system we need abstract "URLFetcher"
objects to go off and get web pages. Our system
shouldn't care what particular URLFetcher is used.
So in our Factory object we have a method called newURLFetcher()
which returns a URLFetcher whose actual subclass can
be a plain vanilla JavaURLFetcher, a Win32URLFetcher
that uses the Microsoft Wininet DLL used by Internet
Explorer, or a JigsawURLFetcher that uses the W3C Jigsaw
client to fetch web pages.
The
Factory is itself an abstract superclass, so that different
subclasses of Factory can define different configurations
of the VoxGateway. For instance, the FlexibleFactory
is a very generic subclass that determines the configuration
settings from a properties file, for instance.
One property is the name of the URLFetcher class to
use.
The
Factory pattern is exploited repeatedly. To protect
our code from knowing which particular XML parser is
being used, the Factory's newXMLParser() method returns
an object of type XMLParser. This allowed us to
shift from the IBM XML Parser for Java to the Xerces
XML Parser with very little effort. Likewise,
to protect our code from knowing which ECMAScript interpreter
is being used (currently Rhino), the Factory's newECMAScript()
method returns an ECMAScript object.
Another
dimension of variability is the particular language
a voice markup page is written in. We handle this
by defining a "markup language compiler" object
(an MLCompiler), and then subclasses of this for each
language we support (VoxMLCompiler for VoxML and VoiceXMLCompiler
for VoiceXML 1.0 and 2.0). The Factory's newMLCompiler()
method looks at a byte array returned by the fetching
subsystem and replies an MLCompiler based on its content.
Various
operations, administration, and maintenance (OA&M)
subsystems are plugged in the same way: logging, billing,
user definition, etc. The OutputFilter is an abstract
superclass that defines a hierarchy of classes that
filter TTS prompts based on the speech synthesis markup
language supported by the text to speech system.
But
the major decision made by the Factory is which speech
and telephony resources to use. The speech and
telephony API is called the ExecutionContext, and it
has very abstract methods like speak() for prompting,
setGrammar() for conveying speech recognition grammars,
record() for recording, transfer() for transferring
calls, and listen() for doing a speech recognition.
The Factory newExecutionContext() method can return
different subclasses of ExecutionContext, including
one that implements a text only interface for batch
regression testing, a JSCExecutionContext for integrating
with the Nuance Java Speech Channel, a MIXExecutionContext
to fit it into the MIX Vlet environment, and so forth.
Licensees have implemented various other ExecutionContexts
based on their needs.
This architecture has given rise to the terms core
interpreter and framework. The core
interpreter (or just "core") consists of all
the classes that are used in every configuration of
the VoxGateway, whereas the framework consists of those
classes that can be used or not used based on the configuration
specified by the Factory in use.
Figure 2: The VoxGateway's dialog processing cycle.
Continued...

back
to the top

Copyright
© 2001-2003 VoiceXML Forum. All rights reserved.
The VoiceXML Forum is a program of the
IEEE
Industry Standards and Technology Organization
(IEEE-ISTO).
|