Using HTTP headers
VoiceXML 2.0 explicitly mandates the minimum support
of the Cache-Control and Expires header fields. The
Expires header indicates a date for when the object
is to be considered no longer fresh. The Cache-Control:
max-age header indicates that the object is stale after
max-age seconds.. For example, listing 2 above indicates
that index.html should be considered stale after 24
hours.
When an object expires, it must not be used unless it
has been revalidated. There are two main ways of doing
this. HTTP 1.0 and HTTP 1.1 use an efficient method
that performs conditional fetches. A request for an
object can be made supplying the If-Modified-Since header
and an associated date. Assuming no errors, the origin
server will return either a 304 status code indicating
that the object has not modified and the cached version
is the latest or a 200 status code followed by the new
version of the object (similar to the response listing
2) if the object has changed. The second method (HTTP
1.1 only) uses a unique identifier called an Entity
Tag or ETag to uniquely identify an object. An If-None-Match
request header followed by the ETag (obtained in a previous
response) indicates that the origin server should (assuming
no errors) return a 304 not modified if the server's
current ETag for the requested document matches the
requested ETag or a 200 response followed by the new
document if they do not.
If a document specifies Cache-Control: no-cache or Pragma:
no-cache, the user agent will not cache the document
and is useful for dynamic content that changes unpredictably.
If none of Expires, Cache-Control, Pragma, or ETag appears
in the response, a cache may use the Last-Modified date
to calculate an expiration time. This is called heuristic
caching and formulas usually work on the basis of choosing
an expiration time based on a fraction of the interval
since the object was last modified. Since heuristic
caching might cause problems with dynamic content generation
mechanisms such as JSPs or ASPs etc, these mechanisms
typically omit the Last-Modified date and are not cached.
Setting HTTP header fields in responses are specific
to the server-side technology used. Typically, however,
setting the headers is straightforward. Listing 3 illustrates
setting a header for a JavaServer Page (JSP).
Listing 3: Setting a HTTP head in a JavaServer Page
Using VoiceXML attributes
VoiceXML 1.0 and VoiceXML 2.0 have slightly different
mechanisms for controlling caching policy and we will
mention both here. In VoiceXML 1.0, an attribute called
caching is specified on elements requiring resources
to be fetched (e.g.
VoiceXML 2.0 brings more advanced control with the introduction
of maxage and maxstale attributes. When these attributes
are not specified, the behaviour is the same as the
default for VoiceXML 1.0 i.e. use the cached version
of the object if it has not expired. The maxage attribute
allows the developer to effectively specify an earlier
expiration date than that associated with the object
in the cache. Thus,
<audio src="news.wav" maxage="86400"/>
Listing 4: Specifying maxage caching of an audio
resource in VoiceXML 2.0
indicates that the cached version should be used up
to a maximum age of 24 hours. Setting maxage to 0 forces
the user agent to ensure it has the latest version of
the resource and is thus equivalent to safe in VoiceXML
1.0.
The maxstale attribute indicates that a cached object
may be used up to a specified number of seconds after
it has expired and could be used, for example, to ensure
that the same audio is used during a call.
<audio src="quiz_question.wav" maxstale="300"/>
Listing 5: Specifying maxstale caching of an audio
resource in VoiceXML 2.0
Caching Recommendations
In this section we suggest a couple of recommendations
to observe when employing caching with VoiceXML
Meta tags
Although HTTP header equivalent values can be specified
with the <meta> tag in VoiceXML, these are generally
to be discouraged. Even though a VoiceXML interpreter
might understand their meaning, it is unlikely that
a proxy cache will and thus will be ignored.
Caching does not mean High Availability
By definition, caching is designed to provide a temporary
store of information and thus should not be considered
a substitute for proper high availability mechanisms.
Use proven high availability mechanisms such as clustering,
mirror sites etc.
Consider ETags formulation
Large-scale deployments that employ web server farms
with load balancing should ensure that the ETag generation
algorithm is identical on each server for identical
content. Otherwise a user agent might receive unnecessary
content when revalidating with different servers.
Separate static content from dynamic content
It is important to determine which content is suitable
for caching and for how long. It is typical to consider
the largest objects as the most suitable caching candidates
and for VoiceXML applications these are typically large
audio and grammar files. A useful strategy is to employ
an industry grade web server to serve static content
and apply the correct caching attributes (HTTP headers)
and use the typically more computationally expensive
application server for dynamic content.
Remember that there is no 'Reload' button
Whilst developing HTML applications, hitting the 'Reload'
button or equivalent may refresh an updated file that
is erroneously cached. This is usually not possible
in a VoiceXML paradigm and careful consideration to
avoid these scenarios should be employed. As a rule
of thumb, it is advisable to use no caching whilst developing
an application and to only introduce it when performance
tuning the application at the end of the development
cycle. If a forced update is required, it is possible
to use the maxage attribute set to 0 to achieve this.
Alternatively, changing the name of the referenced resource
will have the same effect.
POST vs. GET
The HTTP POST method is exceptionally useful for sending
large amounts of data to an origin server in a reliable
and secure (at least over SSL) manner. However, it is
not possible to cache this and the alternative GET method
with query string should be considered if large amounts
of data are not being sent to the server and the resultant
content is suitable for caching.
Avoid depending on Pragma: no-cache
The HTTP 1.1 specification does not actually explicitly
mandate what this means in a response header. Use Cache-Control:
no-cache and perhaps an Expires with a date in the past
in addition to the Pragma: no-cache to be sure that
the object is not stored in any cache along the way.
Cache-Control: max-age takes precedence over Expires
To avoid confusion use only one mechanism.
Heuristic algorithms are meant as a fallback strategy
For consistency, it is recommended to always specify
your caching requirements so as not to depend on platform
dependent heuristic algorithms.
Conclusion
HTTP caching provides a powerful mechanism for improving
performance of applications. A performant VoiceXML application
that yields customer satisfaction will promote customer
retention and also save money on deployment costs. Caching
is often poorly understood and under-utilised on the
Internet, yet can be effectively harnessed by observing
some simple practices as outlined in this article.