Creating Audio Interactive
for the MainStream Market:
Of all truly American inventions in literature, two of the least
appreciated in academic circles are the radio drama and the comic
book. These two art forms are surprisingly similar. Their
main focus is text in the form of narration and dialog. Certain
key moments, sights and sounds are illustrated, leaving the rest to the
imagination. Even those illustrations could quickly be described
in a very few words of text, as I'm sure they originally were, at least
in the planning stage.
Another truly American art form is the cartoon. I spent many a
Saturday morning ignoring my parents' warnings that they would "rot my
brain", no doubt because my imagination was sitting idle as all my
senses were held captive. The cartoon has no lack of fulfillment
in computer games, since nearly every computer game today is an
This paper is about bringing the comic book and the radio drama to the
next level by combining them together in an interactive computer game,
which I will call the Audio Interactive Comic Book.
The Audio Adventure
Game Engine is an ideal tool for this, since its characteristic
limitations are not relevant to the art, and it can be programmed by
anyone in a form of plain English text. The Engine can provide a
rich tapestry of layered sounds as well as text in the form of computer
generated speech, prerecorded MP3 speech by real actors and or visual
text windows. It also has advanced features that support joystick
driven real time three dimensional action. But rather than
cartoon-like moving graphics, the live action is displayed by radar and
flight instrument functions. For a Comic Book rendering of such
live action, and for the hearing impaired, such radar and instrument
functions can be graphically represented by still pictures,
superimposed and shown in quick succession as necessary. Thus the
extent of what would define an Audio Enhanced Comic Book and the
limitations in the Engine are a perfect match for each other.
Having established the right tool for the job, I'll present my own idea
of the ideal implementation.
Two important aspects must be considered for a successful mainstream
Audio Engine Game.
1. The Game should involve as many senses as the player has available.
2. The Game should only require as many senses as the player has
It should be played and enjoyed by people who are sighted and hearing,
visually impaired or hearing impaired.
With that in mind, here is how I would create such a literary
The Game is constructed as a series of text based situations, spiced up
from time to time by a live action sequence. Still picture
illustrations are used throughout. Each still picture has a text
description that can be accessed by the visually impaired.
Background sounds, sound effects and music are also used. The
sounds also have text descriptions, which can be accessed by the hearing
impaired. Options regarding performance of text descriptions are
chosen by the player from the accessibility menu of the Engine.
The central and most important aspect of a game consists of the text
information told to the player, and the response typed by the
player. The hearing impaired should be able to understand the game
from just the normal delivered text, the pictures and the text
descriptions of any sounds played. The visually impaired should be
able to understand the game from just the normal delivered text, the
sounds and the text descriptions of any pictures used.
For the most part, sounds and pictures should be there to enhance the
enjoyment and realism of the game, but should not normally be necessary
to understand the flow of the game. In those cases where either a
sound or a picture is important in understanding the flow of the game,
the game author should begin the picture or sound description with
"IMPORTANT:". That will attract the visual attention of the
hearing impaired, and it will also cause the description to be played
for the visually impaired if they have set their accessibility option
for just the important picture descriptions.
The highest quality mainstream games can provide the normally delivered
text in both written and audio form. The audio form should be MP3
or OGG audio files of real people acting out the lines, like voices in a
radio drama. The written and audio forms of text should say
exactly the same thing. That way when one who can both see and
hear plays, they will not get confused.
If real time action sequences are used, those players using visual
feedback would recieve an unfair advantage in terms of weapons
targetting. Therefore in order to keep everyone equally
challanged, an adaptive approach to difficulty control should be used
throughout all real time action. I will soon be posting a paper on
The Engine will soon be updated to allow multiple still pictures to be
selected, sized and superimposed over each other. I'll post
notices when that does happen. In the mean time, individual still
pictures could still be used effectively for real time action, with some
care to blur out the backgrounds, leaving only the relevant locations,
objects or attackers in focus. Since the graphic overlay features
will be ready soon I will describe my implementation using them.
The Engine has flight instruments that tell the player their
orientation in terms of pitch (up or down angle), yaw (direction facing
on a map) and roll (rolling over sideways angle). The Engine also
has radar that tells the player about anything nearby in their field of
vision. To graphically represent yaw (map direction) and pitch (up
or down angle) at certain locations, the Engine can select pictures from
a table of them, and then show segments of those pictures. This
selection and segment showing process would go something like this:
1. Each action sequence takes place within only one location.
Divide up that location into a large grid of squares. Each of
these squares has an X value and a Y value in the table. Each
square can be pretty far apart, as you'll soon see why. When the
player is closest to a certain square, the background image for the
graphic radar illustration will be chosen according to the instructions
found in the table for that square.
2. Each square also has a table associated with it for
determining what to show in response to the angle that the player is
facing, both in pitch (vertical angle) and yaw (map direction).
For this, you can have eight pictures, one for each map angle. One
for north, one for north east and so on. Then you can show just a
segment of that picture, depending on how up or down the player is
facing, and you can also show a little to the left or to the right of a
picture to fudge the map angle too. I will be writing a radar
function that does this, and you can just plug in your own stuff into it.
3. If the game author is doing arial combat and wishes to provide
roll also, then after the entire graphic radar screen is populated with
background and objects, the whole screen can be rotated according to the
roll angle. This will be accomplished by a RotateScreen command
I'll provide that invokes a graphic capability of DirectX 8. I'll
post when that is added.
To save room on the hard drive, some backgrounds can be made from other
background pictures just by selecting portions and zooming in.
I'll provide a ZoomScreen and ZoomObject command, and will post when it
4. Now that the background has been painted, the next thing that
needs to be done is to paint in each "blip" on the radar screen, that
is, each object that is important on the radar screen. This
information is obtained from using the ResearchRadar and GetBlip
commands, which pack the appropriate variables so you can look them
up. Each object can be seen from one of 8 possible angles, and
possibly distorted slightly to simulate other angles. This is not
realism. This is a comic book, keep in mind. The object can
be located on the screen and given the size appropriate to its distance
from the player.
5. Certain locations, such as doors, can have two looks to them,
based upon how accessible they are, namely, whether they are open or
closed. Objects can have just one look per angle. Other
persons, namely, wandering monsters of the like, can mostly have one
look per angle, except that the angle directly facing the player head on
can have two more looks per weapon used. Those looks would be for
aiming and firing their weapons. Each look is embodied by a
separate picture that would be superimposed upon the background.
Thus all of the information given in a radar report is delivered to the
visual player for a full high speed action comic book experience.
The difficulty scaling can be set such that if a multisensing player
wants to shut off the visual or audio feedback, the difficulty of
related activities is compensated, resulting in a normalized difficulty
for all players.
I invite any comments.