Creating Audio Interactive Comic Books
for the MainStream Market:

Introduction:

Of all truly American inventions in literature, two of the least appreciated in academic circles are the radio drama and the comic book.  These two art forms are surprisingly similar.  Their main focus is text in the form of narration and dialog.  Certain key moments, sights and sounds are illustrated, leaving the rest to the imagination.  Even those illustrations could quickly be described in a very few words of text, as I'm sure they originally were, at least in the planning stage.

Another truly American art form is the cartoon.  I spent many a Saturday morning ignoring my parents' warnings that they would "rot my brain", no doubt because my imagination was sitting idle as all my senses were held captive.  The cartoon has no lack of fulfillment in computer games, since nearly every computer game today is an interacticve cartoon.

This paper is about bringing the comic book and the radio drama to the next level by combining them together in an interactive computer game, which I will call the Audio Interactive Comic Book.

The Audio Adventure Game Engine is an ideal tool for this, since its characteristic limitations are not relevant to the art, and it can be programmed by anyone in a form of plain English text.  The Engine can provide a rich tapestry of layered sounds as well as text in the form of computer generated speech, prerecorded MP3 speech by real actors and or visual text windows.  It also has advanced features that support joystick driven real time three dimensional action.  But rather than cartoon-like moving graphics, the live action is displayed by radar and flight instrument functions.  For a Comic Book rendering of such live action, and for the hearing impaired, such radar and instrument functions can be graphically represented by still pictures, superimposed and shown in quick succession as necessary.  Thus the extent of what would define an Audio Enhanced Comic Book and the limitations in the Engine are a perfect match for each other.  Having established the right tool for the job, I'll present my own idea of the ideal implementation.

Accessibility Considerations:

Two important aspects must be considered for a successful mainstream Audio Engine Game.

1. The Game should involve as many senses as the player has available.
2. The Game should only require as many senses as the player has available.

It should be played and enjoyed by people who are sighted and hearing, visually impaired or hearing impaired.

With that in mind, here is how I would create such a literary masterpiece.

The Game is constructed as a series of text based situations, spiced up from time to time by a live action sequence.  Still picture illustrations are used throughout.  Each still picture has a text description that can be accessed by the visually impaired.  Background sounds, sound effects and music are also used.  The sounds also have text descriptions, which can be accessed by the hearing impaired.  Options regarding performance of text descriptions are chosen by the player from the accessibility menu of the Engine. 

The central and most important aspect of a game consists of the text information told to the player, and the response typed by the player.  The hearing impaired should be able to understand the game from just the normal delivered text, the pictures and the text descriptions of any sounds played.  The visually impaired should be able to understand the game from just the normal delivered text, the sounds and the text descriptions of any pictures used. 

For the most part, sounds and pictures should be there to enhance the enjoyment and realism of the game, but should not normally be necessary to understand the flow of the game.  In those cases where either a sound or a picture is important in understanding the flow of the game, the game author should begin the picture or sound description with "IMPORTANT:".  That will attract the visual attention of the hearing impaired, and it will also cause the description to be played for the visually impaired if they have set their accessibility option for just the important picture descriptions.

The highest quality mainstream games can provide the normally delivered text in both written and audio form.  The audio form should be MP3 or OGG audio files of real people acting out the lines, like voices in a radio drama.  The written and audio forms of text should say exactly the same thing.  That way when one who can both see and hear plays, they will not get confused.

If real time action sequences are used, those players using visual feedback would recieve an unfair advantage in terms of weapons targetting.  Therefore in order to keep everyone equally challanged, an adaptive approach to difficulty control should be used throughout all real time action.  I will soon be posting a paper on adaptive techniques.

The Engine will soon be updated to allow multiple still pictures to be selected, sized and superimposed over each other.  I'll post notices when that does happen.  In the mean time, individual still pictures could still be used effectively for real time action, with some care to blur out the backgrounds, leaving only the relevant locations, objects or attackers in focus.  Since the graphic overlay features will be ready soon I will describe my implementation using them.

The Engine has flight instruments that tell the player their orientation in terms of pitch (up or down angle), yaw (direction facing on a map) and roll (rolling over sideways angle).  The Engine also has radar that tells the player about anything nearby in their field of vision.  To graphically represent yaw (map direction) and pitch (up or down angle) at certain locations, the Engine can select pictures from a table of them, and then show segments of those pictures.  This selection and segment showing process would go something like this:

1. Each action sequence takes place within only one location.  Divide up that location into a large grid of squares.  Each of these squares has an X value and a Y value in the table.  Each square can be pretty far apart, as you'll soon see why.  When the player is closest to a certain square, the background image for the graphic radar illustration will be chosen according to the instructions found in the table for that square. 

2.  Each square also has a table associated with it for determining what to show in response to the angle that the player is facing, both in pitch (vertical angle) and yaw (map direction).  For this, you can have eight pictures, one for each map angle.  One for north, one for north east and so on.  Then you can show just a segment of that picture, depending on how up or down the player is facing, and you can also show a little to the left or to the right of a picture to fudge the map angle too.  I will be writing a radar function that does this, and you can just plug in your own stuff into it.

3.  If the game author is doing arial combat and wishes to provide roll also, then after the entire graphic radar screen is populated with background and objects, the whole screen can be rotated according to the roll angle.  This will be accomplished by a RotateScreen command I'll provide that invokes a graphic capability of DirectX 8.  I'll post when that is added.

To save room on the hard drive, some backgrounds can be made from other background pictures just by selecting portions and zooming in.  I'll provide a ZoomScreen and ZoomObject command, and will post when it is there.

4.  Now that the background has been painted, the next thing that needs to be done is to paint in each "blip" on the radar screen, that is, each object that is important on the radar screen.  This information is obtained from using the ResearchRadar and GetBlip commands, which pack the appropriate variables so you can look them up.  Each object can be seen from one of 8 possible angles, and possibly distorted slightly to simulate other angles.  This is not realism.  This is a comic book, keep in mind.  The object can be located on the screen and given the size appropriate to its distance from the player. 

5.  Certain locations, such as doors, can have two looks to them, based upon how accessible they are, namely, whether they are open or closed.  Objects can have just one look per angle.  Other persons, namely, wandering monsters of the like, can mostly have one look per angle, except that the angle directly facing the player head on can have two more looks per weapon used.  Those looks would be for aiming and firing their weapons.  Each look is embodied by a separate picture that would be superimposed upon the background. 

Thus all of the information given in a radar report is delivered to the visual player for a full high speed action comic book experience.  The difficulty scaling can be set such that if a multisensing player wants to shut off the visual or audio feedback, the difficulty of related activities is compensated, resulting in a normalized difficulty for all players.

I invite any comments.

Robison Bryan