The following rambled document is a draft attaining to the concept for a rather simple, but dynamic high level music playback system. Which is to say it is to work above the audio engine. My goal here is to help develop the music in video games, while many games have highly developed music systems I have found that most Open source games utilise notably primitive systems.
Hopefully the developers of the game will take interest in this idea.
FPS simple audio system
Note: This document is a theoretical concept of which has not been put into practise or programmed.
'Sownd system'1 is a music delivery system designed for dynamic music playback with a gameplay enviroment. Specifically the system is most suitable for FPS (First person shooter) however it could be adapted to many simple games or programmes of which lack an advanced GUI based-middleware engine. The key goals of this is to be easily integrateable, expandable and most importantly provide seamless and dynamic audio in a multiplayer/skirmish format.
The major problems concerning music and the composer for video games is making music that is suitable without becoming boring very quickly and that of solidifying the aural atmosphere within the gaming environments. Music is often overlooked in favour of graphics and other gameplay elements, while it might be wrong to suggest that audio was more important one only needs to take note of a number of games with fantastically composed and implemented music to see the positive effects of good idea. Since this system is specifically suited to none-linear gameplay format common while playing online on nearly all FPS's, RTS's etc.
The Sownd system
The Sownd system works on a simple system of 'banks', 'events' and 'layers' and attempts to avoid listener boredom and fatigue by use of of short clips of music (Called 'Elements') seamlessly played one after another.
Events are the triggers that select new layer. These layers can then be divided into sub-layers to more accurately represent a piece of music while still making it relatively random. Each layer contains a groups of files that will be played (either randomly or in a prioritised manner) until either the next event is triggered or for whatever reason the music is told to stop playing.
The music files are called Elements for the sake of distinction between any other files that may be used.
The following example should demonstrate one instance of how the system could be applied.
A multiplayer 'Team Death match' game begins. A 'bank' is chosen out of the 3 or 4 selectable banks. Each with their own different sets of layers and thus elements. The banks are the top level of the organisation within the system. This system is devised both for greater control and ease of organisation and to reduce the overall number of files being loaded.
(This may infact be a more inefficient method, programmers will know I imagine)
This is then brought down to the Layers and sub-layers level. (This may well be the more efficient point to load files, perhaps while preloading the map the system could load half or something)
These are the following musical events with annotation, each triggered event triggers a layer which at some point may trigger a sub-layer. Each layer sets the specific elements of which can be chosen to play at this point until the next event is triggered OR a new Sub-layer is triggered.
The use of sub-layers could be seen if trying to arrange a 12 bars blues (a poor example, but for explanation's purposes it suffices) by composing a number of 1 bar samples based around chord I, chord IV and Chord V. Each chord is a sub-layer that species to chose from only the files you tell it too, which would be the ones centred on the correct chords. This of course could be used in any entirely different manner.
-----------------
Intro – A short 5-20 seconds piece of energetic aggressive music plays out of 2 or 3 randomly chosen pieces (what the chosen piece is could influence what the following tracks would be)
Segue – A short period of time before 'First blood' when the first enemy player is taken out of action. Typically this could be a number of short 2-4 bar samples randomly chosen. The music would generally be rather quiet and subdued.
First Blood - The music again becomes more aggressive (but not too heavy) This would be the majority of game time (unless more events were created). Typically a larger number (10-30?) of tracks of varying lengths depending on content. Priority at this point would be very important.
Sub layers, as explained prior to this list would be extremely powerful in this instance.
A simple use of sub-layers here could be to play 8 of the elements in the first layer then move the 2nd layer which could provide a variation on the theme. This could allow for a changing of key or melodic phrase.
The 1st sub layer could then be repeated in order to maintain a coherent form in the piece – to reaffirm a theme or atmospheric element. This example is very much like organising a song. Anyone knowledgeable in composing and arranging will understand the idea of AABA form or ABAC, AABCA etc. The possibilities can vary and creativity here could provide some fantastic results.
Last minute – In the last minute of game play the music becomes very aggressive, perhaps uptempo and louder. Since the time will be definite, music chosen could simply be made from a selection of 1 minute elements.
Other ideas again could be used, another layer could be triggered at 30 seconds to ramp up the tempo further or move the piece up a tone.
Outro – Played in the short break before moving to the next map. A pensive slow track, usually might require just a short 2/4 bar loop
These could be easily changed and manipulated to whatever the environment and situation commands
Transitioning between files
Due to the simplicity of the system: one that lacks a complex method of cueing or arranging files overlapping each other in a mixer-like environment akin to FMOD or Wwise (or ones DAW, such Pro-Tools, Reaper etc) one simply cannot string a number of elements together. This results in clicks and pops or just abrupt dissatisfying changing which would pull the listener out of the atmosphere and be otherwise unpleasant to hear.
The method of rectifying this problem is straight forward. By simply recording the the reverb that would occur after the element finishes as a separate audio file that would be linked to automatically with its relevant element.

The highlighted area here shows in the mixer from where I created the 'Intro' file. The red 2 at the top right is the point at which the element ends. The period after that is the reverb file.
The next image shows how these files would be organised within the game in the mixer (this is only as a visual representation)
The file in the top track is the intro track, the second is its reverb, played just after. The third track down is the first element of the segue and the fourth its reverb. Of course, the actual system would lack this mixer format and would have to be coded. It would be appropriate to have the reverb attached intrinsically to its appropriate element to play automatically when the relevant element finishes playing.

This is a basic conceptual draft. I am no programmer, however I have little doubts that it would be programmable and I doubt that it would be overly difficult.
Extra considerable features
A few extra features are worth considering for such an audio system.
Audio-Damage relationship
This occurs in many FPS'ers but it helps immerse the player in the experience. When the player takes a hit or is taken below a certain amount of hitpoints the music changes. The usual might be to adjust slow the elements play back speed while injured. A very easy method to implement would be simply applying effects to the elements while injured. This could either be using DSP effects or simply extra files relating to the elements of which are swapped to when a player is injured.
Applying a huge degree of reverb, cutting the highs and increasing the bass to add boom is a common method of creating the 'appropriate effect'.
Ducking
Ducking might not be hugely necessary however it can be helpful at times if a more important file is playing over another or if some significant vocal comes into play, however unlikely that maybe. Must audio engines naturally deal with this, such as OpenAL.
P.S. It seems the image files have been buggered abit. The idea should still be coherent in any case.