Worlds Built by Sound: The Evolution of Video Game Audio

Metal Gear Exclamation Sound

Throughout the history of gaming, our medium has been heavily driven by technological innovation. Every new console generation affords game development higher limits and new frontiers. Most commonly we recognize new ceilings in graphical fidelity as a gold standard of technological development. With the recent release of the Nintendo Switch and the rise of VR headsets, the means by which we control games has also become a measuring stick for technological development.

One developing aspect of gaming that I think has flown under the radar is the leaps we’ve seen in audio power through the years. To avoid starting any console wars, I’ve decided to use older examples. The Playstation One’s sound processor (SPU) had 512 kb RAM. The amount of available sound RAM would quadruple in the Playstation Two. Going back even further, we find that the Sega Genesis sported about 8 kb of sound RAM while the SNES had a whopping 64 kb of sound RAM. There are many other aspects of audio specifications beyond sound RAM; indeed, some consoles defer audio to the CPU while others seems to have varying sound chips based on manufacture date and region.

Overall, my point is this – sound is obviously something that is considered in console design. After all, Sony and AMD made headlines when it became known that the PS4 would use AMD’s True Audio tech. As always, we must ask ourselves: “Why?” What does audio add to the average gamer’s experience? The answer is: A heck of a lot more than you’d expect.

Battlefield Bad Company 2
Anyone who has played Bad Company 2 can attest. Source: GeForce

Improvements in things like sound memory (as well as other kinds of technology) have enabled sound effects, music, and voice acting to reach richer and more complex heights. Hell, in the history of gaming, technology capable of handling widespread voice acting is a fairly recent phenomena. This newfound complexity and depth has led to larger AAA game budgets (just as growing complexity in graphics, and physics engines have ballooned game budgets). However, underneath the bottom line, there is something more fundamental that is thrust into contention as complexity in sound design grows.

To see what’s at stake in as far as sound design goes, let us visit the uncanny valley. Typically, the term “Uncanny Valley” is a term used in reference to visual character design. A character is considered “uncanny” when a character seems life-like, but is not quite life-like enough to pass as flawlessly human. The ultimate result is an unsettling “uncanny” feeling. In other words, something about the depiction feels off. Something feels incomplete. I would argue that this phenomena is even stronger in sound design. A quick look at some of the more embarrassing moments in the history of video game design reveal the extent to which sound can be uncanny.

Consider, for a moment, the infamous laughing scene with Titus from FF10. Or the “master of unlocking” line from Resident Evil. Recently, Watch Dogs 1 found itself in hot water over a lack of bullet impact sounds or effects on water and various surfaces. The reason these moments feel so off is because the “uncanny valley.” The world, as it is presented, is played straight but a technical or aesthetic component leaves the setting flawed and incomplete. As such, sound – especially sound in games with high production budgets plays a very important role in making a world feel “whole.”

Final Fantasy 10 Tidus Laugh

On the flip side, sound effects done well quickly become iconic. How many times have you heard someone with a Metal Gear Solid “alert” sound as their text tone? Same goes for the Mario Bros. “Coin Sound.” In both cases these individual sound effects (which are both less than a second long) have become inseparable from the game in which they originate. These sounds conveyed the setting of their respective games so well that they became symbolic of the game as a whole. Very rarely does this sort of phenomena occur in other mediums such as film, but it happens quite often in games.

Similarly, sound plays an important role in Esports for an entirely different reason. In often times sounds in competitive games will act as indicators and clues. In a game such as Counter Strike you can hear your opponent’s footsteps in the next room over, you can tell if they are walking on carpet or on wood. One can ascertain that the gunman around the corner has an AWP because of the distinct sound made by the weapon when it is fired. Nearly every Esports game (especially first person shooters!) has numerous tutorials on how to “read sound” in any given situation. As such, in Esports, it is important for the different sounds in game to be consistent and clear. If they were anything less, the sound effects could actually undermine the balance and skill ceiling of the game.

The audio cues that are so critical in Esports are often just as important in horror games. In a 2015 study out of Leiden University, Gizem Kockesen switched around the sound effects in Amnesia to study the effects. Her study found that gamers relied on the sound effects in game to act as cues. From there they adjusted their behavior in terms of those cues. This led the study to conclude the following:  “First of all, it was shown that if a game starts with reliable audio cues which slowly become unreliable players’ fear levels increase; this can be used to make survival horror games scarier.” This conclusion makes sense; without reliable cues, players partially lose the ability to make informed game play decisions.

Of course, no discussion of game audio would be complete without touching on how music is used in the medium. Ludwig Wittgenstein wrote that “Music conveys to us itself!” This statement is an articulation of what might be known as musical formalism- that is a sense that the meaning of music is contained within the formal structure of the music itself. Video games, like film, reject musical formalism for expressionism. Unlike musical formalism, musical expressionism seeks to create meaning through feeling and emotion. Soundtracks rely on the mood and emotional upswells as a means of conveying setting. Video game soundtracks in particular need to account for the fact that the individual tracks that make up the soundtrack are often extremely situational. Think about how many games have “battle music” for example; or games that have “victory music” or “defeat music.”

One type of game, however, does seem at least a little bit more formalistic in its approach to music; that is, music games themselves. Music games such as Guitar Hero and Rock Band revel in the structure of individual songs by involving the players in those structures. Players are measured by their ability to connect mechanically with the structures of the music. Because they plunge the players so deeply in the actual architecture of the music, these music games often rely on visual aesthetics very heavily to convey mood and tone.

With all of that said, how does one write a good and memorable soundtrack for a game? To some extent this is a subjective process. However, there are two aspects that stand out to me about the way games are built as things that have to be considered when writing music for games. First, video games are repetitive. Second, music in games depend heavily on eventual and emotional context. In my view, understanding the unique characteristics of video games is key to writing good game music.

Guitar Hero
Source: Gamespot

Unfortunately, there are those who are left out of the experience of sound in video games – the hearing impaired. As James Knack says in Making Deaf Accessible Games is Hard, without sound “The huge explosion behind you, as far as you know, didn’t even happen.” In my own research, it seems there are very few games that have closed captioning beyond subtitles. There are almost none that employ closed captioning in a creative and innovative way that does not distract from gameplay. Knack calls for the industry to step up and make a change – I for one, agree. The video game development process, especially for AAA games, has reached a point of complexity where there really is no excuse not to be as inclusive and as innovative possible in this regard, lest more and more folks be left behind.

In an article written for Polygon, Richard Moss argues that “Accessibility benefits everyone.” he continues: “It encourages creativity on the parts of designers.” In his article, Knack applauds the innovation of Thomas Was Alone as far as its experimentation with subtitling. In this way, I think that Moss’ and Knack’s thesis’ line up nicely. Accessibility takes creativity, but we all benefit if everyone has a seat at the table.

Thus, the future of sound in video games is more complicated than increased audio RAM or more expensive music soundtracks… at least I hope so. The future of sound in video games has to do with making sound itself just as expressive and interesting for those who those who might have trouble accessing it. At present, although accessibility is incomplete, sound serves a rich purpose in video games because, as we’ve seen, sound goes a long way to build the worlds in which we play.

Some of the coverage you find on Cultured Vultures contains affiliate links, which provide us with small commissions based on purchases made from visiting our site.