Back in 1996, a new company known as 3dfx changed the world when it introduced the Voodoo1 Graphics chip, one of the first add-in hardware graphics accelerators for the personal computer. Graphics on the PC would never be the same, as the dedicated graphics processor gave developers their first real 3d rendering tools, and gave birth to modern rendering techniques that all of us now take for granted.
With hardware acceleration, game authors could increase resolution, while inserting new lighting effects, textures, shadows and reflectivity, all of which were shown off to an amazed gaming world in GLQuake. Hard core gamers rushed to buy a pair of Voodoo2 cards (the first SLI solution), and enjoyed double frame rates, and a max resolution increase from 8x6 to 10x7. Soon, those without hardware acceleration were left behind, as OpenGL and DirectX5 imposed new hardware requirements, and presented developers and gamers with even more to play with.
But this is 2006! I was in diapers when GLQuake was released, what does this have to do with multi-core technology?
Well, folks, hold onto your hats, because on Wednesday Valve’s Gabe Newell likened the arrival of 2 and 4 core CPUs to the release of the first GPU, and stated quite clearly his company’s commitment to leveraging the new technology to its fullest – a process Newell called “painful and expensive” but also “critical.” As we saw during Valve’s presentation, that process is already well underway at Valve’s Bellevue, Washington facility.

The Present State of Gaming

We have all borne witness to the breakneck pace at which hardware vendors have been pushing graphics cards to the limits. Just over three years have passed since cards like the ATi 9800 Pro debuted, but here we are today with triple the fill rate and memory bandwidth of that solution, and six times the number of parallel shaders, to say nothing of the multi-GPU solutions now becoming commonplace.
However, as games have become prettier and prettier, the behind the scenes operations that add to realism and immersion have lagged behind. As Brian Jacobson, Valve Senior Software Engineer and one of our hosts put it, “I can render a photorealistic person, but I can’t make it act like a person.”

Serial Code Paths

For single threaded applications, game code runs “in serial,” meaning that a series of queries and calculations takes place before each frame is rendered, and that each query or calculation in the series depends on input from the prior query or calculation to do its part, and therefore must wait its turn. An example might be as follows:

  • Build asset lists (textures, sprites, lighting sources)
  • Build object lists (crates, doors, walls, ammo packs)
  • Update animation (from physics modeling, user input, NPC actions, environmental actions and changes)
  • Compute shadows
  • Draw frame

These calculations take up substantial processing power to complete, and must be completed three times for a single scene you might see in Half Life 2 – once to create the player’s point of view, once for the world as it is reflected in water, and once again for the view through any video monitors on the level! This sequence represents a ton of operations, and any inefficiency in the process can produce lag in the game.
The rendering speed and detail level available in today’s powerhouse GPUs can make the world within the game look real with ever-increasing ease, but what developers want is to find ways to make the world ACT real. With the visual component nearing its peak, the simulation component is, according to Valve software engineers, what will take games to the next level. . .allowing greater interactivity with the environment (imagine leaving accurate footprints in dew-covered grass), better AI (elimination of unnatural NPC behaviors), and more realistic particle effects (imagine smoke swirling and eddying behind an object passing through it). All these things are easy to draw – but the CPU power needed to make the necessary calculations simply isn’t available on a single core machine.

Multi-Threaded Strategies

The first problem faced by those entering the world of multi-core optimized code is that a decade of effort has been spent writing serial code for single cores. As some of our recent Kensfield benchmarks here at DH showed, software only realizes gains from multiple cores when it is written to efficiently execute multiple threads. The basic theory of multiple threads across multiple cores is obvious. . .multiple cores can run instructions in parallel, and divide a large task into smaller ones. Less obvious, from the programmer’s standpoint, is how to divide that work.

Course Threading

“Course Threading” has one significant advantage over other multi-threaded alternatives, and that is simplicity. Under this model, each subsystem within the game runs on its own core. For example, rendering, AI and sound subsystems would each have their own core. The threads would have to be synchronized of course, adding to the potential of delay and game lag, but that is an issue faced anytime parallel instructions are executed.
In their first forays into multi-threaded code, Valve experimented with the “course” approach, as it lent itself well to the bifurcated structure of the Source game engine. As you are probably aware, Source uses a server/client model, in which the game Client is responsible for user input, rendering, and graphics simulation, while the Server side manages AI, physics and game logic. The first step then, was to give each half of the game its own core. In “contrived” maps, this arrangement led to near perfect results, with the multithreaded code realizing double frame rates over its serial counterpart. However, in real world maps, this advantage fell to a 1.2x increase. . .while the client core was pegged at near 100% CPU utilization, the server core spent too much time at idle, only using 20% of its potential, reflective of the work load of the two engine components.

Fine Threading

As opposed to the heavy hand of course threading, fine threading attempts to take a more educated approach to dividing workload by spreading identical operations across cores. In the game environment, this approach is well suited to looping operations which continually perform the same functions, such as lookups which update the positions of objects within a map. In early experimentation, fine threading showed good scalability (meaning, two cores were twice as fast, and four twice as fast as that), and moderate difficulty in coding.

Hybrid Threading

After examination of the models described above, it became clear that different portions of Source engine code were better suited to different threading strategies, thus the birth of “hybrid threading.” Some systems, such as sound, do well simply isolated on their own core. Others, such as the looping lookup function discussed above, are more efficient running parallel across multiple cores. Additionally, using a combination of course and fine threaded processes allows the developers to get closer to the holy grail of game coding. . .100 percent efficiency, with no unused CPU cycles.


Multi-threading at Work

It took a while, but here is the nugget many of you were probably seeking. . .what will these optimizations do for me, and when? Well, you will be pleased to know that multi-core optimizations will be delivered over Steam prior to the release of Episode 2! According to Valve, these optimizations should result in immediate performance improvements for dual and quad core rigs. Keep an eye on Driver Heaven, and we will provide benchmarks for you as soon as they become available.

While the present is exciting, the future is more so. As I alluded to above, Valve engineers are busily seizing upon the opportunity to improve the calculations behind the eye candy, in order to make the Source universe, and Half Life in particular, more immersive.

Multi core machines will afford two major areas of improvement that are already being explored on Valve’s test beds as you read this. The first is in the area of AI. With more computing power on tap, AI programmers like Valve’s Tom Leonard will improve AI behaviors, making NPCs better tacticians, and better at adapting to and using their environments. Consider a combine soldier “looking for cover” – the program must query the world, taking into account line of sight, available structures, and changes in the environment. The process is computationally intensive, and must be done for every NPC “trying to take cover.” With more CPU power to burn, Tom & his colleagues will make AI smarter, better at adaptation, and less likely to kill immersion by doing something, well. . .stupid!

The second area being explored is particle effects. As many of you have likely noticed, particle effects in games look pretty, but often are completely divorced from the universe of the game. Smoke and dust hang in the air, but don’t react to forces acting upon it, like a hand passing through it, or wind blowing into an open window. With multithreaded code, we will be seeing this change in the near future (but no, not in Episode 2).


The Demos

One of the real treats of Hardware Day 2006 was to see some of these effects in action, and see them on a Quad Core Kentsfield, sporting an X1950XTX and huge flat panel display! We were shown two demos, one each on AI and particle effects. The AI demo was really fascinating, and really showed the potential within Source that is about to be unleashed.

In the AI demo, the player first walked into a room where hundreds of orange bugs swarmed, each one of course running an AI routine which dictated its behaviors. This was impressive in itself for the sheer number of creatures, but then in another room (Collision Detection), the same swarm navigated over and through a dozen or so static obstacles, and in the third (Advanced Collision Detection), over obstacles that were influenced by their movement. Boxes toppled with typical Source/Havoc realism, and you could clearly see the bugs reacting and adapting to the changes they had wrought. We were told that the calculations needed to keep a swarm of this size moving would bring a Pentium 4 to its knees, but the Kentsfield pumped out the data as quickly as it could be drawn – AWESOME.



The above graph show results using Valve’s Particle Simulation Benchmark, and shows the raw power and effective scaling of the Kentsfield setup. The demo was provided to us for additional testing, so I have some included short captures of the effects in action below. They speak for themselves, loud and clear! (click below for WMV Videos - problems streaming? right click save as to desktop).






Final Thoughts

I was a bit skeptical at first of Valve’s position that multi-core CPUs would usher in a new era in gaming, as the first graphics accelerator had, but after listening to the Valve engineers, seeing the excitement on their faces, and experiencing the demos first-hand, I am a believer. The coming years should see come truly amazing advances in the immersion of gaming, and it seems that Valve is ready to make the investments needed to lead the way.

However, one thing did occur to me as I listened to Tom Leonard outline his goals for advancement of AI. I thought back to the release of Half Life 2, and I remembered how the Source Engine, unlike that of iD’s Doom 3, was very forgiving of less-than-bleeding-edge hardware. Sure, someone with a 9800 Pro had to turn down visual settings to get playable frame rates, but the game still looked very, very good, and the game play experience was the same across a variety of hardware.

However, if Valve leverages the power of multi-core, and quad core in particular, to improve AI routines, single core users will surely be left out in the cold when those advanced features are turned off. In other words, lack of computing power will substantively change the game experience for those who don’t upgrade, because NPCs won’t be doing the same things.

That is a change in posture from the initial release of HL2, and may send a lot of people running to the computer store.

Thanks to Gabe Newell, Doug Lombardi and the rest of the Valve staff for hosting DriverHeaven at this fascinating event


Visit DriverHeaven

Copyright ©2002-2006 DriverHeaven.net, All rights reserved.

TechHeaven design based on BlackTeal adapted by craig5320 & Zardon. Additional artwork/DH logo by Zardon. Coding Zardon.
DH logo & Artwork may NOT be used without express permission of the Administration Team, protected under Copyright Law.

DriverHeaven.net Reviews
Style By: vBSkinworks