This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.
Introduction to Gaming AI
You probably don't think that the average rabbit is particularly intelligent. They generally like eating grass, hopping around and digging holes. How complicated could such a small animal be?
Well, in early 2008, IBM embarked on a project to digitally simulate the cerebral cortex of such a small mammal at the Thomas J Watson Research Center. The test used 8,192 processors and 2.8TB of memory on the Blue Gene-L supercomputer to simulate the 22 million neurons and 11 billion synapses found in the16cm2 cortical surface of a rabbit's brain.
Simulating even primitive intelligence is extremely demanding, so how does a basic PC go about simulating human intelligence in multiple game characters at once?
Gaming AI is in many ways similar to 3D graphics rasterisation, which uses complex trickery such as shaders and textures to give the impression of real light and shadow. If you were to properly ray trace a 3D scene in a game, as you would in a CGI movie, it would take considerably more processing power, and the same applies to AI.
Empire Total War features thousands of men on screen at once. All need to have a degree of intelligence to make the game playable.
Creative Assembly's AI programmer, Richard Bull, who masterminded the battle AI system in Empire: Total War explains that "the AI academics are your wizards and we're your stage magicians - it's all smoke and mirrors with game AI.
"We're trying to give you the perception of reality, and often we therefore look like we've got systems that are much more impressive than anything the academics can create, and we're not doing anything more intelligent. In fact, in most cases, we're not - what we're doing is giving the perception of the AI being much cleverer than it actually is."
When an enemy in F.E.A.R. kicks you as you move closer and then dives through an open window, it gives you the impression that it's a real human character wanting to kill you and survive, but all it's really doing is responding to a computer program and using the right animations at the right time. Unwrapping gaming AI is rather like watching the camera zoom out to reveal Frank Oz controlling Miss Piggy; it's disappointing at first, but becomes incredible when you see the amount of work that goes into the puppetry.
So how does a gaming character give the impression of being intelligent? To answer that requires a little grounding in the basics of AI. Let's start with the traditional first-person shooter enemies of yore, which idly patrol their area until they notice you and then start shooting at you. The decisions of these characters are made using state machines.
Each non-player character or squad of non-player characters has a number of states, such as ‘man shooting' or ‘seeking enemy,' and each of these states has various actions and behaviours associated with it. For example, a character in a ‘man shooting' state has actions such as firing and reloading weapons, and the character then changes to another state if a certain number of criteria are met. If the enemy moves out of sight, the character might change to a ‘seeking enemy' state and employ a set of actions that involve looking around the game world for new enemies.
This concept is simple enough, but the decision trees that make up state machines become massively more complicated as you make characters more intelligent. These decision trees feature a hierarchy of behaviours, and are therefore called hierarchical finite state machines (HFSMs).
Like a family tree, this starts with a single state (such as charge), which can then branch out to a behaviour (such as engage or retreat). Each of these behaviours will then have a number of sub-options (such as flee or guard), which can also branch out into even more actions. The sub-options will compete with each other for execution, and the winner will be decided by the parent behaviour, which makes its decision based on the sub-option's weighting and relevance to the current situation.
The AI for Relic's Dawn of War 2 uses a state machine with roughly 20 states
Chris Jurney, a senior programmer for Relic, offered the example of the state machines in its RTS, Dawn of war 2, to illustrate this. "The AI for Dawn of War 2 has roughly three main layers: the computer player, the squad and the entity," says Jurney. "The squad and the entities are both hierarchal finite state machines, and we have roughly 20 states at the squad level and 20 at the entity level. The states at the squad level pretty much map directly to orders that can be issued by the user.
"For example, if you issue a capture order to a squad, the squad will enter the SquadStateCapture state," continues Jurney. "This state might find that it's not close enough yet to capture the point, so it will sub-state to SquadStateMove. In this state, the formation movement system will kick in and start issuing Move orders to individual entities, and the entities receive these commands and enter the StateMove state. Inside that, paths are found, dynamics are applied and generally, the individuals try to look smart as they perform their orders."
Some states will also be shared among multiple character types where relevant, and most characters spend most of their time using just a few standard states, with the others waiting in the wings for extraordinary circumstances.
Crytek's Matthew Jack provides the example of the North Korean soldiers in Crisis which he says have around 25 states from which to choose, but their "core behaviour is defined by only a small number of these". Jack estimates that "perhaps 90 per cent of their time is spent in only five states", while the other states "handle specific circumstances such as using or reacting to vehicles, using special weapons and so on".
State machines have been the staple core of gaming AI for decades, but they can only go so far before they get out of hand. A non-player character can have scripted or partly random behaviour, but using a state machine will mean that it's inherently predictable to a certain degree. If you want to make a character appear more sophisticated, you need to make its state machines increasingly bigger - until they reach the point where they become unmanageable.
Creative Assembly's studio communications manager, Kieran Brigden, explains, "When you start talking about an ever-expanding number of states then you start to run into problems. For example, if you have a huge number of states for a man in Empire: Total War, you then put 160 men in a unit, and 20 units in an army, think how many decision trees you're having to do for every single man on the screen. We're talking about several thousands of guys on the screen at one time; each of those has a massive state-based logic tree, so never mind your two percent processor overhead."
Crytek's Matthew Jack concurs, saying that "the problem with state machines is defining all the transitions between them; 25 states isn't such a big number for an AI character, but it can be hard for a developer to manage all the ways that a state might be reached, and ensure that it can be reached where it's needed". Without revealing any specifics, Jack also adds that "since Crysis, we've moved towards other behaviour models to help us manage that complexity".
The Planning System
So is there an alternative to using state machines in gaming AI? Not entirely - state machines are still required to define simple behaviours in gaming characters, and they're perfect for the job, so there's no need to change them. Creative Assembly, for example, still uses a state machine for ‘man shooting', among many other states.
The problem is when you want to make your characters act more intelligently without cluttering up your CPU resources and creating huge, complicated workloads for programmers. The answer to this is a relatively recent development in gaming AI called the goal-orientated planning system.
Monolith's F.E.A.R was one of the first games to bring planning systems to the fore when it launched at the end of 2005, and it was instantly noticeable when you saw multiple characters on-screen acting uniquely and in some cases, very humanly. Gone was the straightforward shooting and wandering around, and in came an advanced force of enemies that would vault over obstacles, duck to avoid fire and kick you during close combat.
In simple terms, a planning system makes a gaming character work out what it has to do in order to fulfil an objective. Instead of having a series of rule-based behaviours in a state machine jostling for attention, it will have a set of goals or objectives, and the character has several choices of ways to achieve them, which depend on the current circumstances. This diagram (see right) was drawn by Jeff Orkin from the MIT Media Lab, who also worked as a senior software engineer on F.E.A.R., and it shows how an AI character forms a plan using a goal-orientated planning system.
In Empire: Total War, typical goals would be ‘make sure my flanks are secure' or ‘look after this fort efficiently', which are always prioritised according to the situation. In order to achieve a goal, the computer needs to look at the situation in which it wants to be, and then work backwards from it to calculate the best way of achieving the desired result.
Creative Assembly introduced its own planning system in Empire: Total War, which could make the AI much more unpredictable compared with previous Total War games. Richard Bull worked on the battle AI in the game, and describes a planning system as "like a state machine that re-writes itself", but adds that "it isn't a magic system in that it isn't capable of creating its own objectives or tasks; we have a repository of things that the AI can do, and a repository of things that the AI might want to do".
When using a planning system, the AI works on a per-goal basis, working out which goal to prioritise after analysing the situation. In Empire: Total War, Bull gives the example of the AI "rolling its line infantry with cavalry on its flanks and with artillery on its wide flanks waiting to get within range". Bull explains, "It would then be dangerous for the planning system to say 'well actually, I want to use these guys to do this job over here' at the risk of damaging the whole army plan, so the AI takes the resources into account, as well as their current activity and state of action, plus the enemy threat and the terrain."
Bethesda also moved over to a planning system for the combat AI in Fallout 3. Jean-Sylvere Simonet, a programmer at Bethesda who worked on the AI in the game, explains that it "combines simple actions (small state-machines) into entire plans. We implemented a planner for this after trying out other alternatives. In the end, an NPC in combat was able to prioritise goals and actions to maximise their chance of success".
If you kill one of these Super Mutants in Fallout 3, the other won't think twice about picking up his dead brother's weapon if that's the most efficient way of achieving his current goal
The end result in Fallout 3 is that an NPC works out the most efficient way of achieving its goals, which makes it appear as though it's thinking for itself. It's impressive stuff to watch. For example, suppose there are two Super Mutants in front of you, you kill one of them, and the other one then runs out of ammo. The remaining Super Mutant will then have the goal of finding a new weapon, so the planning system will work out the most efficient route to obtain one; if the nearest weapon is on his dead brother's body, he'll pick it up. This level of unpredictable behaviour is very complicated in a state machine, but a planning system means that the NPC does it automatically. Planning systems aren't the answer for everyone though.
Relic, for example, still uses state machines for the AI in Dawn of War 2, since the developer wanted the AI to be predictable to a certain degree. Chris Jurney points out that when a player issues an order to a squad, "he then expects that unit to carry it out to the best of his abilities. Because of this, it's very important that units perform consistently so that users can build an expectation of what will happen when they click".
"As a result," says Jurney, "for the purposes of AI, this means that predictability trumps smarts, so things like learning or re-writing decision trees are out. The expectation of getting the exact most optimal behaviour is a lot higher in RTS games as well, because the player can see every action of every entity he's controlling and fighting against."
So you now have characters that look as if they can think for themselves, but how do they know where they are, and how do they know what their environment looks like so that they can reach you in the most efficient way?
NPCs can't see the same game world you're viewing - that would take a ridiculous amount of processing power. Instead, they need a guide to the areas to which they can go. This often takes the form of a navigation grid that shows obstacles and the shape of the terrain, but developers often also use a navigation mesh (usually called a navmesh) that's local to the current 3D scene.
A navmesh is basically a low-resolution 3D mesh that sits on top of a 3D scene and shows an AI character the areas to which it can move, as well as featuring tags and hints to the AI that have been placed by the designer. On the AI development blog for F.E.A.R. 2 : Project Origin Monolith's senior software engineer, Matt Rice, explains that "the designer will place this invisible mesh any place he wants the AI character to potentially move to".
The navmesh in F.E.A.R. 2 features tags that indicate to the AI that it must play a particular animation, such as a dive, in order to traverse that area
Rice provides the example of an NPC in the game deftly manoeuvring between objects that provide cover, and then diving through an open window to flank a buddy who's taking fire. Rice explains that "this is accomplished by our fancy animators creating a palette of animations for the designers to use in any given space. The designer will then tag a portion of the navmesh to indicate to the AI that it must play a particular animation (a dive for instance) in order to traverse that area".
This might sound a lot like scripting, but the NPC will decide for itself whether diving through the window is the best course of action by using the game's planning system. If it needs to dive through a window, the navmesh will tell it which animation it needs to use when doing it. Amusingly, Rice adds that "if the resulting path ends up by looking cinematic and awesome, then sweet; I'll take credit for it. If not, it's back to the drawing board for myself and the designer or animator, or both".
A navmesh provides more flexibility than a standard navigation grid, and not just because of the animation hints. Bethesda's Jean-Sylvere Simonet points out that the navmesh used in Fallout 3 (see image, right) for local path finding "turned out to be the perfect support for storing cover information that the NPCs needed, and it also allowed us to have a unified pipeline for making small and large creature paths around the environment. Whereas our previous system (a navigation grid) assumed a standard character size, the navmesh let both Bloatflies and Behemoths use the same data to find their way around corridors and alleyways".
Simonet also points out that a navmesh can help the AI with much more than pathfinding. "Because of the inherent spatial information it stores," says Simonet, "we were able to write algorithms that 'thought' about the space. Searching behaviour, for instance, was able to use the navmesh data to map out areas that an NPC had investigated quickly and efficiently, and it could then come to a conclusion about where the target was."
Lie of the Land
In a similar fashion to Fallout 3 and F.E.A.R. 2, the terrain analysis in Crysis was managed by first triangulating the terrain; after that, the AI can derive a path-finding grid from the triangulation.
Crytek's Pavel Mores explains that this grid "was unified in the sense that multiple kinds of agents could use it. This was achieved by storing additional data with nodes and links (hints placed by the game designer) to decide, for instance, whether an agent is small enough to pass through a link or whether it can handle the water depth associated with a link".
As well as the terrain triangulation system, Crysis used traditional waypoint graphs to define longitude and latitude on the surface, but it also featured what Mores describes as "specialised support for navigation through 3D space".
He explains that "one part of this is what we call flight navigation, which is effective for volumes where an up-vector can be reasonably defined (where there's gravitation). The other is volume navigation, which supports free movement in space, so it's more general at the cost of more complex pre-processing".
Navigating inside 3D volumes is an AI challenge that's becoming increasingly more important. We aren't only up against foes on the ground any more; in a game such as F.E.A.R. 2, or even BioShock, we're now up against mutant beasties that can climb walls and traverse ceilings, as well as hanging off ledges and falling through the air. Monolith uses an impressive method of addressing this in F.E.A.R. 2 with the use of what it calls ‘segments'.
The segments in F.E.A.R. 2 are complicated work for programmers, but they allow game characters to move through a 360-degree environment, rather than just on the ground
Monolith's Matt Rice explains that "segments are a series of points and links, which may or may not be on or attached to the navmesh. The AI is free to create paths using these segments just as it would on a normal navmesh". However, unlike a polygon in a navmesh, whereby an AI character is allowed to move to any point within that polygon, a segment is placed in a very specific point in the scene, and these points can be located anywhere in the environment. This, Rice says, gives the "AI characters free rein over the environment, no longer tying them to moving along the floor".
When you combine segments with a planning system, you end up with chaotic scenes in which you can have monsters coming at you from all directions, and in an unpredictable fashion. As an example, Rice says that "while the player is focused on the half-naked Abomination directly in the front, his compatriot will have leapt from the floor to the ceiling directly above, to the wall at the right, and ended up directly behind the player ready to pounce".
Segments are a very complicated feature to implement though. Rice explains that "for the designer, the added complexity of managing all of the segments and links can be daunting". However, he's confident that "anyone who has battled F.E.A.R. 2's Abominations will appreciate the many acetaminophen pills the team consumed".
Meanwhile, Relic uses a large number of one-megapixel navigation grids to store the terrain data for the AI in Dawn of War 2. As RTS veterans will know, Company of Heroes introduced the concept of using terrain and objects in the game as various levels of cover, and this continues in Dawn of War 2. To do this, the game needs two grids for cover - one for the terrain, and another for objects and obstacles.
In addition, the game features a grid that describes the largest size of unit that can go into each cell, plus a grid that the units can use to reserve their destinations when moving, so they don't end up competing with another unit for the same space. On top of that, there's another grid that describes where units are going to be located in the near future - again to avoid collisions. There's also a ‘hierarchical sector map' that describes where cells with identical properties are merged together, which speeds up long-distance searches. That amounts to a total of six maps needed to transport NPCs from A to B in an intelligent fashion.
Relic draws its maps dynamically using rasterisation, just as you would in the 3D world. Relic's Chris Jurney explains that it's "just like what a graphics card does when it draws triangles to the screen. When you put a new building into the world, we draw the box that it blocks into the various map grids to update them, exactly like rendering a quad (a group of four pixels) in an overhead view".
An advantage of Relic's system of rasterised designer-hinted maps is that it makes it easy for the game to keep up with changes in the environment. With increasingly complicated physics algorithms, game developers are under pressure to make game worlds react as realistically as possible.
One of the hot ideas at the moment is destructible environments, in which you can destroy or damage an object such as a tree or a building, with the result having an effect on the gameplay. Dawn of War 2 will feature destructible battlefields, but Jurney says that this is pretty simple.
"We keep all the maps and world representations up to date as the world changes due to construction or destruction," he explains. "So as long as the AI makes decisions often enough, they respond instantly to changes. Our AIs analyse their environments continuously (roughly every 0.5 to 1 second); as soon as the world changes, their behaviour changes to match it."
Crysis and Crysis Warhead both featured eminently destructible environments
A game that was particularly notable for its destructible environments was Crysis, in which you could drive a jeep through a jungle and knock down trees in front of you. Crytek's Pavel Mores explains that the developer needed multiple ways of dealing with an environment that could change dynamically and unpredictably. He gives the example of the AI performing a path-finding search on the navigation grid as though there were no dynamic obstacles present, and then retrieving the dynamic obstacles near the resulting path and adjusting it accordingly if there was an intersection.
A rich navmesh can also help when it comes to destructible environments, and Bethesda used this in Fallout 3. The navmesh can be dynamically updated in the game when large rigid bodies are removed. This, of course, would usually use a fair amount of processing power, but Bethesda got around this by using simple square boxes to represent a rigid body on the mesh.
Bethesda's Jean-Sylvere Simonet says that this works fine though. "In practice," he says, "most dynamic objects have a pretty square outline (cars, buses, tables and so on), so this worked out. The process is to then find out the area of the navmesh that's overlapped by the rigid body and retesselate that area, leaving a hole along the bounding box's outline. Then we have to recompute cover edges on the fly, but that turns out to be quick, since there are only four new edges to check. The result is that NPCs can then use these dynamic objects as cover."
Could AI be the next big thing for GPGPU processing to tackle in games? There aren't many gaming features that don't have a hardware accelerator these days. A GPU can accelerate both 3D graphics and physics, while a decent sound card can accelerate advanced surround-sound effects. As AI is such a fundamental part of gameplay, shouldn't we be accelerating that with hardware too?
In 2005, an Israeli company called AI seek announced a dedicated AI processor called the Intia, which was designed to accelerate some AI features, including terrain analysis and path finding. However, we've yet to see any products based on the technology. The problem with this sort of dedicated hardware is that AI is such a fundamental part of gameplay that it can't be optional.
As such, an AI processor would need to be a part of a standard PC.This is what makes AI ripe for the picking when it comes to GPGPU technology. Almost every PC gamer has a graphics card, and providing it's compatible with Nvidia's CUDA or AMD's Stream technology (or a cross-platform GPU API such as OpenCL), it could be used to take some of the load from the CPU when it comes to repetitive AI processing.
AMD's head of developer relations, Richard Huddy, explains that the most common AI tasks involve visibility queries and path finding queries. "Our recent research into AI suggests that it isn't uncommon for gaming AI to spend more than 90 per cent of its time resolving these two simple questions," says Huddy. He adds that these two queries are "almost perfect for GPU implementation", since they "make excellent use of the GPU's inherently parallel architecture and typically aren't memory-bound".
Could the GPU soon be accelerating AI in games?
Nvidia agrees with this. Director of product management for PhysX, Nadeem Mohammad, explained that "the simple, complex operations" involved with pathfinding and collision detection "are all very repetitive, so pathfinding is one of the algorithms that works very well on CUDA". Mohammad adds that ray tracing via CUDA could play a useful part in AI when it comes to visibility queries. We aren't talking about graphical ray tracing, but tracing a ray from a bot in order to work out what it can see. "You have to shoot rays from point A to point B to see if they hit anything," says Mohammad. "We do the same calibration in PhysX for operations such as collision detection.
"You can always imagine CUDA as loads of processors running the same program but not the same instruction, and ideally on the same data set but with different input parameters," adds Mohammad. "So, in the context of AI, the data set consists of the whole game world, and the parameters going into it are the individual bots - that's one way of neatly parallelising the problem. If you look at it in that context then any AI program could be accelerated."
Both AMD and Nvidia claim to be working with several game developers and middleware developers in AI. According to Huddy, "some middleware providers are looking at this in terms of packaging a GPU AI library for games, while some developers are looking to transfer their own existing AI code from CPU to GPU".
Mohammad estimates that we'll see GPGPU- accelerated AI soon. "I don't expect it within a year," he says, "but definitely within 18 months." GPGPU-accelerated AI also appeals to game developers, who are constantly on the lookout for ways to make their systems intelligent without using up valuable CPU resources. "I think there's a lot of potential for GPU acceleration to benefit AI," says Chris Jurney, Relic's senior programmer on Dawn of War 2. "All our AI is grid-based, and we're already using rasterisation to keep our maps up to date, and for line-draws on those maps to test for passability, so it's a great match."
Crytek's Markus Mohr also supports GPGPU-accelerated AI, saying that "with widely available parallel architectures, we have the opportunity to achieve new levels of quality and quantity". Mohr also notes that it "doesn't make much sense to develop dedicated hardware for AI," such as the Intia chip. Creative Assembly's Richard Bull agrees that "hardware advances that can offer processing power will allow us to think of more things that we can possibly do, certainly for planning or prediction". However, he also points out that the battle AI in Empire: Total War "peaks at around two percent of the logic usage" (see image, right).
This doesn't mean that AI isn't in need of advanced processing power; it just means that AI in its current form has limited resources available to it, and it has to compete with graphics for even a small share of processing power. Bull offers the example of an unbeatable chess computer that "gets to an end state by working out every possible move for every piece. A chess problem like that can, even now, with a lot of processors take hours, weeks or months to work out an optimal solution". Basically, there's always room for more processing power for AI.
"What runs slowly on the CPU, maybe taking 25 percent of the CPU load, can run efficiently on the GPU, adding only one or two percent to the total GPU load," Richard Huddy points out. Even a small share of a GPU's resources could boost AI processing power for some tasks, and it looks as though GPU companies and game developers are keen to see this working soon.
Not everyone is convinced that GPGPU is the best way of processing AI though. Bethesda's Jean-Sylvere Simonet notes that "we might be able to take advantage of parallel architectures, but not for everything. You could probably speed up some individual parts of the decision process, such as replacing your AI search with a brute-force GPU approach, or running a pattern detection algorithm". However, Simonet also points out that "most AI processing is very sequential and usually requires a lot of data.
"For an NPC to decide on its next action, it will usually have to query the world for a tonne of information, and most of that information is conditional on a previous query result. For that reason, fewer processors that are more versatile, such as the SPEs in the PlayStation 3's Cell chip, are ideal".
It's also worth noting that both AMD and Nvidia are talking about utilising this in CUDA and Stream at the moment, which would make it hard for GPGPU-accelerated AI to become a standard in the gaming industry. "We rely on the fact that all CPUs in all PCs come up with the exact same result to make multiplayer work," points out Relic's Chris Jurney. "We're only transmitting the inputs and commands between players, so if two GPUs decide on slightly different results in the AI, different players will be playing in different worlds."
Perhaps OpenCL or DirectX 11's Compute Shader could provide a way to accelerate AI on an even larger array of GPUs. Either way,
it looks as though AI could be a major advancement in GPGPU processing.
AI in Games - Where Next?
AI is clearly becoming much more important in games, but it still has to compete with pretty graphics when it comes to processing resources, and it's often at the bottom of the pile.
Creative Assembly's Richard Bull notes that "there's still this disturbing mindset among programmers, particularly game programmers, that if the AI is taking any kind of considerable chunk of time, that's a really bad thing. It's only just getting to the stage now where people regard it as important enough to deserve this chunk of time in a game. If your graphics rendering is taking up 50 percent of your CPU time it's like 'well, never mind, it looks great', but if you try to tell people that you have this really intelligent decision-making system that's taking up 30 percent of the CPU time, they'll say 'you obviously don't know what you're doing, it's badly programmed' and so on".
However, with GPU hardware support and multithreading becoming more widespread, hopefully, we'll see AI using a greater amount of resources. Just take a look at your CPU resources when you're running Fallout 3 or GTA IV, and you'll see that your CPU is being hammered. Anything that can provide game developers with more AI processing power will be warmly welcomed by the gaming industry, and we can then progress to even more sophisticated AI systems.
Creative Assembly's planning system means Empire Total War's AI analyses its resources before deciding on a goal and then working out the most efficient way of achieving it.
Where is gaming AI going in the future? Creative Assembly's Richard Bull points that "once you're past the state machine, you've got the planning system, which is the rewired state machine, then the next step is online learning". Online chatbots such as Jabberwacky and Eliza learn as a result of constant online input, and an AI system could do the same. Creative Assembly already has a similar technology in the form of a debugging system that's used in its research and development process.
"We have this system of replays, and we have the ability to push these replays back through our AI debugging system, and we can get information from that," says Bull. "A lot of the advances in our planning system were trained by just running replays through this thing. It's completely offline, but it would be interesting to have some kind of system where we could just get gamers to send their replays in."
With a constant online feed of battle results coming in from players using different strategies, the AI could then balance the stats and learn other strategies from human players. Creative Assembly's Kieran Brigden enthusiastically notes that "it's impressive watching the AI go through the debugger - you're basically almost talking about a contained feedback neural network". Bull describes the potential as a "kind of 17th century Skynet", which could feed off loads of online data and improve considerably.