There is, I think, a major difference between game performance and performance in other applications. For most software, the goal is to make everything as fast as possible all the time. For games, the goal is slightly different. At Supergiant, we want our games to run at 60 frames per second, which means we want each frame to take no more than ~16 milliseconds.
Running much faster than 16 milliseconds is wasteful. Gamers aren’t going to notice a difference between 100 frames per second and 60, and the 6 milliseconds we’d save reducing the frame rate from 100 to 60 could be used to add more effects and objects – essentially, to create a better game.
Frame rate is tied to another concern, too: input response time. Games are an interactive experience, and gamers expect an immediate response to their actions. A response time that’s greater than 4/60th of a second feels sluggish and unresponsive, and fluctuations in response time are even worse.
All this means there’s a continual back and forth to keep the game running at 60 hertz. As the engineers create more headroom through optimization, the artists and designers can make the game that much better. The more times we can iterate on this loop of optimization, the better the game becomes.
This is where performance profiling comes in – it’s critical for delivering a seamless, immersive, and interactive experience. We need to identify bottlenecks and fix them, then measure and verify that the frame rate is both low and consistent. We use ANTS Performance Profiler, which allows us to attack both of these problems. Two features of ANTS are especially important for my work.
The first is the ability to create event markers. It means that if there’s a hitch, I can click my mouse, mark that section of the timeline, and investigate further. Similarly, I can click to mark a section that I’m interested in profiling. With other profilers, you have to enable and disable profiling at just the right time, which is error prone and wastes time.
The second feature is the ability to look at just specific regions of the timeline. This feature enables me to look at specific frames that are too slow and find the bottlenecks. Most other profilers don’t allow you to look at regions of the timeline. All the profiler data is aggregated together, which is very problematic for me, because I’m looking for the 10 frames that are slow out of 1000, and that data gets lost in the aggregation.
I find it best to profile throughout the development process, so I can stay on top of performance issues as they arise – as soon as I notice the frame rate dropping, it’s time to fire up ANTS. It’s proved invaluable for a couple of problems we had with our latest project, an original action role playing game called Bastion.
There was one weapon in the game that one day, all of a sudden, started causing a significant hitch when you fired it. The frame rate would drop to 2 frames per second. It was a big mystery because, as far as anyone knew, nothing had changed with that weapon in months. I fired up ANTS and set out to investigate. Right before firing the weapon in the game, I clicked my mouse to mark the region on the timeline. Looking at the marked region, I immediately located the culprit. We had recently implemented homing for some projectiles, and the code that searched for a target to lock-on to was especially slow. But the real problem was not the homing code, it was that the weapon’s projectiles didn’t use homing in the first place! It was finding a target to lock-on to, and then ignoring it. Because this weapon had a dozen or more projectiles, it was bringing performance to its knees. I changed the code to only find a target if it needed one, and we were back to running at steady 60 frames per second. The whole process took less than an hour.
ANTS made this process so smooth because I could look at exactly the region that was causing the issue, and see the source code side by side with the timings. Being able to see the source code and the context let me see that this was a logic bug. If I had been using another profiler that only showed a call graph, it’s likely I would’ve immediately started optimizing the “find target” function, rather than looking at the context, which would’ve been a waste of time.
In another case, we noticed that some of our larger levels performed much worse than the smaller ones. Using ANTS, I realized the problem was that too many game objects were being updated every frame. ANTS allowed me to see this problem in two ways. First, the Hit Count column showed me that these lines of code were being hit thousands of times per frame. Second, the line level timings showed me, through the Avg. Time column, that each line on its own was very fast, or as fast as it could be. In fact, most objects’ update function didn’t do anything at all and just returned. This meant that the problem was the number of objects. The fix was to identify the objects that didn’t need updates and simply not update them. It was one of those rare cases where we had so many objects that the function call overhead itself was the bottleneck.
Over the course of Bastion’s development, we have tried alternatives. Before we started using ANTS, I once had to instrument our code so we could do our own profiling. It was a slow and tedious process. It took me all day to track down a single line of code that was causing a hitch. With ANTS it would’ve taken me a few minutes.
Using other performance profilers slowed Bastion down so that it was unplayable, which made getting profiling results for a “typical” playthrough very difficult. In addition, none of the free profilers had the ease of use that ANTS provided, without sacrificing depth, and thus far no profiler has let me optimize as quickly or frequently. The speed with which I am able to iterate on optimization is one of ANTS strongest assets. As with debugging, 50-90% of optimization time is often spent tracking down the bug or the bottleneck. With ANTS, that time is reduced to 5-15% at most, a huge boost in productivity that makes a world of difference in our final product.
We are also at a stage now where I am optimizing for the Xbox 360, which has a slower CPU than our development machines. Although I can’t use ANTS to profile code on the Xbox, I can still use it to see which code is relatively slowest and optimize that way. So far it’s proven to be just as effective.
To try out the latest version of ANTS Performance Profiler, visit the Red Gate website.