Porting to Lower Systems Difficult

Porting to current-gen platforms is always a difficult task.  Below explains:

Fast, Simple, Higher Level Optimization

We recently had the rather daunting task of porting a next generation SDK to current generation and handheld platforms. This presented a variety of difficult challenges with one of the most rewarding being performance optimization. Over several weeks, we used the often overlooked methods in this article to gain an approximate 4x increase in performance and a healthy reduction in library size. A little assembly language was necessary but on the whole the bulk of the optimization was high level.

First Step – Profile Profile Profile

Before undertaking any optimization exercise it is necessary to establish a solid foundation for which to benchmark optimization changes. To do this we set up a simple repeatable scenario in our game. This scenario represented a rather typical but nonetheless demanding situation for the SDK. In our case, we captured approximately 30 seconds of profiling data on each run. Results were then saved and compared with subsequent profiling runs.

We spent a lot of time in the profiler doing essentially two different types of run. The first type was sample-based profiling where the profiler interrupts the application at a high frequency. With each interrupt the program counter is queried and the tuning software makes a note of which function is being executed. This method of profiling is crude in terms of accuracy (due to the overhead of interrupting/logging results) but it is useful for quickly spotting where most cycles are being spent. Once we’d established potential offenders we would then go through the source code and perform a second, more focused function specific profile. We were then able to quickly establish exactly where the time was being spent. The next step was then to figure out why.

Second Step – Look at the Code

Sometimes, we get lucky and we can spot a bottleneck in a higher level function immediately. Other times it is not so obvious and this was especially true in our case as the code had already been optimized for next generation platforms. With this in mind, we resorted to cross referencing the source code with the actual assembly language that had been generated. Most debuggers have a ‘source code annotation’ option that makes this easier. By doing this we were quickly able to spot areas where we thought the code was unnecessarily bloated. It is important to say that one doesn’t need a great knowledge of assembler programming to do this. Sure it helps, but as developers we have a ‘feel’ for how much code something should take.

On the following pages, I’ll spend some time covering some of the more beneficial optimizations that we made.

More at: http://www.gamasutra.com/features/20060913/whitaker_01.shtml



Around The Net: