Evil Stick Blog

Tag: Technology

DX9 Engine Design, part 1

by Evil Stick Man on May.20, 2009, under Game related, Technology

Note: A lot of this is based on http://ati.amd.com/developer/gdc/D3DTutorial3_Pipeline_Performance.pdf, which was a presentation given at GDC several years ago (I don’t know the exact year). It is based on DirectX 9, and while most of the info is probably still valid, it wouldn’t surprise me if there are several items that were no longer valid as of the advent of DirectX 10 and later. Most of what I say will be a basic thought exercise stemming from the presentation, coupled with my experiences working with a DX9 engine at my last position.

At the core of every 3D game is its graphics engine. The graphics engine determines what the player sees, which then determines how they react. It’s something that’s very easy to do wrong, and the specifics are often under heated debate both in a company and on the web. Below are some guidelines to use when designing a high-performance 3D engine, or when optimizing an existing engine. There won’t be any code here, as I currently don’t have an engine utilizing these principles (it’s currently under development), but more some tips I discovered on the job.

Know thine enemy

First, before we embark on any engine enhancement journey, we need to make sure we can actually pinpoint where our engine slowdown is originating from. There’s no point in streamlining your vertex information if what you perceive as slowdown is actually taking place in your AI or physics code. There are a number of ways to gauge overall engine performance, and determine at least the major causes of slowdown:

  • Keep ample statistics - the success of your optimization effort, as well as turnaround time on improvements, will hinge on your ability to accurately judge the results of your improvements quickly. One of the best ways to do this is to develop a set of metrics in your engine. Proper encapsulation coupled with judicious use of time keeping will go a long way towards identifying problem areas. Encase your code in #ifdef __DEBUG” blocks and you can’t go wrong. At my last position, we often relied on these statistics to detect major problems both in the engine and in the engine management structure - for instance, if we weren’t releasing an image and were leaking memory to the video card, we could tell pretty quickly by watching a list of loaded images, coupled with instance counts.
  • Find some effective third-party tools - when I was tracking down engine speed blocks, both PIX for windows (included with the D3D SDK) and NVIDIA’s PerfHUD proved invaluable. PIX was great for tracking actual geometry issues - if something was coming out garbled, I could quickly use PIX to view the offending D3D call and the vertex/index buffers causing the trouble. PerfHUD, on the other hand, did a great job of providing metrics on the graphics pipeline itself. At a glance it’d give you load on your shader processors (vertex, pixel, and geometry), number of DIPs, memory usage, triangles in a scene, and so on. The instruction manual for this tool is in excess of 40 pages, and it is packed full of useful ways to utilize this tool.
  • Profiling helps, but it depends on your profiler - I was able to obtain some info using Visual Studio’s profiler, but at a certain level of engine complexity it fails to be useful. Most of the discoveries I was able to make were largely due to our own engine metrics, as well as use of the aforementioned tools.

Basically, you’ve got two bottlenecks in your engine: the CPU, and the GPU. Maxing one will impair the performance of the other. 95% of games are CPU limited, so start there in your quest for optimization. You can tell pretty quickly if you’re CPU limited by watching the graphs in PerfHUD - if you’re pinging only 25% GPU usage and only sending out 100k triangles and you’re still running slow, odds are you’re CPU bound and no amount of engine optimization will help you. In fact, contrary to popular wisdom, in these cases you want to do more work on the graphics card, not less, to take some load off your CPU.

Watch your creations and deletions

If you’re creating and deleting graphics objects during your game’s main loop, I highly recommend reorganizing your code structure. Creation and deletion are time-consuming operations, and can drastically impact performance. Your game should have, at least, a load and an unload phase that doesn’t impact play at all. There are exceptions to this (specifically when dealing with large, open worlds), but if your levels are small there should be no reason for you to constantly create and delete objects in your game loop. Store objects to be destroyed in a queue, and kill them when you have some extra time in your rendering (metrics in your engine help here).

Use all of your time

Set your D3DPRESENT settings to “IMMEDIATE,” and take control of engine rendering yourself. Letting DirectX do vsync for you is nice, but when you use this your present calls will block until their 1/60th of a second is complete (at least in my testing that would happen). In a highly dynamic environment, such as a game engine, this will mean that you’re going to start dropping frames once your non-rendering code takes more than its alloted time to run, and that for a brief segment of time you’ll be effectively running at 1/2 your refresh rate.

Basically all you need here is a method to call “present” at a regular time. You can either set a timer event to simply fire off a present call every 1/60th of a second (if you’re going for 60FPS), or at the least you can control the slowdown on a more granular level. By making some intelligent use of engine compartmentalization, you can also spread out your engine’s labor so that you minimize cyclic slowdown. Keep a queue of tasks that don’t necessarily need to be performed immediately, and then wait for some free time in your engine. Keep an overall frame time, and if you get to a draw call after having taken less time than expected, use that extra time to kill some objects, or load some objects, or do some poly sorting. How you end up specifically using this will depend on your engine architecture, but this’ll give you one more dial to play with when optimizing for performance.

The graphics card is a pipeline

One thing to keep in mind is that a graphics card, at its core, is largely just a big parallel-processing beast. It wants to do one thing, and do it repeatedly. Think of your graphics card as a water pipe - it takes the water it gets in a fixed path, and does it very well. However, if you want to change that path you need to completely drain the pipe before you reconfigure things, which reduces the amount of water you can push through. With that analogy in mind, here are some sorting criterion from your object that transcend the standard “Sort objects front to back, and back to front for transparency” logic.

Avoid excessive shader changes

After you’ve sorted your polys from front-back, try sorting your object by shader. The vital distinction here is to avoid excessive switches between the programmable pipeline and the fixed function pipeline. A previous engine I worked on had cars that were made up of multiple objects split pretty evenly among shaded and unshaded. By organizing the game to draw all the shaded parts first and then draw all the fixed function stuff, we were able to get about 5 extra frames per second.

Avoid excessive state changes

State changes (such as texture state changes) are processed by the CPU, and not the graphics card. After you’ve sorted your objects by z-val (which you might try doing in the shader when you’re CPU-bound) and vertex shader, try sorting by texture stage state.

Avoid excessive texture changes

Sort your objects by the texture they use, if possible.

Divest your rendering code from your game

At its core, the graphics card just wants to sit there and draw polys. It doesn’t care about your awesome physics, or your lifelike AI, it just wants to make pretty pictures. So try to divest your actual geometry data from your game logic as much as humanly possible. The ideal engine in my mind has the rendering code running on a separate thread, just spinnig in a loop, flinging polys at the graphics card. Consider storing all of your geometry objects in a geometry manager, which handles all of your sorting and communication with the card. Return a pointer to your parent object if you need it, and you’re good to go. More on this in later posts

There’ll be more to come here, but I figure that’s enough for now. Watch for part 2 in the near future.

Leave a Comment :, , more...

The joys of a new job

by Evil Stick Man on Apr.22, 2009, under Randomness, Ravings, Technology

Somewhere along the line I read an article (I think it was on http://www.joelonsoftware.com, but I can’t be sure) that stated that it takes a new programmer anywhere from 6 months to a year to become fully productive and competent in a new environment. In my experience this has been pretty accurate. Mind you I’m not saying that you can expect a new programmer to sit there with his thumb up his ass for 180 days - far from it! It’s just that there’s a certain amount of sluggishness in their work as they flounder in the unfamiliar code, finding their way around much as a new citizen would wander the town without a map. I typically take 4-6 months before I feel competent enough in a system to have strong opinions about its general function. Of course I’ll have made contributions by then - with my past jobs I’ve typically had functional code released within the first two weeks (depending on the company and the release cycle) - but inexperience with the system coupled with a need to discover the lay of the land tend to put a dampening effect on my desire to rock the boat.

My new position has been more of the same, so far, but with this being my fourth company over the past 5 or so years, I’m starting to notice some patterns:

Whether or not you realize it, you are being unconsciously judged by your coworkers based on how long it takes you to pick things up.

This is not something that will necessarily be set in stone for the duration of your employment, but it can color many of your early interactions. Of course you never exactly know how you measure up, not until it’s too late to change anything, but you can gauge your progress based on interactions with coworkers. Call it the “time to guru rapport” measurement - every company has its guru programmers, the guys that have been there since the beginning and know everything there is to know. The sooner you are on a comradely level with a majority of them, the better you are doing.

A coherent, well-documented code architecture is invaluable to ramping up new programmers.

80% of your development time should be spent in design. The architecture of a system is by far the best indicator of an organization’s dedication to code quality. It’s also the first evidence that a new developer will see of the type of coding environment at your company - will that neophyte see a culture that praises re-use and judicious application of patterns in a logical manner? Or will they see five solutions to the same problem written under a looming deadline?

No matter how intuitive something looks, it is not as intuitive as you think it is.

Everyone comes from different perspectives. Coworker A may have 5 years in .NET, coworker B 6 years in LISP, and you’re all writing in FORTRAN. What passes the “5 minute” test with you may not do so for someone used to a different idiom. What’s obvious in functional programming may be anathema in procedural programming (we’ll hold the “FP makes coders better” discussion at another time), and what seems like the simplest thing in the world may be a foreign concept to the most surprising of people.

There’s no such thing as too many comments for a programmer stumbling his way through a new system.

It doesn’t matter how ridiculous your current programmers think comments are. “Preposterous!” they say, “Good code comments itself! Anyone who can’t grasp this is a moron!” The question we need to ask ourselves is that if our organization is spending $50 an hour on a programmer, and one comment that takes thirty seconds to write can reduce their ramp-up time by an hour, how can you not write comments? I have yet to hear a decent argument that convinced me that there is any such thing as too many comments in the code - at the absolute worst, you lose nothing by having them (no matter how unnecessary they seem).

You can tell about a company’s commitment to code by the discussions that take place around its architecture.

If you kick off projects without an architecture meeting, you’re doing things wrong. It may seem like too corporate a model for your company, but keeping the knowledge of the overall program structure in the head of two or three people leads only to labor duplication. Conversely, multiple discussions for a simple service may point to a process that is too top-heavy.

Process is good if it is well thought-out, but process for the sake of process is overkill.

Having a clear sequence of steps from conception to realization is invaluable. On the flip side, though, one page of documentation per line of code is excessive. Similar to the previous item, balance is key in this area.

One can tell if a development shop is a firehouse with the answer to one simple question

“Was my workstation ready to go when I got to my desk?” An oversight such as this says more about the company than any two of these points combined. If your infrastructure group is not on the ball enough to have desks configured for new employees, or if there is not some process built up around onboarding, you’re sending a message that you like to fly by the seat of the pants and get things done quickly as opposed to getting things done well.

Firehouse development, while edge-of-your-seat cool, doesn’t lead to good code.

Every office has a hero - that guy who’s in until 4 AM, on his second pot of coffee working towards a release. Does your environment encourage this style of development? Many people have pointed out the negative correlation between hours worked per week and code quality, but organizations still persist in encouraging long hours in development with shortened schedules. The only time a programmer should be pulling 12 hour days is if something is metaphorically (or sometimes literally, depending on your job description) on fire. Any other time and it’s a failure of management, process, or both.

In any case, these are the things I tend to notice. What do you people look for in your first couple weeks on the job?

2 Comments :, , more...

Coding problems, logic problems, and the joys of job hunting

by Evil Stick Man on Apr.13, 2009, under Randomness, Ravings, Technology

This morning I came up with an answer to a programming/logic question I was given during an interview back in early February. Before going into the rest of it, let me give the problem (and the solution I belatedly derived):

Problem: Given a dimension n, output an array that represents an n x n matrix of numbers beginning at (n*n)-1 and decreasing to 0. The numbers must be output in such a manner as to “wrap around” the resulting matrix, and must be output as a single array. For example, if the function is given the number ‘3′, the output will be as follows:

8, 7, 6, 1, 0, 5, 2, 3, 4

Which is a linear, row-major representation of the matrix

8, 7, 6
1, 0, 5
2, 3, 4

You may not pre-calculate the matrix and index into it, nor may you store an array of results and index them based on the dimension. In other words, you need to develop a function that, when given a position (i, j) and a dimension n, outputs the appropriate value for (i, j).

My solution:

int calc(int i, int j, int n)
{
  if(i == 0 || j == (n-1))
  {
    return ((n*n) - 1 - i - j)
  }
  if(i == (n-1) || j == 0)
  {
    return (((n-1)*(n-1)) - ((n-1) - j) - ((n-1) - i))
  }
  if(n > 2)
  {
    return calc(i-1, j-1, n-2)
  }
  // should never get here
  return 0
}

Explaining the solution:
I don’t claim that this solution is optimal, or that this solution is the “right” solution. All I can say for it is that when I worked out the “wrapped” matrices for several dimensions and then ran those resulting matrices cell-by-cell through the above algorithm the results matched, meaning that if nothing else the function above successfully takes a coordinate input and returns the “wrapped” value. Indeed, if anyone has a better solution to the problem, please let me know!

One of the first things that should be obvious in this solution is the recursion (when n > 2), and I figure that deserves some explanation. If you write out several examples, you start to notice a pattern. For example, take a look at the matrices for n = 3 and n = 5:

            24, 23, 22, 21, 20
8, 7, 6,     9,  8,  7,  6, 19,
1, 0, 5,    10,  1,  0,  5, 18,
2, 3, 4     11,  2,  3,  4, 17,
            12, 13, 14, 15, 16

The larger dimension, n = 5, actually contains the matrix for n = 3 verbatim. In fact, this is true of all odd and all even dimensions - the matrix for n = 6 contains n = 4, and the matrix for n = 7 contains both n = 5 and  n = 3 (I’ll leave it to you to work out these matrices to see that this is true). So the recursion needs to occur on n -2 in each case, to move to the next odd or even dimension as applicable. A dimension of 1 outputs 0 in a single location, and a dimension of 2 outputs the matrix 3, 2, 0, 1 as expected given the above conditions, meaning that the guard against n > 2 is unnecessary, but I felt uncomfortable leaving it out.

Once you figure out the recursion, you then need a means to calculate the “border” values. The two equations you see above are the results of two or three hours of deliberations and testing by hand. There is a division between how you count each edge - the top and rightmost edge use one formula, while the bottom and leftmost edge use another. This is reflected in the two conditions above. (I originally wrote it out as four separate conditions, but condensed it into 2). The conditions check only for the edge cases - when i = 0 (top edge), when j = 0 (left edge), when i = n-1(right edge, or when j = n-1 (bottom edge).

OK, so what’s the big deal?
This problem has been bothering me for almost two months now. It was something I should have been able to do (and, indeed, I was eventually able to do it), and I’m pretty sure that my inability to provide the complete answer during the interview resulted directly in me not getting the job (especially considering I nailed the technical portion of the interview). More than that, though, it exposed a worry that I had that I was actually losing intelligence as I age. I felt strongly that I should have been able to solve it sooner, and indeed some anxiety as a result of this internal assertion probably led to my inability to provide the complete solution during the interview. That and the general stress of interviewing with a company whose phone screen was so technically intense that anyone who could make it into their offices for the in-person pretty much officially Knows Their Stuff (or so I’m told).

This problem bugged me so much that I spent time solving it while on vacation, sitting around with graph paper and a pen while my family socialized around me. For some reason I let this one problem, this one failed interview get to me in a very primal way, and I’ve been wracking my brains trying to figure out why I just couldn’t get over it. I tied my ability to solve problems (which, as a programmer, is pretty much my bread and butter) to my self-worth, and became slightly obsessed as a result. Indeed the obsession is still fading - half the reason for this post is the hope that someone out there in internet-land has a better solution, or at the very least can confirm my efforts!

One of the biggest problems any coder looking for work has is in proving themselves. Companies are looking for programmers who know what they claim to know, who aren’t bullshitting the interviewer into a job they’ll be forced out of in 6 months due to incompetence. Furthermore, many companies ascribe to the philosophy that a great programmer is many times more productive than a mediocre programmer (a philosophy I don’t necessarily disagree with, but more on that later) and thus orient their screening process towards ferretting out these artisans of ascii. This basically boils out into one of three interview patterns:

1 - The candidate is subjected to several intense hours of technical interviews covering topics ranging from mathematics to algorithms to data design
2 - The candidate is subjected to a programming or logic test designed to be unique enough so as not to be easily searchable on the web and to test the candidate’s ability to both work under pressure and produce excellent code
3 - The candidate is subjected to both pattern 1 and pattern 2.

The problem faced by the companies, of course, is that there are a lot of people out there who would attempt to subvert the interview process by learning a few buzzwords and faking their way through the process. As a result, these charlatans have caused the rest of us decent people trouble and made companies highly selective.

The problem faced by the job candidate, though, is more insidious. First off, some people (I am not one of them, but I can understand the mindset) take offense at having to take coding tests, thinking that their years of experience and stellar references should be adequate proof of their ability. Second, these kinds of problems can have a profound effect on the job candidate, and can affect the course of the interview. Sure, throwing a curve ball at someone is a good opportunity to judge how they react to adverse situations, but with programmers (and problem-solvers in general), if they get this kind of thing wrong, or even worse if they can’t solve it at that time, this problem will work itself into their psyche, DEMANDING that it be solved, and will take a measurable toll on the individual as they obsess about the problem while trying to continue to answer the interviewer’s questions as they move on to other areas, and in the end this kind of problem colors the company’s perception of the candidate in a manner disproportionate to the applicability of the skills involved to the job being applied for. Third, these kinds of problems have yet to demonstrably select the “best” candidate for the job. The skills used to solve logic puzzles are not necessarily those used to solve programming problems, and 95% of any programmer’s job is not going to require thought of that order at all. Take Microsoft, for example - they are notorious for using these kinds of interviewing tactics and as a result they do write some very good code, but they also make some of the most horrific blunders in code and application design known to man.

Probably the biggest problem with this kind of interviewing pattern, though, is that every worthwhile company has their own problem or three, and every worthwhile company thinks that pushing their candidates through the technical interview/logic/coding problem wringer results in getting the best of the best. Each and every company prides themselves on having the best and the brightest, evidenced by how well these individuals did on their tests. This calls into view a white lie that recruiters tell themselves - that there are more bright people than there are job openings. Any person of moderate intelligence can do the math and realize that the number of positions governed by hiring managers employing these tactics exceeds the statistically probable numbers of “best and brightest”. Not everyone gets to be an astronaut, and there are a lot more fry cooks than there are rocket scientists. But most importantly of all, these managers and these companies ignore one of the brutal truths of programming in general: 95% of coding is performing the equivalent of menial labor, the kind of tasks that you could train a monkey to perform if you had the time. No matter how intelligent your programmers are, their intelligence is largely wasted if all they do is implement specs. For every person developing an algorithm for an autonomous combat drone, there are dozens squirreled away in a cube farm updating corporate software for the next big product switch, changing enumerations occasionally throughout the code, or hunting down an elusive typo.

As a job seeker, there’s not much we can do except rail against the system quietly after we’ve already gone through the wringer and found that next employer. The only means to change this process come when you have power that is typically granted to people that don’t have Computer Science anywhere in their credentials, and even then a significant portion of technical manners swear by the “beat down” method of interviewing. For myself, all I can do is know that I am competent, that given enough time I can perform 95% of coding tasks in the business world, and that no matter what my inability to solve a problem is much more easily explained by stress than by a phantom loss of intelligence.

Note that this is not an indictment of my current employer. Yes, I had to take a coding test when I was interviewing with them, but they also had one of the least BS-laden interview processes I’ve ever been through. I only wish the rest of the companies I interviewed with had been as straightforward. All I’m really doing here is releasing some pent-up aggression from my job hunt in the form of an internet rant. Not really effective, but it does help me feel better :) Anyway, I’m done with the incoherent ramble.

52 Comments :, , , , more...