How Should We Judge a Game Jam?
This year I was invited to my first Global Game Jam event. It was sponsored by the Northern College of Leeuwarden (pronounced lei-warden, to rhyme with garden) in the Netherlands. I went not as a jammer, but as a judge. In all, there were four judges with a good mix of talents: an audio engineer, an artist, an executive, and me on the design side. No programmers, although I was a software engineer for 12 years and still have a pretty good sense of what's involved. The organizers left it to us to decide what to look for and how to vote, and I was startled to discover how varied our opinions were.
Our jam consisted of six teams, five building video games and one making a board game. There were some strange asymmetries among our dev teams. We had teams in which people could make 3D models but nobody could draw in 2D, and vice versa. We had teams with no audio person at all (a constant problem with small groups). Our jammers had access to a great motion capture studio, but in the end none of them used the results because there wasn't time.
For my part, I hung around the workroom, chatted with the developers (briefly; they were very busy!), and of course played all the games near the end of the jam to assess them. Then the other three judges and I retired to our chambers to debate the results. It took longer than I expected to choose a winner and a runner-up, and it got me thinking about how we judge games: completed commercial games, student projects, indie festival games, and game jam games.
Let's go back to the beginning for a minute. Just about ten years ago, Sean Barrett, Chris Hecker and their friends ran the very first (well, officially the 0th) Indie Game Jam. They started with a uniform code base and programmed for four days to make a bunch of crazy games that explored the question, "What can we do with 100,000 sprites?" I covered this in an earlier Designer's Notebook column.
In 2003 and 2004 they did two more Indie Game Jams, and then in 2009 Susan Gold borrowed the idea and took it worldwide as an IGDA project, with help from Gorm Lai and Ian Schreiber. Since then it has been a huge success, with over 200 jams and thousands of participants.
There are some pretty significant differences between the original Indie Game Jams and the Global Game Jam. The Indie Game Jam was invitation-only, a private event for friends of the founders. The Indie Jam lasted four days, while the Global Game Jam lasts 48 hours. Because we now have so many ways to make games quickly (Flash, GameMaker, etc.) the Global Game Jam has a common theme rather than a common code base. Perhaps more significantly, the first game jam was in no way a competition. The jammers shared everything, including their rather limited art resources.
I like the idea of an entirely collaborative jam, but I can't deny that the element of competition adds some motivation to the process. The Global Game Jam's own web site says that competition is fine as long as it's friendly and mature. There was no problem about that in ours—everybody was great. (However, I hope the competitions remain optional and at the local level only. A worldwide game competition with big prizes would quickly get too professional, which defeats the purpose and shuts out student and small-scale jammers.) The tricky part was figuring out who should win. All the judges had distinctly different preferences
One thing we all did agree on was that it's simply not fair to put board games and video games in the same competition. Writing and testing code is time-consuming. Changing the rules of a board game consists of making a note on a piece of paper. I'm not saying it's easy to make a good board game, but it's unquestionably faster to build one, and that means you can spent a lot more time polishing the design.
Our board game was called Gautar Hero (!) and named after the ancient kingdom of Gautar in Sweden. It was a clever asymmetric four-player game, set in a Viking town where a man-eating snake is on the loose. All the judges thought it was a very good game, and we gave it a special mention during the awards ceremony. (You can download the rules, and the graphics for the board and cards, at the Global Game Jam Leeuwarden web site.)
Reviewing the video games, even though I'm a designer, I gave more weight to completeness than to design innovation. I have seen so many student projects (and commercial projects, too, for that matter) fail through overambition and inability to scope correctly that I give major props to anybody who can build a complete and functional game on time. I've also worked with people who were all big creative talk and never got anything done because they couldn't commit to a decision. So to me, a small finished game is better than a big broken one. Of course, a game still has to be reasonably imaginative; even if it were finished, I wouldn't have rated a Tetris clone very highly. Two of our games included the concept of health points but had no health bar, so the player couldn't tell how much damage he was taking. The developers knew they needed one; they just didn't have time to implement it. I considered this a demerit, but the other judges didn't.
I think I was the only judge who took programming difficulty into account. One of our games required some tricky coding, but either the other judges didn't notice or didn't consider that an important criterion. Our artist judge was more concerned about design innovation than he was about artistic quality, which surprised me. Our audio engineer judge rated one game strongest that I rated weakest. We also had very different experiences playing the games. Some judges observed gameplay problems that never occurred while I was playing. Games that I thought were fun, others thought were boring, and vice versa. Each judge played them by himself rather than with the other judges watching, which may have been a mistake.
I'm not complaining about the outcome; I'm happy to stand behind our collective decision, and so are the other judges. Our winning game, which was entitled "We couldn't think of a funny title so we call it 'Walkabout' or 'The Great Balance.' Pick one," included artistic, audio and gameplay innovations, and genuinely deserved to win.
It seems to me that game judges have four major areas for discussion: programming, graphics, sound, and gameplay . You can judge each of these on three metrics: innovation, quality, and completeness. You might also consider a fourth, size, but in the context of a game jam I don't think there's much point. Other things being equal, a bigger game requires more work and deserves more kudos than a smaller one, but other things never really are equal, and it would be a shame for participants to sacrifice anything else just for size.
Programming innovation will be difficult to determine because the judges are unlikely to see the code. If the developers include some innovative but entirely hidden algorithm (in a combat model, for example), the judges will never notice unless the programmers tell them. However, if the game tries to do something more difficult than most of the other games it's competing with—it includes AI, for example, or more realistic physics—then I think that qualifies as innovation in the context of a 48-hour jam. Quality means no bugs or crashes, obviously. Completeness means that the whole intended feature set is in and the game plays from a beginning to an end (even if the end comes soon—most of our games only had one level).
Graphics innovation is easy to spot: are the images and animation different from the things we've seen a thousand times before? Quality likewise is fairly straightforward by conventional video game standards, although if a game goes for an avant-garde or retro look the judges will start debating their aesthetic preferences, which could get tricky. Completeness means that all the intended graphical elements are there, including user interface elements (such as health bars!).
Sound innovation is pretty tough because few dev teams can afford to devote much time to sound unless they're making a strongly sound-themed game. In a game jam I think you get points for recording your own effects and composing your own music, even if they aren't especially innovative. As for quality, well, if you do create your own effects and music they need to sound decent, and of course effects should play when they're supposed to and harmonize with the theme of the game. As with graphics, completeness again means that all the sounds that should be in are in.
Gameplay innovation is obviously a high priority, but it's hard to do in a game jam because time is so short, and judges are likely to disagree about whether a game is innovative or not. I've seen a lot of games in 42 years of playing them. Quality means "is it fun?" and is another debatable point, since different players like different kinds of things; but this also includes the ease of using the controls. Completeness refers to the number of play features included—several of our games started with plans for more different kinds of challenges and actions than they could actually get into the game.
Completeness sort of depends on another variable, ambition. If you say you're going to make something complicated and you only get a quarter of it done, is that better than saying you're going to make something simple and getting it all done? I discussed this with some colleagues on Google+, and several felt that, at a game jam, innovation is by far the most important factor and completeness shouldn't come into it much. Another thought that game jams are effectively rapid prototyping, and the result should be judged like a prototype, for its ability to give a feel of the real product.
It's possible that I allowed myself to be influenced by my other work. As a freelance professor I spend quite a lot of time looking at student projects, and in a student project, completeness is critical. A student needs to build a portfolio of finished work. Publishers and developers hire new graduates for their ability to get things done, not for their imagination. It's painful, but it's true. But a game jam changes these rules: in 48 hours, we can't reasonably expect too much, and it's a good opportunity to go a little crazy.
This raises the question of how many different kinds of events there are. From shortest to longest development cycles, I've judged 2- ,4- , and 8-hour game design (no coding) workshops; a 48-hour game jam; three five-day student development festivals; numerous multi-week student class projects; and some professional, commercial games for a major award. The lighthearted awards are definitely the most fun and require the least work (from the judges).
In the end we were helped by the small number of teams; we didn't have many games to argue about. If there had been 20 games and several that were really close together, it might have been difficult to arrive at a conclusion that we could all feel confident about. It was also good that our judges had diverse backgrounds. But if I'm invited to do this again, I'd like to sit down with the other judges in advance to discuss our expectations about the judging criteria. We'll probably get it done faster if we know in advance what we want to reward. And of course, if any of the developers are taking the awards seriously, they would like to know too.