PyWeek - Anyone notice an oxymoron about the ratings?

It says to rate the games on their own merit, but the ratings are all based on the average of that Pyweek's games: "Below Average, About Average" etc.

hidas on 2011/09/28 21:10


Comments: (log in to comment)

I was thinking the same thing. How can you say a game is "average" unless you compare it with other games? Perhaps it's the "average" of games not in the compo?
Yep I agree. I think we should just go to numerical ratings (1, 2, 3, 4, 5), since that's how they're tallied anyway. I guess it should just be pointed out that 1 = worst and 5 = best.

I also don't like the fact that "About Average" and "Above Average" look almost exactly the same from a quick glance.
I found it hard to give ratings for games also. In the end, when reviewing a game, I gave all ratings 'average' and then bumped it up or down depending on the quality.

The rating system is based on ludum dare's one but maybe we could try an alternative system like youtube's thumbsup/down which they moved to when they found 5 star ratings weren't very helpful. http://youtube-global.blogspot.com/2009/09/five-stars-dominate-ratings.html
In the ratings distribution in your link, it looks like about 90% of ratings were 5 stars, and about 8% were 1 star, with 2, 3, and 4 almost never being chosen. If PyWeek was seeing a similar distribution, then that would be a strong case for moving to a thumbs-up/thumbs-down system. I doubt that's the case, but it wouldn't be that hard for someone to scrape the data and make a graph or two.

Ideally you'd want as few rating categories as possible without losing the information in the ratings. Rating out of 10 stars is almost never justified, and in the YouTube example, there's clearly very little difference between 2 stars and 4 stars. But I think most PyWeek judges are discriminating enough to justify at least 3 categories, and actually 5 feels about right to me. But I'd be interested to see the distribution.
5 ranks feels about right to me, too. But I support the idea of just numbering them instead of giving them descriptions. I'm always reluctant to give a "Not at all" rating to anything, because it's rare for an entry to have no shred of merit whatsoever.

Also some category/rank combinations are non-sequiturs. "Production: Not at all." What???
I would say that part of the reason behind the youtube results is that many people would not bother to rate unless they feel very strongly in either direction
I just take average to mean "OK, but could be better". Below average = big problems and above average = pretty good.

For fun, average means it can hold my attention for more than 5 minutes.
 
For production, it means the game doesn't look amateurish. i.e. it's intuitive how to play, no awkward controls/UI, the use of art/sound is appropriate, and the game generally feels complete. I'll sometimes let poor art slide if the other aspects make up for it.
 
For innovation, it means there is some originality to the game and it fits the theme well.

I'd hate to move to a thumbs up/down system unless we broke up the categories more. Most of the raitings I give fall into the middle 3 categories, with a few exceptionals/not at alls.

"Production: Not at all" is a bit pointless, yeah, I was thinking about that too. I think that would mean you are basically submitting someone else's work, but without breaking the rules of the competition...
Also: I don't like the idea of a numerical score, since people could get confused about which one is best. if anything we should just change the terminology.

Poor/Adequate/Good/Exceptional would do.