Rules, DQ, and stuff...
Hi.
Last pyweek, there were some disagreements and arguments about how games are (dis)qualified and how they should be, and the clarity of some rules. You can see some of that on this posts:
- PyWeek judging idea
- ADMIN: judging irregularities
- library before challenge
- ADMIN: disqualifiying for using certain libraries
This is a summary of some of the issues and complaints that were presented:
- One or two users tried to abuse the voting system (voting 1/1/1 to everyone else). No serious problem because they were caught.
- Some people voted low ratings for some games instead of DNW (Did Not Work).
- The criteria for voting DNW is not clear (it should work out of the box? it is ok/fair to vote a game as working even if it needs a small patch/changing a filename case?). This got some people complaining of getting DNW even if a trivial patch/fix was posted.
- There were some obvious DQ (Disqualify), like people submitting a game without source code. They got about 3-5% of DQ voting only.
- There were complaints of unjustified DQ (like DQs for not following the theme on game where the theme was there but not blatantly obvious).
- There were many gray cases where it was unclear if it was OK to DQ someone (specially related to the "library released 1 month before" rule).
- Many people participated in the contest knowing that they were breaking some rule (specially the library rule, but sometimes the theme rule). The argument is "hey, this is for the fun, and I am not very far from the spirit of the compo". Or "I just use pyweek as a testbed for my library"
- Many other people who constrained themselves to the rules, felt unfair to be competing along people who breaks the rules and has no chance of getting a DQ.
OK, so what we do about this? I think all of these are symptoms of a few problems
- Some rules are not too clear
- Some people get in pyweek for personal fun, and take the rules only as fuzzy guidelines but not something to enforce.
- Some people get in pyweek thinking of it as a place where the rules set interesting limits, and think that those rules should be enforced.
The solution to (1) is to clarify the rules. That is more or less easy (I expect comments below telling "I have had problems qualifying rule 'foo' ").
The solution to (2) and (3) is not that easy. We are talking about large groups of people on each side (and in the middle :) ). Probably both groups are right; both have valid viewpoints. We want all of that people in pygame, because their position there has no effect in their ability to contribute great games.
An easy solution would be saying "OK, we split pyweek into two compos". But that would create a split, and I think that would be sad because I find pyweek to be a nice and healthy community.
So, this is a proposal I have been thinking about: The pyweek competition is one but with two "categories": strict and free-style. On the entry setting you can decide in which category you wish to enter (and perhaps it could be changed before the end of the contest).
The free-style category has the same rules, but the judges/participants rate with an open criteria. The disqualify option perhaps can be removed for this people (or left in... anyway it doesn't change too much given the DQ rates seen in previous pyweeks).
The users participating in strict entries agree to play "by the book", trying to not step into black or gray areas, asking before doing ("Can I use library foo?"... "Is it ok to include a 30000 lines C extension?") and to qualify other strict entries consciously checking that the judged entries have followed the rules strictly. Perhaps a small set of people can be appointed to check requests for disqualification and act as judges in the difficult cases (as proposed by adam in one of the comments to the "judging idea" post); these people would have the authority to actually disqualify games (if somebody else requested DQ for a valid reason).
Other than that flags, and the different commitments for different kinds of users, the competition would be mostly the same. The entries would be rated together (or well, separated into solo and team), with an extra column indicating strict or not.
What do you think? comments?
(log in to comment) Actually, I always thought that the Pyweek rules are quite clear. If an entry doesn't work out of the box you are free to rate it DNW. It's up to each rater's politeness, patience and (lack of) time to apply any patches or little fixes. Since Pygame and Pyglet have been around looong before 1 month of the challenge, using the updated versions was perfectly ok. The 1-month rule applies to libraries that are not so widely used and written from scratch. Interpreting and including the theme in the game can be done just "cosmetically", so DQing participants for not following the theme is kind of hard. And AFAIK the few games who did break that rule were rather bad, anyway.. I guess we can put DQ out of the rating system because no one would ever be DQed this way. Instead, we should use this message board to point out who broke which rule and ask them to fix it first. If it's something really sinister and evil they may be DQ'd right away.. I can't think of an example, though. ^^ Now, I don't want Pyweek to be a competition where everyone uses his personal codebase and his very own library (that nobody else can use) and nobody cares about the theme and deadline. We should make sure that the few rules we have are followed without being too harsh. So if somebody says "Hey, I'll use my own library but I didn't finish it in time. I just want to join Pyweek for fun." then let him participate but exclude him from the ratings. I like PyWeek's informality. And as participants we should be responsible for making sure our entry is a fair one, that it follows the theme and that it uses only libraries within the spirit of the competition. If others think we were unfair in these decisions then they can say so by marking DQ. The "problem" (and I quote it because to be honest, I don't think it truly is one) is surely that people as judges are too permissive in their interpretation of the rules. Quite frankly I am happy with this. PyWeek is fun, it is challenging (because writing a fun game is not easy), it is inspiring, it produces lots of Python game developers who go and do other things, it produces lots of Python code related to game programming, it produces a bucket load of game ideas and most importantly it develops a community. So what we have are a few superficial quirks of the judging, so what? It doesn't change the competition. The theme is there to inspire, the rules are there to guide.
@dmoisset: you put a lot of thought into your proposal, but I would be -0 on splitting pyweek entries into free-style/strict
imo the pyweek rules are clear and we shouldn't make pyweek more complex. it's fun how it is.
the one thing i would be +1 is having some sort of "master reviewer" who does predefined sanity checks to catch cheaters and obvious DQs (no source) before the ratings get online.
Comments
I think though the DQ's should remain, and adding judges (which could basically be contestants themselves, but if someone notes something about a game as DQ worthy they evaluate and from an impartial view give a DQ or not.
That way you don't need some special people outside of the competition to do the judging, I mean come on, I don't think many people would abuse that position if given it.) would be a very good idea.
If we do away with DQ's altogether, then what is the point of the 1-month before rule on libs? Or the "you must use this theme in an important way" rule?
I think they add flavor to the competition. Even though I have disliked the themes (personally) as often as I have liked them, they have always added challenge and focus to the project for me.
Being a competition, and if you wish this to remain that, you must have rules that are enforced. DQ's are the only way right now to do this.
richard on 2008/07/30 05:41:
Or I could just do away with disqualification altogether? I started this comp to have fun, and want it to be fun, rather than playing-for-sheep-stations serious. This is all getting a bit serious.Maybe I'm mellowing or something, but I think that maybe it's OK if a really cool entry that only works on 50% of computers wins the comp :)
I really don't think I want to introduce the complexity of strict/freestyle.