Barry here from the Bridgespan team. Let me just say that the dialogue developing here is outstanding! A real testament to your commitment to Wikimedia's community and work. Thanks!
On the question of ratings, this is a really productive area and I agree there are good ways to "crowdsource" information on quality that would help drive higher quality both in terms of the reliability and depth of the content provided and the readability of articles. I agree with the sentiment that a simple 1-10 rating scale might over simplify, but I also think that it isn't necessary or realistic to create something that is overly complex. I think the right answer lies a trial and error process that starts to use reader feedback (the crowd) and has a team of contributors do sample-based analysis to see how the ratings correlate with expert judgment. I'd imagine that it wouldn't take long (given the huge volume of readership) to find a simple rating tool that provides great info about article reliability, depth and readability. One might marry the work ongoing within the community (sorry couldn't quickly locate the link) to rate articles using more of an expert approach (which is not scalable).
I also think some simple rating tools provides great data to start recognizing contributions of different sorts. Reliability, depth speaks to the work of contributors who write, readability speaks to the work of those who edit/maintain/monitor. One might create some recognition categories that draw from the data captured. For example (purely an illustration), software could be designed to give contributors a tag for an article where they wrote a lot of content or where they make a lot of small edits. As the ratings are generated for each article, they could become part of a portfolio that provides for recognition of the contributor's work. (this would have positive synergies with the community health work as it would reward positive contributions more clearly)
Thanks again for engaging so effectively on this issue. --BarryN 17:56, 8 December 2009 (UTC)
I think that making it too subjective will just make it gamey. You'll see different political ideologies, different religions, using it as a way to express disapproval over the perceived "bias". ... that's if they haven't already gotten there first to give the article a ten, and use it as an excuse to prevent the article from improving. "My Criticism of Barack Obama was rated a ten, so you have no right to start changing it."
I think this is a bad idea if there aren't some checks and balances.
I agree that any kind of rating system will be "gamey". If introduced, the emphasis should be on reader feedback (readers outnumber editors by a huge margin), not on ratings by users, but even that will not preclude gamesmanship. A very likely scenario is that articles that get good ratings will attract attention from editors, with deleterious consequences. - Brya 06:19, 10 December 2009 (UTC)