Quality of Articles and Quality of Sources

History

Thread summary:[Link to] [Edit]

Starting point

MissionInn.Jim suggested users could rate articles (more weight to recent ratings), and that a Wikibibliography of sources (also rateable) might be valuable. Articles could then be rated based on the quality of sources used. Similarly editors could be rated too, based on quality of edits.

General discussion

FT2 agreed with rating of articles by users, but disagreed with rating editors due to incendiary potential and pressure to "game" (make oneself and friends look good or worse, make opponents look bad). He felt a sourcing index might have issues too: sources contain good and bad material, making generalization hard, cites can be gamed like nobody's business, overall a lot of work for questionable hard benefit.

(Slrubenstein states he "agrees completely with FT2 here" but it's not entirely clear what this relates to, and it might relate to the distinctions noted below on "trusted/senior editors")

MissionInn.Jim suggested editor rating issues could perhaps be mitigated, and a catalog of rated sources "could be a challenge, but I think it is worth exploring".

Article rating, editor rating, and experts

Slrubenstein notes that our consumers are also our producers, meaning we are not simply "providing a service" as such. Users who visit to read an article may not know enough to assess it for quality. His main point is:

"One index of Wikipedia's poor standing is the number of university professors who discourage students from using Wikipedia as a source. I think one reason why Wikipedia has quality problems is because too few of our editors are experts (e.g. university professors) on the relevant topics. When more professors are editors, more professors will judge article content highly and encourage their students to use it".

Slrubenstein noted that while more professors (etc) are contributing, the rise in non-experts/non-academics is much faster. We need (he feels) to get more experts on boat to balance the community and improve our ability to rate articles to a high standard.

Piotrus stated that users/editors rating articles is good, but needs care what to do after the page is edited - and especially after "major edits".

BarryN (Bridgespan) stated that rating content was "a really productive area" and agreed there were good ways to crowdsource suitable quality information. One approach would be to obtain simpler feedback and have a team correlate the feedback received with expert assessments, which would (probably) quickly allow a "simple content rating tool" to be set up which would partly compensate for the lower proportion of experts.

He also thought that such a tool might provide a basis for user rating, basing user ratings upon the changes in quality in each of their many edits in some manner; "As the ratings are generated for each article, they could become part of a portfolio that provides for recognition of the contributor's work. [T]his would have positive synergies with the community health work as it would reward positive contributions more clearly".

Randomran cautioned that "making it too subjective will just make it gamey. You'll see different political ideologies, different religions, using it as a way to express disapproval over the perceived "bias". ... that's if they haven't already gotten there first to give the article a ten, and use it as an excuse to prevent the article from improving. ("My 'Criticism of Barack Obama' [article] was rated a ten, so you have no right to start changing it")." He feels rating would be "a bad idea if there aren't some checks and balances". Woodwalker felt that feedback on different areas would allow the POV data to be separated from other signals.

Brya also agreed that any kind of rating system would be "gamey". "[T]he emphasis should be on reader feedback (readers outnumber editors by a huge margin), not on ratings by users, but... A very likely scenario is that articles that get good ratings will attract attention from editors, with deleterious consequences."

Editor rating and trusted/senior editors

MissionInn.Jim asked FT2 how disagreement with editor rating was consistent with the idea (elsewhere) of recognizing trusted/senior editors ("How would you arrive at trusted / high quality users if they were not rated in some manner?").

FT2 clarified that rating editors automatically or via a formal schema would be a target for gaming. But a "trusted/senior user" system would be just one level that's granted or not:

"Users aren't being 'rated' [in that proposal]. It's a means of recognition of trust. A user who "sometimes" edit wars but "mostly" edits well, or "usually" adds cites but "a few times" has acted improperly in content work per consensus, doesn't get a "slightly lower" rating. They get no "trusted content editor" standing at all".

Sources, cites, and trust

Woodwalker states that "The poor state of the verifiability principle is probably the main reason why Wikipedia isn't seen as a trustworthy source. The problem is not the quantity of sources, but the quality of sources and [their] balance".

A couple ideas to help manage quality. (Sorry if these ideas have been suggested already. It is going to take me a while to review all the quality info already developed, but I wanted to put these ideas down in the meantime.) 1) Rather than trying to get a consensus of users to rate the quality of articles, allow every person that reads an article to mark it quality on a scale of 1 to 10, then average the responses. Since articles change over time, newer ratings would be weighted higher than older ratings. 2) Build a Wikibibliography where each source in the bibiliography is assigned a quality rating (possibly in the manner described above). The quality of the article would be rated based on the quality of its sources. Articles in Wikisource could be part of the bibliography .

MissionInn.Jim 15:53, 26 November 2009 (UTC)

MissionInn.Jim‎

3) Wikipedia contributors could also be rated, using a method similar to item #1 above, based on the quality of their contributions. A contributor with a high enough score would be allowed to assign a quality rating to individual articles.

MissionInn.Jim‎

Building in a "rate this article" is essential. Flagged revisions has something like this, I think, but it's capable of being a separate tool as well. Disagree with rating editors. Although in an ideal world it would help to do this, in the real world it's incendiary and adds a pressure to "game" - either to improve oneself and friends or (worryingly) to discredit "opponents". More harm than good.

A recognized sourcing index might be an interesting idea, but disagree. The reasons are interesting though.

Outside scientific literature, many sources will contain both good quality and poor quality content. Generalization's hard.
A huge part of quality depends on the bias and writing of the article. A common feature in edit wars is to stuff articles full of a dozen cites to "prove a point". Dangerous to then rate articles purely on the repute of sources for the cites they contain. Cites can be gamed like nothing else.
A lot of work for questionable hard benefit to content.

In a way it's conceptually nice but in practice probably a non-starter.

FT2 ^{(Talk | email)}‎

Some of the issues you raise regarding rating editors could be mitigated. If each user is allowed to rate any other user only once, it would be more difficult to improve or discredit other users, without creating many accounts. An editor would not receive a rating until they have x number of ratings from unique users. The scoring could be dropped completely, and only allow the ability to indicate if you believe someone is a good editor. The only way to give someone a negative is not to rate the person, or withdraw your rating, if that was an option.

I can see where building a catalogue of rated sources could be a challenge, but I think it is worth exploring.

MissionInn.Jim‎

FT2 - It would seem to me that your discussion about Benefits of having "trusted / high quality" user recognition an argument in favor of rating users? How would you arrive at trusted / high quality users if they were not rated in some manner?

MissionInn.Jim‎

I wouldn't have a "bare numbers" rating system, like "how many users like/don't like this person". I would be wary of trying to deduce automatically from their editing the quality of their work. I would expect a formal rating system would be a target for gaming.

The approach in the other thread is different. It assumes one "level" that's granted or not, rather than a "rating system", and relies on review and discussion of their editing conduct not automation (possibly with weight to other trusted users per Piotrus).

In that approach (crucially) users aren't being "rated". It's a means of recognition of trust. A user who "sometimes" edit wars but "mostly" edits well, or "usually" adds cites but "a few times" has acted improperly in content work per consensus, doesn't get a "slightly lower" rating. They get no "trusted content editor" standing at all. Not till the community considers their content work and interactions on content are consistently appropriate and consistently of a reasonable/good standard. Which is what we actually want to see.

FT2 ^{(Talk | email)}‎

I agree completely with FT2 here. But I have to add, I am uncomfortable with and resist a "customer satisfaction" approach. I like the basic model, Wikipedia is the encyclopedia anyone can edit at any time - this means our consumers are also our producers. The issue here is not so simple as our providing a service to consumers. The problem is this: those people who come to Wikipedia because they do not know anything about Hegel simply cannot assess the quality of the Hegel article. They CAN assess how readable it is, and a comment on the talk page "I do not understand the third paragraph because ...." shoud always be welcomed and valued. So I hav eno peoblem with saying any reader can give us feadback on how readable an article is. But the only way to know whether the article on Hegel is really good or not is for an expert on Hegel to say it is.

I am not calling for some boad of experts to rate articles.

This is my main point: One index of Wikipedia's poor standing is the number of university professors who discourage students from using Wikipedia as a source. I think one reason why Wikipedia has quality problems is because too few of our editors are experts (e.g. university professors) on the relevant topics. When more professors are editors, more professors will judge article content highly and encourage their students to use it.

So I see the real problem as in the recruitment of experts as editors. University professors are used to writing things without getting paid; some will not edit Wikipedia because they hate the fact that their work will be edited by others - I wouldn't even want such people contributing to Wikipedia. Many more simply are not used to writing something collectively. But I think most academics do not contribute to Wikipedia because they are too busy and receive no recognition by their employer for contributing to Wikipedia. I do not see any solution to this.

But the fact is, more and more university professors are contributing to Wikipedia. As Wikipidia has grown, so has grown the number of academics contributing. But I bet that users have epanded exponentially but the number of expert editors has expanded arithmeticaly.

I think we need to find ways to recruit more.

By the way I use academics as the example but I mean of course any kind of expert. Slrubenstein 14:26, 8 December 2009 (UTC)

Slrubenstein‎

The poor state of the verifiability principle is probably the main reason why Wikipedia isn't seen as a trustworthy source. The problem is not the quantity of sources, but the quality of sources and the balance between them. Woodwalker 12:33, 9 December 2009 (UTC)

213.213.172.254‎

I am pretty sure something like this was discussed on Wikipedia in the past, but I cannot find it now. Anybody? I think that the current assessment scheme is a result of that, but I like the idea of editors - and readers - being able to vote on the quality of article (or sources). The problem with voting on article's quality is what do after an article is edited (particularly if it is a major edit). --Piotrus 21:26, 26 November 2009 (UTC)

Piotrus‎

Barry here from the Bridgespan team. Let me just say that the dialogue developing here is outstanding! A real testament to your commitment to Wikimedia's community and work. Thanks!

On the question of ratings, this is a really productive area and I agree there are good ways to "crowdsource" information on quality that would help drive higher quality both in terms of the reliability and depth of the content provided and the readability of articles. I agree with the sentiment that a simple 1-10 rating scale might over simplify, but I also think that it isn't necessary or realistic to create something that is overly complex. I think the right answer lies a trial and error process that starts to use reader feedback (the crowd) and has a team of contributors do sample-based analysis to see how the ratings correlate with expert judgment. I'd imagine that it wouldn't take long (given the huge volume of readership) to find a simple rating tool that provides great info about article reliability, depth and readability. One might marry the work ongoing within the community (sorry couldn't quickly locate the link) to rate articles using more of an expert approach (which is not scalable).

I also think some simple rating tools provides great data to start recognizing contributions of different sorts. Reliability, depth speaks to the work of contributors who write, readability speaks to the work of those who edit/maintain/monitor. One might create some recognition categories that draw from the data captured. For example (purely an illustration), software could be designed to give contributors a tag for an article where they wrote a lot of content or where they make a lot of small edits. As the ratings are generated for each article, they could become part of a portfolio that provides for recognition of the contributor's work. (this would have positive synergies with the community health work as it would reward positive contributions more clearly)

Thanks again for engaging so effectively on this issue. --BarryN 17:56, 8 December 2009 (UTC)

BarryN‎

I think that making it too subjective will just make it gamey. You'll see different political ideologies, different religions, using it as a way to express disapproval over the perceived "bias". ... that's if they haven't already gotten there first to give the article a ten, and use it as an excuse to prevent the article from improving. "My Criticism of Barack Obama was rated a ten, so you have no right to start changing it."

I think this is a bad idea if there aren't some checks and balances.

Randomran‎

That is one reason why I insisted on having feedback questions based on different aspects/factors of quality. If the feedback comes along such lines, the POV-signal can be separated from other signals and the feedback can still be valuable.

Woodwalker‎

I agree that any kind of rating system will be "gamey". If introduced, the emphasis should be on reader feedback (readers outnumber editors by a huge margin), not on ratings by users, but even that will not preclude gamesmanship. A very likely scenario is that articles that get good ratings will attract attention from editors, with deleterious consequences. - Brya 06:19, 10 December 2009 (UTC)

Brya‎