the discussion below prompted a lot of thinking on my part. I thought I would lay out some questions I have been pondering about how to assess the usefulness of a Wikipedia to a given language community.
- Is there a threshold number of articles that the Wikipedia must cross before it becomes the number one reference resource for a given language?
- Given the data we have available, what is the best metric we could develop to assess the robustness of a given Wikipedia? (by robustness I mean degree to which it approaches covering the sum of all knowledge)
Does anyone else have any other questions or any thoughts on these questions? Sarah476 23:33, 24 August 2009 (UTC)
Articles per number of speaker
Number of articles is misleading; maybe we should count the number of words, or the database dimension. Number of speakers is misleading, too; perhaps we should consider only people who speak that language as their main language: we have lots of italian-speaking people around the world (e.g. in Argentina), who most likely don't contribute to italian language projects. Pageview can be an interesting (rough) estimate of the "reader audience", and is greatly different from projects dimension (in fact WikiStats abandoned the number of article as ordering criterion). Nemo 21:25, 19 August 2009 (UTC)
You bring up a very good point. While number of articles is an easy metric to get a quick sense of how a given language Wikipedia is performing, it is clearly insufficient. To get a truly robust sense, we would also need to understand the range and detail of the different topics covered. One can certainly imagine a Wikipedia with a smaller number of articles that cover many important topics across a wide range of disciplines could be more useful than a larger Wikipedia that consists mostly of information on pop-culture. It would be really interesting to figure out what it takes for a Wikipedia to become the default Internet reference source in a given language. I have posted an additional chart that only includes native language speakersSarah476 21:55, 24 August 2009 (UTC)
Sister projects global penetration
I added this section because I have many questions and doubts on this issue. I can't understand if there's a reason to be so Wikipedia-centric (maybe some wikipedians are right and sister projects are not so useful or do not need such a great amount of effort?). Nemo 09:39, 14 September 2009 (UTC)
- Thanks for adding data on other projects. The Wikpedia centricity is derived from the fact that the public at large is looking 96% at the time at Wikipedia, and only 4% of the time at all other projects combined. And somehow it might be easier to expand the reach of Wikipedia by 10% than tripling the reach of all other projects combined, which however, would involve the same absolute number of people involved. The amount of attention (for example in project specific proposals) for other projects is (I guess) probably proportional to the relative attention for the projects in general. In my view, when someone says "The WMF hosts Wikipedia and other projects", nearly half the characters in that sentence would have gone to other projects which would exaggerate the attention compared to the general attention for these projects by a factor 10. The importance of commons in this picture is grossly underestimated: nearly all pages on Wikipedia with images show images from Commons. So, nearly all pageviews for Wikipedia should count indirectly as a view on a work hosted on commons as well. Dedalus 10:29, 14 September 2009 (UTC)
- Yes, that's what I meant. :-) I understand your point, but I can't fully agree (nor disagree). For many small languages, moreover, I think that it would be easier (and more useful) to work on a Wikisource which would host almost all their written documents. As i wrote on a list, texts in small languages are always very interesting, and such a Wikisource could be easier to develop than a Wikipedia (we are considering a Ladin Wikisource), since texts are already there and need preservation (while a "Treacherous Computing" dispute in Ladin or even Latin may be of small interest...).
- For Commons, I don't know if that figure includes "requests" for Commons' images, but yes, I think it doesn't because those are conted as requests directly to upload.wikimedia.org. Then yes, obviously you can't consider that figure as a measurement of Commons's "general usefulness", but you can consider it as a measurement of Commons' "fame" (and reputation, which is by far too low, IMHO). Nemo 12:32, 14 September 2009 (UTC)