Wikimedia penetration

From Strategic Planning
Jump to: navigation, search

Wikimedia penetration

Wikimedia penetration by country (i.e. percent of internet users within the country who use Wikimedia projects)

Wikipedia Penetration Map Updated v2.png
One way to think about expanding reach is to increase the number of Internet users who use Wikimedia projects. By examining Wikipedia, the most popular project, it is clear that there is ample opportunity to increase the number of Wikipedia users simply by expanding its use amongst people already online. The following table provides estimates for the percentage of the online population using Wikipedia for all regions of the world and selected countries. This table shows that even amongst countries such as the United States, Germany, and Japan whose population speaks the languages of the three largest Wikipedias and serve as home to many of the most active Wikipedia community members, there is still some room for expansion. However, the greatest opportunities for growth are amongst Asian countries such as China, Korea, and India where below 10% of the online population is currently using Wikipedia. An estimate of the total global population that uses Wikipedia is between 17 and 25%.

Wikimedia users by region by language.png


Estimate of Internet users who use Wikipedia, by region and selected countries.[1]
Region Country Estimated range of
Internet users who
use Wikipedia
Asia & Pacific 10-16%
Australia 21-27%
China 1%
India 8- 20%
Japan 26-38%
South Korea 5-7%
Taiwan 10-13%
Middle East & Africa 12-30%
South Africa 14%
Europe 26-37%
France 36-39% *
Germany 25-42%
Russian Federation 18%
United Kingdom 28-37%
Latin America 15-30%
Argentina 22-26%
Brazil 9-22%
Mexico 19-35%
North America
Canada 45-50%
United States 24-32%

Wikimedia penetration in areas of rapid Internet growth.png

Internet access speed

See also Regional bandwidth

Participation is more likely in countries with higher broadband adoption and higher broadband speeds. Dial-up internet users can't use Wikimedia projects a lot and are less likely to have enough (connection) time to contribute.

Akamai State of the Internet Report 2010Q1 figures, figure 24 (Average and average maximum connection speed, average megabytes downloaded per month by mobile provider).

Wikimedia penetration by language

The following table displays the top 11 languages by number of native speakers and the number of Wikipedia articles in that language. It also shows the number of language speakers per Wikipedia article. This is one possible way to measure penetration by language, with higher numbers meaning that there is less content available per speaker. It is clear from this table that some of the most widely spoken languages such as Arabic, and Hindi and Bengali have significantly smaller Wikipedias.

Number of native language speakers and number of Wikipedia articles. [2]
Language Number of Speakers (millions) Number of Articles (thousands) Number of speakers/Number of articles
Chinese 1'213 255 4,757
Spanish 329 440 748
English 328 3'048 112
Arabic 221 99 2,232
Hindi 182 34 5,353
Bengali 181 20 9,050
Portuguese 178 496 359
Russian 144 440 327
Japanese 122 603 202
German 90 925 97
Javanese 84 108 778


Articles per number of speakers

The following chart provides another take on the penetration of Wikipedia by language.

The main resource to create Wikipedia articles are speakers of the language. Drivers to contribute might differ per culture or language.

In the table below are listed the Wikipedia in descending order of number of articles per thousand speakers of the language. This table might change your view on the reach and participation. Included are languages with 20 million speakers or more. Also included are Wikipedias with more than one hundred thousand articles. Please note that English is not within the top 10 of this list.

Warning: The number of speakers are rough estimates of vernacular language speakers (secondary language speakers have apparently been ignored). The effective source of data should be retrieved from an accurate and maintained source, such as the Unicode CLDR extra data, which also includes data about the literacy level (in order to estimate the number of readers of the language).

Language Wikipedia Speakers
(million)
Articles
(thousand)
Articles
per speaker

(thousandth)
Volapük vo 000.00003 0,120 4,000,000.0
Esperanto eo 002.00000 0,117 0,000,058.5
norsk bokmål nb 005.00000 0,225 0,000,045.0
svenska sv 009.00000 0,325 0,000,036.1
suomi fi 006.00000 0,213 0,000,035.5
Nederlands nl 020.00000 0,553 0,000,027.7
català ca 007.00000 0,192 0,000,027.4
dansk da 005.00000 0,114 0,000,022.8
slovenčina sk 005.00000 0,109 0,000,021.8
polski pl 050.00000 0,627 0,000,012.5
čeština cs 012.00000 0,134 0,000,011.2
Deutsch de 105.00000 0,943 0,000,009.0
italiano it 100.00000 0,596 0,000,006.0
English en 510.00000 3,000 0,000,005.9
română ro 026.00000 0,130 0,000,005.0
日本語 ja 130.00000 0,609 0,000,004.7
українська uk 044.00000 0,158 0,000,003.6
français fr 270.00000 0,839 0,000,003.1
português pt 180.00000 0,500 0,000,002.8
Türkçe tr 060.00000 0,134 0,000,002.2
Bahasa Melayu ms 023.00000 0,046 0,000,002.0
ไทย th 030.00000 0,049 0,000,001.6
español es 320.00000 0,504 0,000,001.6
русский ru 280.00000 0,422 0,000,001.5
Tiếng Việt vi 068.00000 0,096 0,000,001.4
한국어 ko 078.00000 0,108 0,000,001.4
srpskohrvatski / српскохрватски
+ српски / srpski
+ hrvatski
+ bosanski
sh
+ sr
+ hr
+ bs
021.00000 0,024 0,000,001.1
فارسی fa 070.00000 0,066 0,000,000.9
azərbaycanca az 031.00000 0,025 0,000,000.8
తెలుగు te 070.00000 0,044 0,000,000.6
Bahasa Indonesia id 180.00000 0,109 0,000,000.6
Basa Sunda su 027.00000 0,014 0,000,000.5
العربية ar 250.00000 0,100 0,000,000.4
मराठी mr 068.00000 0,024 0,000,000.4
മലയാളം ml 035.00000 0,011 0,000,000.3
中文 zh 900.00000 0,270 0,000,000.3
தமிழ் ta 066.00000 0,019 0,000,000.3
Basa Jawa jv 075.00000 0,020 0,000,000.3
ಕನ್ನಡ kn 035.00000 0,007 0,000,000.2
贛語 gan 021.00000 0,004 0,000,000.2
ગુજરાતી gu 046.00000 0,008 0,000,000.2
اردو ur 060.00000 0,010 0,000,000.2
粵語 zh-yue 071.00000 0,011 0,000,000.2
Bân-lâm-gú zh-min-nan 046.00000 0,006 0,000,000.1
বাংলা bn 185.00000 0,020 0,000,000.1
हिन्दी hi 360.00000 0,036 0,000,000.1
မြန်မာဘာသာ my 032.00000 0,002 0,000,000.1
吴语 wuu 077.00000 0,004 0,000,000.1
پښتو ps 023.00000 0,001 0,000,000.0
ਪੰਜਾਬੀ pa 057.00000 0,002 0,000,000.0
ଓଡ଼ିଆ or 032.00000 0,001 0,000,000.0
客家語/Hak-kâ-ngî hak 034.00000 0,001 0,000,000.0
hsn (Xiang) hsn 036.00000 0,000 0,000,000.0
मैथिली mai 030.00000 0,000 0,000,000.0
भोजपुरी bh 026.00000 0,000 0,000,000.0
Hausa ha 024.00000 0,000 0,000,000.0

Sister projects global penetration

Wikipedia: 96 % of total requests
Number of contributors and page views, compared
Sister project Number of contributors[3] Contributors over total (%) Monthly page views (millions) Page views over total (%)
Wikibooks 12,942 1.28 35.6 0.33
Commons 6557[4] 0.65 173 1.63
Wikinews 2969 0.30 12 0.11
Wikipedia 962,555[5] 95.44 10,175 95.71
Wikiquote 5963 0.59 33.5 0.31
Wikisource 4605 0.46 34.1 0.32
Wikispecies 633[6] 0.06 8 ca.[7] 0.08
Wikiversity 2329 0.23 6.6 0.06
Wiktionary 10,021 0.99 151 1.42
Total 1,008,574 99.7[8] 10,631 99.97[8]

With regard to share, Wikiversity and Wikibooks seem to have a number of contributors which is "four times" the number of page views, Wikinews "three times", Wikiquote "twice", Wiktionary "two thirds", Commons almost "one third" (but the figure is wrong), other projects quite balanced. What does this mean? If

  1. the total number of edits / number of active contributor ratio is more or less the same in all projects, and
  2. if the number of edits is an indicator of the total effort of the community more or less equal in all projects,

then we can consider that

  1. we have some projects where the community does a better focused work (i.e. less work, more useful to readers), either
    1. because the community is better directed/more able,
    2. or because the whole project is more needed than another (i.e. maybe a dictionary is more needed than a wiki newspaper),
    3. or because the project is more difficult and needs more effort to achieve some critical mass to be useful,
  2. or we have an unequal distribituon of:
    1. readers: some projects should be more advertised or linked from other projects/other web resources/other offline resources (e.g. schools);
    2. editors: some projects fail to attract the needed editors, either because they're "wasted" on another projects which needs them less, or because they are not able to attract new wikimedia editors (i.e. editors who don't arrive from another project) for some reason (see Participation).

See also Proposal:Make Wikimedia projects scale.

Note that the premises need to be verified:

  1. a project can attract more very active editors and another can attract more less active editors (e.g. because maybe to collect quotations is less difficult than to write a book);
  2. some projects may require more "external" preparative work (e.g. a public domain dictionary which is processed offline and then uploaded to Wiktionary by bots) or spread work on many edits (e.g. the uploading of a book full text on Wikisource requires an edit per page plus others for index, metadata etc.).

References

  1. Estimates for upper bound of range from ComScore for the month of December 2008. Estimate on lower bound of range produced by Bridgespan using both ComScore December 2008 figures and the International Telecommunication Union 2008 data on internet use ITU data. In cases where no range is given it is because the two sources converged on the same number. Note that ComScore data is produced though an opt-in panel of two million internet users around the globe and uses a range of statistical techniques to create an internally consistent portrait of the global internet audience. However it is important to note that they do not estimate users who access the Internet via Internet cafés, and people below age 15, therefore, their statistics will not be representative of certain populations. For information on ComScore as well as what countries are included in the regions see ComScore data on Wikimedia (*)Upper bound from ITU and ComScore data. Lower bound from ComScore data.
  2. Numbers from Data on language speakers from Ethnologue 2009 http://www.ethnologue.com/ethno_docs/distribution.asp?by=size#3, data on Wikipedia articles from wikistats http://stats.wikimedia.org/EN/Sitemap.htm
  3. From WikiStats, authors who edited at least 10 times since they arrived, total number for all subdomains (when appliable), e.g. Wikiquote. August 2009.
  4. This figure is wrong because WikiStats does not consider edits to categories and file description pages, and thus considers only 1/30 of the total edits (compared to e.g. 1/3 on Wikipedia).
  5. July 2009.
  6. June 2009.
  7. From WikiSpecial WikiStats figures (August 2009): 11844 views/hour*24 hours*30 days in September/1,000,000.
  8. 8.0 8.1 Approximation error.