South Asia
Appearance
South Asia
Table of Major South Asian languages and their Wikipedias
Wiki code | Language | Primary Country | Number of speakers (Millions) | Potential Users (Millions) | Number of articles (7-09) | # of articles >1500 bytes (7-09) | Articles, 1 year growth rate (5/08-5/09) | # of 5+editors (5-09) | 5+ editors, 1 year growth rate (5/08-5/09) | 5+ editors,2 year growth rate (5/07-5/09) | Article to editor ratio |
hi | Hindi | India | *302 | 18.1 | 34,614 | 3,461 | 65% | 72 | 60% | 112% | 481 |
bn | Bengali | Bangladesh | *251 | 15.1 | 20,049 | 2,406 | 14% | 53 | 33% | 104% | 378 |
ur | Urdu | Pakistan | *101 | 6 | 10,667 | 1,707 | 36% | 39 | 22% | 50% | 274 |
pa | Punjabi | Pakistan | 91 | 5.5 | 1,408 | 113 | 380% | 18 | 157% | 260% | 78 |
te | Telugu | India | 70 | 4.2 | 43,556 | 3,049 | 8% | 45 | -21% | 32% | 968 |
mr | Marathi | India | 68 | 4.1 | 23,651 | 1,419 | 32% | 47 | 38% | 74% | 503 |
ta | Tamil | India | 66 | 3.9 | 18,951 | 4,359 | 31% | 61 | 22% | 65% | 311 |
gu | Gujarati | India | 47 | 2.8 | 7,454 | 373 | 490% | 25 | 39% | 178% | 298 |
ml | Malayalam | India | 36 | 2.2 | 10,631 | 3,508 | 60% | 83 | 17% | 84% | 128 |
kn | Kannada | India | 35 | 2.1 | 6,812 | 1,022 | 19% | 32 | 28% | 100% | 213 |
or | Oriya | India | 32 | 1.9 | 538 | 5 | 79% | 7 | 40% | N/A | 77 |
ps | Pushto | Pakistan | 20 | 1.2 | 1,363 | 654 | 19% | 20 | 33% | 82% | 68 |
si | Sinhala | Sri Lanka | *18 | 1.1 | 1,847 | 55 | 161% | 16 | 33% | 167% | 115 |
as | Assamese | India | 17 | 1 | 249 | 960 | 15% | 5 | -17% | -38% | 50 |
ne | Nepali | Nepal | 14 | 0.8 | 2,746 | 247 | 4% | 14 | -7% | -22% | 196 |
*Includes second language speakers [1]
Table showing comparison of article depth and quality of top Indian language Wikipedias (as of 2009 August)
Language | Official article count | Articles over 0.5 Kb (percentage) | Articles over 2 Kb (percentage) | Average Bytes per article | Edits per article | Database Size (MB) | Words (M) | Images | Page depth |
---|---|---|---|---|---|---|---|---|---|
Bengali | 20,022 | 54 | 12 | 1244 | 15.8 | 74 | 3.7 | 1197 | 64 |
Hindi | 33,497 | 34 | 10 | 1162 | 7.9 | 128 | 7.9 | 3386 | 21 |
Kannada | 6,685 | 55 | 15 | 1381 | 12.4 | 28 | 1.2 | 1413 | 16 |
Malayalam | 10,271 | 83 | 33 | 2590 | 24.9 | 80 | 3.0 | 6112 | 173 |
Marathi | 23,448 | 24 | 6 | 716 | 12.8 | 56 | 2.4 | 1872 | 16 |
Tamil | 18,625 | 81 | 23 | 1840 | 15.8 | 100 | 4.1 | 3533 | 26 |
Telugu | 43,370 | 21 | 7 | 714 | 7.7 | 82 | 3.9 | 4947 | 5 |
South Asian languages
- There are over 900 million native language speakers of the 15 South Asian languages listed in the table, comprising 57% of the population of India, Pakistan, Bangladesh, Sri Lanka, and Afghanistan.
- All of the 15 languages listed in the table are official languages of one or more South Asian country, or state, and are used extensively in the local print media.
- While many of these languages have a long written history, there are limited digital resources in these languages on science, technology, and world history
- While there are many other languages spoken in South Asia, these fifteen languages were chosen due to large number of speakers, importance as official languages, and as medium of educational instruction.
South Asia and Internet access and usage
- Despite the large number of speakers of these languages, most lack access to computers and the Internet, with Internet use rates of 6% of the population for the region as a whole. Additionally, most people in South Asia with Internet access can also read English.[2]
- Although English is still the dominant language on the web for India, there are active and growing online communities in Indian languages. For example www.maayboli.com, a Marathi language Internet community boasts of 100,000 hits monthly. Many of these online communities use a Drupal based platform which facilitates typing in Indian language fonts[3]
South Asian languages and education
- All of the languages listed are used as mediums of instruction at the elementary and high school level in different states and countries of South Asia.
- Many private high schools, some elite public high schools, and some advanced high school science classes are taught in English.
- For example, in the Indian state of Maharashtra in 2009 there were approximately 1,500,000 students graduating from the 10th grade in the state high school system. Of these, 750,000 pursued a course of studies in the humanities which is taught in the official state language of Marathi, 300,000 in the sciences which is taught in English, and 450,000 ended their studies at the 10th grade[4]
- English is the primary language for higher education in South Asia. However, some universities in India use local languages at the undergraduate level.
South Asian language Wikipedias
- There are some South Asian language Wikipedias in all of the 15 listed languages in the table. Some have shown a moderate growth over time, while others have grown at a very slow pace.
- There are three Wikipedias in South Asian languages of more than 3 million speakers that were not listed, Kashmiri, Sindhi and Bhojpuri, but none of these three have more than 2500 articles.
Barriers to growth of the South Asian language Wikipedias
- There are several major barriers to the growth of South Asian Wikipedias:
- Low awareness of tools for typing Indian scripts on Western-style keyboards and outdated computers and operating systems that do not allow people to read and type in South Asian language scripts
- Lack of Wikipedia tools to facilitate editing in Indian languages and a lack of editors who have the technical skills to address problems and fix bugs
- Low awareness of the existence of Indian language Wikipedias
- Strong emphasis on English for advanced education and professional advancement
Updating of Potential user calculation
Notes
- ↑ Information on languages from Ethnologue 2009 http://www.ethnologue.com Potential users is calculated by multiplying the number of language speakers by the national or regional Internet use rate. Internet use rates from from the International Telecom Union 2008
- ↑ Information on Internet use from International Telecommunications Union 2008 /
- ↑ Wikimediaindia mailing list post Please note in this e-mail the word "lack" (also spelled lakh) is a word taken it from the Hindi language that means 100,000. So 2.5 lack monthly hits would mean 250,000 monthly hits
- ↑ Letter from Wikamediaindia mailing list Please note that this letter uses the hindi word lack (also spelled lakh) to refer to the number 100,000. Therefore, 7.5 lack students means at 750,000 students