Could Wikipedia be to slow to load in many countries?
Is there anyone that knows whether bandwidth is something that hinders smaller wikipedias to grow? If Wikipedia is slow to load in many countries this would probably slow the growth and reduce the usage significantly. Could the sites be made to load faster in that case? Should picture sizes be limited on wikipedias where users are likely to have slow connections, and other kinds of media also restricted?
I summed up the sizes of the pictures contained in todays (29th of December) featured article at the English Wikipedia. The total size was about 11,5Mb. I really think that such large article size can prevent usage in many countries where users have low bandwidth.
Are you sure this is correct? The Wikimedia servers serve low res renditions of the images. Firebug gives me 289kb of images as an anonymous user (this includes all MediaWiki icons that are not part of the article content).
No I am not sure at all, I would very much like real data on this. Could you provide data for a couple of pages? I just wrote down this thought here and in the recommendation document to hope to get some answers.
So the Wikimedia sofware scales down thumbs automatically?
Maybe a list over the last 20 feature articles would be a good data set.
Sorry, no time. I recommend you install Firefox and the firebug plugin. Then get the data by fully reloading the page as an anonymous user (that will also load css and scripts).
If I have made everything right I get the following results for the featured articles of December:
The average is 340,7kb per article. Can this be seen as the standard size of a large article? It remaining then to find an expectation of the bandwidth in different regions.
The following links provides some information for Africa:
- Second slide seems to show that the bits per capita is below 10 for all African countries. Bits per connection would however be a much more intresting number.
Some further analysis on the size of the main page for the according to alexa.com top ten sites:
- Google.com 34kb
- Facebook.com 93kb
- Youtube.com 12100kb
- Yahoo.com 284kb
- Bing.com 174kb
- wikipedia.com 51kb
- blogger.com 55kb
- baidu.com 6kb
- msn.com 252kb
- yahoo.co.jp 167kb
I don't know if I understand all the output from the firebug extension. But I made the following experimet.
I started from this days featured article and randomly clicked a blue link in the article, and continued to click a random link in the article I was taken to, and so on. I never clicked reload. Most of the results where in the range 10kb-500kb. Does that mean that this amount of data actually had to be loaded from the server? I noticed that when I randomly walked through articles in this way most of the content that appeared in the list where from upload.wikimedia.org. So I really think that most of the data that actually is sent when an article is loaded is media. The numbers of 11,5Mb that I got from simply adding the sizes of the full size pictures was however all to high.
I also found some rather large featured articles. For example the articel of the 18th of November loaded 1.3Mb of data, without clicking refresh.
So I wonder if it makes sense to recommend a turn of button for media? I guess this largely depends on the internet connection speeds around the world. But with a connection speed of 50kbps (which I remeber was standard here in Sweden ten years ago), 200kb of data (which might be expected to be some average from the random walk I did) whould take 32 seconds to load at absolute ful speed. As I got the impression that the major part of the data that was loaded was media, I think that it does make sense in that case.
I talked to a friend that had been to Swaziland that told me that the only internet provider there in 2007 where MTN (http://www.mtn.com). According to this article "MTN describes itself as 'the leader in telecommunications in Africa and the Middle East'". I checked their website and found this map
that shows the possible mobile internet conection methods that they provided in different parts of Africa and the Middle East. It seems like GPRS is the main solution provided.
These pages shows the speed of the four different solutins provided:
The speeds are
- GPRS: 40kbp/s
- EDGE: 5*GPRS = 200kbp/s
- 3G: 384kbp/s
- HSDPA: ? (up to similar to fixed lines)
I think this company is a provider of fixed landlines as well, but can't find any information about these solutions at that site.
With GPRS solutions for internet connections it seems like my estimate of 50kbps really wasn't that bad =).
I have asked a couple of friends that either are from or has been to some countries for at least a couple of months about their experience of internet connections. I didn't ask about if Wikipedia where slow to load, but tried to get information on how fast a typical internet connections where. The countries and years they where there as well as some comments they gave are listed here. Hope to be able to fill this in as I get more answers:
- West China (2008)
- A lot has happened from 2004 to 2008. In 2008 I think I had an internet connection of 20Mbit, which I also think was the slowest choice available. As long as you visited sites within China the connection speed was very good. But as soon as you tried to access other sites it became very slow. It could take about a minute to login to Facebook. The internet provider where China Telecom (http://en.chinatelecom.com.cn/)
- Malaysia (citizen)
- Pakistan (citizen)
- I don't know what the internet connection speeds are, but I know that it is a problem and some websites load slowly.
- Kenya (citizen)
- At my connection I guess there takes an average of 10 seconds to load a Wikipedia article. Some ISP are ISPKenya (http://www.ispkenya.com), Safaricom (http://www.safaricom.co.ke) and zain.
- Uganda (2009)
- I don't know what the speed where, but sometimes it was quite slow.
- Swaziland (2007)
- I don't know what the internet connection speeds where, but Swaziland only had a single telecom company called MTN.
- Ecuador (2009)
This page provides some information on internet connections around the world. I think the statisics is gathered from people that runs the program on the site so the results are probably biased toward internet connection speeds for people that are interested of knowing their speed (I guess this makes the results higher than the true average), but not sure about this:
Looking at the "surf speed" statistics country by country and using the "past 8 weeks" option to have statistics for most countries there seems to be very many countries where the average is at 100kbps or below when surfing outside the country. Many of the countries in this statistics is also quite well developed so the countries that ain't included in the list probably has much lower speeds.
Would it also make sense to host the local projects on servers within the countries where the languages are spoken as there seems to be much higher trafic speeds within the countries than when accesing material outside?
Some information from ISPs in Kenya:
Safaricom provides broadband connections from 512Kbps to 4Mbps. These are listed as busines solution and only the 512kbps solution is listed as a single user connection so I guess this is a fairly high connection speed in Kenya.
ISP Kenya gives connection speeds ranging from 12Kbps up to 2Mbps+ for their "leased lines" solutions. The 12Kbps is refered to as home/single user solution while the 2Mbps+ is refered to as a solution for a corporate client with a large LAN/WAN connectivity. Once again indicating that the average home user connection probably is very low speed.
A bigger problem than bandwidth is latency IMHO. 13 stylesheets plus 12 external scripts is a lot of roundtrips. I'm in Australia with a fast ADSL connection, and on an empty cache it can often take 5 seconds or more for the page to start displaying. Mobile networks and PSTN modems add additional latency.