Exponential growth, followed by a plateaue phase.
I intended to make some analysis of how the amount of active contributors historically had been affected by the introduction of new projects. I therefore used octave and statistics for the first 29 languages on http://stats.wikimedia.org/EN/TablesWikipediansEditsGt5.htm and plotted the amount of active contributors normalized against the amount of current active contributors. I have not been able to see effects of the introduction of new projects on the amount of contributors to Wikipedia, but there was a striking similarity between the growth of the different projects. So I thought I better share it here.
First there was one class of projects (en,ja,de,es,fr,it,pl,pt,nl,zh,tr,sv,fi,cs,id,he,no,da) which all have gone through an exponential growth up until about three years ago, when they all more or less plateaued.
And second there was one class of projects (ru,ar,hu,vi,ko,th,ro,uk,fa,hr,bg) that stills seems to be in the initial exponential growth phase.
I guess this is just the normal growth curve for pretty much any system you study that is given space to grow. But I think this division of projects into two classes might signal that there is two different approaches that needs to be considered. For the projects that has plateaued it is important to consider what can be done to make the plateaue value of active contributors as high as possible. how can the growth space made as large as possible. For the other class of projects it probably is more important to consider how the current growth can be maintained or fueled.
I really love this. I want some time to digest it, but thanks for putting it together... I'm going to pass it around here and see what folks think and try to get some feedback on it.
I did some further analysis from which I would like to group the projects into the following four cathegories.
Have had exponential growth, but plateaued about three years ago: en,ja,de,es,fr,it,pl,pt,nl,zh,tr,sv,fi,cs,id,he,no,da,sk,el
Are in exponential growth phase: ru,ar,vi,ko,ro,uk,fa
Has grown linearly, and is growing linearly: hu,hr,bg,ca
Ambiguous. Most seems to have been at a plateaue since the data I used started being collected: lt,eo,sl,ms,sr,et,simple,eu,bs,ka,gl,hi,mk,cy,te,nn,lv,ml,br,af,la,mr,ta,bn,tl,az,zh_yue,is,sq,be,sh,lb,an,be_x_old
The first two classes seems healthy to me, and shows growth that can be expected. If we want to do better we can however ask what can be done to rise the plateaue level. The third class does also show growth, which is good. But I think a realy healthy growth should show a period of exponential growth, so this class might be a problem class. The bigest problem as i see it is the fourth class that seems to have been stuck at a low level (these projects has a couple of hundred active editors or less), and for many of these projects I think there is opportunity for exponential growth. But what are the problems that has not made them grow?
For the languages I havn't included in this analysis there is a lack of data, but most of them (if not all) is doing worse than those in the fourth class. So these do all belong to the problem class as well.
Hoi, the new presentations of Erik Zachte show nicely where a language is at. I blogged about this as well.. http://ultimategerardm.blogspot.com/2010/04/anonymous-coversion-rate.html The message is that the conversion from unregistered to registered will not happen for several languages..
It is likely that this has everything to do with the lack of support for the languages involved. Thanks,
I checked the graphs on your blog. But I wonder if the absence of anonymous edits on the Hindi Wikipedia really is strikingly low? Comparing the number of anonymous and registered edits in Janurays, the quotient is 3.64 (61%/19%) and 6.14 (43%/7%) for the Russian and Hindi Wikipedia respectively. That is, they agree within a factor of 2.
It is true however that the ratio was much higher for the Hindi Wikipedia during most of 2009. But according to the graph, the high number of registered edits seems to be the anomaly rather than the low amount of unregistered edits.
To me it seems much like a vissual illusion that the graphs signals low amount of unregistered edits on the Hindi Wikipedia. The high peak in the cumulative curve presses all other curves closer to the bottom, and the high amount of bot edits does the same for the registered and unregistered curves. If bot edits would have been excluded and the normalization would have been done against the cumulative value in January instead of the peak value, then the curves would not seem to be as different.
By the way, is there anyone who knows what happened about three years ago when so many projects seems to have plateaued.
Thanks for this analysis! It's mirrored some of the work that PARC and others have done, some of which is on this wiki. Philippe is currently undertaking the Herculean task of merging all this work into one place, and I hope he'll include this as well.
Regarding your question about plateauing project participation: This is the great unanswered question. The best hypothesis I've heard so far: the worldwide economic downturn.