What particular factors might have begun to inhibit participation in 2006, when we know it began to stagnate?

What particular factors might have begun to inhibit participation in 2006, when we know it began to stagnate?

One of the participants in discussion on my livejournal on this topic linked to a kerfuffle over pages being deleted in 2005 and 2006 and another person referred to a case where people who complained about admin actions got their accounts blocked. I have begun to look at deletion discussion logs and have requested access to data from page deletion and user block logs to see if I can verify whether or not there was an upswing of editors being blocked or leaving due to debates over notability.

If you have other theories for relevant patterns that we might query log or survey data for, please suggest them.

We might also develop strategies for further surveys or interviews of people who chose not to become editors or to stop editing in order to determine root causes.

Netmouse00:32, 30 October 2009

This is a really interesting question, and the more data the better. Just to brainstorm some possible explanations: - By 2006, WM had more veterans to claim "ownership" for content, and make it hard for new people to contribute. - By 2006, Wikipedia was relying on "verifiability" to ensure quality (e.g.: featured articles had to be verified top to bottom) and avoid disputes, making it hard for new people to contribute. - By 2006, there were other major competitors for peoples' attention online that were just more fun and usable. (MySpace, Facebook, MMOs, alternative Wikis?) - By 2006, people felt more and more that the major content had been created, and there were fewer gaps in content that were large enough to inspire new contributors - By 2006, there was more infighting (e.g.: religious, ideological, etc.), with no end in sight, and people began to burnout more quickly. - By 2006, Wikipedia's reader traffic began to level off, and participation just followed by becoming less exponential - ... something else.

I think all of these (and then some) contribute to the dropoff somewhat. Some just a few percent, but I think we could confirm every one of these is a factor. Much harder will be to figure out which one is the biggest factor.

Randomran18:17, 30 October 2009

These are hard questions. When I first started editing there wasn't really any citation method in place. I just used to gambol around putting in what I "knew" and I wouldn't run into any resistance. My edits would stay even though I provided no citations. Things became more strict but I sort of grew up with things becoming more stringent in such a way that I just adapted as things became more "difficult". So I'm trying to think how I would have responded if, when I made my first few edits, I had been reverted with the summary "unreferenced". It's hard for me to see through the eyes of a newbie. That's something I'm going to have to try to do to come up with good answers for this task force.

Bodnotbod19:33, 30 October 2009
 

As abandoned editor Rahere, there were a number of turn-offs: 1. Admin remoteness and slowness. I know, cost, but you're playing Oscar Zoroaster Phadrig Isaac Norman Henkel Emmannuel Ambroise Diggs's game a bit too obviously. Something this size needs charismatic leadership as the sine qua non - and that's rare. You must delegate a lot more through the Projects, and require them to get of their bums and lead. 2. Bot flyers - there seem to be more people inside Wikipedia doing it down than actually doing something to fix the problems. Why post a flyer complaining about something when you could get stuck in and fix it yourself? Are standards really that important? Are they really more important than IAR? More than content? If so, why not get specialists interested in that kind of thing to work? Orphaned pages have been beaten to death by the bots, not because of poor inherent value, but because they did not not meet the requisite statistical norm - I now destroy flyers on sight, on the basis that if the fly-poster doesn't have the knowledge or commitment to put his editing where his mouth is, then he lacks the credibility to comment - standards are not one-size-fits-all across all of academia. Who wants to read a page covered in years' worth of stickers? The most valuable bot of all, I think, would be a cleaner, going through pages deleting any bot flyer over two weeks old! I looked after the Albigenisan Crusade page for a bit, putting in citations at the request of a bot, posting totally unnecessary references to the source texts which are already indexed in parallel, being chronologies: when I checked, the bot owner told me I should know what I was doing and then flamed me, not for the quality of the citations, but for daring to ask him if he was satisfied thus far - so I desisted, see full records on the discussion page. I was then complimented by general readership on the quality of my work, and a page whose Good Article status had been withdrawn - and which remained withdrawn because of what is functionally administrative BS - none the less was adopted by the Schools History program with the request that editing should be cautious. You can't have it both ways, folks: if it's good in the eyes of the specialists, then it's good, period, regardless of what some ivory-tower merchant thinks, IAR. That page was orphaned when I arrived, was orphaned again after Admin killed the foster-parent by neglect, and remains adrift, despite its value. That is what happens out there, folks, all too often. Ah, and another thought for something that's getting out of control - Project Bands on discussion pages. All of this belongs in the footnotes - what the reader needs is a superficial intro, a deeper study linking out to more specialist pages, and a summary. 3. Lack of general support and positive feedback. Sign-up-here to projects doesn't really get any reaction, so you're steering your own ship without direction. 4. Lack of "If you need to know more, contact so-and-so". It would be cood on occasion to compare notes in advance. You are welcome to write to me privately on my old address - Rahere also runs an occasional LiveJournal blog, if you've lost that.

81.241.227.8418:15, 1 February 2010

See also this proposal - Brya 05:20, 2 February 2010 (UTC)

Brya05:20, 2 February 2010
 

[I did not write this here. This is just one more of the vagaries of LiquidThreads (it never stops finding ways to malfunction). - Brya 05:33, 2 February 2010 (UTC)]

Brya05:29, 2 February 2010
 
 

A couple of relevant things:

  • Late 2005 and early 2006 were probably the period of most intense media attention for Wikipedia, particularly because of the Seigenthaler incident and the Nature accuracy study (published in late 2005, and spread widely in early 2006). So that was when the most new people were introduced to Wikipedia for the first time.
  • Nevertheless, the net article creation rate is jittery in this period, falling in early 2006 before continuing to rise, and peaking around July 2006.
  • The end of 2005 is when new article creation was disabled for anonymous accounts. One possible interpretation is that this blunted wildly rapid and largely uncontrolled growth that was happening at that time, even while continued media exposure kept growth up until mid-2006, by which time most internet-connected people in English-speaking countries had already heard of Wikipedia and later media exposure had diminishing returns in terms of attracting new editors.
  • The speedy deletion criteria, which are the main ways of enforcing the idea that Wikipedia is a serious encyclopedia and not everything or everyone can have an article, essentially developed into their current form between early 2005 and October 2006.
  • As Randomran and Bodnotbod note, the rise of citations and the associated norms are surely a factor as well, both in terms of more confusing markup and norms for what counts as quality content that won't get deleted.
  • The null hypothesis, I'd say, is the "low-hanging fruit" explanation: by 2006, most of the things that most people wanted to write about had already been covered.

--ragesoss 22:30, 30 October 2009 (UTC)

ragesoss22:30, 30 October 2009

One other thing worth noting, which I think is relevant for the null hypothesis of "low-hanging fruit": while article creation peaked in July 2006, editing frequency actually continued to rise until peaking in March 2007, and has declined somewhat (but not dramatically) since then. So there is a lag of eight months or so between the peak in article creation and the peak in overall editing activity/number of active editors. One explanation is that by 2006, most of the articles that most people wanted to write about had already been created, but they didn't seem complete. By early 2007, most articles that people wanted to write about were not only created, but also pretty well fleshed out with few gaping hole for a newcomer to add sections to. --ragesoss 22:51, 30 October 2009 (UTC)

ragesoss22:51, 30 October 2009

Interesting, ragesoss. I'm wondering if a similar lag time occurs on other language Wikipedias--we have the charts for these in "analysis" section of the task force.

JohnF23:13, 30 October 2009

Rageross has a really interesting theory, and it makes a lot of sense. If the same trend were true in the other Wikipedias, then we might be able to confirm that Wikipedia is experiencing a natural slow down in growth, rather than some failure of vision. said, my instinct tells me that "natural slow down" is only part of the explanation, and that there are lots of things we can still do to make the community more vibrant. But let's investigate, if we can.

Randomran23:55, 30 October 2009

Long mail ahead -- bear with me!

It seems to me that either A) article creation tends to peak at a certain number of articles, supporting the "low-hanging fruit" hypothesis. Or B) article creation tends to peak after a certain period of time has passed, or is associated with some other variable, and is unrelated to the number of articles. Which would debunk "low-hanging fruit."

So, from Erik Zachte's stats pages: New article creation in enWP peaked in July 2006, when enWP had 1.2 million articles. And, new article creation in svWP peaked in May 2006, at 162K articles.

That suggests that A, low-hanging-fruit, is false. If we looked at other language versions, and saw new article creation peaking at widely varying article counts, then A would be, in my view, thoroughly debunked.

Which leaves us looking for a different cause. I would say that then would leave us with three possibilities.

B1) If new article creation peaked across all language versions more-or-less simultaneously (meaning, on the same date), then I can only imagine that the cause is somehow both external to us, and global. Examples: a global blossoming of interactive sites lured away our editors, or, a terrible global economic collapse meant people everywhere needed to focus solely on paid work. (A non-external hypthesis: I also speculate sometimes: if Jimmy talked publicly, a lot, about quality post-Siegenthaler, then maybe that somehow engendered a large global increase in restrictiveness inside the editing communities. I can't think of any internal factor but Jimmy that would potentially have that kind of large global impact.)

B2) If new article creation peaked for each language version at roughly the same time post-launch (like, launch + five-and-a-half months), that would support the idea that our editing communities have a natural internal lifecycle. (That wouldn't mean the "natural" lifecycle was necessarily a positive one, but it would suggest that the cause of the peak is internal to each editing community.) I have sometimes wondered whether online communities have a certain period of time (possibly varying according to the nature of the community) during which they either thrive or fail: maybe that's true, and we have some of each type.

B3) If new article creation peaked for English on July 2006, and peaked for other language versions within a year afterwards, that would suggest to me that other-language-versions were possibly modelling their behaviours on the behaviours in English, regardless of their suitability. So for example, if "low-hanging fruit" were true for English, and English responded with a number of behaviours (such as more deletionism, more emphasis on "quality," higher barriers to new article creation, new focus on multimedia, etc.) -- then perhaps other language-versions started adopting those same behaviours, accidentally triggering a premature peak of new article creation. That hypothesis has always sounded true to me (anecdotally, people in other language versions have told me stories that tend to support it), but Swedish new article creation peaking before English suggests that is not true.

Does anyone have time to look at the stats pages and check a few other random languages for the date on which new article creation peaked, and the number of articles at that point? Because I am provisionally thinking, based on the Swedish example, that "low-hanging fruit" is not true.

Sue Gardner22:07, 11 December 2009
 

Yeah, I've always thought that the low hanging fruit explanation is, at best, a partial explanation. There are definitely other factors at play, some external, some internal. ... and we can really only have an impact on the internal factors.

As far as I know, some Wikipedias are more liberal, some are more tight, and this has had very little impact on article or editor growth. The real issue is cultural and behavioral, and a culture acts without policy (sometimes in spite of policy).

I think there is much more support for the hypothesis that there is more friction in the community, due to its increasing size, and maybe due to the kinds of people it attracts, and the kinds of conflicts that have emerged due to Wikipedia's popularity. Check the rise in administrator incidents, and then use the same tool to look up Wikipedia:Wikiquette alerts. People are just fighting a lot more, and I'm not so sure that we've been able to address the real root causes of those fights.

I'd like to take a closer look at Erik's stats. It's unfortunate that the best study I've been able to look at has come from an external source. I think there are some important trends worth looking at.

Randomran22:27, 11 December 2009
 
Edited by 3 users.
Last edit: 11:26, 28 January 2010

The data presented from the earlier fact base work indicates that there doesn't seem to be a relationship between new article growth and participation in En, FR, De wikipedia:

What this data says is that participation tends to plateau after a period of time, but that the most active contributors get even more active in expanding content. I will also look for a visualization that Eric did - think it shows a similar take-off.

The data on increased reverts from Ed Chi speaks to the original question that Netmouse posed. This data actually raises a question about whether there really was much of a shift in 2006? Looks to me like there has been a rather consistent increase in reverts over time. This may speak to continuous tightening of editorial standards/control across the project over time rather than a one time increase.

One other helpful source of article growth info is Eric Z's visualization visualization

BarryN01:04, 13 December 2009
 

There's a lot of useful information in the wikistats in terms of cross-language comparisons, and there isn't much analysis that I'm aware of. Just from an initial eye-ball estimate at a few of the top languages, there does indeed seem to be a pattern of new article rate peaking around 6-18 months before editing rate and active contributors peak.

When I get a chance, I'll try to plot total articles against active contributors for a bunch of languages, which should give some indication of the extent to which the opportunity to start new articles (probably the most important form of low-hanging fruit) is what draws people into the community.--ragesoss 20:35, 5 November 2009 (UTC)

ragesoss20:35, 5 November 2009

While this pattern is interesting, we should avoid jumping to the conclusion that the theoretical explanation for it that lead to this examination of the data (e.g. "most of the articles that most people wanted to write about had already been created") is in fact the most significant cause of the pattern. We should probably cross-reference with other data like number of people who are participating regularly, number of people with admin privs, number of people "touching" each article (or debating them on the talk pages), and rate of article deletion. It may be the case that a slow-down in article creation doesn't indicate a slow-down in the number of interesting topics people want to introduce to wikipedia, but rather that topics get harder to introduce due to an "entrenched" community watching and reacting to article production, and people get discouraged from trying to introduce more obscure topics that are not part of the knowledgespace of the existing editor/admin society (or even non-obscure topics) because they have less time to fiddle with an article all on their own, they start suffering cross-editing activity that requires merging, they get into debates/wars with other editors who have a different vision for the article, or their work (or whole article) gets deleted or tagged as needing improvement.

(The tags identifying how an article needs improvement are supposed to encourage good articles and more work on the articles, but I suspect that newbie editors find them challenging--not always in a fun way--and possibly discouraging.)

Netmouse21:04, 5 November 2009

Yeah, I don't think we'd want to make a specific conclusion like "articles are already created, and people ran out of things to contribute". But we could probably make a broad conclusion that there is a natural slowdown, and that it's not based on idiosyncrasies like the introduction of flagged revisions, or changing the privileges for anonymous editors. There are things that happen as a mass of people develops into a community with a culture, and we need to create a better on-ramp for new users so they can mingle and embrace that culture.

Randomran22:26, 5 November 2009
 

I am the antithesis of your average Wikipedian: female, 56, did not finish 10th grade and an autodidact. I also consider the concept and execution of "political correctness" to be obscene.

Bluntly then:

  • Wikipedia is top heavy with policies and guidelines, in language that is NOT easily comprehensible; it discourages participation from the get-go. In other words WP has become "Nerdsville". (And no offense to nerds intended, either!) A major overhaul and condensation is indicated.
  • Advertise in language that is short, sweet and to the point. Run banners that change every minute of the day:
Got the info you wanted? Tell us something you know!
If you learned something useful - leave some knowledge behind!
Know something interesting? Put it here!
People like you make the difference! Join Wikipedia today!
Everyone knows something important. Share it with us! etc.
  • In business an owner will almost always choose a manager in their own image, i.e. someone they understand and can relate to (or crudely: 'Fish always stinks from the head'). WikiMedia need to bring an iconoclast or two aboard.
  • Finally: get a professional trouble-shooter with a proven track-record to analyze the problem. "You get what you pay for."
Shir-El too20:39, 19 January 2010
 
 
 
 

That's a lot of useful information. I almost always come to wikipedia logged into my user account, and now that you mention it I *have* noticed that there are barriers to creating (and editing) pages anonymously. Like Bodnotbod noted, we need to try to see through the eys of newbies as we consider these questions; editing anonymously from time to time might help with that. Netmouse 19:24, 5 November 2009 (UTC)

Netmouse19:24, 5 November 2009
 

I have done a new analysis of wikipedia's growth. It seems that the new-article rate it fits fairly well a two-phase model: exponential growth (double every 11 months) until 2005, exponential decay (1/2 every 5 years) since 2006:


New article rate N'(t) - linear scale
    
New article rate N'(t) - log scale

The dots are the data, the solid lines are the model. The fairly abrupt transition in 2006 rules out the "low hanging fruit" theory, and the steady decline after 2006 (instead of a sudden drop and gradual recovery) rules out "bad media image". The best explanation that I can think of for the shape of that graph is by assuming that

  1. over 90% of the new articles are created by regular editors (as opposed to newbies)
  2. a reader only feels the need to register after creating one article as a IP user.
  3. regular editors leave or become less active with a half-life of 5 years or so
  4. the rate at which new editors were recruited was growing exponentially until 2005
  5. that rate dropped to nearly zero in 2006
  6. the cause was some change in wikipedia (not an event in the outside world)
  7. regular editors were not affected by that change

Assumptions 4 and 5 seems necessary to explain the exponential growth of the regular editor corps until 2005, and the lack of growth after that. Assumptions 1 and 7 seem necessary to explain why the new article rate did not drop immediately when the recruitment rate fell to nearly zero. Assumption 3 then explains the decay since 2006. Assumption 6 seems the only way to explain the abruptness and persistence of the 2006 drop in recruitment. Finally assumption 2 provides a possible explanation for that drop: namely, the policy that prevents article creation by IP users, that closed the main and most natural path through which readers used to become regular editors.

143.106.24.2504:54, 30 January 2010
 

Signing the previous entry: --Jorge Stolfi 05:05, 30 January 2010 (UTC)

Jorge Stolfi05:05, 30 January 2010