Editor Trends Study
This is a research project supported by the Wikimedia Foundation. For other ongoing research projects, please see the project tracking page on Meta.
This Study aims at understanding Wikimedia Editors' Trends. The main challenge is: many users are joining us, but “Active Editors” aren't growing. Who are the ones leaving? Newbies or Elites?
Current result: Non-vandal newbies are the ones leaving.
Caveat: it is important to note that the measures used in the above description (namely “New Wikipedians” and “Active Editors”) are not entirely comparable (but even with this distinction, the broad pattern still holds). We will use the current definitions:
- New Wikipedians: any editor who has completed at least 10 cumulative edits at any moment in history is considered a “New Wikipedian” in the month of the 10th edit.
- Active Editors: any editor who makes at least five edits within a given month.
Note: The definition of “leaving” is a complex one. We know that editors take wikibreaks, so the fact that an editor hasn’t edited for a certain amount of time doesn’t guarantee that the editor has left the project. This project does not assume a specific definition of “leaving.” Rather, measurements of retention are used to provide an approximation.
The following project is aimed at getting a better understanding of Editor Trends within Wikipedia projects. Every month, around ~15–19K users become “New Wikipedians”, yet the number of “Active Editors” (> 5 edits/month) has not grown accordingly Wikimedia Statistics.
With an influx of New Wikipedians equivalent to almost 20% of the existing Active Editor base, one would expect the number of Active Editors to increase steadily over time. The fact that the number of Active Editors has not increased suggests that the community is losing editors as fast as it is gaining editors. This had been previously predicted in Top_risks_2009 and Participation Crisis. An alternative explanation is that a smaller proportion of new editors are becoming fully active.
New Editors Active editors
(> 5 e/mo)
Very active editors
(> 100 e/mo)
Jul-10 15,380 81,474 10,411 Jun-10 16,143 83,087 10,574 May-10 18,339 88,465 11,115 Apr-10 17,861 86,577 10,871 Mar-10 19,276 90,024 11,224 Feb-10 17,823 85,610 10,747 Jan-10 19,481 90,758 11,475 Dec-09 17,469 84,478 10,443 Nov-09 18,155 86,328 10,510 Oct-09 18,687 86,793 10,836 Sep-09 17,474 84,575 10,834 Aug-09 18,791 87,456 11,103
Other researchers who have thought about editor trends include:
- On the distressing trend of editors leaving Wikipedia
- WikiProject Wikipedia Reform - Attrition Study: Reasons for leaving
- Finding social roles in Wikipedia
- Readers are not free-riders: Reading as a form of participation on Wikipedia
- Wikipedia burnout analysis
- Wikipedia: A quantitative analysis, see the PDF Version 9.2 MB
- Rethinking Wikipedia contribution rates
- Interface and tools for community health
- Growth trajectories of different Wikimedia projects
The question is then Which editors are the ones that are leaving – are they the new editors or the more tenured ones? This study is designed to help the Wikipedia community better understand this dynamic from a quantitative standpoint. If this method of analysis proves insightful, we can repeat for other projects.
Two Sets of Analysis
We are proposing the following two sets of analysis to help us get a quantitative understanding of these broad editor trends:
1. Active editor composition by tenure
In this analysis, we would take the monthly active editor number and then separate them by "tenure”, or how long they have been editing Wikipedia. For example, for Jul 2010, of the 36,148 Active Editors on the English Wikipedia, how many made their first edit within the past 3 months? 6 months? One year? 2 years? 3 years? We could also do the same analysis with the date that they became a “New Wikipedian” as the starting point. By comparing the Jul 2010 composition with, say the Jul 2009 composition, we should get a sense for how the tenure mix of our Active Editors is changing.
User: WereSpielChequers has done a similar analysis of administrators on enwiki.
2. Cohort analysis New Wikipedians
A related analysis is to understand cohorts of New Wikipedians at different points in time and how the age of the cohort affects their likelihood of staying with Wikipedia. This is similar to the previous analysis, but looking forward instead of looking backwards.
We would take a cohort of say, all users that became New Wikipedians in January 2009 and determine the number that were active editors 3 months later, 6 months later, 1 year later, etc. The Jan 2009 could then be compared with the January 2010, January 2008, January 2007, etc. cohorts.
3. A related analysis we could do is to analyze how quickly it takes a first-time editor to become a New Wikipedian. We could take the total number of New Wikipedians in, say July 2010, we’d break out:
- Number reached the 10 edit mark within 1 day, 3 days, 7 days, 14 days, 1 month, 2 months, 3 months, 6 months, etc.
- Within each of these groups, % that edited Article/Talk, User/Talk.
- Average number of edits/day on the day where editing occurs
- One interesting fact: about 90% of users that create an account and make at least one edit within 10 days of creating that account do so the same day they create their account. This could suggest a relatively compressed timeline for making the initial edit(s) on Wikipedia.
On Five Projects
Each Wikipedia has its own community with unique dynamics. While providing this data Wikipedia-wide could be informative, the real insight would probably be gained by having the data at the language-project level. We recommend researching the following Wikipedias:
- English Wikipedia
- German Wikipedia
- Russian Wikipedia
- French Wikipedia
- Japanese Wikipedia
We will start with the English, German, and Russian Wikipedias, and will look into French and Japanese if we have time.
The below chart from the study shows this quite clearly for the English Wikipedia. What it shows is the number of active editors (blue) plotted against the percentage of editors who joined in that month who are still active one year later (red). Please note that these trends hold true even when looking at new users who have completed at least 50 edits – it’s not just an increase in experimentation and vandalism... Newbies are making up a smaller percentage of editors overall than ever before, and the absolute number of newbies is dropping as well.
This analysis represents the first step in understanding editor trends over time. The research process will likely be iterative: once we complete this analysis, we are sure to find areas to pursue that we haven’t thought of. We will evaluate these new directions for further research as they come. These new directions may include other research methods, such as focus groups, surveys, etc.
As a follow up to the two sets of analysis already underway, we will be conducting an experimental study of a certain category of edits (regardless of whether they are new editors or not). As noted in discussions about the Editor Trends Study and elsewhere, edit count alone is not a definitive tool for measuring community health. Many types of volunteer work that are vital to the growth and maintenance of the projects do not produce high edit counts. Authorship of Wikipedia articles in the main namespace is one of these activities. For example, editors may compose drafts of high quality offline, and then add them to the site later in one or two edits.
Thus we are going to attempt to get a rough count of the number of Wikipedians who do a measurably significant amount of work on articles, as opposed to any other type of edit. The intial metrics we have developed for this count are as follows:
One paragraph: approximately 1024 characters (1 KB)
- (A.1) Active authors: > 1 KB addition to the article name space per month
- (A.2): Very active authors: > 5 KB addition to the article name space per month
- (B.1): Active article talker: > 1 KB article talk name space per month
- (B.2): Very active article talker: > 5 KB article talk name space per month
- (C.1): Active meta debater: > 1 KB Wikipedia/Wikipedia talk name space per month
- (C.2): Very active meta debater: > 5 KB Wikipedia/Wikipedia talk name space per month
We will be refining our metrics as the study continues, so keep in mind that this is only an initial effort at taking a more fine-grained look at trends in editing. If you'd like to suggest ways we could refine our count in order to get a substantive feel for article authorship, please do so on the talk page here.
- See also: this research project