Product Whitepaper

From Strategic Planning

Introduction

Product priorities are informed by our five-year plan (click to download PDF version)

The objective of this whitepaper is to provide the facts and analysis required to determine how to prioritize WMF product development efforts. This document will:

  • Start with the strategic priorities defined in Wikimedia's five-year plan.
  • Provide yearly business targets for each of the Priorities.
  • Put these goals within the updated context of (a) the state of Wikimedia projects as of late 2010/early 2011, and (b) some research on the broader web landscape.
  • Use the yearly targets and updated context/learnings to establish a Product Roadmap for WMF development efforts.

"Product" here means the entirety of technology through which people receive and develop Wikimedia content, whether that be through the regular website, a mobile gateway, an offline copy, etc. It does not attempt to capture engineering and research processes, code review/QA, or implementation details. Nor does it include purely community-focused programs and activities.

Strategic Plan and Business Goals

The Strategic Plan established Movement Priorities for 2010-2015. The Movement Priorities consist of the following goals: Increase Reach, Improve Content Quality, Increase Participation, Stabilize Infrastructure, and Encourage Innovation.

Movement Priorities – Five Year Targets

In October 2010, the Board passed a resolution approving the following targets for July 2015:

  • Reach:
    • 1B unique visitors per month (up from ~424M in July 2010)
  • Quality:
    • Metric TBD, but target of 25% increase in quality relative to July 2010 has been established
  • Participation:
    • Number of articles: 50M (up from 16.1M in July 2010)
    • Active editors: 200,000 (up from ~85,000 in July 2010)
  • Diversity: double percentage of Global South (18.5% to 37%) and female (12.6% to 25%) contributors

These are “big hairy audacious goals” (BHAG), intended to inspire and rally Wikimedia’s collective energy. They are understood to be highly ambitious, and perhaps impossible to achieve. For the purposes of our planning, we will treat them as stretch targets and seek to align our programmatic work so as to reach them. They deliberately omit “infrastructure” and “innovation” (which are among the larger movement priorities) because these areas are seen as internal rather than external measures of success. That said, within the product context below, we do include innovation as an explicit priority, because it relates to certain product investments.

The five-year targets set by the Board will be complemented by additional key performance indicators developed and monitored by the staff on an ongoing basis. These KPIs will establish baselines and targets for:

  • Broken-out reach indicators, such as geographically broken-down unique visitors, desktop/mobile/offline usage stats, time-on-site
  • Financials, such as number of individual donors, financial reserves, and percentage of restricted funds
  • Infrastructure, such as uptime of key services, site load time in different parts of the world, time-to-interactivity, availability of secure off-site copies, public content snapshots
  • Content, such as number of media files, content objects in other Wikimedia projects
  • Community health, such as editor retention and satisfaction

For the purposes of determining product development priorities, we are focusing first and foremost on the five-year targets set by the Board. Additional key performance indicators (especially those related to content and community) will also inform product development.

Movement Priorities: Business Targets for 2011

For planning purposes, the five-year targets need to be broken down on a yearly basis. In order to do so, we significantly need to refine our planning as to which program activities are likely to help us achieve these targets, as well as improve our forecasting.

For example, we may determine that in order to meet our 2015 target of 200k contributors we need to grow our contributor base to 100k contributors in 2011. This target would require WP projects to add 15k new contributors (net) over the next year. We should be able to assign projects and estimate each project’s contribution to this target. While both Community and Technology projects will be included in the yearly plan, the focus of this document is on Technology projects that help the movement achieve its growth goals.

The business targets should also reflect expectations of growth from mature projects and emerging projects. The drivers of growth are likely to vary based on where a project is in its lifecycle.

Framework for Strategic Product Analysis

The diagram above represents the overall framework for the Product Planning process. As stated, we start off with the five-year movement priorities and strategic goals to establish the business targets. We then put these targets within the context of:

  • Community Trends: What are the major trends within the community? How are the different user groups evolving? How are their needs changing? Is there a healthy flow of users going from one stage in their lifecycle to another? Is there a healthy inflow of new contributors? How are projects retaining valuable contributors? How do these dynamics vary across project?
  • External Trends: What are the developments in the overall landscape of the web that could affect how users interact with WMF projects? Are there broad behavior patterns that could impact the likelihood of new users contributing/tenured users continuing to contribute? How have user expectations been shifting and what could be the likely impact on WMF projects? Are there other disruptive technologies we need to anticipate?

Imagine each area of product development as a card that has written on it a description of that feature or product area (e.g. "slideshows for images shown in articles") and a hypothesis how that feature will impact the strategic priorities (e.g. "will increase likelihood that readers fully explore the content of an article").

The stack of "feature cards" could easily contain hundreds of individual cards: How do we sort them? The answer is the validation of hypotheses by experimentation ("Let's build a prototype slideshow gadget and see whether it increases usage"), but also by checking them against the established priorities, the trends in the community, and the surrounding context. This validation process ultimately needs to result in a prioritization of product areas. We can reach conclusions relevant to our short term decision-making, and posit questions that need to be answered to support longer term planning.

Product areas relevant to Wikimedia and impact hypotheses are described in further detail later in the document.

Basic User Model

There are limitless ways to get involved with and to contribute to Wikimedia projects. The Contribution Taxonomy Project is an effort to create a very granular view of these different roles.

An initial draft of a detailed taxonomy of contributions to Wikimedia projects
An initial draft of a detailed taxonomy of contributions to Wikimedia projects

For the purposes of this document, we are using a simplified user model which does not attempt to account for the full breadth and depth of contribution types: readers are users who participate passively in the Wikimedia projects. They can potentially be converted into new editors. After intense contribution over an extended period of time, they may become advanced editors, which often goes along with taking on more specialized functions in the Wikimedia community and acquiring additional access privileges, such as administrator status.

Community Trends: Reach

Readership Growth

Readership of WMF projects continues its historically strong growth. As of December 2010, WMF projects have approximately 395,000 unique visitors, which represents an approximate 14% increase from December 2009. As of December 2010, WMF sites generated 13.9B page views, a 22% increase from December 2009.
Source: comScore Media Metrix; Wikimedia server logs (pageviews)
Indexing the unique visitors per region to 100, we can more clearly see which regions show the strongest growth in unique visitors. India is specifically broken out because it represents a strategic priority for WMF. The three strategic priority regions identified by WMF – India, Brazil, and the Middle East/North Africa – clearly correspond to the strong growth trends in these regions.
Source: comScore Media Metrix

There are a number of possible explanations for the continued readership growth, including:

  • Internet growth: The global Internet-connected population continues to expand.
  • Content: WMF projects have increasing breadth of content at a requisite level of quality.
  • Search: The content on WMF sites is quickly indexed and highly ranked in all key search engines.
  • Lack of direct competition: For many of our larger projects, there are no significant direct competitors (e.g., online encyclopedias), though there are numerous other ways users may obtain information.

Wikimedia Ranking

WMF projects are currently ranked 5th in worldwide usage, measured by Unique Visitors per month. WMF projects have been ranked 4th overall for most of 2008 and the first half of 2009. In July 2009, Facebook overtook WMF projects as 4th overall.

Worldwide Ranking by Monthly Unique Visitors (comScore): November 2010

Rank Site UV (Nov 2010)
1 Google Sites 970,109
2 Microsoft Sites 869,373
3 Facebook 647,482
4 Yahoo! Sites 630,275
5 Wikimedia Foundation Sites 410,816
6 Amazon Sites 252,698
7 AOL, Inc. 239,205
8 Ask Network 233,155
9 eBay 229,915
10 CBS Interactive 229,265

Wikimedia's world-wide ranking can in large part be attributed to its availability in more than 250 languages, whereas many other top websites are only available in one country's language(s), or a set of languages widely spoken by Internet users. In the United States, WMF projects are only ranked 12th largest.


United States Ranking by Monthly Unique Visitors (comScore): November 2010

Rank Site UV (Nov 2010)
1 Yahoo! Sites 180,987
2 Google Sites 178,726
3 Microsoft Sites 175,731
4 Facebook 151,722
5 AOL, Inc. 114,484
6 Ask Network 92,369
7 Glam Media 89,864
8 CBS Interactive 88,017
9 Turner Digital 86,452
10 Amazon Sites 83,875
11 Viacom Digital 81,710
12 Wikimedia Foundation Sites 77,773
13 New York Times Digital 72,280
14 Apple Inc. 69,896
15 Fox Interactive Media 69,184

Community Trends: Contributors

(Note: Since the combined Wikipedia language editions have by far the largest contributor base of all Wikimedia projects, we are focusing on them. That being said, trends in other Wikimedia projects require further study. For example, Wikimedia Commons has more than 6,000 active contributors, and has experienced strong contributor growth through 2010.)

Every month, Wikipedia projects have a stable base of approximately 80-90k Active Editors and 10k Very Active Editors contributing to the projects (Active Editors are defined as editors who make at least 5 edits in a given month; Very Active Editors are defined as editors who make at least 100 edits in a given month). In addition, approximately 17-20k users become “New Wikipedians” every month (editors who have completed their first 10 cumulative edits in a given month). While the number of “New Wikipedians” is certainly small compared to the overall Unique Visitor count of 411M, it represents approximately 20% of the Active Contributor count, suggesting that there is, on a relative basis, a material inflow of new contributors to Wikipedia projects every month. While the inflow of new contributors is material, there are two points to note:

  1. Even though there is a significant number of New Wikipedians every month, the total number of Active Editors has not grown. This suggests that there are a sizeable number of users that “leave” Wikipedia (the definition of “leave” is a complex one, which we will leave open for now). At present, we do not know whether these New Wikipedians are the ones who are leaving, or whether Wikipedia is losing its more experienced editors.
  2. Over time, the Editor growth (or lack thereof) has become more and more disjointed from the Reader growth. With Readers growing at a 20+% rate and Editors slightly declining, the number and fraction of Readers that choose not to edit is increasing (see related analysis and discussion in Mako Hill's blog).
Month New Wikipedians Active Editors
(>5 edits/month)
Very Active Editors
(>100 edits/month)
Sep-10 14,327 79,413 10,539
Aug-10 15,303 82,794 10,909
Jul-10 15,379 81,397 10,409
Jun-10 16,150 83,070 10,574
May-10 18,338 88,451 11,115
Apr-10 17,863 86,570 10,870
Mar-10 19,276 90,020 11,224
Feb-10 17,820 85,603 10,747
Jan-10 19,483 90,748 11,475
Dec-09 17,463 84,471 10,443
Nov-09 18,152 86,321 10,510
Oct-09 18,686 86,790 10,836
Sep-09 17,475 84,571 10,834
Aug-09 18,785 87,450 11,103

These numbers have been slightly declining over the past few years, though June and July have shown an uncharacteristic dip. When the data are examined on a project-by-project basis, we find that the English Wikipedia shows a larger decline than other Wikipedias.

Overall Editorship has been stagnant/declining

As previously mentioned, the number of Active Editors on Wikipedia has remained stagnant. We can approximate a trendline:


The pattern of stagnation and, in some cases, slow decline is not limited to a few Wikipedias. It has been observed across many different projects. Here are the Active Editor trends for a few projects (de=German, fr=French, ru=Russian, ja=Japanese, pt=Portuguese, zh=Chinese, es=Spanish, he=Hebrew):

Fewer people join the community every month

While the trend across all Wikipedia languages combined is one of stagnation, when looking at the number of people completing their first 10 edits in any given month, a much more noticeable decline can be seen in the aggregate count. The decline of "New Editors" is even more pronounced for some individual projects like the English Wikipedia. In March 2007 (the peak month), 14,734 completed their first 10 edits; in September 2010, only 6,677 people did so.


Editor Trends Study

In October 2010, WMF commissioned the Editor Trends Study to investigate trends within the Wikipedia’s editor community. As previously mentioned, the Active Editor base in the Wikipedia projects has been declining slightly over the past several years even though several thousand New Wikipedians join the project each month. This dynamic suggests that editors are leaving the project faster than new editors are joining, but the existing data does not tell anything about who is leaving. The central questions the Editor Trends Study aims to answer are:

If new editors keep joining Wikipedia, why isn’t the number of active editors growing? Are the new editors leaving as quickly as they join? Or are the more experienced editors the ones that are leaving?

Here are the main findings of the study (the entire paper may be viewed here).

(Note: the study does the most in-depth analysis on the English Wikipedia, with comparisons to the German, French, Spanish, Japanese, and Russian Wikipedias. As part of the project, there is a software toolkit that researchers may use to investigate other projects).

Wikipedia communities are aging.

By looking at the age composition of editors who contribute in a given year, we find that Wikipedia communities are aging. i.e., new users are becoming a smaller and smaller portion of the editing population. The following chart shows the age composition, in wiki-years, of editors that made at least 50 yearly edits to the English Wikipedia:

The data show the age composition of the English Wikipedia is clearly shifting, most dramatically between 2006 and 2007. We see a similar pattern across the larger Wikipedia projects, with the German Wikipedia having the lowest percentage of “young” Wikipedians and the Russian Wikipedia with the highest.

Some of this aging is to be expected. As a community matures, we should expect to see more “veteran” community members (assuming that these community members are retained at a reasonable rate). On the other hand, developing communities have higher percentage of new members (e.g., the Russian Wikipedia).

The retention of New Wikipedians dropped dramatically from mid-2005 to early 2007, and has since remained at a record low.

The following chart shows the change in retention rate (defined as percentage of New Wikipedians still active one year after making their 10th edit) on the English Wikipedia. The chart also shows the Active Editor growth.

We see from the graph that there is a noticeable difference in retention of editors that joined during 2004-2005 compared to editors that joined after 2007. The data show that new editors in recent years are leaving at faster rates than ever before. Of the editors that joined in 2004 and 2005, approximately 35-40% of them were still editing one year later (see paper for full results). Of the editors that joined in 2009, only 12-15% of them were editing a year later. Not only are fewer users becoming New Wikipedians, those who do cross the 10-edit threshold are abandoning at very high rates.

While there is not a definitive explanation for the decrease in retention yet, it is likely the English Wikipedia experienced an “Eternal September” effect from mid-2005 to early 2007. Under this notion, the influx of users created a situation where the existing community tried to accommodate the new users while at the same time accomplishing its work of writing a high-quality encyclopedia. In order to cope with the torrent of new editors joining, existing editors established defensive mechanisms to protect the encyclopedia (e.g., vandalism fighting tools, increased requirements for acceptable edits, etc.). These changes resulted in policies that, intentionally or not, made it more difficult for new editors to become acculturated within Wikipedia.

More in-depth analysis of individual cohorts of users reveals some other interesting trends. The following chart shows the retention patterns of editors that became New Wikipedians the Januaries of various years.

Editor retention has not worsened significantly over the past several years. While 2006-2007 saw a downward step-change in editor retention, the change in retention rate since 2007 has been relatively small, albeit at a lower level. This trend is shown by the close clustering of the retention curves of users joining between 2007-2010 and is consistent with the initial retention chart where the retention rate appears to bottom out around 2007.

Retention rate of veteran editors appears to be stable. We know that editors, even very active ones, leave the project for a variety of reasons. Some take wiki-breaks and others leave permanently. A certain amount of churn is to be expected. The recent activity of veteran editors suggests that they are continuing to edit Wikipedia at reasonable rates. This result is not entirely surprising as these veteran editors are the “survivors.”

Why Do Editors Stop Editing Wikipedia?

We know from evidence that the community can be harsh in its treatment of New Editors (Newbies). There is also evidence that suggests Newbie treatment (e.g., reversions with little or no explanation, hostile editors) impacts these users’ decisions to either continue editing Wikipedia or find some other online pursuit.

Reversion and Newbie Treatment

We have quantitative data that indicate New Editors are reverted more frequently than experienced editors. The following graphs from Erik Zachte show the difference in reversion rates for edits made by anonymous users, registered users, and bots for the English, Dutch, Spanish, and Portuguese Wikipedias:

Erik Zachte’s revert analysis shows that edits made by anonymous users are about five times more likely than edits made by registered users to be reverted. Anonymous edits have ~25% chance of being reverted while registered edits have less than 5% chance of being reverted. Moreover, the reversion rate of anonymous edits has dramatically increased over time while the reversion rate of registered edits has remained relatively stable.

Analysis from Ed Chi et al. describes a similar dynamic. In his 2009 paper, he shows that editors who make fewer edits per month get reverted at higher rates than editors who make more edits per month.

Reversion rate by editing activity (English Wikipedia only)

Each line in the above graph shows the reversion rate for a class of users (e.g., users who make 1 edit/month, between 2-9 edits/month, 10-99 edits/month, etc.).

This reversion data suggests a strategic question: How do reversions impact editors' decisions to continue editing Wikipedia?

To further dig into this data, it would be useful to assess how the percentage of vandalism has changed over time. If vandalism has increased together with reverts, the revert trend would be less potentially problematic. While vandalism cannot be reliably detected, some types of destructive editing (such as page blanking or replacement of a long page with a very short text) can be. Qualitative sampling-based research could complement this assessment. Using this data, we could better understand to what extent editors making good faith edits are more likely to be reverted now than five years ago.

Former Contributors Survey

Qualitative data is needed to help us understand how these revert trends impact a user’s decision to continue editing Wikipedia.

In January 2010, WMF conducted a survey of casual contributors who stopped editing the English Wikipedia. We received 1,238 responses from users that contributed between 20-99 lifetime edits to Wikipedia, but stopped for the period Oct-Dec 2009. The survey revealed a complex notion of what it means to “leave” Wikipedia, and provided some very valuable insight into the difficulties New Contributors had during their time editing Wikipedia.

The survey revealed that approximately half of casual editors stop contributing because of reasons outside of the community’s control (e.g., they start a new job, get married, etc.). But a sizable percentage stop contributing because of experiences they have with the community, and this percentage increases as their editing activity increases:

25% of all respondents said they stopped contributing because

"Some editors made Wikipedia a difficult place to work"

This number goes up to 40% for users that reported making over 10 edits/month.

The most revealing insights into the experience of these casual editors, however, come from their comments. Many users relayed negative experiences with the community, many of which involved reversions to their edits:

They always delete my edits.
Having my edits reversed - also, nothing about this was explained in a fashion that was easily understood. The people "overseeing" the edits appeared to be too power-hungry in their roles.
One or Two Editors that think they are God and make life uneasy even though I follow the rules of Wikipedia.
Having edits reversed, or eliminated because other people feel territorial about certain topics, and refuse to accept the input from other people.
Editors pushing their POV whilst claiming they are without bias...
There was no one horrible experience. I just found it unpleasant over time... it seemed like there were an awful lot of bullies.
Other contributors reversing whole paragraphs that I spent a long time on to write carefully.

Open questions around editor trends:

  • Current lifecycle trends within the editor community: who comprises our existing active editor base? Are new editors quickly leaving? Or are the more experienced editors leaving? The Editor Trends Study will help us answer these questions.
  • How are reversions affecting new editors? What percentage of reversions are fully desirable, and how has that percentage changed over time? How strongly do revert trends predict the growth/decline of an entire editor community?
  • How does the way experienced editors treat new editors affect the new editor’s likelihood of becoming a consistent contributor?
  • When editors leave a particular WMF project, what percent shift their activity to alternative WMF projects? If so, which projects, and how do those trends correlate to differential availability of editor amenities, e.g. specific software extensions like LiquidThreads?
  • When editors leave, is there a significant cohort which re-registers under a fresh start, perhaps after some time interval, or do they leave permanently? Can different types of public/private users be investigated to cull usable IP number data, or is this situation opaque to investigation?
  • What kinds of "exit interview" or "please don't go pitch" brings back more potential "quitters", what kinds of information can be culled from them, and what modifications or returnee programs might draw them back in?
  • What experiments can we run to give us a better understanding of how new editors are being affected by how experienced editors are treating them?

External Trends

Evolution of the Web: 2001-2010

The web has evolved dramatically since Wikipedia was founded in 2001. The types of applications available today are vastly more powerful and enable general users to accomplish more tasks more easily than ever before. The following information is not meant to be an exhaustive evaluation of major developments over the past nine years. Rather, it focuses on developments that have had a major impact on (a) the ability of users to contribute/participate on the web, and (b) models of user-web interaction.

The period 2005-2007 was an incredibly transformative time for web applications. This period brought along the following developments:

  1. Major changes in user interaction models as a result of the adoption of Web 2.0 technologies and UX patterns
  2. Blogging/publishing software becoming more powerful. WSIWYG/rich text editing became commonplace around this time.
  3. The rise of social networking software and web applications with a heavy social component (e.g., MySpace, Facebook, Flickr, Twitter)

The graphic below illustrates some major developments in the area of online participation and user interaction:

Prior to this period of change, websites tended to fall into two main categories -- static HTML websites, and dynamic applications implemented through server-side scripting and web forms -- and web development was constrained by both the capabilities and execution times of client-side scripting. Most users who wanted to publish on the web used early forms of online blogging tools (Blogger, Livejournal, Xanga) which had limited functionality or required HTML. Web 2.0 brought a number of widespread innovations to web applications. The following are important events:

  • On April 5, 2006, W3C released the first draft of a JavaScript object that “gives AJAX its power".
  • Gmail, one of the first large-scale deployments with heavy AJAX use, launched via invite in mid-2004. In February 2007, Gmail became open to everyone.
  • Facebook opened its service to the general public in September 2006.
  • Twitter launched in 2006, and reached a tipping point at SXSW in March 2007.
  • Moveable type launched a WYSIWYG editor in June 2007.

Other developments that may have an impact on WMF Projects:

  • Yahoo Answers launched July 5, 2005.
  • Answers.com launched January 2005.
  • Wikia founded “late 2004”, changed name March 27, 2006.

While it is difficult to quantify changes in user expectations as it relates to web applications, it’s reasonable to assume that user expectations have changed drastically as a result of the innovations that became mainstream during the 2005-2007 and continue today. The studies conducted during the Usability Initiative provide evidence that the editing interface is confusing and does not match user expectations:

"In many websites, you kind of see the screen just the way you see it in the article. Here it looks like they converted it into plain text. I think what I’ll have to do is open another Wikipedia, so I can compare the views. In blogs, it’s easier to add stuff- you don’t go into the programming mode. This html version- it's much easier to edit a blog." -- Saurab, 28, Retail Software Developer

"I couldn’t really understand the format, I didn’t know what it was saying. I would just go to the stuff that’s readable. It looks kinda like a website, lingo stuff." -- Tito, 21, Student and Video Producer

We can hypothesize that users who started editing Wikipedia during the 2001-2006 time period were accustomed to a very different web environment than users who start to edit Wikipedia today. There simply weren’t easy, yet powerful, WYSIWYG editors to enable the types of publishing that are present today, and in general, web applications were less intuitive, less social, and less responsive.

User Allocation of Online Time is Shifting

Another external trend is the recent shift in how users are spending time online. Here is a chart that compares the time users spent on the top 10 websites in July 2009 vs. July 2010 (constructed from comScore data):


Facebook time per user has increased nearly 50%, while many other sites (e.g., AOL, Yahoo, Microsoft) saw a decline. Google had a modest 6% gain. Interestingly, “All other sites” also increased 49%.

comScore data also show us that the Facebook/Wikipedia userbase overlap is increasing:

The question for WMF projects is how this shift is impacting a user's choice to participate in WMF projects. The above data are for overall Internet users -- the relevant question is how current and potential editors are being affected by the trend. Are users who would have otherwise edited Wikipedia spending time on Facebook instead? If so, why? Are current editors spending less time on Wikipedia and more time on social networking and gaming sites? Today's Internet users have many more ways to contribute and interact on the web than they did 5 and 10 years ago. A deeper understanding of how Wikimedia sits within this "competitive" environment is likely an important step in understanding editor trends.

It should be noted that the social networking effect on user time is not new. A compete.com study release in January 2007 indicated that MySpace accounted for 12% of online time.

Time Spent on Wikimedia

Aggregate Wikimedia time per month per user does not reveal an obvious increasing or decreasing trend. These figures, however, are dominated by Readers, as Readers comprise the vast majority of users that comScore tracks.

Product Areas and Hypotheses for Growth

The following mindmap lists out the main product areas relevant to Wikimedia Foundation projects (green: features related to readers and new editors, red: features related to editing and collaboration, blue: features related to infrastructure and platform). Detailed descriptions of the first-level and second-level areas in the taxonomy can be found below. These include hypotheses of impact and assumed risks.

  • Impact on Reach means increase in the number of people perusing Wikimedia projects.
  • Impact on Participation means increase in the number of people actively contributing content of value to our audience.
  • Impact on Quality means increase in measurable quality of existing content, or generation of quality content by the existing community.
  • Impact on Innovation means increased likelihood that the movement (in the largest sense) will develop tools that advance strategic priorities.
  • Impact on Diversity means increased likelihood that groups currently underrepresented in the Wikimedia movement will join it.

View full-size version of the feature map - View in zoomviewer

Please see Feature map on MediaWiki.org for a living, documented version of this taxonomy.

Product priority recommendations

With an eye to the internal and external trends we have described, and the large number of opportunities before us, we want to provide a first answer to the question of how the Wikimedia movement and the Wikimedia Foundation should prioritize their product-focused work in order to achieve the five-year goals outlined in the strategic plan.

To do so, we use a framework that distinguishes between the following categories:

  • Great Movement Projects are critical to the success of the Wikimedia movement. They require significant Wikimedia Foundation resources and leadership, as well as strategic alignment-building across the movement. Their success is of paramount importance, and to guarantee it, other projects may need to be deemphasized or discontinued.
  • Strategic Opportunities are specific areas which the Wikimedia Foundation has identified as of high strategic significance, which should receive substantial resource investments, and in which we should strive to continually deliver milestone accomplishments.
  • Frontier Projects are investment areas that could help us make leaps towards our strategic goals, but which come with some risk and complexity. These are areas toward which the Wikimedia Foundation will invest some resources, typically involving considerable prototyping and data analysis to better understand impact and risks.
  • Red Links are projects that could have a very high impact toward our strategic goals, but which the Wikimedia Foundation is unlikely to be able to take a leadership role on, and is requesting the help of the entire movement to tackle.

The intent of this framework is to create a healthy balance between calculated risks and known opportunities, and to clearly identify priority areas in which the larger community is especially invited to help. In each area, we’re constraining ourselves to enumerate no more than three projects.

Based on the above, we present the following table of proposed priorities, which are explained in more detail below.

Type of project Project name Goal Mindmap subset view
Great Movement Project Rich-Text Editor Develop and deploy a rich-text editing environment for Wikimedia projects with visual editing tools for all key markup.

Legend: ! = Part of Great Movement Project ; ^^ = Strategic Opportunity ; ^ = Frontier. "Red link" projects omitted.

Great Movement Project -1 to 100 Develop, test and productize interventions designed to increase the retention of new contributors. Then create new entry vectors to increase the inflow of contributors.
Strategic Opportunity Mobile Develop an optimized site experience for audiences using 2G or 3G phones to access Wikimedia content. Create entry vectors for meaningful participation.
Strategic Opportunity Multimedia Contribution and Review Create a delightful experience for contributing media files. In parallel, develop effective tools for vetting large numbers of media contributions.
Strategic Opportunity Internationalization Blockers Eliminate software barriers in specific languages that prevent people from using or contributing to Wikimedia projects. Focus on strategic priority languages.
Frontier Project Quality Review Develop a comprehensive quality review and labeling toolkit. Systematically invite reviews from people with demonstrable expertise. Make all feedback maximally usable and useful to readers and contributors alike.
Frontier Project Discussion Redesign talk pages to ensure they are maximally effective for supporting collaboration and the new user experience.
Frontier Project Offline Develop a rich toolset for exporting, managing and using Wikimedia content with limited or no Internet connectivity.
Red Link Structured Data Build a structured data repository (a “Wikidata Commons”) to effectively share structured data across Wikimedia projects.
Red Link WikiProjects Create a full set of tools to support the organization of collaborative work by identified individuals around a specific topic or issue.
Red Link Social Media Updates Implement lightweight and privacy-sensitive tools that enable users to extend their Wikimedia activity into social networks.

Great Movement Projects: Rich-text editing interface, and the -1 to 100 edit experience

An analysis of all available research findings strongly supports four dominant overall conclusions:

Conclusion Explanation
The decline in new contributor growth is the single most serious challenge facing the Wikimedia movement in the year 2011. As of December 2010, the number of active contributors to Wikimedia's largest project, the English Wikipedia, is at its lowest level since April 2006. This represents a decrease of 37% from its peak in March 2007. [1] Other large and mature language editions are experiencing similar trends of decline or stagnation. This overall trend coincides with a decline of the number of New Wikipedians. [insert editor trends study summary]
Removing the avoidable technical impediments associated with Wikimedia’s editing interface is a necessary, but not sufficient precondition for increasing the number of Wikimedia contributors.

Wikimedia’s editing environment, which fundamentally is based on 1995 technology, represents a highly complex and intimidating way for users to engage with content online. In usability studies, users themselves call out the editing environment as unusual, and ask why a rich-text editing environment as used in tools like Blogger or Google Docs is not present. (See, for example, findings from Wikimedia's usability studies, e.g. general editing issues.) Data from other environments, such as Wikia’s deployment of rich-text editing, strongly supports the hypothesis that the complexity of the editing environment acts as a major deterrent before users even make their first edit. Wikia reports that the save ratio on edit actions is about twice as high for anonymous contributors (16% vs. 8%). Even registered contributors are more likely to complete edits using the rich-text editor (44% vs. 36%).

We need to ensure that Wikimedia’s increasing standards of quality are accompanied by a commensurate improvement in the acculturation of new good faith editors, so that their experience is as uniformly as possible a positive one. In addition, we need to carefully review these standards (policies, procedures and implementations) to ensure that they best serve the goals of the Wikimedia movement.

Through the years of Wikimedia’s most dramatic growth (2005-2007), a significant drop occurred in the retention of new contributors, which has ultimately led to a slow decline in the number of active contributors. This drop in retention is most likely associated with evolving standards of quality and response mechanisms to ensure that only high quality edits survive (new and evolving policies, tools and processes for fighting vandalism, etc.). Creating a more positive and nurturing environment for new users appears to be the most promising strategy to increase retention.

We need to experiment with new technical and social approaches for engaging new contributors beyond the “Edit” link.

It is clear that the number of new contributors is declining in many mature Wikimedia projects (people who make it past 10 cumulative edits in a given month). Regardless of the causes for this, it is unlikely that we will be able to fully reverse this trend by purely focusing on the experience of individuals who have already chosen to edit.

Based on these conclusions, we recommend that the Wikimedia movement undertake two Great Projects, projects that need to receive sufficient support to be successful, even at the expense of other activities:

  • Implement a rich-text editing environment. This will likely necessitate some substantial architectural work on the MediaWiki parser and possibly also require MediaWiki backend changes. Given the large complexity of content and programmatic constructs expressed with wiki markup, it represents a major undertaking. Our initial focus should be on the core of a rich-text editing environment (including visual editing for tables, citations and templates), with other ideas from the “editing” branch of the feature catalog being prototyped primarily as labs projects.
  • Improve the -1 to 100 edit experience. Focus on increasing retention before increasing inflow. Hundreds of users complete their first 10 edits every day. They discontinue editing at a faster rate than ever before in the history of our projects. Before we experiment with mechanisms that increase the number of people who edit, we suggest running a series of technology and community experiments designed to increase retention, and productizing the most effective approaches. If we are successful, we can then focus on maximizing the inflow of new editors (cf. "Reader Conversion" section of feature map, but also consider e-mail alerts to reactivate retired editors).

While both projects have technical and social aspects, developing a rich-text editing environment is primarily a technological undertaking, with some required support in execution and socialization. Improving the -1 to 100 edit experience is primarily a community project, with a need for some technical interventions where implementations and measurement are concerned. The structure of the teams working on these large activities should reflect this difference in focus.

Strategic opportunity: Multimedia

In the time period between July 2007 and December 2010, the number of new articles created per day in the English Wikipedia decreased from 2,190 to 1,025 (-46.8%). During the same time period, the number of media files uploaded every month to Wikimedia Commons increased from 109,922 to 236,824 (+115.45%). Both the number of “active contributors” and the number of “new contributors per month” for Wikimedia Commons have increased, while other projects have stagnated or declined.

This reflects both a growing interest by the Wikimedia movement in multimedia as an area of improvement (in part through community activities such as photo projects, in part through partnerships with the cultural sector), and multimedia contributions as an opportunity for new contributors to make a significant difference. It’s important to note that these trends are visible in spite of many known usability issues with Wikimedia Commons, including:

  • the process of uploading media files being very complex and confusing to new users (see videos and report from a Wikimedia Commons usability study);
  • the distinction between Wikipedia (and other sister projects) and Wikimedia Commons, in purpose and policies, which is far from intuitive or clear.

We therefore see multimedia as a key opportunity for investment, where we can strengthen positive trends (as opposed having to reverse a negative trend), with high expected payoff in terms of contributor numbers and content quality. At the same time, to prevent plunging Wikimedia Commons into an Eternal September phenomena as low quality contributions increase and large numbers of new contributors have to be assimilated, we strongly recommend a two-pronged approach where increases in inflow are accompanied by better inflow management.

This leads us to the following two emergent priorities:

  1. Build a delightful experience for contributing media content from any device. Adding relevant freely licensed media content to an article, or directly to Wikimedia Commons, should be an intuitive and satisfying experience. Insofar as this includes smartphones, it relates to the mobile opportunity described below.
  2. Build a strong toolset for reviewing media uploads. We should give the community the tools now to deal with a large influx of new content in a way that is also socially conscious and only minimally deters contributions. Such a toolset will be especially important for inflow sources that have a low signal-to-noise ratio. See the media review proposal for some initial notes on the requirements for this toolset.

There are many other changes that could improve the rich media experience on Wikimedia projects, ranging from the reader-oriented presentation of the content to metadata management, geo-tagging and multilingual functionality (cf. feature map). While all of those features are important, we recommend an initial Wikimedia Foundation focus on contribution and contribution review, to clearly demonstrate the value of a continued investment in this area, and to achieve impact relevant to our primary strategic priorities.

That being said, multimedia is an area with a lot of room for community innovation, and we hope that the Wikimedia Foundation will be able to develop a thriving labs environment through which community experiments can be prototyped, or even used at limited production scale.

Strategic opportunity: Mobile

As the mobile strategy summary states:

By the end of 2010, it is estimated that there will be 5.3 billion mobile phone subscribers worldwide; approximately 77 percent of the world population will have a mobile phone subscription. By the end of 2010, 68 percent of people living in the Global South are expected to have a mobile phone subscription. Obviously, Wikimedia should have a strategy that allows Wikipedia and other Wikimedia projects to be easily read and edited using mobile technology, especially in the Global South, where mobile technology is the gateway to the Internet.

Additionally, Morgan Stanley projects that smartphone shipments will outpace desktop/laptop PC shipments in 2012, displacing them as a universal and ubiquitous computing platform.

Wikimedia’s read-only mobile portal, at present, is only active for some Wikipedia languages. A (slow and fragile) redirect sends users with certain selected smartphones to the portal, which servers only high-end phones well (it is too bandwidth-intensive for low-end phones and connections). This, alone, however has already resulted in mobile pageviews being more than 4% of site pageviews as of December 2010.

The mobile strategy recommends the following, a recommendation with which we concur:

  1. Develop a strong mobile base platform integrated into MediaWiki. This platform should support both low-end and high-end phones/connections.
  2. Experiment with mobile contribution mechanisms. This could include minor edits, image uploads, and article ratings.

A remaining question is the "app vs. web" implementation strategy. It may be desirable to build an open source companion app that relies on the mobile gateway for formatting, but provides additional features that cannot yet be delivered through the web experience, consistent with the approach taken with the official Wikipedia iPhone app to date (which uses device access to provide geo-specific information).

Strategic opportunity: Internationalization

The strategic plan calls for priority investments in India, Brazil, and the Middle East/North Africa. India has an especially complex language landscape, with 22 regional languages in addition to English, and a set of complex scripts to represent them.

It is is assumed that at least a significant percentage of growth in new contributors will need to come from these world regions, and that many of them will want to write in their native language and script. This is supported by the fact that although the numbers are small, language versions like Hindi and Malayalam are experiencing growth in both active and new contributors.

Where major issues remain in displaying or entering characters in a language, or where localizations are incomplete, this presents serious and completely avoidable impediments for new contributors. In many cases, the difference between good language support and no language support will be the difference of being able to recruit a contributor or not.

As of December 2010, the Wikipedia community in Indic languages has already done a considerable amount of work to add language support to the respective wikis (primarily by adding input methods). However, the solutions are incomplete, imperfect and not consistently implemented.

Internationalization and localization work can quickly become a game of diminishing returns as the work turns from the absolutely necessary to the highly specialized. We therefore recommend a highly focused approach:

  • Continually eliminate critical internationalization issues for widely spoken languages, focusing on text input and text display on all key devices/operating systems, with highest priority given to languages spoken in priority geographies. Implement these solutions in MediaWiki proper so that they are available in all Wikimedia projects (e.g. typing in Hindi should be possible everywhere).

Frontier: Quality assessment tools

The principle that, with few exceptions, any page can be modified by anyone at any time is at the heart of Wikimedia projects. It is what has enabled their rapid growth to begin with, and what allows them to continually accept input from the general public. This principle is also the cause of most of the public concerns regarding the quality of information delivered through our projects. While any individual article may be of very high quality and a recommendable resource, we generally caution readers to use citations as the primary means of validating the veracity of Wikimedia project content.

In some projects, the Flagged Revisions extension is used to at least minimize the risk of obvious vandalism being served to the reader (German Wikipedia being the most prominent example), but in most projects, this is not the case. Beyond that, the community diligently adds quality assessments (typically to article talk pages) and subjects articles to intense but slow quality review through candidacy processes like “Featured article candidates”.

The input from readers or subject-matter experts is rarely solicited by the Wikimedia community. Instead, the flip side of the “anyone can edit” principle tends to be “anyone should edit” (as opposed to reviewing content or reporting problems). There is some degree of aversion against non-Wikimedians adding their stamp of approval or disapproval to articles, and expressed fear that actively soliciting such input would deter people from just fixing problems.

With that said, early experiments with the Article Feedback Tool indicate that readers are willing to provide qualitative feedback in large numbers, even on complex topics. Is the feedback itself useful? That is something we need to continue to assess.

What seems evident is that building a rich toolset of quality assessment, quality labeling and quality analysis tools has the potential to dramatically impact the reader experience on Wikimedia projects, shifting the perception of the projects from “highly variable and unpredictable quality” to “often high quality; clearly indicated when low quality”. If well-designed, such tools will give the reader the possibility to determine, at a glance

  • where an article is in the community’s view of its development lifecycle
  • whether the general audience or specific sub-groups (such as credentialed or self-identified experts) have raised important objections concerning the article.

It would also allow the Wikimedia Foundation, Wikimedia chapters and individual volunteers to better understand how their actions are impacting the perception of quality. For example, if the Wikimedia Foundation organizes a program to improve content with the help of university students, we need to be able to make a reasonable determination as to whether content has actually improved. The quality assessment toolset would allow us to make that determination.

Considerable risks remain especially around the engagement of credentialed experts in a highly egalitarian community. Another key risk we need to understand better is whether such tools can cannibalize editing, or whether they could actually be used as powerful entry vectors for new contributors (with each rating transaction followed by an invitation to edit).

Because of the transformative potential of quality assessment tools, and because they are necessary to measure the movement's impact towards its quality goals, we recommend the following approach:

  1. Iteratively develop a comprehensive quality assessment toolset which makes it easy to report content quality, to express changes in quality over time, and to associate metadata such as expertise with individual ratings. With regard to expert reviews, we should carefully explore whether we can build APIs and user interfaces that will allow experts to authenticate themselves and to submit credentialed ratings (without such ratings necessarily receiving any additional weight).
  2. Bake quality indicators into the user experience so that readers get a quick at-a-glance view of an article’s key quality characteristics.

As a frontier project, this project will likely not be able to receive primary resource attention through 2011. We suggest shooting for modest milestones, such as a first Wikimedia-wide deployment of quality assessment tools no later than mid-2011, and an experimental toolset for expert review to be released at about the same time.

Frontier: Discussion system

Discussion pages are the mechanism through which the majority of social interactions occur on Wikimedia projects. The basic principle is simple: a single, versioned wiki page functions as a discussion space linked to another page. There is very little additional software support for discussions beyond that.

This has created a number of major usability issues with the current discussion system:

  • Lack of discoverability. There is no “Reply” link; instead, in order to reply, one has to edit the page or section, position the cursor below the target comment, and indent the response with a number of indentation symbols (typically “:”s). It is also necessary to sign the comment with three or four tildes. The usability videos show that users asked to leave comments on talk pages are confused by the large number of banners and the lack of obvious indicators how to post a comment, to the point that they often completely fail to be able to do so.
  • Lack of clear alerts and indicators. There is no notification of replies other than a general “new message” indicator for user-to-user messages and the watchlist system that is also used for article changes. The watchlist system focuses on “diffs” as the expression of changes in a side-by-side comparison, a view that requires interpretation of wiki markup to understand the flow of a discussion.
  • No response tracking. User talk pages do not function like an inbox where each user retains a full copy of the thread; instead, the choice where a response is given matters and has implications about whether one receives a notification for follow-up responses.
  • No alternative views or automation. Because comments do not have a database identity as such, it is impossible to sort them by certain criteria, filter them, show a feed of selected comments, etc. Archiving of comments is manually done by copying or moving page content to a separate page (mature wikis often use special archival bots to do this).

These and other usability issues mean that especially for new users, the interaction with advanced users is seriously impaired by their lack of a mental model of the discussion system. Paradigms that the user may be familiar with (forums, inboxes, social media feeds) do not apply. Indeed, it is challenging to find any discussion system that is wilfully designed to resemble Wikimedia’s.

It is likely that the deficiencies of the discussion system also impact the effectiveness of day-to-day collaboration on Wikimedia projects. Rather than getting a single at-a-glance view of new or important discussions, one has to carefully review a number of “diffs” for watchlisted talk pages, in addition to checking one’s own contribution history and tracking changes to recently visited pages that are not on the personal watchlist.

A more sophisticated discussion system would create several additional opportunities that cannot even be explored in the current model, especially through additions of feedback mechanisms such as helpful/unhelpful ratings, the fluid transition from asynchronous to synchronous communication (e.g. chat), the addition of user avatars to create a more human discussion experience, tracking user interaction graphs to provide relevant feeds, etc.

The LiquidThreads extension presents a realistic and powerful alternative to ordinary discussion pages. It does still have a large number of documented interaction and design issues, and it does not yet begin to explore some of the aforementioned opportunities. Most importantly, discussion pages represent such a critical part of how Wikimedia works that replacing them with a new system will only be possible if the system enjoys a very high across-the-board acceptance by experienced users. This means the system cannot simply be better overall, it needs to be widely perceived as a positive step change in usefulness.

We do not yet fully understand how large the benefits are that we may derive from a reformed discussion system in terms of our strategic goals. It may turn out to be a dramatic game-changer, but the deployment challenges in combination with the large technical complexity and the not yet fully understood benefits argue for a cautious, iterative and exploratory approach at this time. We recommend the following:

  1. Continued, iterative improvement of the existing LiquidThreads discussion system with the goal to deploy to a specific mid-scale production use case.
  2. Careful experimentation with interventions on traditional talk pages to better understand whether reputation and recognition systems can create a more collaborative discussion culture. Some of this can occur within the scope of the -1 to 100 edit project.

Frontier: Offline content packaging

Although cell phones are increasingly becoming the mass medium and communication tool of choice for the developing world, the vast majority of users are still not able to access the full Internet. Moreover, Internet use in the educational sector is often limited.

Packaging content offline in various ways can enable us to reach anyone with any access, however limited, to infrastructure that can be used for storing and displaying information. It can potentially circumvent government or institutional restrictions that would apply to net access.

Offline content packaging can therefore help us bridge the digital divide, and especially in combination with quality assessment tools, it has the potential to allow transformative use cases in the educational sector.

The largest singular drawback of offline approaches to content packaging is that, without connectivity, there is no participation, and with limited connectivity, we can at best hope for backchannels or clunky edit synchronization tools.

Beyond that, the main question remains to what extent an investment in this area is going to help us reach the hundreds of millions of people that our strategic plan calls for. Scalability to this degree would require either market adoption or large government buy-in for offline solutions. Whether either is feasible or likely remains to be demonstrated.

For now, our recommendation is consistent with the work done on offline strategy:

  • Support the development of a standard toolset for packaging and perusing offline content. This includes the file format and reader, but it also includes generation of full dumps or selections, with or without images and metadata, and finally, the application of quality assessment criteria to the selections.
  • Prototype offline content in the context of at least one large scale commercial pilot and one large-scale non-commercial pilot. Concrete strategies are beyond the scope of this whitepaper (some are discussed in Offline/Target Market), but it is clear that demonstration of impact through pilots needs to precede a larger strategic push.

Red link: Structured data repository

One of the largest continuing challenges in Wikimedia projects is that structured data (such as information in tables and lists) is plain text, with only minimal separation of text and content through the template system. There's no shared repository for such data, which means that every language and every project has to keep copies of all data.

The lack of any storage or search mechanism for structured data also means that there's no way to automatically construct queries like "List of countries by GDP". Instead, these lists have to be manually maintained by the community (often with the help of bots).

An alternative is to think about a lot of the data in Wikimedia projects in the same terms in which we think about images: It's not language-dependent (although conversions and label translations are required), and it should be possible to maintain a single repository where key data (such as data about countries) can be stored and maintained.

Such a "Wikidata Commons" would a) dramatically reduce duplicate effort, b) create new potential opportunities for participation, c) make it easier to bootstrap new language communities, d) lead to higher content quality overall.

Solving this problem will not immediately address any of the major issues we're facing (e.g. reversing problematic trends in participation). But it's a critical building block to a more mature and useful Wikimedia projects that can serve the planet in many new ways. It's also a logical precondition for some other high-wishlist items such as better geo-data support, better structured data management for media files, etc.

Because of this, we believe it would be a good project for a mature chapter organization or a technology partner organization to take on: It's a long-range enabling project, with huge potential for transformative impact; the Wikimedia Foundation is not going to be able to immediately address it, but it very much needs attention and effort. It is a red link.

Red link: WikiProject support

We know from research that collaboration in teams and affiliation with teams is understood to be one of the major factors influencing group effectiveness. It may also have an impact on retention. WikiProjects are the closest thing Wikimedia has to an interest graph of its contributors. They enable teams to form and to collaborate, making them perhaps the structural backbone of the Wikimedia community.

Virtually the entirety of technologies and processes supporting WikiProject collaboration have been created by the community over the years. The downside of this is that best practices are not well understood across languages, and that many processes may suffer from limitations in the technology. For example, we know that new users typically join a WikiProject fairly late in their "wiki career". That's probably because WikiProjects aren't very discoverable -- there's no software feature that suggests joining one.

An ideal WikiProject should perhaps be more systematically designed around the concept of an interest graph. The fact that a user "likes" a topic or expresses an opinion about it is a good indicator that they may be interested in collaborating around this topic. This suggests that such systems could be integrated with article feedback mechanisms.

Ultimately, a WikiProject toolkit could both help recruit new editors (joining a project could be a new way to join the community) and dramatically increase effectiveness of existing community work. There's a lot of groundwork to do, and it's a project that could benefit from bit-by-bit experimentation that any community member can engage in, culminating in a systematic requirements definition and sustained engineering and community effort. The potential for transformative impact on community work makes this project a red link..

Red link: Post to social media feeds

The audience overlap analysis shows that Wikimedia's audience is highly active on social media. Whether that's true for the contributor community isn't known, but the composition of that community as established by the UNU-Merit survey suggests that it is. If we take as a given that at least some of the people in a typical social circle have an overlap of skills and interests, this makes them a powerful potential recruiting tool for new editors. Even to the extent that this isn't true, social media are still an effective way to get a message out to large numbers of people quickly.

That makes it remarkable how relatively invisible Wikimedia is as a social activity -- relative to gaming, taking photos and videos, sharing news, etc. Sharing on social media isn't integrated into the Wikimedia site experience. Beyond privacy-aware "Share" and "Like" buttons, this could include ways to turn on automatic social sharing for specific activities: uploading a picture to Wikimedia Commons, creating a new page, successfully promoting an article to "Featured Article" status, joining a WikiProject, etc.

This could have the effect of mainstreaming much of the "backstage" work in Wikimedia, and each of these sharing mechanisms could be designed to invite participation, specific ("share your photos") or general ("join Wikipedia").

Of course, such mechanisms would be strictly opt-in, and they should at minimum be vendor neutral and ideally provide strong support for federated open source solutions like StatusNet or Diaspora.

These mechanisms have the potential to strongly influence participation, although there's little data that could be used at this point in time to estimate the magnitude of that effect. What's clear is that it's unexplored terrain with significant potential for positive impact on strategic priorities. This makes it a red link.