Jump to content


From Strategic Planning
For early discussions, see m:Quality

Goal and function of this page

This page is a space for collecting data and analysis regarding the content and quality of the various Wikimedia projects. This collection of data and analysis will serve as a starting point for identifying growth opportunities for the various Wikimedia projects.

Data and analysis on content quality

Discussion regarding this data and analysis can be found on the talk pages of each of the subpages below.

This section is an overview of the available data and research. So far, research and analysis is falling into several categories:


  • What is the content landscape in which Wikimedia operates?
    • Types and categories of content
    • Sources of content by type and category
    • Relevant trends in content and content sourcing (e.g. digital textbooks)

  • What is Wikimedia’s current position in this content landscape?
    • Current content penetration by category
    • Sources of existing content
    • Initiatives currently underway or planned

  • What options does Wikimedia have for extending the scope of its content?

  • What initiatives could Wikimedia consider to support this scope extension?
    • Partnerships (e.g. with content institutions, educational institutions, libraries, online encyclopedias)
    • Others TBD

  • What is the potential impact of these content initiatives?
    • Likely impact on content scope
    • Resource requirements and funding availability
    • Benefits and/or risks (for reach, participation, etc)


  • What is the quality landscape in which Wikimedia operates?
    • Quality criteria (e.g. accurate, credible, complete, neutral)
    • Audience/stakeholder expectations (including online context)
    • Changes/trends over time

  • What is Wikimedia’s current position in this quality landscape?
    • Perceived vs. actual
    • Key challenges (e.g. translations)
    • Comparisons to relevant benchmarks

  • What quality control/assurance initiatives are already in place, or are being tested by Wikimedia and the community?
    • How effective is the combination of requiring references, and allowing not merely peer-review, but peer-editing in sustaining accuracy?
    • Often the quality problem is not getting quality in, but stopping subtle entropic processes of noise and vandalism: could a loss of entropy be detected in articles?
      • An obvious loss is the loss of a reference from an article: this is/can be flagged automagically for followup.

  • What approaches to quality control/assurance could Wikimedia consider to improve actual and perceived quality?
    • Current initiatives underway/tried within Wikipedias (e.g. flagged revisions)
    • Other initiatives tried in the field
    • Others TBD

  • What is the potential impact of these quality control/assurance approaches?
    • On content scope/generation
    • On reach and participation
    • On perceptions of key stakeholders

  • Where are the most salient intersections between content and quality?
    • Quality vs. quantity debate


Flagged Revisions

m:FlaggedRevs_Report_December_2008 -- Basic overview of flagged revisions feature in German Wikipedia

Quality statistics on Wikipedia projects

Organised by WikiProject; in some cases, you need to scroll down to see global stats. Only some languages participate.

File Types on Wikimedia Commons

Content Partnerships

Wikimedia Germany partnerships:

Wikimedia France partnerships:

Postmortem on attempt to create illustrations using $20K restricted donation in 2007.


Inter wiki improvements

  • meta:Death anomalies table a new improvement program for Biographies, producing lists of Biographies of dead people in one language who are still living in another. Currently only used by DE and EN wikipedias but available to others on request.