Quality/Data and analysis

From Strategic Planning

Data and analysis on content quality

Discussion regarding this data and analysis can be found on the talk pages of each of the subpages below.

This section is an overview of the available data and research. So far, research and analysis is falling into several categories:

Contents (en)

  • What is the content landscape in which Wikimedia operates?
    • Types and categories of content
    • Sources of content by type and category
    • Relevant trends in content and content sourcing (e.g. digital textbooks)


  • What is Wikimedia’s current position in this content landscape?
    • Current content penetration by category
    • Sources of existing content
    • Initiatives currently underway or planned


  • What options does Wikimedia have for extending the scope of its content?


  • What initiatives could Wikimedia consider to support this scope extension?
    • Partnerships (e.g. with content institutions, educational institutions, libraries, online encyclopedias)
    • Others TBD


  • What is the potential impact of these content initiatives?
    • Likely impact on content scope
    • Resource requirements and funding availability
    • Benefits and/or risks (for reach, participation, etc)

Quality (en)

  • What is the quality landscape in which Wikimedia operates?
    • Quality criteria (e.g. accurate, credible, complete, neutral)
    • Audience/stakeholder expectations (including online context)
    • Changes/trends over time


  • What is Wikimedia’s current position in this quality landscape?
    • Perceived vs. actual
    • Key challenges (e.g. translations)
    • Comparisons to relevant benchmarks


  • What quality control/assurance initiatives are already in place, or are being tested by Wikimedia and the community?
    • How effective is the combination of requiring references, and allowing not merely peer-review, but peer-editing in sustaining accuracy?
    • Often the quality problem is not getting quality in, but stopping subtle entropic processes of noise and vandalism: could a loss of entropy be detected in articles?
      • An obvious loss is the loss of a reference from an article: this is/can be flagged automagically for followup.


  • What approaches to quality control/assurance could Wikimedia consider to improve actual and perceived quality?
    • Current initiatives underway/tried within Wikipedias (e.g. flagged revisions)
    • Other initiatives tried in the field
    • Others TBD


  • What is the potential impact of these quality control/assurance approaches?
    • On content scope/generation
    • On reach and participation
    • On perceptions of key stakeholders


  • Where are the most salient intersections between content and quality?
    • Quality vs. quantity debate