Task force/Analytics/Principles

From Strategic Planning

These are some principles that should guide the development of the analytics infrastructure.

  • Goal-oriented
  • Privacy is important and must be taken seriously. The goal is not to track individuals, but to study patterns in aggregate. That said, the privacy policy may need some additional refining, as the analytics infrastructure will push the edge of what is known and understood.
  • Transparency is in service to privacy. The data available from the analytics infrastructure should be available to everyone, as long as it does not reveal any private information (and does not lend itself for de-anonimization).
  • In general, Wikimedia wants to support existing open source initiatives. To the extent that we have the opportunity to build and improve an open source analytics project in cooperation with partners, we should. It would serve our mission and pay hommage to our volunteer culture at the same time. Arguably the choice between a proprietary solution (e.g. Google Analytics, comScore) or an open source solution (e.g. Piwik) is even the most fundamental and far reaching choice to be made.
  • A choice of architecture is not lightly redone. Turning back from such a decision is a waste of resources and potentially very disruptive. Therefore any decision which favors short term benefits over the long haul could lead to perfect tactics serving a flawed strategy. Our challenge is to weigh all factors appropriately. At the other hand a mission and future proof solution should not lead us away from the need to produce tangible results in the short to medium trem. A fair estimate of what reasonably could be accomplished with any solution in 1,2,3 years is critical. No-one in the team really doubts commercial tools for analytics will bring us faster results than any open source initiative, but how much will we come to regret later an opportunity wasted? At the other hand if no existing open source initiative turns out to a viable platform with potential for greatness in a reasonable time frame, then we don't think it is within our mission to start building a generic open source solution from scratch.
Potential advantages of a mainstream proprietary solution
  • Maturity: Systems with fully grown feature set (although this is a moving target) do exist
  • Lots of expertise at the solutions provider
  • Lots of expertise in the market (books, consultants, huge customer base)
  • Wikimedia can expect favorable treatment by vendor/host due to our market position
Potential disadvantages of a mainstream proprietary solution
  • Uncertainty: Any favorable treatment (e.g. reduced fee) could have an expiration date
  • Obscurity: Much more difficult to assess robustness of process and strength of algorithms
  • Rigidity: Much more difficult to influence the way the product evolves and expands
Potential advantages of open source solution
  • Flexibility: Architecture would ideally be oriented towards modular expansion
  • Verifyability: Many eyes make all bugs shallow
  • Opportunities: Decision making based of collaboration between partners on equal footing
Potential disadvantages of open source solution
  • Uncertainty: Not all open source communities maintain a healthy level of participation over the years
  • Uncertainty: Much more needed to assess architectural fundaments of the product