Editor Trends Study/Progress

From Strategic Planning

Introduction

This page will be used to track the progress of the Editor Trends Study.

Progress

We will highlight, by week, the different milestones we reach.

Week 41

  • Launch of wiki

Week 42

  • Collecting input from the Wikipedia community
  • First commit source code in SVN

Week 43

  • Added command-line interface that allows people to conduct their own analyzes.
  • Improved performance of XML processing
  • Redesign MongoDB to use less hard disk space and improve performance

Week 44

  • Started small group of alpha testers

Week 45

  • Expanded documentation
  • Made first charts
  • download.wikimedia.org is offline...

Week 46

  • Posted an online tutorial on how to install Editor Trends
  • Benchmarked results against Erik Zachte's stats software module
  • Refactored large parts of code to make it more object oriented

Week 47

  • Version 0.1 has been committed to Subversion. This offers the full data pipeline of downloading a dump file, chunking it in smaller parts, extracting required information, presorting information (improves performance), storing it in MongoDB, transforming to a real dataset and exporting a dataset to a CSV file. This process is fully automated and configurable.
  • Initial charts have been made
  • Waiting for download.wikimedia.org to resume, once it resumes I will make a second video tutorial demonstrating how to use this tool.

Week 48

  • Presented preliminary results together with Howie
  • Currently downloading latest dumps as download.wikimedia.org is back online

Week 49

  • Optimizing code, adding documentation

Week 50

  • Met Wikimedia staff in SF
  • Presented preliminary results

Week 51 / 52

  • Rewrote some parts of the code to reducing processing time, thanks to Nimish for his suggestions
  • Holidays

Week 1

  • Finalizing cohort charts

Week 2

  • Creating prediction model active wikipedians

Source Code

You can find the source code of this study at the Mediawiki svn repository. To check out the Editor Trends Study Analytics package into the folder "editor_trends":

svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/tools/editor_trends editor_trends

Datasets

Will be made available as soon as possible.