Editor Trends Study/Progress
This page will be used to track the progress of the Editor Trends Study.
We will highlight, by week, the different milestones we reach.
- Launch of wiki
- Collecting input from the Wikipedia community
- First commit source code in SVN
- Added command-line interface that allows people to conduct their own analyzes.
- Improved performance of XML processing
- Redesign MongoDB to use less hard disk space and improve performance
- Started small group of alpha testers
- Expanded documentation
- Made first charts
- download.wikimedia.org is offline...
- Posted an online tutorial on how to install Editor Trends
- Benchmarked results against Erik Zachte's stats software module
- Refactored large parts of code to make it more object oriented
- Version 0.1 has been committed to Subversion. This offers the full data pipeline of downloading a dump file, chunking it in smaller parts, extracting required information, presorting information (improves performance), storing it in MongoDB, transforming to a real dataset and exporting a dataset to a CSV file. This process is fully automated and configurable.
- Initial charts have been made
- Waiting for download.wikimedia.org to resume, once it resumes I will make a second video tutorial demonstrating how to use this tool.
- Presented preliminary results together with Howie
- Currently downloading latest dumps as download.wikimedia.org is back online
- Optimizing code, adding documentation
- Met Wikimedia staff in SF
- Presented preliminary results
Week 51 / 52
- Rewrote some parts of the code to reducing processing time, thanks to Nimish for his suggestions
- Finalizing cohort charts
- Creating prediction model active wikipedians
You can find the source code of this study at the Mediawiki svn repository. To check out the Editor Trends Study Analytics package into the folder "editor_trends":
svn checkout http://svn.wikimedia.org/svnroot/mediawiki/trunk/tools/editor_trends editor_trends
Will be made available as soon as possible.