Troubleshooting

Samat: can you send me the console output, I need more information.

If I give the python manage.py dataset or python manage.py -l Hungarian -c new_editor_count dataset command, I get the same result:

in the console:

...

Final settings after parsing command line arguments:
         Project: Wikipedia
 Input directory: c:\wikimedia\hu\wiki
Output directory: c:\wikimedia\hu\wiki and subdirectories
        Language: Hungarian / Magyar / hu
Start exporting dataset
Exporting data for chart: new_editor_count
Project: wikilytics
Dataset: huwiki_editors_dataset
wikilytics huwiki_editors_dataset new_wikipedian
100% |########################################################################|
Processing time: 0:00:07.050000
Storing dataset: C:\editor_trends\datasets\huwiki_new_editor_count_max_year=2012_min_year=2003.csv
Serializing dataset to wikilytics_charts
+----------+---------------+--------+---------------+---------+---------+---------+---------------+---------------------
+---------------------+
| Variable |          Mean | Median |            SD | Minimum | Maximum | Num Obs |        Num of |           First Obs
|           Final Obs |
|          |               |        |               |         |         |         | Unique Groups |
|                     |
+----------+---------------+--------+---------------+---------+---------+---------+---------------+---------------------
+---------------------+
|    count | 973.555555556 |  789.0 | 853.956836016 |      14 |    2154 |    8762 |             9 | 2003-07-09 06:43:25
| 2011-02-22 19:01:16 |
+----------+---------------+--------+---------------+---------+---------+---------+---------------+---------------------
+---------------------+
Dataset contains 1 variables
Project: huwiki
JSON encoder: to_bar_json
Raw data was retrieved from: huwiki/huwiki_editors_dataset
None

Processing time: 0:00:07.090000

in the huwiki_new_editor_count_max_year=2012_min_year=2003.csv file:

"date	count"
"1-1-2006:12-31-2006	789"
"1-1-2007:12-31-2007	1560"
"1-1-2005:12-31-2005	287"
"1-1-2004:12-31-2004	66"
"1-1-2003:12-31-2003	14"
"1-1-2010:12-31-2010	1613"
"1-1-2011:12-31-2011	308"
"1-1-2008:12-31-2008	2154"
"1-1-2009:12-31-2009	1971"

Samat‎

That's the correct behavior as the new_editor_count is the default plugin that will run if you do not explicitly give a plugin name. So python manage.py dataset and python manage.py dataset -c new_editor_count give the same result. The data from the csv file looks good to me :) so I am happy to see that you are making progress. I will start preparing a video on how to replicate the editor trends study. Thanks for all the questions and feedback!

Drdee‎

As I mentioned above it works fine.

I'd like to repeat this study and generate the same figures for the Hungarian language as the result page shows for big language versions, but after I've finished the calculation, I have only 9 numbers. I have expected a more complex result file. :) Which plugins should I run? (Bdamokos and you wrote that many of them don't work yet.)

And I have still a small problem during the process (I'm not sure whether you could fix or not):

BSON document too large, unable to store TXiKiBoT                             |
BSON document too large, unable to store SieBot                               |
BSON document too large, unable to store Xqbot###                             |
BSON document too large, unable to store Luckas-bot##########                 |
BSON document too large, unable to store SamatBot################             |
BSON document too large, unable to store AsgardBot########################### |
BSON document too large, unable to store DeniBot

What about this editors and their edits?

Thank you for all your trouble,

Samat‎

I've got the same errors, but as all these accounts belong to bots, I don't think it is a big loss if they are not stored among the humans.

(I think Diederik is currently making a video on how to replicate the study, so the second problem should be fixable as well...)

Bdamokos‎

Exactly, these are bot edits and are discarded at the moment. The reason is a limitation of Mongo. With Mongo 1.8 this should be resolved so if you are really interested in these edits then I suggest you wait for Mongo 1.8. Else, there is nothing to worry about.

Drdee‎

You are right, Mongo 1.8 solved this problem.

Samat‎

You are right! I am waiting for the tutorial video :)

Thanks for everything, Diederik!

Samat‎