Problem with transforming
From Talk:Wikilytics
Problem with transforming
python manage.py transform command gives me this message:
Start transforming dataset wikilytics huwiki_editors_raw 38018 {u'date': datetime.datetime(2003, 9, 13, 7, 14, 41), u'article': 684328, u'ns': 505} Traceback (most recent call last): File "manage.py", line 461, in <module> main() File "manage.py", line 457, in main args.func(rts, logger) File "manage.py", line 328, in transformer_launcher transformer.transform_editors_single_launcher(rts) File "C:\editor_trends\etl\transformer.py", line 313, in transform_editors_single_launcher editor() File "C:\editor_trends\etl\transformer.py", line 80, in __call__ character_count = determine_edit_volume(edits, first_year, final_year) File "C:\editor_trends\etl\transformer.py", line 226, in determine_edit_volume if edit['delta'] < 0: KeyError: 'delta'
Yes, you need to redo the store and transformation phase as I have made significant changes in the last couple of days (there are more variables added).
so go to mongo enter use wikilytics then enter show collections and then enter:
db.huwiki_editors_raw.drop() db.huwiki_editors_dataset.drop() db.huwiki_articles_raw.drop()
I get the following error when trying to run manage.py transform:
Microsoft Windows [verziószám: 6.1.7600] Copyright (c) 2009 Microsoft Corporation. Minden jog fenntartva. C:\wikimedia\editor_trends>manage.py transform Final settings after parsing command line arguments: Project: Wikipedia Input directory: c:\wikimedia Output directory: c:\wikimedia\hu\wiki and subdirectories Language: Hungarian / Magyar / hu Start transforming dataset wikilytics huwiki_editors_raw Traceback (most recent call last): File "C:\wikimedia\editor_trends\manage.py", line 461, in <module> main() File "C:\wikimedia\editor_trends\manage.py", line 457, in main args.func(rts, logger) File "C:\wikimedia\editor_trends\manage.py", line 328, in transformer_launcher transformer.transform_editors_single_launcher(rts) File "C:\wikimedia\editor_trends\etl\transformer.py", line 310, in transform_e ditors_single_launcher ids = db.retrieve_distinct_keys(rts.dbname, rts.editors_raw, 'editor') File "C:\wikimedia\editor_trends\database\db.py", line 144, in retrieve_distin ct_keys ids = retrieve_distinct_keys_mapreduce(editors, field) File "C:\wikimedia\editor_trends\database\db.py", line 156, in retrieve_distin ct_keys_mapreduce cursor = collection.map_reduce(map, reduce) File "build\bdist.win-amd64\egg\pymongo\collection.py", line 943, in map_reduc e File "build\bdist.win-amd64\egg\pymongo\database.py", line 293, in command File "build\bdist.win-amd64\egg\pymongo\helpers.py", line 119, in _check_comma nd_response pymongo.errors.OperationFailure: command SON([('mapreduce', u'huwiki_editors_raw '), ('map', Code('function () { emit(this.editor, 1)};', {})), ('reduce', Code(' function()', {}))]) failed: db assertion failure