Store Wikipedia dump file

Store Wikipedia dump file

Not sure if this is related to recent issues with the extraction phase but I'm seeing problems in the store phase after extraction and sorting seemed to have finished without any major errors:

rfaulkner@wmf128:~/trunk/projects/editor_trends$ python manage.py -l Polish store

Wikilytics is (c) 2010-2011 by the Wikimedia Foundation.
Written by Diederik van Liere (dvanliere@gmail.com).
This software comes with ABSOLUTELY NO WARRANTY. This is 
    free software, and you are welcome to distribute it under certain 
    conditions.
See the README.1ST file for more information.

Final settings after parsing command line arguments:
         Project: Wikipedia
 Input directory: /home/rfaulkner/wikimedia/pl/wiki
Output directory: /home/rfaulkner/wikimedia/pl/wiki and subdirectories
        Language: Polish / Polski / pl
Start storing data in MongoDB
Storing article titles...
/home/rfaulkner/wikimedia/pl/wiki
2	False	AWK
Traceback (most recent call last):
  File "manage.py", line 583, in <module>
    main()
  File "manage.py", line 579, in main
    args.func(rts, logger)
  File "manage.py", line 306, in store_launcher
    store.launcher(rts)
  File "/home/rfaulkner/trunk/projects/editor_trends/etl/store.py", line 106, in launcher
    store_articles(rts)
  File "/home/rfaulkner/trunk/projects/editor_trends/etl/store.py", line 96, in store_articles
    collection.insert({'id':id, 'title':title})
UnboundLocalError: local variable 'id' referenced before assignment

Any clues as to what may be happening here? I recall there may have been an issue with xml parsing cElementTree::iterparse .. would this be related?

Renklauf23:18, 29 March 2011

This should be fixed in the current svn repos.

Drdee17:59, 1 April 2011