Problem (?) with extracting

Fragment of a discussion from Talk:Wikilytics

What OS are you using? and can you email me the log file from /logs/

Drdee13:51, 5 April 2011

Win7 64bit. Sure.

Samat13:53, 5 April 2011

Ok, I am making a lot of changes (as you have noted) and I am really trying to get a stable version as soon as possible. thanks for your patience.

Drdee13:55, 5 April 2011

mmm I cannot replicate this, could you please update to the most recent version on SVN and try again?

Drdee15:38, 5 April 2011

hmm. I used the latest revision (85439; I update editor_trends through SVN before every run). I have replicated right now.

Processing of huwiki-latest-stub-meta-history.xml took 0:21:48.918000
Number of articles: 512102
Number of revisions: 8256238
Finished parsing bz2 archives

[waiting infinite amount of time then CTRL+C]

Traceback (most recent call last):
  File "manage.py", line 461, in <module>
    main()
  File "manage.py", line 457, in main
    args.func(rts, logger)
  File "manage.py", line 405, in all_launcher
    res = function(rts, logger)
  File "manage.py", line 281, in extract_launcher
    enricher.launcher(rts)
  File "C:\editor_trends\etl\enricher.py", line 823, in launcher
    multiprocessor_launcher(function, dataset, storage, locks, rts)
  File "C:\editor_trends\etl\enricher.py", line 777, in multiprocessor_launcher
    input_queue.join()
  File "C:\Program Files\Python 2.7\lib\multiprocessing\queues.py", line 316, in join
    self._cond.wait()
  File "C:\Program Files\Python 2.7\lib\multiprocessing\synchronize.py", line 220, in wait
    self._wait_semaphore.acquire(True, timeout)
KeyboardInterrupt
Samat16:39, 5 April 2011

So it did finish extracting, the problem is that it did not exit the queue. How many processors does your computer have? so you can continue now doing the sort, store and transform phase.

Drdee17:00, 5 April 2011

The processor has 4 physical cores, with Hyper-threading the program can use 8 cores. It worked fine few days ago.

Samat17:13, 5 April 2011
 

I have updated the code with some extra debugging info. Can you update your code, rerun the extract phase and then copy the output on the console and email it to me?

try both these options to see if if differs:

python manage.py extract

python manage.py all -e download

Drdee19:51, 5 April 2011