From: Verweij, Arjen on
Hi,

I'm trying to parallelize a loop using Process and Queue. My code looks similar to the example in http://docs.python.org/library/multiprocessing.html?highlight=multiprocessing#module-multiprocessing.managers

It looks like this:

def pp(q, t, s, a, h, p):
doStuff()
c.generate()
q.put(c, True) #stuff a in a queue

main:

for a in aa:
processes = []
q = Queue()
for b in bb:
try:
if bla > 24:
print "too old"
continue
except:
print "etc" + t #file not found
pass
try:
p = Process(target=pp, args=(q, t, s, a, h, p,)) #trailing comma, do not touch m'kay
p.start()
processes.append(p)
#end for b in bb
for p in processes:
# p.join() # this deadlocks in combination with the Queue() at some point
ss = q.get()
bigcontainer.add(ss)
bigcontainer.generate()
world.add(bigcontainer)
#end for a in aa
world.generate()

So I have some XML, that requires translation into HTML. I take a sublist, traverse it, spawn a process for every XML file in that list and generate HTML inside that process. Then I would very much like to have the results back in the original main() so it can be used. Is there a way to guarantee that the a in aa loop will not start the next loop? In other words, I'm worried that q will be depleted due to some unknown factor while a subprocess from b in bb still has to write to the Queue() and it will continue somehow leaking/destroying data.

Before I found the example in the webpage that pointed out the deadlock I couldn't get the program to finish, but finishing without all the data would be equally bad.

Thanks,

Arjen

Arjen Verweij
QA/Test Engineer



Schoemakerstraat 97
2628 VK  Delft
The Netherlands


Phone:  +31 88 827 7086
Fax:       +31 88 827  7003
Email:  arjen.verweij(a)tass-safe.com
www.tass-safe.com
This e-mail and its contents are subject to a DISCLAIMER with important RESERVATIONS.