From: EW on
Hi

I'm writing a multithreaded app that relies on Queues to move data
between the threads. I'm trying to write my objects in a general way
so that I can reuse them in the future so I need to write them in such
a way that I don't know how many producer and how many consumer
threads I might need. I also might have different consumer threads do
different tasks (for example one might write to a log and one might
write to SQL) so that again means I can't plan for a set ratio of
consumers to producers. So it's unknown.

So this means that instead of having 1 Queue that all the producers
put to and that all the consumers get from I actually have 1 Queue per
producer thread that the main body sends to the correct type of
consumer thread. So I could get something like this where 3 producer
threads write to 3 different Queues all of which get read by 1
consumer thread:

P1 P2 P3
\ | /
\ | /
C1

So producers 1, 2, and 3 all write to individual Queues and consumer 1
had a list of those Queues and reads them all. The problem I'm having
is that those producer threads can come and go pretty quickly and when
they die I can cleanup the thread with join() but I'm still left with
the Queue. So I could get something like this:

P1 P3
\ | /
\ | /
C1

So here the P2 thread has ended and gone away but I still have his
Queue lingering.

So on a thread I can use is_alive() to check status and use join() to
clean up but I don't see any analogous functionality for Queues. How
do I kill them? I thought about putting a suicide message on the
Queue and then C1 would read it and set the variable to None but i'm
not sure setting the variable to None actually makes the Queue go
away. It could just end up sitting in memory unreferenced - and
that's not good. Additionally, I could have any number of consumer
threads reading that Queue so once the first one get the suicide note
the other consumer threads never would.

I figure there has to be an elegant way for managing my Queues but so
far I can't find it. Any suggestions would be appreciated and thanks
in advance for any help.


ps Python rocks.
From: EW on
On Aug 11, 12:55 pm, EW <ericwoodwo...(a)gmail.com> wrote:
> Hi
>
> I'm writing a multithreaded app that relies on Queues to move data
> between the threads.  I'm trying to write my objects in a general way
> so that I can reuse them in the future so I need to write them in such
> a way that I don't know how many producer and how many consumer
> threads I might need.  I also might have different consumer threads do
> different tasks (for example one might write to a log and one might
> write to SQL) so that again means I can't plan for a set ratio of
> consumers to producers.  So it's unknown.
>
> So this means that instead of having 1 Queue that all the producers
> put to and that all the consumers get from I actually have 1 Queue per
> producer thread  that the main body sends to the correct type of
> consumer thread.  So I could get something like this where 3 producer
> threads write to 3 different Queues all of which get read by 1
> consumer thread:
>
> P1    P2   P3
>      \    |   /
>        \  |  /
>         C1
>
> So producers 1, 2, and 3 all write to individual Queues and consumer 1
> had a list of those Queues and reads them all.  The problem I'm having
> is that those producer threads can come and go pretty quickly and when
> they die I can cleanup the thread with join() but I'm still left with
> the Queue.  So I could get something like this:
>
> P1         P3
>      \    |   /
>        \  |  /
>         C1
>
> So here the P2 thread has ended and gone away but I still have his
> Queue lingering.
>
> So on a thread I can use is_alive() to check status and use join() to
> clean up but I don't see any analogous functionality for Queues.  How
> do I kill them?  I thought about putting a suicide message on the
> Queue and then C1 would read it and set the variable to None but i'm
> not sure setting the variable to None actually makes the Queue go
> away.  It could just end up sitting in memory unreferenced - and
> that's not good.  Additionally, I could have any number of consumer
> threads reading that Queue so once the first one get the suicide note
> the other consumer threads never would.
>
> I figure there has to be an elegant way for managing my Queues but so
> far I can't find it.  Any suggestions would be appreciated and thanks
> in advance for any help.
>
> ps Python rocks.

Whoo..the formatting got torn up! My terrible diagrams are even more
terrible! Oh well, I think you'll catch my meaning :)
From: Paul Rubin on
EW <ericwoodworth(a)gmail.com> writes:
> I also might have different consumer threads do
> different tasks (for example one might write to a log and one might
> write to SQL) so that again means I can't plan for a set ratio of
> consumers to producers.... So it's unknown.
>
> So this means that instead of having 1 Queue that all the producers
> put to and that all the consumers get from I actually have 1 Queue per
> producer thread

That doesn't sound appropriate. Queues can have many readers and many
writers. So use one queue per task (logging, SQL, etc), regardless of
the number of producer or consumer threads. Any producer with an SQL
request sends it to the SQL queue, which can have many listeners. The
different SQL consumer threads listen to the SQL queue and pick up
requests and handle them.
From: EW on
On Aug 11, 1:18 pm, Paul Rubin <no.em...(a)nospam.invalid> wrote:
> EW <ericwoodwo...(a)gmail.com> writes:
> > I also might have different consumer threads do
> > different tasks (for example one might write to a log and one might
> > write to SQL) so that again means I can't plan for a set ratio of
> > consumers to producers....  So it's unknown.
>
> > So this means that instead of having 1 Queue that all the producers
> > put to and that all the consumers get from I actually have 1 Queue per
> > producer thread
>
> That doesn't sound appropriate.  Queues can have many readers and many
> writers.  So use one queue per task (logging, SQL, etc), regardless of
> the number of producer or consumer threads.  Any producer with an SQL
> request sends it to the SQL queue, which can have many listeners.  The
> different SQL consumer threads listen to the SQL queue and pick up
> requests and handle them.

I thought about doing it that way and I could do it that way but it
still seems like there should be a way to clean up Queues on my own.
If I did it this way then I guess I'd be relying on garbage collection
when the script ended to clean up the Queues for me.

What if I want to clean up my own Queues? Regardless of the specifics
of my current design, I'm just generally curious how people manage
cleanup of their Queues when they don't want them any more.
From: MRAB on
EW wrote:
[snip]
> So here the P2 thread has ended and gone away but I still have his
> Queue lingering.
>
> So on a thread I can use is_alive() to check status and use join() to
> clean up but I don't see any analogous functionality for Queues. How
> do I kill them? I thought about putting a suicide message on the
> Queue and then C1 would read it and set the variable to None but i'm
> not sure setting the variable to None actually makes the Queue go
> away. It could just end up sitting in memory unreferenced - and
> that's not good. Additionally, I could have any number of consumer
> threads reading that Queue so once the first one get the suicide note
> the other consumer threads never would.
>
> I figure there has to be an elegant way for managing my Queues but so
> far I can't find it. Any suggestions would be appreciated and thanks
> in advance for any help.
>
An object will be available for garbage collection when nothing refers
to it either directly or indirectly. If it's unreferenced then it will
go away.

As for the suicide note, if a consumer sees it then it can put it back
into the queue so other consumers will see it and then forget about the
queue (set the variable which refers to the queue to None, or, if the
references are in a list, delete it from the list).