From: Nathan on
I want one process to continually loop through a list of objects (in
the form of a hash), while another process continually refreshes that
list. I've done it the most obvious way below, but are there any
pitfalls here? I don't really know how the each method works, so will
it be looking for keys that may not be there anymore? Would I be
better off doing some kind of merge of the old hash with the new,
rather than replacing the old hash entirely?
Any help much appreciated.
Nathan

tester = Thread.new {
until terminate
myHash.each do |key,value|
value.test()
end
end
}

while Time.now < time_to_finish
myHash = MyClass.generate_list_of_objects
end

terminate=true

tester.join

From: Brian Candler on
Nathan Macinnes wrote:
> I want one process to continually loop through a list of objects (in
> the form of a hash), while another process continually refreshes that
> list. I've done it the most obvious way below, but are there any
> pitfalls here? I don't really know how the each method works, so will
> it be looking for keys that may not be there anymore? Would I be
> better off doing some kind of merge of the old hash with the new,
> rather than replacing the old hash entirely?

Fortunately you are not replacing or modifying the hash at all (and
modifying a hash while iterating through it is a really bad idea)

Rather, you are updating the local variable 'myHash' to point to a new
hash. The tester thread, once it has started iterating through the old
hash, will continue to do so until it gets to the end.

You could make this even more explicit by eliminating the 'terminate'
variable:

tester = Thread.new {
while myHash
myHash.each ...
end
}

...
myHash = nil
tester.join

But is there a particular reason to do this using threads? It would be
simpler like this:

while Time.now < time_to_finish
myHash = MyClass.generate_list_of_objects
myHash.each do |key,value|
value.test
end
end

Note that MRI won't give you thread concurrency across multiple cores,
although JRuby does.
--
Posted via http://www.ruby-forum.com/.

From: Nathan on
Thanks for the clarification... My application is network based, and
some operations take several seconds. generate_list_of_objects takes
anything between 2 and 15 seconds, so I wouldn't like to have my whole
program on hold while that's happening. Each test process takes a few
seconds too, so if the object is no longer on the list, it isn't a
particularly bad thing if test tries to run, but it'll take up a lot
of time unnecessarily.

I'm trying to work out if there's another way of doing it then.
Perhaps I'll modify the test method so that it knows if the current
object is on the list, and only runs the test if it is.
Nathan

On Apr 7, 2:25 pm, Brian Candler <b.cand...(a)pobox.com> wrote:
> Nathan Macinnes wrote:
> > I want one process to continually loop through a list of objects (in
> > the form of a hash), while another process continually refreshes that
> > list. I've done it the most obvious way below, but are there any
> > pitfalls here? I don't really know how the each method works, so will
> > it be looking for keys that may not be there anymore? Would I be
> > better off doing some kind of merge of the old hash with the new,
> > rather than replacing the old hash entirely?
>
> Fortunately you are not replacing or modifying the hash at all (and
> modifying a hash while iterating through it is a really bad idea)
>
> Rather, you are updating the local variable 'myHash' to point to a new
> hash. The tester thread, once it has started iterating through the old
> hash, will continue to do so until it gets to the end.
>
> You could make this even more explicit by eliminating the 'terminate'
> variable:
>
> tester = Thread.new {
>   while myHash
>     myHash.each ...
>   end
>
> }
>
> ...
> myHash = nil
> tester.join
>
> But is there a particular reason to do this using threads? It would be
> simpler like this:
>
> while Time.now < time_to_finish
>   myHash = MyClass.generate_list_of_objects
>   myHash.each do |key,value|
>     value.test
>   end
> end
>
> Note that MRI won't give you thread concurrency across multiple cores,
> although JRuby does.
> --
> Posted viahttp://www.ruby-forum.com/.

From: Matthew K. Williams on
On Wed, 7 Apr 2010, Nathan wrote:

> Thanks for the clarification... My application is network based, and
> some operations take several seconds. generate_list_of_objects takes
> anything between 2 and 15 seconds, so I wouldn't like to have my whole
> program on hold while that's happening. Each test process takes a few
> seconds too, so if the object is no longer on the list, it isn't a
> particularly bad thing if test tries to run, but it'll take up a lot
> of time unnecessarily.
>
> I'm trying to work out if there's another way of doing it then.
> Perhaps I'll modify the test method so that it knows if the current
> object is on the list, and only runs the test if it is.
> Nathan

Might a messaging queue work better? That way you don't have the
concurrency issues, as well as the issues with changing a hash in the
middle of iteration.

Matt

From: Dan Drew on
Depending on how memory efficient you want to be you could also try.

1) Memory hog method... this will have potentially two copies of your data at a given time

Thread 1:
lock_shared # using whatever mechanism you want such as a mutex
testHash = sharedHash
unlock_shared
# iterate and test

Thread 2
newHash = generate_hash()
lock_shared
sharedHash = newHash
unlock_shared

2) Memory efficient but more contention

Thread 1
lock_shared
test_keys = sharedHash.keys
unlock_shared
test_keys.each do |k|
lock_shared
v = sharedHash[k]
unlock_shared
test(v) if v
end

Thread 2
# for each item as it's loaded from the network
generate_index do |k,v|
next if sharedHash[k] == v # Optional to avoid unnecessary contention
lock_shared
if v
sharedHash[k] = v
else
sharedHash.delete(k)
end
unlock_shared
end

Dan


From: Matthew K. Williams
Sent: Wednesday, April 07, 2010 10:02 AM
To: ruby-talk ML
Subject: Re: modifying a Hash in one process when .each is running in another


On Wed, 7 Apr 2010, Nathan wrote:

> Thanks for the clarification... My application is network based, and
> some operations take several seconds. generate_list_of_objects takes
> anything between 2 and 15 seconds, so I wouldn't like to have my whole
> program on hold while that's happening. Each test process takes a few
> seconds too, so if the object is no longer on the list, it isn't a
> particularly bad thing if test tries to run, but it'll take up a lot
> of time unnecessarily.
>
> I'm trying to work out if there's another way of doing it then.
> Perhaps I'll modify the test method so that it knows if the current
> object is on the list, and only runs the test if it is.
> Nathan

Might a messaging queue work better? That way you don't have the
concurrency issues, as well as the issues with changing a hash in the
middle of iteration.

Matt