Using array instead of hash and still have efficient cross-referencing [Ruby]

Prev: socket problem
Next: do Rails/gems updates break applications?

From: Intransition on 5 May 2010 17:59

On May 5, 2:40 pm, Caleb Clausen <vikk...(a)gmail.com> wrote:
> On 5/5/10, Intransition <transf...(a)gmail.com> wrote:
>
>
>
>
>
> > On May 5, 12:58 pm, Caleb Clausen <vikk...(a)gmail.com> wrote:
> >> Why not create a temporary hash before the other.modules.each loop and
> >> use it to tell which modules are present.... something like (assuming
> >> you rewrite @modules as an array):
>
> >> known_mods={}
> >> c.modules.each{|mod| known_mods[mod.base]=mod }
> >> other.modules.each do |ofmod|
> >> if known_mods[ofmod.base]
> >> ...
> >> end
> >> end
>
> >> Your Snapshot#- won't be as efficient as it is now, since you have to
> >> build that index up every time its called, but it should be a fairly
> >> minor performance degradation, I would think.
>
> > That was my first alternative idea too. I'm just not sure if it's
> > worth the efficiency trade-off.
>
> From what you say about how this is used, it doesn't sound like this
> method is called all that much. Once per file in the lib tested? I'd
> say just use a temporary hash and get on with your life until and
> unless the performance actually proves to be a problem. Or stick with
> your current solution of a permanent hash.
>
> It's worth noting that (if I recall correctly) Array#-, which you are
> calling 6 times inside that if statement, creates a temporary hash of
> the receiver's contents in order to do its own work efficiently. So,
> the cost of your one temporary hash is probably vastly dwarfed by the
> cost of the 6 temporary hashes created by stdlib on every loop
> iteration.
>
> I expect that a temp hash would perform reasonably even with thousands
> of files to scan and/or thousands of classes/modules in ObjectSpace.
> So, it should scale to all but the very largest projects, and those
> should probably expect a performance hit for their large size.

Interesting. I may go ahead do it this way then. Thanks.

> > I was thinking there might be a way to
> > do it were the two arrays are sorted by name and then iterate down the
> > list popping off one or the other and merging base on <=>, but I
> > haven't worked it out yet. Even though there's two sorts involved it
> > should be just as fast I think.
>
> It's a neat idea.... but sounds a little complex to implement.

Which is why I haven't yet tried it. I may give it a shot. If it
works, great. If not, I'll go with the above.

> > Lemon is unit testing framework that has a strict testcase<->class/
> > module and unit<->method correspondence. By taking a snapshot of the
> > system before and after a target library is loaded it can provide test
> > coverage information.
>
> Is this just verifying that there is a test of some sort for every
> method? Or actually a deeper level of coverage (line coverage) like
> what rcov does?

Just verifying --nothing like rcov. It's main purpose is to act as a
guide.

Having a new test library is perhaps overkill for what it does. I have
been thinking about refactoring it into an extension for Test::Unit/
MiniTest. But I do not particularly relish working with either of
those code bases. We'll see.

First | Prev |
Pages: 1 2
Prev: socket problem
Next: do Rails/gems updates break applications?