Count substrings in string, scan too slow [Ruby]

Prev: ruby gem installation problem on MAC PC which is behind firewall
Next: IO select. Reconnect procedure.

From: Charles Oliver Nutter on 29 Jun 2010 15:13

On Thu, Jun 24, 2010 at 2:48 PM, <brabuhr(a)gmail.com> wrote:
> :-( that wasn't the right one :-)
>
> Â x.report 'boyer_moore' do
> Â Â TIMES.times do
> Â Â Â count = BoyerMoore.match("yo", s).size
> Â Â Â check count
> Â Â end
> Â end

FYI, a large part of the overhead here is probably the Java calls,
which are a bit slower than Ruby to Ruby calls (plus it's decoding the
"yo" string to UTF-16 each call). For a larger string and fewer calls,
the pure Java BoyerMoore performance would likely benchmark a lot
better than this.

- Charlie

From: Robert Dober on 29 Jun 2010 16:19

On Thu, Jun 24, 2010 at 6:05 PM, Robert Klemme
<shortcutter(a)googlemail.com> wrote:
I too took the liberty to change the benchmark and I found a strange
way to beat the "while"
but by little :(

http://gist.github.com/457751

--
The best way to predict the future is to invent it.
-- Alan Kay

From: brabuhr on 29 Jun 2010 22:04

On Tue, Jun 29, 2010 at 3:13 PM, Charles Oliver Nutter
<headius(a)headius.com> wrote:
> On Thu, Jun 24, 2010 at 2:48 PM, <brabuhr(a)gmail.com> wrote:
>> x.report 'boyer_moore' do
>> TIMES.times do
>> count = BoyerMoore.match("yo", s).size
>> check count
>> end
>> end
>
> FYI, a large part of the overhead here is probably the Java calls,
> which are a bit slower than Ruby to Ruby calls (plus it's decoding the
> "yo" string to UTF-16 each call). For a larger string and fewer calls,
> the pure Java BoyerMoore performance would likely benchmark a lot
> better than this.

I had a similar suspicion and had started a modified benchmark doing
fewer loops over larger data, but had to move on to other things.

This gives me a chance to try out the JRuby Mac Installer...

Original benchmark:

jruby 1.5.0 (ruby 1.8.7 patchlevel 249) (2010-05-12 6769999) (Java
HotSpot(TM) 64-Bit Server VM 1.6.0_20) [x86_64-java]
Rehearsal -----------------------------------------------
scan 8.851000 0.000000 8.851000 ( 8.784000)
scan ++ 14.186000 0.000000 14.186000 ( 14.186000)
scan re 8.594000 0.000000 8.594000 ( 8.594000)
scan re ++ 15.558000 0.000000 15.558000 ( 15.558000)
while 8.102000 0.000000 8.102000 ( 8.101000)
strscan 14.023000 0.000000 14.023000 ( 14.023000)
boyer_moore 7.446000 0.000000 7.446000 ( 7.446000)
------------------------------------- total: 76.760000sec

user system total real
scan 8.157000 0.000000 8.157000 ( 8.157000)
scan ++ 13.953000 0.000000 13.953000 ( 13.953000)
scan re 8.346000 0.000000 8.346000 ( 8.346000)
scan re ++ 15.332000 0.000000 15.332000 ( 15.333000)
while 8.087000 0.000000 8.087000 ( 8.087000)
strscan 14.303000 0.000000 14.303000 ( 14.303000)
boyer_moore 6.885000 0.000000 6.885000 ( 6.885000)

Even with the Ruby to Java call overhead, the Java BoyerMoore is
coming back the fastest on this machine. For comparison:

ruby 1.8.7 (2009-06-12 patchlevel 174) [universal-darwin10.0]
Rehearsal ----------------------------------------------
scan 31.030000 0.020000 31.050000 ( 31.094718)
scan ++ 62.310000 0.900000 63.210000 ( 63.227271)
scan re 31.030000 0.030000 31.060000 ( 31.110528)
scan re ++ 62.820000 0.870000 63.690000 ( 63.718876)
while 26.090000 0.020000 26.110000 ( 26.095308)
strscan 28.440000 0.010000 28.450000 ( 28.485140)
----------------------------------- total: 243.570000sec

user system total real
scan 31.240000 0.020000 31.260000 ( 31.264699)
scan ++ 64.000000 0.860000 64.860000 ( 64.865223)
scan re 31.570000 0.020000 31.590000 ( 31.581045)
scan re ++ 64.180000 0.980000 65.160000 ( 65.401667)
while 26.580000 0.030000 26.610000 ( 26.757658)
strscan 28.730000 0.030000 28.760000 ( 28.831860)

Unfortunately, I do not have 1.9.x on this machine at the moment.

First | Prev |
Pages: 1 2 3 4
Prev: ruby gem installation problem on MAC PC which is behind firewall
Next: IO select. Reconnect procedure.