From: brabuhr on
On Thu, Jun 24, 2010 at 1:16 PM, Michael Fellinger
<m.fellinger(a)gmail.com> wrote:
> I've just run some benchmarks with strscan, and it's at least in the
> same ballpark as the other approaches, unless you're on rubinius, but
> then all string processing is really slow on that anyway.
>
> Benchmark with strscan here: http://gist.github.com/451675

http://en.literateprograms.org/Boyer-Moore_string_search_algorithm_%28Java%29

require 'java'
java_import 'BoyerMoore'

x.report 'boyer_moore' do
count = BoyerMoore.match("yo", s).size
check count
end

$ jruby -v yomark.rb
jruby 1.5.0 (ruby 1.8.7 patchlevel 249) (2010-05-12 6769999) (Java
HotSpot(TM) Client VM 1.6.0_20) [i386-java]
Rehearsal -----------------------------------------------
scan 22.423000 0.000000 22.423000 ( 22.334000)
scan ++ 36.738000 0.000000 36.738000 ( 36.738000)
scan re 19.451000 0.000000 19.451000 ( 19.451000)
scan re ++ 39.222000 0.000000 39.222000 ( 39.222000)
while 22.621000 0.000000 22.621000 ( 22.622000)
strscan 29.075000 0.000000 29.075000 ( 29.076000)
boyer_moore 0.009000 0.000000 0.009000 ( 0.009000)
------------------------------------ total: 169.539000sec

user system total real
scan 18.050000 0.000000 18.050000 ( 18.051000)
scan ++ 35.046000 0.000000 35.046000 ( 35.046000)
scan re 17.807000 0.000000 17.807000 ( 17.807000)
scan re ++ 34.086000 0.000000 34.086000 ( 34.085000)
while 22.089000 0.000000 22.089000 ( 22.089000)
strscan 29.538000 0.000000 29.538000 ( 29.538000)
boyer_moore 0.005000 0.000000 0.005000 ( 0.004000)

$ jruby -v --server --fast yomark.rb
jruby 1.5.0 (ruby 1.8.7 patchlevel 249) (2010-05-12 6769999) (Java
HotSpot(TM) Server VM 1.6.0_20) [i386-java]
yobench.rb:50 warning: Useless use of a variable in void context.
Rehearsal -----------------------------------------------
scan 17.340000 0.000000 17.340000 ( 17.154000)
scan ++ 23.986000 0.000000 23.986000 ( 23.987000)
scan re 15.170000 0.000000 15.170000 ( 15.169000)
scan re ++ 22.805000 0.000000 22.805000 ( 22.806000)
while 12.050000 0.000000 12.050000 ( 12.050000)
strscan 31.396000 0.000000 31.396000 ( 31.396000)
boyer_moore 0.010000 0.000000 0.010000 ( 0.010000)
------------------------------------ total: 122.756999sec

user system total real
scan 15.201000 0.000000 15.201000 ( 15.201000)
scan ++ 23.758000 0.000000 23.758000 ( 23.758000)
scan re 14.770000 0.000000 14.770000 ( 14.770000)
scan re ++ 22.455000 0.000000 22.455000 ( 22.455000)
while 12.182000 0.000000 12.182000 ( 12.182000)
strscan 24.497000 0.000000 24.497000 ( 24.497000)
boyer_moore 0.002000 0.000000 0.002000 ( 0.002000)

From: brabuhr on
> http://en.literateprograms.org/Boyer-Moore_string_search_algorithm_%28Java%29
>
>  require 'java'
>  java_import 'BoyerMoore'
>
>  x.report 'boyer_moore' do
>    count = BoyerMoore.match("yo", s).size
>    check count
>  end

:-( that wasn't the right one :-)

x.report 'boyer_moore' do
TIMES.times do
count = BoyerMoore.match("yo", s).size
check count
end
end

jruby 1.5.0 (ruby 1.8.7 patchlevel 249) (2010-05-12 6769999) (Java
HotSpot(TM) Client VM 1.6.0_20) [i386-java]
Rehearsal -----------------------------------------------
boyer_moore 25.742000 0.000000 25.742000 ( 25.661000)
------------------------------------- total: 25.742000sec

user system total real
boyer_moore 24.869000 0.000000 24.869000 ( 24.869000)

jruby 1.5.0 (ruby 1.8.7 patchlevel 249) (2010-05-12 6769999) (Java
HotSpot(TM) Server VM 1.6.0_20) [i386-java]
Rehearsal -----------------------------------------------
boyer_moore 16.733000 0.000000 16.733000 ( 16.401000)
------------------------------------- total: 16.733000sec

user system total real
boyer_moore 15.970000 0.000000 15.970000 ( 15.971000)

From: botp on
On Fri, Jun 25, 2010 at 1:16 AM, Michael Fellinger > I've just run
some benchmarks with strscan, and it's at least in the
> same ballpark as the other approaches, unless you're on rubinius, but
> then all string processing is really slow on that anyway.
>
> Benchmark with strscan here: http://gist.github.com/451675
>

that is not fair for strscan.. you are recreating the object inside the loop :)

outside loop do:
s=StringScanner.new "some string foo..."
s2=s.dup

inside loop do:
s=s2
.... s.scan_until...

best regards -botp

From: Michael Fellinger on
On Fri, Jun 25, 2010 at 1:01 PM, botp <botpena(a)gmail.com> wrote:
> On Fri, Jun 25, 2010 at 1:16 AM, Michael Fellinger > I've just run
> some benchmarks with strscan, and it's at least in the
>> same ballpark as the other approaches, unless you're on rubinius, but
>> then all string processing is really slow on that anyway.
>>
>> Benchmark with strscan here: http://gist.github.com/451675
>>
>
> that is not fair for strscan.. you are recreating the object inside the loop :)

That's not fair for the others, and doesn't make any difference in the
benchmark anyway.

--
Michael Fellinger
CTO, The Rubyists, LLC

From: botp on
On Fri, Jun 25, 2010 at 3:38 PM, Michael Fellinger
> That's not fair for the others,

indeed, in general. but if multiple/repeated processes are done on the
same string, then strscan will make very big difference.

> and doesn't make any difference in the
> benchmark anyway.

wc makes me think that it could be possible that ruby strings may be
strscan-ready without added init load :)

best regards -botp