From: Peter J. Holzer on
On 2010-07-26 10:07, Wolfram Humann <w.c.humann(a)arcor.de> wrote:
> On Jul 26, 11:39�am, Ben Morrow <b...(a)morrow.me.uk> wrote:
>> I seriously doubt the issue is with Strawberry specifically; almost
>> certainly any issue applies Win32 perl in general and should be reported
>> to p5p with perlbug. If you can confirm it is specific to Strawberry
>> (so, e.g., a self-compiled mingw perl *doesn't* have the problem) then I
>> think the correct place is the Perl::Dist::Strawberry queue on
>> rt.cpan.org (mail bug-Perl-Dist-Strawbe...(a)rt.cpan.org).
>>
>> (IIRC malloc on Win32 is *also* known to be deadly slow, and also IIRC
>> it's impossible to use perl's malloc without breaking things...)
>
> Thanks for the pointers. My comparison is Strawberry Perl (5.10 and
> 5.12) against Cygwin Perl on the same machine. The latter (as well as
> Perl on Linux) doesn't have the issues I see. Is that a sufficient
> "proof" for the issues being Strawberry specific?

I remember vaguely that Activestate Perl has similar issues. I think Ben
is correct that this is a problem with Win32 malloc and will affect any
Perl build which uses the Win32 malloc implementation (Cygwin probably
provides its own malloc implementation). A fix which works on both
Activestate and Strawberry would certainly be preferrable to a
Strawberry-specific one.

hp

From: Ben Morrow on

Quoth Wolfram Humann <w.c.humann(a)arcor.de>:
> On Jul 26, 11:39�am, Ben Morrow <b...(a)morrow.me.uk> wrote:
> >
> > I seriously doubt the issue is with Strawberry specifically; almost
> > certainly any issue applies Win32 perl in general and should be reported
> > to p5p with perlbug. If you can confirm it is specific to Strawberry
> > (so, e.g., a self-compiled mingw perl *doesn't* have the problem) then I
> > think the correct place is the Perl::Dist::Strawberry queue on
> > rt.cpan.org (mail bug-Perl-Dist-Strawbe...(a)rt.cpan.org).
> >
> > (IIRC malloc on Win32 is *also* known to be deadly slow, and also IIRC
> > it's impossible to use perl's malloc without breaking things...)
>
> Thanks for the pointers. My comparison is Strawberry Perl (5.10 and
> 5.12) against Cygwin Perl on the same machine. The latter (as well as
> Perl on Linux) doesn't have the issues I see. Is that a sufficient
> "proof" for the issues being Strawberry specific?

No. As far as Perl is concerned, Cygwin is a separate OS. A fair
comparison would be with ActiveState or (as I said) with a Win32 perl
you've compiled yourself.

If the issue simply turns out to be 'Microsoft don't know how to write a
decent malloc', there is very little p5p can do about it, of course. On
most platforms perl can, and often does, use its own malloc
implementation which is optimised for perl's use (lots of tiny
allocations and deallocations all the time). This isn't possible on
Win32 unless you make a custom build of perl that doesn't support the
fork emulation.

Ben

From: Peter J. Holzer on
On 2010-07-26 13:11, Ben Morrow <ben(a)morrow.me.uk> wrote:
> If the issue simply turns out to be 'Microsoft don't know how to write a
> decent malloc', there is very little p5p can do about it, of course. On
> most platforms perl can, and often does, use its own malloc
> implementation which is optimised for perl's use (lots of tiny
> allocations and deallocations all the time). This isn't possible on
> Win32 unless you make a custom build of perl that doesn't support the
> fork emulation.

Since the fork emulation works with Win32 malloc, I think it should be
possible to write a custom malloc based on Win32 malloc (or the
underlying API calls) which still works with the fork emulation but is
faster. But it's probably not easy or somebody would have done it
already (I don't pretend to understand either memory allocation or the
fork emulation on windows).

hp

From: Wolfram Humann on
On Jul 26, 3:11 pm, Ben Morrow <b...(a)morrow.me.uk> wrote:
> Quoth Wolfram Humann <w.c.hum...(a)arcor.de>:
> > Thanks for the pointers. My comparison is Strawberry Perl (5.10 and
> > 5.12) against Cygwin Perl on the same machine. The latter (as well as
> > Perl on Linux) doesn't have the issues I see. Is that a sufficient
> > "proof" for the issues being Strawberry specific?
>
> No. As far as Perl is concerned, Cygwin is a separate OS. A fair
> comparison would be with ActiveState or (as I said) with a Win32 perl
> you've compiled yourself.

Oh dear, you're right: ActiveState Perl is just as bad as Strawberry.
Here's my test-case:
(I always append a number of chunks with a total size of 1E6 chars to
an existing string,
but the start-size of the existing string and the chunk-size vary)


use strict;
use warnings;
use Time::HiRes qw(time);

my $c1E1 = '#' x 1E1;
my $c1E2 = '#' x 1E2;
my $c1E3 = '#' x 1E3;
my $c1E4 = '#' x 1E4;
my $c1E5 = '#' x 1E5;


my $str1 = '#' x 1E5;
my $str2 = '#' x 1E6;
my $str3 = '#' x 1E7;

my $str4 = '#' x 1E7;
my $str5 = '#' x 1E7;
my $str6 = '#' x 1E7;
my $str7 = '#' x 1E7;
my $str8 = '#' x 1E7;

my $str9 = '#' x 2E7;
$str9 = '#' x 1E7;

my @ar1 = map{ $c1E2 } 1..1E5;

my @c = (
'1E5 chars + 1E4 x 1E2 chars' => sub{ $str1 .= $c1E2 for 1..1E4 },
'1E6 chars + 1E4 x 1E2 chars' => sub{ $str2 .= $c1E2 for 1..1E4 },
'1E7 chars + 1E4 x 1E2 chars' => sub{ $str3 .= $c1E2 for 1..1E4 },
'',
'1E7 chars + 1E5 x 1E1 chars' => sub{ $str4 .= $c1E1 for 1..1E5 },
'1E7 chars + 1E4 x 1E2 chars' => sub{ $str5 .= $c1E2 for 1..1E4 },
'1E7 chars + 1E3 x 1E3 chars' => sub{ $str6 .= $c1E3 for 1..1E3 },
'1E7 chars + 1E2 x 1E4 chars' => sub{ $str7 .= $c1E4 for 1..1E2 },
'1E7 chars + 1E1 x 1E5 chars' => sub{ $str8 .= $c1E5 for 1..1E1 },
'',
'1E7 chars (pre-extend to 2E7) + 1E4 x 1E2 chars' => sub{ $str9 .=
$c1E2 for 1..1E4 },
'1E7 (1E5 x 1E2 chars) array + 1E4 x 1E2 chars ' => sub{ push @ar1,
$c1E2 for 1..1E4 },
);

while (@c) {
my $name = shift @c;
print("\n"), next unless $name;
my $code = shift @c;
my $t1 = time; &$code; my $t2 = time;
printf "%s: %6.1f ms\n", $name, 1000 * ($t2 - $t1);
}

##########################################################
And these are the results:

c:\cygwin\bin\perl LongStrings.pl
1E5 chars + 1E4 x 1E2 chars: 1.6 ms
1E6 chars + 1E4 x 1E2 chars: 2.4 ms
1E7 chars + 1E4 x 1E2 chars: 1.5 ms

1E7 chars + 1E5 x 1E1 chars: 11.3 ms
1E7 chars + 1E4 x 1E2 chars: 1.5 ms
1E7 chars + 1E3 x 1E3 chars: 0.9 ms
1E7 chars + 1E2 x 1E4 chars: 1.0 ms
1E7 chars + 1E1 x 1E5 chars: 0.9 ms

1E7 chars (pre-extend to 2E7) + 1E4 x 1E2 chars: 1.2 ms
1E7 (1E5 x 1E2 chars) array + 1E4 x 1E2 chars : 5.5 ms

##########################################################
c:\strawberry\perl\bin\perl LongStrings.pl
1E5 chars + 1E4 x 1E2 chars: 94.4 ms
1E6 chars + 1E4 x 1E2 chars: 319.9 ms
1E7 chars + 1E4 x 1E2 chars: 2710.4 ms

1E7 chars + 1E5 x 1E1 chars: 2656.0 ms
1E7 chars + 1E4 x 1E2 chars: 2656.1 ms
1E7 chars + 1E3 x 1E3 chars: 2609.1 ms
1E7 chars + 1E2 x 1E4 chars: 1109.1 ms
1E7 chars + 1E1 x 1E5 chars: 118.3 ms

1E7 chars (pre-extend to 2E7) + 1E4 x 1E2 chars: 1.2 ms
1E7 (1E5 x 1E2 chars) array + 1E4 x 1E2 chars : 6.5 ms

I compared Strawberry and ActiveState on another machine: the times
are close to each other but even longer than the ones above due to the
older hardware.

Wolfram
From: Ilya Zakharevich on
On 2010-07-26, Peter J. Holzer <hjp-usenet2(a)hjp.at> wrote:
> On 2010-07-26 13:11, Ben Morrow <ben(a)morrow.me.uk> wrote:
>> If the issue simply turns out to be 'Microsoft don't know how to write a
>> decent malloc', there is very little p5p can do about it, of course. On
>> most platforms perl can, and often does, use its own malloc
>> implementation which is optimised for perl's use (lots of tiny
>> allocations and deallocations all the time). This isn't possible on
>> Win32 unless you make a custom build of perl that doesn't support the
>> fork emulation.

> Since the fork emulation works with Win32 malloc, I think it should be
> possible to write a custom malloc based on Win32 malloc (or the
> underlying API calls) which still works with the fork emulation but is
> faster. But it's probably not easy or somebody would have done it
> already (I don't pretend to understand either memory allocation or the
> fork emulation on windows).

"My" malloc() (one shipped with Perl) needs only 1 syscall implemented
on a particular architecture: get some pages from the system. So the
macro should be defined on the command line of $(CC) -c malloc.c.

Hope this helps,
Ilya