From: Tad McClellan on
James <hslee911(a)yahoo.com> wrote:
> On Jun 21, 10:45 pm, Xho Jingleheimerschmidt <xhos...(a)gmail.com>
> wrote:
>> James wrote:

>> > use vars qw($db $x %h $k $v $i $key $val);
>>
>> Ugg.  Scope variables to the smallest scope you can.


> my $k;
> my @v;
> my $r;


You have not scoped those to the smallest scope you can.

They are "global" within this file.

If you are only going to use then in the for loop, then they
should be scoped to only the for loop. Lose the declarations
above and instead define each at its first use:

> for (<DATA>)
> {
> @v = ();


That line doesn't do anything useful, as you are going to stomp
over @v in the next line anyway. Lose that line too.


> ($k, @v) = split;

my($k, @v) = split;

> $r = \@v;

my $r = \@v;


--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
The above message is a Usenet post.
I don't recall having given anyone permission to use it on a Web site.
From: Xho Jingleheimerschmidt on
James wrote:
>
> Thanks for your reply. Where can I find a reference to your statement,
>
> "because the tied hash only accepts strings, not array references."

I don't have a reference, just empirical evidence. (Use Data::Dumper
to dump your hash, and it shows stringified used-to-be-references.)

Also, DB_File is a Perl wrapper around a C library. I would not expect
a C library to support Perl nested structures, and if the Perl wrapper
took great pains to emulate such support, it probably would have been
mentioned.

> Anyway, I've re-written the run.pl script and run twice, see below.
> The second time, it looks as though the reference to an array seems
> working, but I may be wrong.

What you need to do is prevent the hash from getting created during one
execution. An easy way to do that is to wrap
if (@ARGV<=1) { ...}
around the part that populates (writes) the hash. Then you can suppress
the population of the hash by supplying an extra argument.

Or just break the populating and the reading into different scripts.


What you will find is that the nested parts of the hash are being
written into Perl's memory, not onto disk.

> $ cat run.pl
> use strict vars;

The issue you are having is not covered by strict vars, but rather is
covered by strict refs.


> $ rm testdb
> $ ./run.pl testdb
> === write testdb ===
> (aa -> 0 -> ) (aa -> 1 -> 2) (aa -> 2 -> 3)
> (cc -> 0 -> ) (cc -> 1 -> 99)
> (zz -> 0 -> ) (zz -> 1 -> fee) (zz -> 2 -> fuu) (zz -> 3 -> fun)
> === read testdb ===
> (aa -> 0 -> ) (aa -> 1 -> 2) (aa -> 2 -> 3)
> (cc -> 0 -> ) (cc -> 1 -> 99)
> (zz -> 0 -> ) (zz -> 1 -> fee) (zz -> 2 -> fuu) (zz -> 3 -> fun)

At this point, the strings that looks like references but aren't have
been written to disk under DB_File.

>
> $ ./run.pl testdb
> === write testdb ===
> (aa -> 0 -> 1) (aa -> 1 -> 2) (aa -> 2 -> 3)
> (cc -> 0 -> 678) (cc -> 1 -> 99)
> (zz -> 0 -> foo) (zz -> 1 -> fee) (zz -> 2 -> fuu) (zz -> 3 -> fun)

Here, the array is not autovivified, because the hash (being tied to
previously populated disk file) already has entries, strings that look
like references, but aren't. So now the first value of each set is
picked up and stuffed into the funnily named variables using symbolic
references, rather than being lost as before.

> === read testdb ===
> (aa -> 0 -> 1) (aa -> 1 -> 2) (aa -> 2 -> 3)
> (cc -> 0 -> 678) (cc -> 1 -> 99)
> (zz -> 0 -> foo) (zz -> 1 -> fee) (zz -> 2 -> fuu) (zz -> 3 -> fun)

Now you are printing out the things you just stuffed into memory.

If you change the code so it just ties the hash and skips the "writing"
part and goes right to read, you will find you get no output.

Xho
From: Mumia W. on
On 06/22/2010 02:43 PM, James wrote:
> On Jun 21, 10:45 pm, Xho Jingleheimerschmidt <xhos...(a)gmail.com>
> wrote:
>> James wrote:
>>> I am trying to write to a database a hash of array (as seen by
>>> __DATA__ in the code).
>>> But somehow the first element of the array is missing. Any idea why?
>> I don't believe DB_File supports nested data structures.
>>
>>> The second time, it is working correctly.
>>> $ cat run.pl
>>> use DB_File;
>>> use vars qw($db $x %h $k $v $i $key $val);
>> Ugg. Scope variables to the smallest scope you can.
>>
>> You should use strict.
>>
>> You seem to be accidentally using symbolic references.
>>
>> The inner data never got stored in DB_File in the first place, it is
>> only stored in Perl's memory.
>>
>>> for $i (0..$#v) {
>>> $h{$k}->[$i] = $r->[$i];
>>> }
>> The first time through, an array is auto-vivified, and it contains
>> $r->[0]. When a reference to this array is stuffed into $h{$k}, it gets
>> stringified to something like 'ARRAY(0x825c3dc)' because the tied hash
>> only accepts strings, not array references. At that point, the array
>> and the string become disconnected from each other, and the value of
>> that auto-vivified array, the copy of $r->[0], is lost.
>>
>> The second subsequent time, you are using a symbol reference to a
>> variable with the peculiar name 'ARRAY(0x825c3dc)', into which you stuff
>> the remaining values.
>>
>>> sub read_db {
>>> print "=== read $db ===\n";
>>> for ( $status = $x->seq($key, $val, R_FIRST); $status == 0; $status =
>>> $x->seq($key, $val, R_NEXT) )
>>> {
>>> print "$key -> @{$val}\n";
>> At this point, you are pulling the values out of the peculiarly named
>> variable using symbolic references.
>>
>> If you separate your program so the perl instance that reads the DB is
>> not the same one that created it, you will find the values never got
>> stored to the DB in the first place.
>>
>> Xho
>
> Thanks for your reply. Where can I find a reference to your statement,
>
> "because the tied hash only accepts strings, not array references."
>
> Anyway, I've re-written the run.pl script and run twice, see below.
> The second time, it looks as though the reference to an array seems
> working, but I may be wrong.
>
>
> $ cat run.pl
> use strict vars;

As Xho said, if you were to use the other features of strict.pm, you
would find out that you are unintentionally using strings as references.

> use DB_File;
> my ($db) = @ARGV;
> my %h = ();
> my $x = tie %h, "DB_File", $db, O_RDWR|O_CREAT, 0640, $DB_HASH;
>
> print "=== write $db ===\n";
> my $k;
> my @v;
> my $r;
> for (<DATA>)
> {
> @v = ();
> ($k, @v) = split;
> $r = \@v;
> for my $i (0..$#v)
> {
> $h{$k}->[$i] = $r->[$i];
> print "($k -> $i -> ", $h{$k}->[$i], ") ";
> }
> print "\n";
> }
>
> print "=== read $db ===\n";
> my $st;
> my $key;
> my $val;
> my @val;
> for ( $st = $x->seq($key, $val, R_FIRST); $st == 0; $st = $x-
>> seq($key, $val, R_NEXT) )
> {
> @val = @{$val};
> for my $i (0..$#val) {
> print "($key -> $i -> $val[$i]) ";
> }
> print "\n";
> }
>
> untie %h;
> undef $x;
>
> __DATA__
> aa 1 2 3
> cc 678 99
> zz foo fee fuu fun
>
>
> $ rm testdb
> $ ./run.pl testdb
> === write testdb ===
> (aa -> 0 -> ) (aa -> 1 -> 2) (aa -> 2 -> 3)
> (cc -> 0 -> ) (cc -> 1 -> 99)
> (zz -> 0 -> ) (zz -> 1 -> fee) (zz -> 2 -> fuu) (zz -> 3 -> fun)
> === read testdb ===
> (aa -> 0 -> ) (aa -> 1 -> 2) (aa -> 2 -> 3)
> (cc -> 0 -> ) (cc -> 1 -> 99)
> (zz -> 0 -> ) (zz -> 1 -> fee) (zz -> 2 -> fuu) (zz -> 3 -> fun)
>
> $ ./run.pl testdb
> === write testdb ===
> (aa -> 0 -> 1) (aa -> 1 -> 2) (aa -> 2 -> 3)
> (cc -> 0 -> 678) (cc -> 1 -> 99)
> (zz -> 0 -> foo) (zz -> 1 -> fee) (zz -> 2 -> fuu) (zz -> 3 -> fun)
> === read testdb ===
> (aa -> 0 -> 1) (aa -> 1 -> 2) (aa -> 2 -> 3)
> (cc -> 0 -> 678) (cc -> 1 -> 99)
> (zz -> 0 -> foo) (zz -> 1 -> fee) (zz -> 2 -> fuu) (zz -> 3 -> fun)
>
>
>
> JL

This is from "man perltie":
> You cannot easily tie a multilevel data structure (such as a hash of
> hashes) to a dbm file. The first problem is that all but GDBM and
> Berkeley DB have size limitations, but beyond that, you also have
> problems with how references are to be represented on disk.


While arrays cannot be written directly to DB_File databases, you can
easily create a db filter to convert an array reference to a string.

See this example:

#!/usr/bin/perl
use strict;
use warnings;
use DB_File;

my $db = '/tmp/file2.db';

write_dbf($db);
read_dbf($db);

sub write_dbf {
my $filename = $_[0];
tie my %h, 'DB_File', $filename, O_RDWR | O_CREAT, 0644
or die "Tie failed: $!";
sethandlers(\%h);

while (<DATA>) {
my ($k, @v) = split;
$h{$k} = [@v];
}

untie %h;
}

sub read_dbf {
my $filename = $_[0];
tie my %h, 'DB_File', $filename, O_RDWR | O_CREAT, 0644
or die "Tie failed: $!";
sethandlers(\%h);

while (my ($k, $v) = each %h) {
print "$k => @{$v}\n";
}

untie %h;
}

sub sethandlers {
my $obj = tied %{$_[0]};
return unless defined $obj;
$obj->filter_store_value(\&filter_store_value);
$obj->filter_fetch_value(\&filter_fetch_value);
}

sub filter_store_value {
$_ = join(',',@$_);
}

sub filter_fetch_value {
$_ = [ split(',',$_)];
}

__DATA__
aa 1 2 3
cc 678 99
zz foo fee fuu fun

__END__

Try the program both with and without the calls to sethandlers.