From: J�rgen Exner on
ccc31807 <cartercc(a)gmail.com> wrote:
>I get some of the data in CSV format. One of my sources switched from
>an Access database to an Excel file. Turns out that Excel strips out
>the leading zeros if it thinks that the datum is an integer.

Which I would argue is the correct behaviour for a numerical data field.
If you don't want a canonical numerical form, then declare the data
field to be text. Problem solved.

jue
From: Uri Guttman on
>>>>> "TM" == Tad McClellan <tadmc(a)seesig.invalid> writes:

>> my ($order, $first, $last, @years) = split /\|/;
>> __DATA__
>> 1|George|Washington|1788 1792
>> 2|John|Adams|1796
>> 3|Thomas|Jefferson|1800 1804
>> 4|James|Madison|1808 1812
>> 32|Franklin|Roosevelt|1932 1936 1940 1944


TM> @years always contains exactly one element, it is a non-arrayish array.

TM> $years would work as well, and would avoid looking like it wouldn't
TM> work...

gack, i didn't see that! no wonder it 'worked'. i was so caught up in
the wrong use of an array there i didn't notice it was only getting one
value which had the whole string with numbers. he never split that field
into a list of numbers. do'h!!

uri

--
Uri Guttman ------ uri(a)stemsystems.com -------- http://www.sysarch.com --
----- Perl Code Review , Architecture, Development, Training, Support ------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
From: bugbear on
Uri Guttman wrote:
>>>>>> "TM" == Tad McClellan <tadmc(a)seesig.invalid> writes:
>
> >> my ($order, $first, $last, @years) = split /\|/;
> >> __DATA__
> >> 1|George|Washington|1788 1792
> >> 2|John|Adams|1796
> >> 3|Thomas|Jefferson|1800 1804
> >> 4|James|Madison|1808 1812
> >> 32|Franklin|Roosevelt|1932 1936 1940 1944
>
>
> TM> @years always contains exactly one element, it is a non-arrayish array.
>
> TM> $years would work as well, and would avoid looking like it wouldn't
> TM> work...
>
> gack, i didn't see that! no wonder it 'worked'. i was so caught up in
> the wrong use of an array there i didn't notice it was only getting one
> value which had the whole string with numbers. he never split that field
> into a list of numbers. do'h!!

Chuckle. *MY* experience tells me that bugs are never where you're looking ;-)

BugBear
From: ccc31807 on
On Jun 8, 9:08 pm, J rgen Exner <jurge...(a)hotmail.com> wrote:
> ccc31807 <carte...(a)gmail.com> wrote:
> >I get some of the data in CSV format. One of my sources switched from
> >an Access database to an Excel file. Turns out that Excel strips out
> >the leading zeros if it thinks that the datum is an integer.
>
> Which I would argue is the correct behaviour for a numerical data field.
> If you don't want a canonical numerical form, then declare the data
> field to be text. Problem solved.

I get these kinds of files as user input. My supposition is that prior
to this experience, the user was using Access, and configured the ID
field as text (even though it consists entirely of digits), so that
when exported as CSV it kept all seven digits, which it would have
done as a text field, a string.

Users don't normally bother to set the data type of Excel columns
unless they are currency, dates, or specific numeric fields, so Excel
treats a column with numeric characters as numeric, which is entirely
reasonable. When you save the Excel file as CSV, it only saves the
significant digits, not leading zeros. Again, this is entirely
reasonable.

My problem was that I wasn't aware of the switch (might have been told
but wasn't really aware of it) and ASSUMED that the numeric IDs were
all present including the leading zeros. When I figured out that the
errors were associated with the records that had ID consisting of
leading zeros, I ASSUMED that it was a software problem, a bug I had
introduced, a programming error.

As bugbear notes, it was indeed a programming error, but related to
validation of data, not conversion of data types. When I converted the
numeric fields to strings, I got the same error, which ultimately lead
me to examine the data file.

As to use of the @courses variable, I'll change that to $courses. I've
already explained why that happened, and I honestly don't feel too bad
about that, as that's the kind of error we all make when we write in
different languages at the same time.

CC
From: Peter J. Holzer on
On 2010-06-08 21:12, ccc31807 <cartercc(a)gmail.com> wrote:
> #! perl
> # array.plx
> use strict;
> use warnings;
> my %presidents;
> while (<DATA>)
> {
> chomp;
> my ($order, $first, $last, @years) = split /\|/;
> $presidents{$order} = {
> first => $first,
> last => $last,
> years => @years,
> };
> }
>
> foreach my $k (sort keys %presidents)
> {
> print "$k => $presidents{$k}\n";
> foreach my $k2 (sort keys %{$presidents{$k}})
> {
> print " $k2 => $presidents{$k}{$k2}\n";
> }
> }
> exit(0);

This script never pads $order to two digits.

> __DATA__
> 1|George|Washington|1788 1792
^ here $order has only one digit.
> 2|John|Adams|1796
> 3|Thomas|Jefferson|1800 1804
> 4|James|Madison|1808 1812
> 32|Franklin|Roosevelt|1932 1936 1940 1944
>
> ----------OUTPUT----------------
> D:\PerlLearn>perl array.plx
> 01 => HASH(0x248e5c)
^^ Thus I do not believe that this output is from the script above.

hp
First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6 7
Prev: How to read a given number of lines?
Next: something stupid