From: "Choate, Paul@DDS" on
Jue - try this -


PROC IMPORT OUT= WORK.A
DATAFILE= "c:\data.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
GUESSINGROWS=3276;
RUN;


hth

Paul Choate
DDS Data Extraction
(916) 654-2160

-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Jue
Wang
Sent: Friday, September 22, 2006 11:01 AM
To: SAS-L(a)LISTSERV.UGA.EDU
Subject: Variable length when using Proc Import

I am using the following program to read in data. The data has a
variable "name" whose length varies from 3 to 18, unfortunately, since
the first observation's name is 3-character, the read-in SAS dataset
truncates all other names to have length 3. I understand that the
easiest resolution is to put the longest name in the first row of the
original data set, but is there any other way to correctly read in data
without changing the original data set?

Thank You

PROC IMPORT OUT= WORK.A
DATAFILE= "C:\data.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
From: Jue Wang on
-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU]On Behalf Of
Choate, Paul(a)DDS
Sent: Friday, September 22, 2006 3:26 PM
To: SAS-L(a)LISTSERV.UGA.EDU
Subject: Re: [SAS-L] Variable length when using Proc Import


Jue - try this -


PROC IMPORT OUT= WORK.A
DATAFILE= "c:\data.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
GUESSINGROWS=3276;
RUN;


hth

Paul Choate
DDS Data Extraction
(916) 654-2160

-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Jue
Wang
Sent: Friday, September 22, 2006 11:01 AM
To: SAS-L(a)LISTSERV.UGA.EDU
Subject: Variable length when using Proc Import

I am using the following program to read in data. The data has a
variable "name" whose length varies from 3 to 18, unfortunately, since
the first observation's name is 3-character, the read-in SAS dataset
truncates all other names to have length 3. I understand that the
easiest resolution is to put the longest name in the first row of the
original data set, but is there any other way to correctly read in data
without changing the original data set?

Thank You

PROC IMPORT OUT= WORK.A
DATAFILE= "C:\data.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
From: "Xiao, Steven" on
That might works for 9.1 ver, but it seems to be that 8.x under Window
OS doesn't have that option.

Any other way to do it in 8.x ver?

Steven


-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of
Choate, Paul(a)DDS
Sent: Friday, September 22, 2006 3:26 PM
To: SAS-L(a)LISTSERV.UGA.EDU
Subject: Re: Variable length when using Proc Import

Jue - try this -


PROC IMPORT OUT= WORK.A
DATAFILE= "c:\data.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
GUESSINGROWS=3276;
RUN;


hth

Paul Choate
DDS Data Extraction
(916) 654-2160

-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Jue
Wang
Sent: Friday, September 22, 2006 11:01 AM
To: SAS-L(a)LISTSERV.UGA.EDU
Subject: Variable length when using Proc Import

I am using the following program to read in data. The data has a
variable "name" whose length varies from 3 to 18, unfortunately, since
the first observation's name is 3-character, the read-in SAS dataset
truncates all other names to have length 3. I understand that the
easiest resolution is to put the longest name in the first row of the
original data set, but is there any other way to correctly read in data
without changing the original data set?

Thank You

PROC IMPORT OUT= WORK.A
DATAFILE= "C:\data.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
RUN;
From: Jue Wang on
Dear Paul,

Thank you for the Help. But somehow I got error message in the log, and no data was read in:


519 PROC IMPORT OUT= WORK.A
520 DATAFILE= "C:\data.csv"
521 DBMS=CSV REPLACE;
NOTE: The previous statement has been deleted.
522 GETNAMES=YES;
523 DATAROW=2;
NOTE: SCL source line.
524 GUESSINGROWS=3276;
------------
180
ERROR 180-322: Statement is not valid or it is used out of proper order.

525 RUN;


-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU]On Behalf Of
Choate, Paul(a)DDS
Sent: Friday, September 22, 2006 3:26 PM
To: SAS-L(a)LISTSERV.UGA.EDU
Subject: Re: [SAS-L] Variable length when using Proc Import


Jue - try this -


PROC IMPORT OUT= WORK.A
DATAFILE= "c:\data.csv"
DBMS=CSV REPLACE;
GETNAMES=YES;
DATAROW=2;
GUESSINGROWS=3276;
RUN;


hth

Paul Choate
DDS Data Extraction
(916) 654-2160
From: "Choate, Paul@DDS" on
Well - it works on v9. I guess you are using SAS 8?

There is a v8 tweak that may be useful to you....

SN-V8+-001075
How to scan more than 20 records to determine variable attributes in EFI

------------------------------------------------------------------------
In Versions 7 and 8 of the SAS System, by default the Import Wizard,
PROC IMPORT and the External File Interface (EFI) scan 20 records to
determine variable attributes when reading delimited text files.
Changing the default setting can only be done for EFI in Version 7,
Release 8 and Release 8.1. Beginning in Release 8.2 changing the
default setting is applicable to the Import Wizard, PROC IMPORT and
EFI.
Follow the steps below to change the default behavior:

1) Type regedit on the command line (white box with a check mark)
2) When the Registry Editor window opens, double click on the
PRODUCTS icon
3) Double click on the BASE icon
4) Double click on the EFI icon
5) In the window on the right the Contents of EFI will be
populated
with EFI options
6) Double click on GuessingRows
7) When the new window opens with the old value of 20, delete it,
enter the new value, and click on OK
8) Close the Registry Editor window
9) Invoke the Import Wizard, PROC IMPORT or EFI to use the new
GuessingRows value

The new value entered for GuessingRows will remain until you change
it.

Beginning in SAS 9.1 there is a new statement, GUESSINGROWS=, that
can
be specified with PROC IMPORT. By specifying the GUESSINGROWS=
statement with PROC IMPORT, you do not have to change the
GuessingRows
value in the SAS Registry. Also beginning in SAS 9.1 you can specify
the Number of Rows to Guess when using the Import Wizard in the SAS
Import: Delimited File Options Window or when using EFI in the
Options
for Import Window.
------------------------------------------------------------------------
----
Product: Base SAS
Component: External File Interface(OS)
Priority: N/A
Note Type: Usage Issue
Date: Fri, 15 Apr 2005


Here's post you might be interested in:


From: Bruce Gilsen
Date: Thurs, Nov 17 2005 8:53 am
Groups: comp.soft-sys.sas

By default, the Import Wizard, PROC IMPORT, and the External File
Interface (EFI) scan the first 20 records to determine variable
attributes
such as field length when reading delimited text files (CSV, TAB, and
DLM
files). If character fields have longer values past the first 20
records,
PROC IMPORT, the Import Wizard, and the EFI truncate them.

Before Version 9.1, you had to update the SAS registry to fix this
problem
(SAS Note 1075,
http://support.sas.com/techsup/unotes/SN/001/001075.html).


Starting in Version 9.1, the PROC IMPORT GUESSINGROWS= option tells SAS
how many records to scan. The Import Wizard and the EFI also have ways
to
do this (see SAS Note 1075).


One problem is that SAS always checks all variables; there's no way to
request that just that some variables are checked, and GUESSINGROWS
slows
down program execution for large files.


Here's a test I ran on Linux 9.1.3 with CSV files. In all cases, all
but
1 variable was numeric (so that no checking beyond the default 20
records
was needed), and a single character variable, COMPANY, was "ibm" or
"aol"
in the first 20 observations and "microsoft" in all other observations.
Without GUESSINGROWS=21 or higher, PROC IMPORT sets the length of
COMPANY
to 3 based on the first 20 records and truncates "microsoft" to "mic".
All tests had 16,000 observations, and I either omitted GUESSINGROWS or
set it to 16,000 to check the whole file.


Number of variables GUESSINGROWS=16,000? SECONDS (cpu/real)


3 no .11/.12
3 yes 5.92/6.00
35 no .31/.33
35 yes 50.63/50.76


So, most of the time in the last test is presumably spent checking
variables that I know don't need to be checked.


My suggestion: an option paired with GUESSINGROWS called GUESSINGCOLUMNS

that would specify which columns to check past the default 20 records.
If
omitted, the current behavior, check all columns, takes place.


Here's the code I used. One of the first 2 DATA steps would be
commented
out, based on which tests I am doing.


/* Create data for tests 1 and 2 (3 variables) */
data one;
retain i company itimes10;
length company $10;
do i=1 to 16000;
itimes10 = i * 10;
if i le 10 then company="ibm";
else if i le 20 then company="aol";
else company="microsoft";
output;
end;
run;


/* Create data for tests 3 and 4 (35 variables) */
data one;
retain i company itimes10 i1-i32;
array iall (*) i1-i32;
drop j;
length company $10;
do i=1 to 16000;
itimes10 = i * 10;
/* in 1st obs, i1=1 i2=2 ... i32=32.
in 2nd obs, i1=2 i2=3 ... i32=33. etc.
*/
do j = 1 to dim(iall);
iall(j) = i+j-1;
end;
if i le 10 then company="ibm";
else if i le 20 then company="aol";
else company="microsoft";
output;
end;
run;


/* create a CSV file for testing */
proc export
data=one
outfile = '/it/support/it/m1bfg00/csv4.csv'
dbms=csv;
run;


/* read the CSV file */
proc import
out=csvin
datafile = '/it/support/it/m1bfg00/csv4.csv'
dbms=csv;
getnames=yes;
guessingrows=16000;
datarow=2;
run;

......................................

Paul Choate
DDS Data Extraction
(916) 654-2160

-----Original Message-----
From: SAS(r) Discussion [mailto:SAS-L(a)LISTSERV.UGA.EDU] On Behalf Of Jue
Wang
Sent: Friday, September 22, 2006 12:42 PM
To: SAS-L(a)LISTSERV.UGA.EDU
Subject: Re: Variable length when using Proc Import

Dear Paul,

Thank you for the Help. But somehow I got error message in the log, and
no data was read in:


519 PROC IMPORT OUT= WORK.A
520 DATAFILE= "C:\data.csv"
521 DBMS=CSV REPLACE;
NOTE: The previous statement has been deleted.
522 GETNAMES=YES;
523 DATAROW=2;
NOTE: SCL source line.
524 GUESSINGROWS=3276;
------------
180
ERROR 180-322: Statement is not valid or it is used out of proper order.

525 RUN;


-----Original Message-----
F
 |  Next  |  Last
Pages: 1 2
Prev: Normalization in SAS
Next: Unzip a file