From: Joe Matise on
1. Put the keep statement as a dataset option in the set statement - not
sure if it saves time, but it might.
2. Try PROC COPY, again using a dataset option for the keep statement.
Very similar to the PROC DATASETS copy statement, but I don't think the
latter supports dataset options (though it might?).
3. Try creating views instead of datasets; if you're just doing it for some
temporary purpose, this may be better. Doesn't help if you need to
transport this file somewhere else, though.

-Joe

On Wed, Oct 7, 2009 at 2:45 PM, Claus Yeh <phoebe.caulfield42(a)gmail.com>wrote:

> Dear all,
>
> I have a very large SAS dataset - 500,000 variables and 4000
> observations.
>
> I want to create smaller datasets that contains about 1000 to 10,000
> variables of the original 500,000 variable dataset.
>
> I used data step to do this but it was very very slow (I need to
> create multiple smaller steps).
>
> ie. data small;
> set large;
> keep var1-var1000;
> run;
>
> Is there a way to do it in Proc Dataset that can output the smaller
> dataset much quicker? If there are other efficient ways, please let
> me know too.
>
> thank you so much,
> claus
>
From: Michael Raithel on
Dear SAS-L-ers,

Claus Yeh, posted the following:

> Dear all,
>
> I have a very large SAS dataset - 500,000 variables and 4000
> observations.
>
> I want to create smaller datasets that contains about 1000 to 10,000
> variables of the original 500,000 variable dataset.
>
> I used data step to do this but it was very very slow (I need to
> create multiple smaller steps).
>
> ie. data small;
> set large;
> keep var1-var1000;
> run;
>
> Is there a way to do it in Proc Dataset that can output the smaller
> dataset much quicker? If there are other efficient ways, please let
> me know too.
>
Claus, yeh, I can think of a way of doing this that will run so fast, that you will hear a sonic boom as the DATA Step reaches Mach I! And, it won't cost you one bit more of storage, to boot!

How about using a DATA Step view? You could code:

data smallarge/view=smallarge;
set large;
keep var1-var1000;
run;

....which would create a DATA Step view file in the blink of an eye. Thereafter, you could use that view to surface only Var1 - Var1000 in future SAS PROCs or DATA Steps.

Would that work for you, or are you going to wait for some other SAS-L-sharpie's clever-er suggestion?

Claus, best of luck in all of your SAS endeavors!


I hope that this suggestion proves helpful now, and in the future!

Of course, all of these opinions and insights are my own, and do not reflect those of my organization or my associates. All SAS code and/or methodologies specified in this posting are for illustrative purposes only and no warranty is stated or implied as to their accuracy or applicability. People deciding to use information in this posting do so at their own risk.

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Michael A. Raithel
"The man who wrote the book on performance"
E-mail: MichaelRaithel(a)westat.com

Author: Tuning SAS Applications in the MVS Environment

Author: Tuning SAS Applications in the OS/390 and z/OS Environments, Second Edition
http://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=58172

Author: The Complete Guide to SAS Indexes
http://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=60409

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
....fire all of your guns at once and explode into space... - Steppenwolf, Born to be Wild
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
From: Claus Yeh on
On Oct 7, 1:22 pm, michaelrait...(a)WESTAT.COM (Michael Raithel) wrote:
> Dear SAS-L-ers,
>
> Claus Yeh, posted the following:
>
>
>
> > Dear all,
>
> > I have a very large SAS dataset - 500,000 variables and 4000
> > observations.
>
> > I want to create smaller datasets that contains about 1000 to 10,000
> > variables of the original 500,000 variable dataset.
>
> > I used data step to do this but it was very very slow (I need to
> > create multiple smaller steps).
>
> > ie.   data small;
> >           set large;
> >          keep var1-var1000;
> >       run;
>
> > Is there a way to do it in Proc Dataset that can output the smaller
> > dataset much quicker?  If there are other efficient ways, please let
> > me know too.
>
> Claus, yeh, I can think of a way of doing this that will run so fast, that you will hear a sonic boom as the DATA Step reaches Mach I!  And, it won't cost you one bit more of storage, to boot!
>
> How about using a DATA Step view?  You could code:
>
>      data smallarge/view=smallarge;
>           set large;
>          keep var1-var1000;
>       run;
>
> ...which would create a DATA Step view file in the blink of an eye.  Thereafter, you could use that view to surface only Var1 - Var1000 in future SAS PROCs or DATA Steps.
>
> Would that work for you, or are you going to wait for some other SAS-L-sharpie's clever-er suggestion?
>
> Claus, best of luck in all of your SAS endeavors!
>
> I hope that this suggestion proves helpful now, and in the future!
>
> Of course, all of these opinions and insights are my own, and do not reflect those of my organization or my associates. All SAS code and/or methodologies specified in this posting are for illustrative purposes only and no warranty is stated or implied as to their accuracy or applicability. People deciding to use information in this posting do so at their own risk.
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Michael A. Raithel
> "The man who wrote the book on performance"
> E-mail: MichaelRait...(a)westat.com
>
> Author: Tuning SAS Applications in the MVS Environment
>
> Author: Tuning SAS Applications in the OS/390 and z/OS Environments, Second Editionhttp://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=58172
>
> Author: The Complete Guide to SAS Indexeshttp://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=60409
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> ...fire all of your guns at once and explode into space... - Steppenwolf, Born to be Wild
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Hi Michael,

Thank you so much. I will do some test runs for "view" by running pro
logistic on it.

thanks again,
claus
From: Claus Yeh on
On Oct 7, 1:22 pm, michaelrait...(a)WESTAT.COM (Michael Raithel) wrote:
> Dear SAS-L-ers,
>
> Claus Yeh, posted the following:
>
>
>
> > Dear all,
>
> > I have a very large SAS dataset - 500,000 variables and 4000
> > observations.
>
> > I want to create smaller datasets that contains about 1000 to 10,000
> > variables of the original 500,000 variable dataset.
>
> > I used data step to do this but it was very very slow (I need to
> > create multiple smaller steps).
>
> > ie.   data small;
> >           set large;
> >          keep var1-var1000;
> >       run;
>
> > Is there a way to do it in Proc Dataset that can output the smaller
> > dataset much quicker?  If there are other efficient ways, please let
> > me know too.
>
> Claus, yeh, I can think of a way of doing this that will run so fast, that you will hear a sonic boom as the DATA Step reaches Mach I!  And, it won't cost you one bit more of storage, to boot!
>
> How about using a DATA Step view?  You could code:
>
>      data smallarge/view=smallarge;
>           set large;
>          keep var1-var1000;
>       run;
>
> ...which would create a DATA Step view file in the blink of an eye.  Thereafter, you could use that view to surface only Var1 - Var1000 in future SAS PROCs or DATA Steps.
>
> Would that work for you, or are you going to wait for some other SAS-L-sharpie's clever-er suggestion?
>
> Claus, best of luck in all of your SAS endeavors!
>
> I hope that this suggestion proves helpful now, and in the future!
>
> Of course, all of these opinions and insights are my own, and do not reflect those of my organization or my associates. All SAS code and/or methodologies specified in this posting are for illustrative purposes only and no warranty is stated or implied as to their accuracy or applicability. People deciding to use information in this posting do so at their own risk.
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Michael A. Raithel
> "The man who wrote the book on performance"
> E-mail: MichaelRait...(a)westat.com
>
> Author: Tuning SAS Applications in the MVS Environment
>
> Author: Tuning SAS Applications in the OS/390 and z/OS Environments, Second Editionhttp://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=58172
>
> Author: The Complete Guide to SAS Indexeshttp://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=60409
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> ...fire all of your guns at once and explode into space... - Steppenwolf, Born to be Wild
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Hi Mike,

I tried the "view". It did finish the the dataset fast but the
subsequent procedures took quite long time.

I was wondering if "indexing" the dataset would help?

thanks,
claus
From: Claus Yeh on
On Oct 7, 1:22 pm, michaelrait...(a)WESTAT.COM (Michael Raithel) wrote:
> Dear SAS-L-ers,
>
> Claus Yeh, posted the following:
>
>
>
> > Dear all,
>
> > I have a very large SAS dataset - 500,000 variables and 4000
> > observations.
>
> > I want to create smaller datasets that contains about 1000 to 10,000
> > variables of the original 500,000 variable dataset.
>
> > I used data step to do this but it was very very slow (I need to
> > create multiple smaller steps).
>
> > ie.   data small;
> >           set large;
> >          keep var1-var1000;
> >       run;
>
> > Is there a way to do it in Proc Dataset that can output the smaller
> > dataset much quicker?  If there are other efficient ways, please let
> > me know too.
>
> Claus, yeh, I can think of a way of doing this that will run so fast, that you will hear a sonic boom as the DATA Step reaches Mach I!  And, it won't cost you one bit more of storage, to boot!
>
> How about using a DATA Step view?  You could code:
>
>      data smallarge/view=smallarge;
>           set large;
>          keep var1-var1000;
>       run;
>
> ...which would create a DATA Step view file in the blink of an eye.  Thereafter, you could use that view to surface only Var1 - Var1000 in future SAS PROCs or DATA Steps.
>
> Would that work for you, or are you going to wait for some other SAS-L-sharpie's clever-er suggestion?
>
> Claus, best of luck in all of your SAS endeavors!
>
> I hope that this suggestion proves helpful now, and in the future!
>
> Of course, all of these opinions and insights are my own, and do not reflect those of my organization or my associates. All SAS code and/or methodologies specified in this posting are for illustrative purposes only and no warranty is stated or implied as to their accuracy or applicability. People deciding to use information in this posting do so at their own risk.
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Michael A. Raithel
> "The man who wrote the book on performance"
> E-mail: MichaelRait...(a)westat.com
>
> Author: Tuning SAS Applications in the MVS Environment
>
> Author: Tuning SAS Applications in the OS/390 and z/OS Environments, Second Editionhttp://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=58172
>
> Author: The Complete Guide to SAS Indexeshttp://www.sas.com/apps/pubscat/bookdetails.jsp?catid=1&pc=60409
>
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> ...fire all of your guns at once and explode into space... - Steppenwolf, Born to be Wild
> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Hi Mike,

I tried the "view". It did finish the the dataset fast but the
subsequent procedures took quite long time.

I was wondering if "indexing" the dataset would help?

thanks,
claus