From: hs AT dc-sug DOT org on
See http://www.sascommunity.org/wiki/Rolling_Calculations

On Wed, 24 Dec 2008 13:11:16 -0600, ST <sheepsas(a)GMAIL.COM> wrote:

>Hi there,
>
>I have a dataset which has 10000 observations. For each obs, I want to
>create a regression model (based on variable x and y in the dataset) using
>previous 20 observations and save the estimated slope as a new variable in
>the dataset. Totally, I will have 9980 slopes.
>
>Can someone show me how to do this efficiently?
>
>Many thanks,
>
>Hu
From: Muthia Kachirayan on
On Wed, Dec 24, 2008 at 2:11 PM, ST <sheepsas(a)gmail.com> wrote:

> Hi there,
>
> I have a dataset which has 10000 observations. For each obs, I want to
> create a regression model (based on variable x and y in the dataset) using
> previous 20 observations and save the estimated slope as a new variable in
> the dataset. Totally, I will have 9980 slopes.
>
> Can someone show me how to do this efficiently?
>
> Many thanks,
>
> Hu
>


A most efficient way to do this MOVING REGRESSION lies in not using the Proc
REG. Hold x and y for a size of WSIZE at a time, sliding up by removing one
at the left and adding one at the right. The slope is detemined by

(Sum(XY) - Sum(X) * SUM(Y) / WSIZE) / (SUM(X * X) - SUM(X) ** 2 / WSIZE)

If y-intercept is also needed, the following statement can be included.

y_intercept = sumy / &wsize - slope * sumx / &wsize;

The following code is an adaption of Howard who posted in SAS-L to find
moving Standard Deviations.


%let wsize=7;

data c(keep = x y slope);
array xx(&wsize) (&wsize * 0);
array yy(&wsize) (&wsize * 0);
retain xx : yy :;
x2remove = xx(mod(_n_ - 1, &wsize) + 1);
y2remove = yy(mod(_n_ - 1, &wsize) + 1);
set sashelp.class(keep = weight height rename = (weight = y height = x)) ;
xx(mod(_n_ - 1, &wsize) + 1) = x;
yy(mod(_n_ - 1, &wsize) + 1) = y;
sumx + x - x2remove ;
sumxsq + x**2 - x2remove**2;
sumy + y - y2remove;
sumxy + x * y - x2remove * y2remove;
if _n_ >= &wsize then
slope = (sumxy - sumx * sumy / &wsize)/(sumxsq - sumx ** 2 / &wsize);
run;


Happy Holidays to All.

Muthia Kachirayan
From: Bill West on
I see you've received several suggested methods and I may be showing my
ignorance but I'm not sure what you mean by “ save the estimated slope as
a new variable.
if you mean a predicted value
I believe the regout dataset below will include a predicted value for
each obs. Is that what you want?


data expanded.;
set file;
if 1< =_N_ <=20 then span=1;
if 21< =_N_ <=30 then span=2;
if 31< =_N_ <=40 then span=3;
etc up to 500;

proc reg data=expanded noprint outest=regout ;
by span;
model x =y;
From: Bill West on
On Wed, 24 Dec 2008 18:42:30 -0500, Muthia Kachirayan
<muthia.kachirayan(a)GMAIL.COM> wrote:

>On Wed, Dec 24, 2008 at 2:11 PM, ST <sheepsas(a)gmail.com> wrote:
>
>> Hi there,
>>
>> I have a dataset which has 10000 observations. For each obs, I want to
>> create a regression model (based on variable x and y in the dataset)
using
>> previous 20 observations and save the estimated slope as a new variable
in
>> the dataset. Totally, I will have 9980 slopes.
>>
>> Can someone show me how to do this efficiently?
>>
>> Many thanks,
>>
>> Hu
>>
>
Hi Hu ,
I see you've received several suggested methods and I may be showing my
ignorance but I'm not sure what you mean by “ save the estimated slope as
a new variable.
if you mean a predicted value
I believe the regout dataset below will include a predicted value for
each obs. Is that what you want?


data expanded.;
set file;
if 1< =_N_ <=20 then span=1;
if 21< =_N_ <=30 then span=2;
if 31< =_N_ <=40 then span=3;
etc up to 500;

proc reg data=expanded noprint outest=regout ;
by span;
model x =y;
From: Muthia Kachirayan on
On Thu, Dec 25, 2008 at 11:08 AM, Bill West <wm_a_west(a)yahoo.com> wrote:

> On Wed, 24 Dec 2008 18:42:30 -0500, Muthia Kachirayan
> <muthia.kachirayan(a)GMAIL.COM> wrote:
>
> >On Wed, Dec 24, 2008 at 2:11 PM, ST <sheepsas(a)gmail.com> wrote:
> >
> >> Hi there,
> >>
> >> I have a dataset which has 10000 observations. For each obs, I want to
> >> create a regression model (based on variable x and y in the dataset)
> using
> >> previous 20 observations and save the estimated slope as a new variabl=
e
> in
> >> the dataset. Totally, I will have 9980 slopes.
> >>
> >> Can someone show me how to do this efficiently?
> >>
> >> Many thanks,
> >>
> >> Hu
> >>
> >
> Hi Hu ,
> I see you've received several suggested methods and I may be showing my
> ignorance but I'm not sure what you mean by =E2=80=9C save the estimated =
slope as
> a new variable.
> if you mean a predicted value
> I believe the regout dataset below will include a predicted value for
> each obs. Is that what you want?
>
>
> data expanded.;
> set file;
> if 1< =3D_N_ <=3D20 then span=3D1;
> if 21< =3D_N_ <=3D30 then span=3D2;
> if 31< =3D_N_ <=3D40 then span=3D3;
> etc up to 500;
>
> proc reg data=3Dexpanded noprint outest=3Dregout ;
> by span;
> model x =3Dy;
>


The OP wanted the slope for each of the set of obsevations (_N_ ) between

1 to 20,
2 to 21,
3 to 22,
4 to 23,

10,081 to 10,000.

This is similar to finding Moving Averages, having overlapping observations=
,
leaving one at the left and adding one to the right .

Thus there will be 9980 slopes(10,000 - 20).

The use of

if 1< =3D_N_ <=3D20 then span=3D1;
if 21< =3D_N_ <=3D30 then span=3D2;
if 31< =3D_N_ <=3D40 then span=3D3;
etc up to 500;

will result in slopes for segmented observations and essentially not a
moving average slope.

My code does the moving average slope by computing 4 sums (Sum(X), Sum(Y),
SUM(X * X) and Sum(X * Y) on a sliding scale.

Hope this clarifies.

Kind regards,

Muthia Kachirayan
 | 
Pages: 1
Prev: Lab Tables
Next: Need help with %qscan and %scan