From: Anthony on
Hey folks,

I'm dissecting a script and found a part of my code that is ridiculously slow. It's a loop to "create" a series a bunch of times and so it involves a recursion.

Essentially I have 48 original time series (of different sizes, that's why I'm using cells)..and with each one I want to create B "fake" or bootstrapped series.

Here a piece of the script:

B = 100;
for i = 1:ncountries % ncountries = 48
phis = resultsE{i}.beta; %phis is a 6x1 vector of regression coefficients, calced earlier

for b = 1:B
yact = actdiffE{i}; %actual data
for j = 1:pE(i)
yboot{b,i}(j) = yact(end+1-j); %assign the first six values to be the same as %the original data series, so I can start the recursion.
end

for j = p+1:length(actdiffE{i})
index = randi([1,numel(w{i})]); %noise that is added
wstar = w{i}(index);
yboot{b,i}(j) = phis(1) + sum(flipud(phis(2:end)).*yboot{b,i}(j-p:j-1)') + wstar;
end

yboot{b,i} = yboot{b,i}';
yboot{b,i} = flipud(yboot{b,i}); %From past to present
end
end

This takes about a minute with 100 replicates, but I'm looking to generate 1000 or 10000. So I need to be more efficient.

Thanks alot
Anthony
From: Steven_Lord on


"Anthony " <antfarinaccio(a)gmail.comremove.spam> wrote in message
news:i43rop$ogh$1(a)fred.mathworks.com...
> Hey folks,
>
> I'm dissecting a script and found a part of my code that is ridiculously
> slow. It's a loop to "create" a series a bunch of times and so it involves
> a recursion.
>
> Essentially I have 48 original time series (of different sizes, that's why
> I'm using cells)..and with each one I want to create B "fake" or
> bootstrapped series.

Do you have Statistics Toolbox? If so, have you looked at the BOOTSTRP
function?

> Here a piece of the script:

Since you've only given the group part of your code, and you haven't
provided any data, I'm not sure how much specific advice the group can give.

> B = 100;
> for i = 1:ncountries % ncountries = 48
> phis = resultsE{i}.beta; %phis is a 6x1 vector of regression
> coefficients, calced earlier
> for b = 1:B yact = actdiffE{i}; %actual data for j = 1:pE(i)
> yboot{b,i}(j) = yact(end+1-j); %assign the first six values to
> be the same as %the original data series, so I can start the
> recursion.

Have you preallocated yboot?

> end
> for j = p+1:length(actdiffE{i})
> index = randi([1,numel(w{i})]); %noise that is added
> wstar = w{i}(index);
> yboot{b,i}(j) = phis(1) +
> sum(flipud(phis(2:end)).*yboot{b,i}(j-p:j-1)') + wstar;
> end
>
> yboot{b,i} = yboot{b,i}';
> yboot{b,i} = flipud(yboot{b,i}); %From past to present end
> end
>
> This takes about a minute with 100 replicates, but I'm looking to generate
> 1000 or 10000. So I need to be more efficient.

Use the Profiler to identify the bottleneck or bottlenecks. Use the PROFILE
function to do so.

Post some (small) example data to which the bottleneck code can be applied.

Check to make sure Code Analyzer (nee M-Lint) doesn't flag any part of your
code as room for potential improvement.

--
Steve Lord
slord(a)mathworks.com
comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ
To contact Technical Support use the Contact Us link on
http://www.mathworks.com

From: Anthony on
Thanks Steven. I preallocated and shaved off 1/10 of the time. And using the profiler I noticed that there is one line that is causing the slowdown:

> > index = randi([1,numel(w{i})]); %noise that is added
> > wstar = w{i}(index);

What I"m trying to do with these two lines is randomly select a value from the ith cell in w. w is a 48x1 cell. And each element is a column vector. I want to randomly select one value out of the column for the ith element...

For example:
Let the first element of w, w(1) be a column vector 200x1. I want some random element of this column. (The actual value, not the location) And I need to do this everytime I enter the j loop

Also, thanks for the bootstrp function but I'm following a specific methodology that I must stick too.

I hope this is clear.
Thanks
Anthony


"Steven_Lord" <slord(a)mathworks.com> wrote in message <i43sg2$bhf$1(a)fred.mathworks.com>...
>
>
> "Anthony " <antfarinaccio(a)gmail.comremove.spam> wrote in message
> news:i43rop$ogh$1(a)fred.mathworks.com...
> > Hey folks,
> >
> > I'm dissecting a script and found a part of my code that is ridiculously
> > slow. It's a loop to "create" a series a bunch of times and so it involves
> > a recursion.
> >
> > Essentially I have 48 original time series (of different sizes, that's why
> > I'm using cells)..and with each one I want to create B "fake" or
> > bootstrapped series.
>
> Do you have Statistics Toolbox? If so, have you looked at the BOOTSTRP
> function?
>
> > Here a piece of the script:
>
> Since you've only given the group part of your code, and you haven't
> provided any data, I'm not sure how much specific advice the group can give.
>
> > B = 100;
> > for i = 1:ncountries % ncountries = 48
> > phis = resultsE{i}.beta; %phis is a 6x1 vector of regression
> > coefficients, calced earlier
> > for b = 1:B yact = actdiffE{i}; %actual data for j = 1:pE(i)
> > yboot{b,i}(j) = yact(end+1-j); %assign the first six values to
> > be the same as %the original data series, so I can start the
> > recursion.
>
> Have you preallocated yboot?
>
> > end
> > for j = p+1:length(actdiffE{i})
> > index = randi([1,numel(w{i})]); %noise that is added
> > wstar = w{i}(index);
> > yboot{b,i}(j) = phis(1) +
> > sum(flipud(phis(2:end)).*yboot{b,i}(j-p:j-1)') + wstar;
> > end
> >
> > yboot{b,i} = yboot{b,i}';
> > yboot{b,i} = flipud(yboot{b,i}); %From past to present end
> > end
> >
> > This takes about a minute with 100 replicates, but I'm looking to generate
> > 1000 or 10000. So I need to be more efficient.
>
> Use the Profiler to identify the bottleneck or bottlenecks. Use the PROFILE
> function to do so.
>
> Post some (small) example data to which the bottleneck code can be applied.
>
> Check to make sure Code Analyzer (nee M-Lint) doesn't flag any part of your
> code as room for potential improvement.
>
> --
> Steve Lord
> slord(a)mathworks.com
> comp.soft-sys.matlab (CSSM) FAQ: http://matlabwiki.mathworks.com/MATLAB_FAQ
> To contact Technical Support use the Contact Us link on
> http://www.mathworks.com
From: Anthony on
Hey,

The above post has a mistake...

I had added a line to my code, and forgot to rerun the profiler...so the line that is taking the most time is really the recursion and not the random generator.

yboot{b,i}(j) = phis(1) + sum(flipud(phis(2:end)).*yboot{b,i}(j-p:j-1)') + wstar;

This recursion process takes half the time of the script, so I'm trying to make it faster ...

Any ideas?
From: Walter Roberson on
Anthony wrote:

> Here a piece of the script:

> B = 100;
> for i = 1:ncountries % ncountries = 48
> phis = resultsE{i}.beta; %phis is a 6x1 vector of regression
coefficients, calced earlier
> for b = 1:B
> yact = actdiffE{i}; %actual data
> for j = 1:pE(i)

Note that in the above two lines, there are values that depend upon the value
of i but not upon the index of the inner for loops, and the variables being
read from are not updated in the loops. That implies that you can pull those
referenced variables out to above the loops and calculate them at the level of
the loop over "i".

> yboot{b,i}(j) = yact(end+1-j); %assign the first six values to
be the same as
> %the original data series, so I can start the recursion.
> end

You do not appear to pre-allocate yboot{b,i} even though you know ahead of
time how long it is going to be.

> for j = p+1:length(actdiffE{i})

actdiffE isn't changing over the loops, so the length() can be calculated at
the level of the "i" loop.

> index = randi([1,numel(w{i})]); %noise that is added

w isn't changing in the loops, so the numel(w{i}) can be calculated at the
level of the "i" loop

> wstar = w{i}(index);

w{i} isn't changing in the loops, so w{i} can be de-referenced at the level of
the "i" loop and assigned to a variable, avoiding a cell reference here

> yboot{b,i}(j) = phis(1) +
sum(flipud(phis(2:end)).*yboot{b,i}(j-p:j-1)') + wstar;

You could remove the phis(1) addition from the j loop and do a vectorized
addition of it after the loop.

flipud(phis(2:end)) is independent of all of the loops and so could be
precalculated before any of them.

If you were to preallocate yboot{b,i} then you could pre-allocate it as a
column vector and so avoid the transpose here.

> end

> yboot{b,i} = yboot{b,i}';
> yboot{b,i} = flipud(yboot{b,i}); %From past to present
> end
> end