From: Peter Dickerson on
"D Yuniskis" <not.going.to.be(a)seen.com> wrote in message
news:i0ajtd$n5q$1(a)speranza.aioe.org...
[snip]
>
> Ah, OK. So, it just tries to look-ahead "better" based on
> history (?). E.g., my estimate[i] are averages of polygon/chord
> length and endpoint length. Track successive estimate[i]'s
> and, at each step, compute an ESTIMATE[i] from Romberg?
> This would have no "resale value" in the next iteration of
> my algorithm but would (hopefully) be a better representation
> of the *real* "curve length" -- better than my "average of
> chords and endpoints" (?)

That's it.
In fact your method has error O(h^4) which is the same (up to a factor) as
using the distance between endpoints and then applying one round of Romberg.
The difference is your method uses 4 sqrts at each stange and mine uses 3
(one from the previous lower resolution + 2 from the current split).
However, your factor is smaller, so the actual error at each step is
slightly smaller in your case. In the example table below the raw length
estimate uses the sqrts column number of square roots for that row. The
Romberg result needs that row's sqrts plus the previous row's. The double
Romberg need sqrts plus the previous two rows. So my method can get to 1
part per billion (ppb) using 56 sqrts (32+16+8), while yours uses 96
(64+32).

Control points
(0.000000,0.000000)
(0.000000,0.500000)
(0.500000,0.500000)
(0.500000,0.000000)

sqrts length Romberg Romberg^2
1 0.500000000
2 0.901387819 1.035183758
4 0.975359556 1.000016801 0.997672337
8 0.993857921 1.000024042 1.000024525
16 0.998465633 1.000001538 1.000000037
32 0.999616481 1.000000096 1.000000000
64 0.999904125 1.000000006 1.000000000
128 0.999976031 1.000000000 1.000000000
256 0.999994008 1.000000000 1.000000000
512 0.999998502 1.000000000 1.000000000
1024 0.999999625 1.000000000 1.000000000

sqrts DY length DY Romberg
4 1.000000000
8 1.002470605 1.002635312
16 1.000128079 0.999971911
32 1.000007988 0.999999982
64 1.000000499 1.000000000
128 1.000000031 1.000000000
256 1.000000002 1.000000000
512 1.000000000 1.000000000
1024 1.000000000 1.000000000
2048 1.000000000 1.000000000
4096 1.000000000 1.000000000

[snip]
>>
>> 1 >= length of curve/distance along the control points >= 0.5
>
> In some pathological cases, that's not true. (e.g., look at
> your original example, I think -- sorry, early in the
> morning to be doing this without pencil and paper...)

Yes, you are right. Not particularly pathological. The point I was making is
that I think there is a lower limit on the path length/distance along
control points otherwise the pathlength can approach zero with the control
points a finite distance apart.

[snip]

>> I see this as the cost of the calculation not the recursion since the
>> main difference between this method and stepping along the curve is the
>> order that things are done, not how many.
>
> If you walk up the curve (t 0->1), you have to take a fixed
> dt for each pass. Even in regions of low curvature.

Its a cubic! How much change of curvature are you expecting?

> I can spend those calculations elsewhere -- in places where
> the curvature demands more work for a particular degree of
> fitness.

And spend more effort working out which places they are (if any). That
doesn't come for free.

> The cost of the actual recursion is tiny (another stack frame
> and function invocation) compared to all the number crunching
> that it (hopefully) avoids (or, spends in better places).

But that has little to do with recursion. That I integrated along the path
in t order is not the point. I could equally have done it via the curve
splitting (had I known how to do it at the time). The point was to show that
the Romberg method can reduce the amount of work dramatically without any
smarts. In my example above it reduced 1024 sqrts (proxy for work) for 375
ppb error to 56 sqrts for <1ppb. For your method it does somewhat less well
because you are already getting a better convergence but still 512 sqrts for
< 1ppb becomes 96 sqrts.

[snip]

>> There aren't any hard curves. It's a cubic.
>
> Sorry, I was trying to describe the (total) work required
> for particular "types" of curves.
>
> E.g., (making up numbers entirely in my head :> )
>
> (0,0) (0,0) (1000,1000) (1000,1000)
>
> is an *easy* curve. The estimate converges instantly
> (regardless of how you -- rationally -- create that
> estimate).

if this curve turns up a lot then you are using the wrong technology :-)

> OTOH,
>
> (0,0) (0,400) (10,0) (0,0)
>
> is a "hard" curve. The estimate requires multiple
> iterations to get a given "goodness".

Yes, a bit harder, particularly if you are looking for ppb accuracy.

> The point of my statement (above) was to try to
> come up with some simple/algorithmic criteria
> (short of actually *solving* the curve) that would
> let you determine how hard you have to work on
> a particular curve (i.e., set of control points)
> to get a particular degree of fit.

That has to be a heuristic until you know how you are calculating the
estimates. However, the ratio of the endpoint distance to the sum of control
point distances will give some idea. If it i small then curve has turn back
on itself. If it's large (approaching 1) then its nearly straight. It can't
be nearly straight and highly twisted and still be a cubic...

Peter

Peter


From: D Yuniskis on
Hi Peter,

Peter Dickerson wrote:
> "D Yuniskis" <not.going.to.be(a)seen.com> wrote in message
> news:i0ajtd$n5q$1(a)speranza.aioe.org...
> [snip]
>> Ah, OK. So, it just tries to look-ahead "better" based on
>> history (?). E.g., my estimate[i] are averages of polygon/chord
>> length and endpoint length. Track successive estimate[i]'s
>> and, at each step, compute an ESTIMATE[i] from Romberg?
>> This would have no "resale value" in the next iteration of
>> my algorithm but would (hopefully) be a better representation
>> of the *real* "curve length" -- better than my "average of
>> chords and endpoints" (?)
>
> That's it.
> In fact your method has error O(h^4) which is the same (up to a factor) as
> using the distance between endpoints and then applying one round of Romberg.
> The difference is your method uses 4 sqrts at each stange and mine uses 3

The first iteration of my approach uses 4 sqrts. Thereafter,
you only need 2.5 of them. (I haven't checked to see what
other optimizations might be implied by the geometry)

> (one from the previous lower resolution + 2 from the current split).
> However, your factor is smaller, so the actual error at each step is
> slightly smaller in your case. In the example table below the raw length
> estimate uses the sqrts column number of square roots for that row. The
> Romberg result needs that row's sqrts plus the previous row's. The double
> Romberg need sqrts plus the previous two rows. So my method can get to 1

"Double Romberg"? Is that like "double secret probation"?? :>

> part per billion (ppb) using 56 sqrts (32+16+8), while yours uses 96
> (64+32).
>
> Control points
> (0.000000,0.000000)
> (0.000000,0.500000)
> (0.500000,0.500000)
> (0.500000,0.000000)
>
> sqrts length Romberg Romberg^2
> 1 0.500000000
> 2 0.901387819 1.035183758
> 4 0.975359556 1.000016801 0.997672337
> 8 0.993857921 1.000024042 1.000024525
> 16 0.998465633 1.000001538 1.000000037
> 32 0.999616481 1.000000096 1.000000000
> 64 0.999904125 1.000000006 1.000000000
> 128 0.999976031 1.000000000 1.000000000
> 256 0.999994008 1.000000000 1.000000000
> 512 0.999998502 1.000000000 1.000000000
> 1024 0.999999625 1.000000000 1.000000000
>
> sqrts DY length DY Romberg
> 4 1.000000000
> 8 1.002470605 1.002635312
> 16 1.000128079 0.999971911
> 32 1.000007988 0.999999982
> 64 1.000000499 1.000000000
> 128 1.000000031 1.000000000
> 256 1.000000002 1.000000000
> 512 1.000000000 1.000000000
> 1024 1.000000000 1.000000000
> 2048 1.000000000 1.000000000
> 4096 1.000000000 1.000000000

I'm not sure how to adjust the sqrts column, here, to reflect
my 2.5/iteration... (?). I should just write it up and let it
do the crunching. (I assume you are just forcing the algorithm
to run to a depth of N regardless of the actual error that is
estimated at that depth?)

> [snip]
>>> 1 >= length of curve/distance along the control points >= 0.5
>> In some pathological cases, that's not true. (e.g., look at
>> your original example, I think -- sorry, early in the
>> morning to be doing this without pencil and paper...)
>
> Yes, you are right. Not particularly pathological. The point I was making is
> that I think there is a lower limit on the path length/distance along
> control points otherwise the pathlength can approach zero with the control
> points a finite distance apart.

Agreed.

>>> I see this as the cost of the calculation not the recursion since the
>>> main difference between this method and stepping along the curve is the
>>> order that things are done, not how many.
>> If you walk up the curve (t 0->1), you have to take a fixed
>> dt for each pass. Even in regions of low curvature.
>
> Its a cubic! How much change of curvature are you expecting?
>
>> I can spend those calculations elsewhere -- in places where
>> the curvature demands more work for a particular degree of
>> fitness.
>
> And spend more effort working out which places they are (if any). That
> doesn't come for free.

But that's the point: see what it is that makes a curve
(or, a "subcurve") too curvy and see how easy it is to identify
those criteria *without* having to invest lots of effort.
E.g., the example I gave shows one of the "control point
vectors" as having a really big magnitude compared to the
angle it forms with its neighbors --> sharp bend. (I already
need to know that magnitude so that doesn't cost anything
extra).

>> The cost of the actual recursion is tiny (another stack frame
>> and function invocation) compared to all the number crunching
>> that it (hopefully) avoids (or, spends in better places).
>
> But that has little to do with recursion. That I integrated along the path
> in t order is not the point. I could equally have done it via the curve
> splitting (had I known how to do it at the time). The point was to show that
> the Romberg method can reduce the amount of work dramatically without any
> smarts. In my example above it reduced 1024 sqrts (proxy for work) for 375
> ppb error to 56 sqrts for <1ppb. For your method it does somewhat less well
> because you are already getting a better convergence but still 512 sqrts for
> < 1ppb becomes 96 sqrts.
>
> [snip]
>
>>> There aren't any hard curves. It's a cubic.
>> Sorry, I was trying to describe the (total) work required
>> for particular "types" of curves.
>>
>> E.g., (making up numbers entirely in my head :> )
>>
>> (0,0) (0,0) (1000,1000) (1000,1000)
>>
>> is an *easy* curve. The estimate converges instantly
>> (regardless of how you -- rationally -- create that
>> estimate).
>
> if this curve turns up a lot then you are using the wrong technology :-)

Agreed. :> But, any representation used for "curves" has to
also handle it -- else you need to provide a different encoding
and a "flavor marker".

[I have assumed any equivalent representation for said "curve"
would be equally efficient, computationally. WITHOUT CHECKING
FOR SPECIAL CONDITIONS, I am not sure that is always the case
with this. I.e., it may be wiser to use a particular
representation than some others...?]

>> OTOH,
>>
>> (0,0) (0,400) (10,0) (0,0)
>>
>> is a "hard" curve. The estimate requires multiple
>> iterations to get a given "goodness".
>
> Yes, a bit harder, particularly if you are looking for ppb accuracy.

But, I might be able to detect this "problem area"
just by looking at dx,dy's of contiguous chords/vectors.
I.e., as if doing trig but without the *actual* trig
(e.g., if dx >= 100*dy ...)

>> The point of my statement (above) was to try to
>> come up with some simple/algorithmic criteria
>> (short of actually *solving* the curve) that would
>> let you determine how hard you have to work on
>> a particular curve (i.e., set of control points)
>> to get a particular degree of fit.
>
> That has to be a heuristic until you know how you are calculating the
> estimates. However, the ratio of the endpoint distance to the sum of control
> point distances will give some idea. If it i small then curve has turn back
> on itself. If it's large (approaching 1) then its nearly straight. It can't
> be nearly straight and highly twisted and still be a cubic...

Yes, but for a given sum of control point distances to endpoint
distances -ratio, you can have very different curve characters.
E.g., compare a closed/nearly closed curve (distance from
start to end very small) to an "s/z" formed with endpoints the
same distance apart but control points on opposite sides of
the endpoint-line.

I think you have to look at two contiguous chords and their angle
(plus magnitudes) to properly identify a "tight corner".
From: D Yuniskis on
Hi Peter,

Peter Dickerson wrote:
> In the example table below the raw length
> estimate uses the sqrts column number of square roots for that row. The
> Romberg result needs that row's sqrts plus the previous row's. The double
> Romberg need sqrts plus the previous two rows. So my method can get to 1
> part per billion (ppb) using 56 sqrts (32+16+8), while yours uses 96
> (64+32).
>
> Control points
> (0.000000,0.000000)
> (0.000000,0.500000)
> (0.500000,0.500000)
> (0.500000,0.000000)
>
> sqrts length Romberg Romberg^2
> 1 0.500000000
> 2 0.901387819 1.035183758
> 4 0.975359556 1.000016801 0.997672337
> 8 0.993857921 1.000024042 1.000024525
> 16 0.998465633 1.000001538 1.000000037
> 32 0.999616481 1.000000096 1.000000000
> 64 0.999904125 1.000000006 1.000000000
> 128 0.999976031 1.000000000 1.000000000
> 256 0.999994008 1.000000000 1.000000000
> 512 0.999998502 1.000000000 1.000000000
> 1024 0.999999625 1.000000000 1.000000000
>
> sqrts DY length DY Romberg
> 4 1.000000000
> 8 1.002470605 1.002635312
> 16 1.000128079 0.999971911
> 32 1.000007988 0.999999982
> 64 1.000000499 1.000000000
> 128 1.000000031 1.000000000
> 256 1.000000002 1.000000000
> 512 1.000000000 1.000000000
> 1024 1.000000000 1.000000000
> 2048 1.000000000 1.000000000
> 4096 1.000000000 1.000000000

OK, I made a stab at trying to "adjust" this for
a more efficient algorithm (2.5/iteration).

Trusting your "results", I assume the table would be:

sqrts DY length DY Romberg
4 1.000000000
5 1.002470605 1.002635312
10 1.000128079 0.999971911
20 1.000007988 0.999999982
40 1.000000499 1.000000000

I.e., at this point, I have invested *79* sqrts (4+5+10+...).
Note, however, that some of those might never occur if the
estimate for a particular "subcurve" (ick!) looks "good
enough" (it need not be further subdivided).

If I blindly force N subdivisions, it changes the mix.

I'll try to make some time to code up a "real" algorithm
and instrument SQRT() to do the counting for me with
various "estimate tolerances".
From: Peter Dickerson on

"D Yuniskis" <not.going.to.be(a)seen.com> wrote in message
news:i0derm$en9$1(a)speranza.aioe.org...
> Hi Peter,
>

[snip]

> OK, I made a stab at trying to "adjust" this for
> a more efficient algorithm (2.5/iteration).

Sorry, I don't understand where the 2.5 sqrts/iteration comes from. When you
split the line and have two sets of four control points you need eight
distances (sqrt) but none of those have been computed at the previous level.
The only place I see five appear is that the are five new points to
calculate, one of which is used twice (giving six) and the start and end
points from the original (now we have eight). None of that uses sqrts and
the time to calculate it is potentially insignificant relative to the sqrts
(e.g. it can be done with integers of sufficient size by rescaling by eight
at each nesting).

> Trusting your "results", I assume the table would be:
>
> sqrts DY length DY Romberg
> 4 1.000000000
> 5 1.002470605 1.002635312
> 10 1.000128079 0.999971911
> 20 1.000007988 0.999999982
> 40 1.000000499 1.000000000
>
> I.e., at this point, I have invested *79* sqrts (4+5+10+...).
> Note, however, that some of those might never occur if the
> estimate for a particular "subcurve" (ick!) looks "good
> enough" (it need not be further subdivided).

You've still lost e.

> If I blindly force N subdivisions, it changes the mix.
>
> I'll try to make some time to code up a "real" algorithm
> and instrument SQRT() to do the counting for me with
> various "estimate tolerances".

Surely you can identify any 'sharp' regions of the curve from the radius of
curvature (or at least its square, otherwise you'll need a sqrt for that). A
proxy might be dx/dt*d2y/dt2 - dy/dt*d2x/dt2 being small.

Peter


From: D Yuniskis on
Hi Peter,

Peter Dickerson wrote:
> "D Yuniskis" <not.going.to.be(a)seen.com> wrote in message
> news:i0derm$en9$1(a)speranza.aioe.org...
>
>> OK, I made a stab at trying to "adjust" this for
>> a more efficient algorithm (2.5/iteration).
>
> Sorry, I don't understand where the 2.5 sqrts/iteration comes from. When you
> split the line and have two sets of four control points you need eight
> distances (sqrt) but none of those have been computed at the previous level.
> The only place I see five appear is that the are five new points to
> calculate, one of which is used twice (giving six) and the start and end
> points from the original (now we have eight). None of that uses sqrts and
> the time to calculate it is potentially insignificant relative to the sqrts
> (e.g. it can be done with integers of sufficient size by rescaling by eight
> at each nesting).

Given a Start point (S), departing control point (D), arriving
control point (A), and End point (E), [i.e., these are P0 - P3]:

The first iteration computes the lengths of |SD|, |DA|, |AE| and
compares them against |SE| (i.e., 4 sqrts).

Then the curve is subdivided into two half-curves ("Left" and
"Right"). The control points for the two half curve are:
M = (D+A)/2

L0 = S
R3 = E

L1 = (S+D)/2
R2 = (A+E)/2

L2 = (L1+M)/2
R1 = (M+R2)/2

L3 = R0 = (L2+R1)/2

Looking at the Left half curve L0, L1, L2, L3:

Compute |L0L1|, |L1L2| and |L2L3| and compare against |L0L3|.

But, |L0L1| is actually just half of |AD| -- which we already
have from the first iteration!

However, we do have to compute |L1L2|. And |L0L3|.
There's two sqrts.

OTOH, when we compute |L2L3|, this is really half of |L2R1|.
So, whatever result we get there we can use for the |R0R1|
computation! So, that's a 0.5 sqrt.

(Did I do that right?)

>> Trusting your "results", I assume the table would be:
>>
>> sqrts DY length DY Romberg
>> 4 1.000000000
>> 5 1.002470605 1.002635312
>> 10 1.000128079 0.999971911
>> 20 1.000007988 0.999999982
>> 40 1.000000499 1.000000000
>>
>> I.e., at this point, I have invested *79* sqrts (4+5+10+...).
>> Note, however, that some of those might never occur if the
>> estimate for a particular "subcurve" (ick!) looks "good
>> enough" (it need not be further subdivided).
>
> You've still lost e.

Assume I am correct with 2.5 sqrts per "half curve".
And 4 on the first iteration.

So, second iteration is 5 (2.5 + 2.5) sqrts (assuming
I am not looking at the closeness of fit).

Third iteration cuts each of the two half curves from
the second iteration in half using 5 sqrts for each
of the second iteration half curves -- 10 sqrts total.

20 for the next. etc.

Since my algorithm starts with the *first* ITERATION
and conditionally subdivides, it takes on the costs of
each iteration as it subdivides the curve(s). So,
my total cost is the sum of the costs of *every*
iteration.

I.e., my code computes "sum of segments" and "endpoint
length"; checks to see how close they are to each other
(or, does your magic Romberg estimate and compares
*that* to these lengths to see how closely *it*
fits); and, if the fit isn;t good enough, *then*
it subdivides the curve and does another iteration of
chord length computations. (but, it has already
incurred the cost of the computations leading up to
that decision to recurse!)

>> If I blindly force N subdivisions, it changes the mix.
>>
>> I'll try to make some time to code up a "real" algorithm
>> and instrument SQRT() to do the counting for me with
>> various "estimate tolerances".
>
> Surely you can identify any 'sharp' regions of the curve from the radius of
> curvature (or at least its square, otherwise you'll need a sqrt for that). A
> proxy might be dx/dt*d2y/dt2 - dy/dt*d2x/dt2 being small.