From: Wade Humeniuk on
For LW 4.3.7

(defun pi/2-opt ()
(declare (optimize (speed 3) (safety 0) (debug 0) (float 0)))
(loop with sum of-type double-float = #.(realpart (+ (exp (- (expt -10.0d0 2.0d0)))
(exp (- (expt 10.0d0 2.0d0)))))
for x of-type double-float from -9.99d0 by 0.01d0
while (< x 10.0d0) do
(incf sum (* 2d0 (exp (- (* (abs x) (abs x))))))
finally (return (* sum 0.005d0))))


Removed some expt stuff, I think it is still correct.

CL-USER 47 > (pi/2-opt)
1.7724538509055174

CL-USER 46 > (time (dotimes (i 10000) (pi/2-opt)))
Timing the evaluation of (DOTIMES (I 10000) (PI/2-OPT))

user time = 1.832
system time = 0.000
Elapsed time = 0:00:02
Allocation = 160984 bytes standard / 112783 bytes conses
0 Page faults
Calls to %EVAL 10035
NIL


CL-USER 44 > (disassemble 'pi/2-opt)
2067FC1A:
0: 55 push ebp
1: 89E5 move ebp, esp
3: 83EC24 sub esp, 24
6: C7042445240000 move [esp], 2445
13: DD05E02A6920 fldl [20692AE0] ; 7.440151952041305E-44
19: DD5DF0 fstpl [ebp-10]
22: DD05C02B6920 fldl [20692BC0] ; -9.99
28: DD55F8 fstl [ebp-8]
31: DD05882C6920 fldl [20692C88] ; 10.0
37: DED9 fcompp
39: DFE0 fnstsw ax
41: 9E sahf
42: 0F8ABF000000 jp L6
48: 0F86B9000000 jbe L6
L1: 54: DD45F0 fldl [ebp-10]
57: DD5DE0 fstpl [ebp-20]
60: DD45F8 fldl [ebp-8]
63: DD55F8 fstl [ebp-8]
66: DD05C00B1920 fldl [20190BC0] ; 0.0
72: DED9 fcompp
74: DFE0 fnstsw ax
76: 9E sahf
77: 0F8AF6000000 jp L7
83: 0F86F0000000 jbe L7
89: DD45F8 fldl [ebp-8]
92: D9E0 fchs
94: DD5DE8 fstpl [ebp-18]
L2: 97: DD45F8 fldl [ebp-8]
100: DD55F8 fstl [ebp-8]
103: DD05C00B1920 fldl [20190BC0] ; 0.0
109: DED9 fcompp
111: DFE0 fnstsw ax
113: 9E sahf
114: 0F8ADC000000 jp L8
120: 0F86D6000000 jbe L8
126: DD45F8 fldl [ebp-8]
129: D9E0 fchs
131: DD5DF0 fstpl [ebp-10]
L3: 134: DD45E8 fldl [ebp-18]
137: DC4DF0 fmull [ebp-10]
140: D9E0 fchs
142: D9E5 fxam
144: DFE0 fnstsw ax
146: 3500010000 xor eax, 100
151: 9E sahf
152: 760D jbe L4
154: 7B23 jnp L5
156: F6C402 testb ah, 2
159: 741E je L5
161: DDD8 fstp st(0)
163: D9EE fldz
165: EB18 jmp L5
L4: 167: D9EA fld2e
169: DEC9 fmulp st(1), st
171: D9C0 fld st(0)
173: D9FC frndint
175: D9C0 fld st(0)
177: D9CA fxch st(2)
179: DEE1 fsubrp st(1), st
181: D9F0 f2xm1
183: D9E8 fld1
185: DEC1 faddp st(1), st
187: D9FD fscale
189: DDD9 fstp st(1)
L5: 191: DD05102D6920 fldl [20692D10] ; 2.0
197: DEC9 fmulp st(1), st
199: DC45E0 faddl [ebp-20]
202: DD5DF0 fstpl [ebp-10]
205: DD45F8 fldl [ebp-8]
208: DD05102C6920 fldl [20692C10] ; 0.01
214: DEC1 faddp st(1), st
216: DD55F8 fstl [ebp-8]
219: DD05882C6920 fldl [20692C88] ; 10.0
225: DED9 fcompp
227: DFE0 fnstsw ax
229: 9E sahf
230: 7A07 jp L6
232: 7605 jbe L6
234: E947FFFFFF jmp L1
L6: 239: DD45F0 fldl [ebp-10]
242: DD05602E6920 fldl [20692E60] ; 0.005
248: DEC9 fmulp st(1), st
250: FF75E4 push [ebp-1C]
253: FF75E0 push [ebp-20]
256: FF75DC push [ebp-24]
259: 8B75E8 move esi, [ebp-18]
262: 8975DC move [ebp-24], esi
265: 8B75EC move esi, [ebp-14]
268: 8975E0 move [ebp-20], esi
271: 8B75F0 move esi, [ebp-10]
274: 8975E4 move [ebp-1C], esi
277: 8B75F4 move esi, [ebp-C]
280: 8975E8 move [ebp-18], esi
283: 8B75F8 move esi, [ebp-8]
286: 8975EC move [ebp-14], esi
289: 8B75FC move esi, [ebp-4]
292: 8975F0 move [ebp-10], esi
295: 8B7500 move esi, [ebp]
298: 8975F4 move [ebp-C], esi
301: 83ED0C sub ebp, C
304: 8B7510 move esi, [ebp+10]
307: 897504 move [ebp+4], esi
310: DD5D0C fstpl [ebp+C]
313: C74508450C0000 move [ebp+8], C45
320: B501 moveb ch, 1
322: C9 leave
323: FF2530671120 jmp [20116730] ; SYSTEM::RAW-FAST-BOX-DOUBLE
L7: 329: DD45F8 fldl [ebp-8]
332: DD5DE8 fstpl [ebp-18]
335: E90DFFFFFF jmp L2
L8: 340: DD45F8 fldl [ebp-8]
343: DD5DF0 fstpl [ebp-10]
346: E927FFFFFF jmp L3
351: 90 nop
352: 90 nop
353: 90
From: Pierre THIERRY on
Le Sat, 12 Aug 2006 19:38:26 +0100, verec a écrit :
> static double piOver2() {
> double sum = std::exp(-std::pow(-10.0, 2.0)) ;
> sum += std::exp(-std::pow(10.0, 2)) ;

For the sake of my curiosity, why do you calculate twice the same
number? 10^2 == (-10)^2

> But I just can't tell how long 10,000 iterations would
> take because I lost patience after 2 minutes.

That's very strange indeed. I can confirm it is pretty fast without any
optimization. I'm also working with SBCL. It gets from 5 seconds without
optimization down to 1.1 with speed 3 and anything else to 0.

Maybe there's a flaw in your compiler. Try it on another implementation.

Curiously,
Nowhere man
--
nowhere.man(a)levallois.eu.org
OpenPGP 0xD9D50D8A

From: Wade Humeniuk on
This version is better than the last

(defun pi/2-opt ()
(declare (optimize (speed 3) (safety 0) (debug 0) (float 0)))
(loop with sum of-type double-float = #.(realpart (+ (exp (- (expt -10.0d0 2.0d0)))
(exp (- (expt 10.0d0 2.0d0)))))
for x of-type double-float from -9.99d0 below 10.0d0 by 0.01d0
do (incf sum (* 2d0 (exp (- (* x x)))))
finally (return (* sum 0.005d0))))

CL-USER 1 > (time (dotimes (i 10000) (pi/2-opt)))
Timing the evaluation of (DOTIMES (I 10000) (PI/2-OPT))
; Loading fasl file C:\Program Files\Xanalys\LispWorks\lib\4-3-0-0\modules\util\callcoun.fsl

user time = 1.231
system time = 0.000
Elapsed time = 0:00:01
Allocation = 164304 bytes standard / 113113 bytes conses
0 Page faults
Calls to %EVAL 10035
NIL

CL-USER 4 > (disassemble 'pi/2-opt)
; Loading fasl file C:\Program Files\Xanalys\LispWorks\lib\4-3-0-0\modules\util\disass.fsl
; Loading fasl file C:\Program
Files\Xanalys\LispWorks\lib\4-3-0-0\patches\disassembler\0001\0001.fsl
; Loaded public patch DISASSEMBLER 1.1

; Loading fasl file C:\Program Files\Xanalys\LispWorks\lib\4-3-0-0\modules\memory\findptr.fsl
214EBE2A:
0: 55 push ebp
1: 89E5 move ebp, esp
3: 83EC24 sub esp, 24
6: C7042445240000 move [esp], 2445
13: DD05500E6920 fldl [20690E50] ; 7.440151952041305E-44
19: DD5DF0 fstpl [ebp-10]
22: DD05300F6920 fldl [20690F30] ; -9.99
28: DD55F8 fstl [ebp-8]
31: DD05780F6920 fldl [20690F78] ; 10.0
37: DED9 fcompp
39: DFE0 fnstsw ax
41: 9E sahf
42: 7A5C jp L2
44: 775A ja L2
L1: 46: DD45F0 fldl [ebp-10]
49: DD0570116920 fldl [20691170] ; 0.005
55: DEC9 fmulp st(1), st
57: FF75E4 push [ebp-1C]
60: FF75E0 push [ebp-20]
63: FF75DC push [ebp-24]
66: 8B75E8 move esi, [ebp-18]
69: 8975DC move [ebp-24], esi
72: 8B75EC move esi, [ebp-14]
75: 8975E0 move [ebp-20], esi
78: 8B75F0 move esi, [ebp-10]
81: 8975E4 move [ebp-1C], esi
84: 8B75F4 move esi, [ebp-C]
87: 8975E8 move [ebp-18], esi
90: 8B75F8 move esi, [ebp-8]
93: 8975EC move [ebp-14], esi
96: 8B75FC move esi, [ebp-4]
99: 8975F0 move [ebp-10], esi
102: 8B7500 move esi, [ebp]
105: 8975F4 move [ebp-C], esi
108: 83ED0C sub ebp, C
111: 8B7510 move esi, [ebp+10]
114: 897504 move [ebp+4], esi
117: DD5D0C fstpl [ebp+C]
120: C74508450C0000 move [ebp+8], C45
127: B501 moveb ch, 1
129: C9 leave
130: FF2530671120 jmp [20116730] ; SYSTEM::RAW-FAST-BOX-DOUBLE
L2: 136: DD45F8 fldl [ebp-8]
139: DC4DF8 fmull [ebp-8]
142: D9E0 fchs
144: D9E5 fxam
146: DFE0 fnstsw ax
148: 3500010000 xor eax, 100
153: 9E sahf
154: 760D jbe L3
156: 7B23 jnp L4
158: F6C402 testb ah, 2
161: 741E je L4
163: DDD8 fstp st(0)
165: D9EE fldz
167: EB18 jmp L4
L3: 169: D9EA fld2e
171: DEC9 fmulp st(1), st
173: D9C0 fld st(0)
175: D9FC frndint
177: D9C0 fld st(0)
179: D9CA fxch st(2)
181: DEE1 fsubrp st(1), st
183: D9F0 f2xm1
185: D9E8 fld1
187: DEC1 faddp st(1), st
189: D9FD fscale
191: DDD9 fstp st(1)
L4: 193: DD0550106920 fldl [20691050] ; 2.0
199: DEC9 fmulp st(1), st
201: DC45F0 faddl [ebp-10]
204: DD5DF0 fstpl [ebp-10]
207: DD45F8 fldl [ebp-8]
210: DD05C80F6920 fldl [20690FC8] ; 0.01
216: DEC1 faddp st(1), st
218: DD55F8 fstl [ebp-8]
221: DD05780F6920 fldl [20690F78] ; 10.0
227: DED9 fcompp
229: DFE0 fnstsw ax
231: 9E sahf
232: 7A9E jp L2
234: 779C ja L2
236: E93DFFFFFF jmp L1
241: 90 nop
NIL
From: verec on
On 2006-08-12 21:49:58 +0100, Pierre THIERRY
<nowhere.man(a)levallois.eu.org> said:

> Le Sat, 12 Aug 2006 19:38:26 +0100, verec a écrit :
>> static double piOver2() {
>> double sum = std::exp(-std::pow(-10.0, 2.0)) ;
>> sum += std::exp(-std::pow(10.0, 2)) ;
>
> For the sake of my curiosity, why do you calculate twice the same
> number? 10^2 == (-10)^2

Well spotted! I guess this is only a mind-less translation of
mine of the math formula

pi/2 = integral[-10, 10, x](e^-(x^2))

:-(

>> But I just can't tell how long 10,000 iterations would
>> take because I lost patience after 2 minutes.
>
> That's very strange indeed. I can confirm it is pretty fast without any
> optimization. I'm also working with SBCL. It gets from 5 seconds without
> optimization down to 1.1 with speed 3 and anything else to 0.
>
> Maybe there's a flaw in your compiler. Try it on another implementation.

Still on the case :-)

I'll report in due course.

> Curiously,
> Nowhere man

--
JFB

From: Barry Margolin on
In article <44de2997$0$639$5a6aecb4(a)news.aaisp.net.uk>,
verec <verec(a)mac.com> wrote:

> I thought (erroneously?) that (the float XYZ) would
> just drop the imaginary part, precisely because that's
> the purpose of a cast? Or should I use (coerce ... in
> addition?

THE is not a type-cast, it is a declaration that the value of the
expression is known a priori to be of the specified type. It allows the
compiler to perform type dispatching at compile time rather than at run
time, but it doesn't perform any conversion to ensure that the object is
of that type.

So you should write (the float (realpart xyz)).

--
Barry Margolin, barmar(a)alum.mit.edu
Arlington, MA
*** PLEASE post questions in newsgroups, not directly to me ***
*** PLEASE don't copy me on replies, I'll read them in the group ***
First  |  Prev  |  Next  |  Last
Pages: 1 2 3 4 5 6 7 8
Prev: static typing
Next: Java is going to have closures.