|
Prev: static typing
Next: Java is going to have closures.
From: Wade Humeniuk on 12 Aug 2006 16:27 For LW 4.3.7 (defun pi/2-opt () (declare (optimize (speed 3) (safety 0) (debug 0) (float 0))) (loop with sum of-type double-float = #.(realpart (+ (exp (- (expt -10.0d0 2.0d0))) (exp (- (expt 10.0d0 2.0d0))))) for x of-type double-float from -9.99d0 by 0.01d0 while (< x 10.0d0) do (incf sum (* 2d0 (exp (- (* (abs x) (abs x)))))) finally (return (* sum 0.005d0)))) Removed some expt stuff, I think it is still correct. CL-USER 47 > (pi/2-opt) 1.7724538509055174 CL-USER 46 > (time (dotimes (i 10000) (pi/2-opt))) Timing the evaluation of (DOTIMES (I 10000) (PI/2-OPT)) user time = 1.832 system time = 0.000 Elapsed time = 0:00:02 Allocation = 160984 bytes standard / 112783 bytes conses 0 Page faults Calls to %EVAL 10035 NIL CL-USER 44 > (disassemble 'pi/2-opt) 2067FC1A: 0: 55 push ebp 1: 89E5 move ebp, esp 3: 83EC24 sub esp, 24 6: C7042445240000 move [esp], 2445 13: DD05E02A6920 fldl [20692AE0] ; 7.440151952041305E-44 19: DD5DF0 fstpl [ebp-10] 22: DD05C02B6920 fldl [20692BC0] ; -9.99 28: DD55F8 fstl [ebp-8] 31: DD05882C6920 fldl [20692C88] ; 10.0 37: DED9 fcompp 39: DFE0 fnstsw ax 41: 9E sahf 42: 0F8ABF000000 jp L6 48: 0F86B9000000 jbe L6 L1: 54: DD45F0 fldl [ebp-10] 57: DD5DE0 fstpl [ebp-20] 60: DD45F8 fldl [ebp-8] 63: DD55F8 fstl [ebp-8] 66: DD05C00B1920 fldl [20190BC0] ; 0.0 72: DED9 fcompp 74: DFE0 fnstsw ax 76: 9E sahf 77: 0F8AF6000000 jp L7 83: 0F86F0000000 jbe L7 89: DD45F8 fldl [ebp-8] 92: D9E0 fchs 94: DD5DE8 fstpl [ebp-18] L2: 97: DD45F8 fldl [ebp-8] 100: DD55F8 fstl [ebp-8] 103: DD05C00B1920 fldl [20190BC0] ; 0.0 109: DED9 fcompp 111: DFE0 fnstsw ax 113: 9E sahf 114: 0F8ADC000000 jp L8 120: 0F86D6000000 jbe L8 126: DD45F8 fldl [ebp-8] 129: D9E0 fchs 131: DD5DF0 fstpl [ebp-10] L3: 134: DD45E8 fldl [ebp-18] 137: DC4DF0 fmull [ebp-10] 140: D9E0 fchs 142: D9E5 fxam 144: DFE0 fnstsw ax 146: 3500010000 xor eax, 100 151: 9E sahf 152: 760D jbe L4 154: 7B23 jnp L5 156: F6C402 testb ah, 2 159: 741E je L5 161: DDD8 fstp st(0) 163: D9EE fldz 165: EB18 jmp L5 L4: 167: D9EA fld2e 169: DEC9 fmulp st(1), st 171: D9C0 fld st(0) 173: D9FC frndint 175: D9C0 fld st(0) 177: D9CA fxch st(2) 179: DEE1 fsubrp st(1), st 181: D9F0 f2xm1 183: D9E8 fld1 185: DEC1 faddp st(1), st 187: D9FD fscale 189: DDD9 fstp st(1) L5: 191: DD05102D6920 fldl [20692D10] ; 2.0 197: DEC9 fmulp st(1), st 199: DC45E0 faddl [ebp-20] 202: DD5DF0 fstpl [ebp-10] 205: DD45F8 fldl [ebp-8] 208: DD05102C6920 fldl [20692C10] ; 0.01 214: DEC1 faddp st(1), st 216: DD55F8 fstl [ebp-8] 219: DD05882C6920 fldl [20692C88] ; 10.0 225: DED9 fcompp 227: DFE0 fnstsw ax 229: 9E sahf 230: 7A07 jp L6 232: 7605 jbe L6 234: E947FFFFFF jmp L1 L6: 239: DD45F0 fldl [ebp-10] 242: DD05602E6920 fldl [20692E60] ; 0.005 248: DEC9 fmulp st(1), st 250: FF75E4 push [ebp-1C] 253: FF75E0 push [ebp-20] 256: FF75DC push [ebp-24] 259: 8B75E8 move esi, [ebp-18] 262: 8975DC move [ebp-24], esi 265: 8B75EC move esi, [ebp-14] 268: 8975E0 move [ebp-20], esi 271: 8B75F0 move esi, [ebp-10] 274: 8975E4 move [ebp-1C], esi 277: 8B75F4 move esi, [ebp-C] 280: 8975E8 move [ebp-18], esi 283: 8B75F8 move esi, [ebp-8] 286: 8975EC move [ebp-14], esi 289: 8B75FC move esi, [ebp-4] 292: 8975F0 move [ebp-10], esi 295: 8B7500 move esi, [ebp] 298: 8975F4 move [ebp-C], esi 301: 83ED0C sub ebp, C 304: 8B7510 move esi, [ebp+10] 307: 897504 move [ebp+4], esi 310: DD5D0C fstpl [ebp+C] 313: C74508450C0000 move [ebp+8], C45 320: B501 moveb ch, 1 322: C9 leave 323: FF2530671120 jmp [20116730] ; SYSTEM::RAW-FAST-BOX-DOUBLE L7: 329: DD45F8 fldl [ebp-8] 332: DD5DE8 fstpl [ebp-18] 335: E90DFFFFFF jmp L2 L8: 340: DD45F8 fldl [ebp-8] 343: DD5DF0 fstpl [ebp-10] 346: E927FFFFFF jmp L3 351: 90 nop 352: 90 nop 353: 90
From: Pierre THIERRY on 12 Aug 2006 16:49 Le Sat, 12 Aug 2006 19:38:26 +0100, verec a écrit : > static double piOver2() { > double sum = std::exp(-std::pow(-10.0, 2.0)) ; > sum += std::exp(-std::pow(10.0, 2)) ; For the sake of my curiosity, why do you calculate twice the same number? 10^2 == (-10)^2 > But I just can't tell how long 10,000 iterations would > take because I lost patience after 2 minutes. That's very strange indeed. I can confirm it is pretty fast without any optimization. I'm also working with SBCL. It gets from 5 seconds without optimization down to 1.1 with speed 3 and anything else to 0. Maybe there's a flaw in your compiler. Try it on another implementation. Curiously, Nowhere man -- nowhere.man(a)levallois.eu.org OpenPGP 0xD9D50D8A
From: Wade Humeniuk on 12 Aug 2006 17:28 This version is better than the last (defun pi/2-opt () (declare (optimize (speed 3) (safety 0) (debug 0) (float 0))) (loop with sum of-type double-float = #.(realpart (+ (exp (- (expt -10.0d0 2.0d0))) (exp (- (expt 10.0d0 2.0d0))))) for x of-type double-float from -9.99d0 below 10.0d0 by 0.01d0 do (incf sum (* 2d0 (exp (- (* x x))))) finally (return (* sum 0.005d0)))) CL-USER 1 > (time (dotimes (i 10000) (pi/2-opt))) Timing the evaluation of (DOTIMES (I 10000) (PI/2-OPT)) ; Loading fasl file C:\Program Files\Xanalys\LispWorks\lib\4-3-0-0\modules\util\callcoun.fsl user time = 1.231 system time = 0.000 Elapsed time = 0:00:01 Allocation = 164304 bytes standard / 113113 bytes conses 0 Page faults Calls to %EVAL 10035 NIL CL-USER 4 > (disassemble 'pi/2-opt) ; Loading fasl file C:\Program Files\Xanalys\LispWorks\lib\4-3-0-0\modules\util\disass.fsl ; Loading fasl file C:\Program Files\Xanalys\LispWorks\lib\4-3-0-0\patches\disassembler\0001\0001.fsl ; Loaded public patch DISASSEMBLER 1.1 ; Loading fasl file C:\Program Files\Xanalys\LispWorks\lib\4-3-0-0\modules\memory\findptr.fsl 214EBE2A: 0: 55 push ebp 1: 89E5 move ebp, esp 3: 83EC24 sub esp, 24 6: C7042445240000 move [esp], 2445 13: DD05500E6920 fldl [20690E50] ; 7.440151952041305E-44 19: DD5DF0 fstpl [ebp-10] 22: DD05300F6920 fldl [20690F30] ; -9.99 28: DD55F8 fstl [ebp-8] 31: DD05780F6920 fldl [20690F78] ; 10.0 37: DED9 fcompp 39: DFE0 fnstsw ax 41: 9E sahf 42: 7A5C jp L2 44: 775A ja L2 L1: 46: DD45F0 fldl [ebp-10] 49: DD0570116920 fldl [20691170] ; 0.005 55: DEC9 fmulp st(1), st 57: FF75E4 push [ebp-1C] 60: FF75E0 push [ebp-20] 63: FF75DC push [ebp-24] 66: 8B75E8 move esi, [ebp-18] 69: 8975DC move [ebp-24], esi 72: 8B75EC move esi, [ebp-14] 75: 8975E0 move [ebp-20], esi 78: 8B75F0 move esi, [ebp-10] 81: 8975E4 move [ebp-1C], esi 84: 8B75F4 move esi, [ebp-C] 87: 8975E8 move [ebp-18], esi 90: 8B75F8 move esi, [ebp-8] 93: 8975EC move [ebp-14], esi 96: 8B75FC move esi, [ebp-4] 99: 8975F0 move [ebp-10], esi 102: 8B7500 move esi, [ebp] 105: 8975F4 move [ebp-C], esi 108: 83ED0C sub ebp, C 111: 8B7510 move esi, [ebp+10] 114: 897504 move [ebp+4], esi 117: DD5D0C fstpl [ebp+C] 120: C74508450C0000 move [ebp+8], C45 127: B501 moveb ch, 1 129: C9 leave 130: FF2530671120 jmp [20116730] ; SYSTEM::RAW-FAST-BOX-DOUBLE L2: 136: DD45F8 fldl [ebp-8] 139: DC4DF8 fmull [ebp-8] 142: D9E0 fchs 144: D9E5 fxam 146: DFE0 fnstsw ax 148: 3500010000 xor eax, 100 153: 9E sahf 154: 760D jbe L3 156: 7B23 jnp L4 158: F6C402 testb ah, 2 161: 741E je L4 163: DDD8 fstp st(0) 165: D9EE fldz 167: EB18 jmp L4 L3: 169: D9EA fld2e 171: DEC9 fmulp st(1), st 173: D9C0 fld st(0) 175: D9FC frndint 177: D9C0 fld st(0) 179: D9CA fxch st(2) 181: DEE1 fsubrp st(1), st 183: D9F0 f2xm1 185: D9E8 fld1 187: DEC1 faddp st(1), st 189: D9FD fscale 191: DDD9 fstp st(1) L4: 193: DD0550106920 fldl [20691050] ; 2.0 199: DEC9 fmulp st(1), st 201: DC45F0 faddl [ebp-10] 204: DD5DF0 fstpl [ebp-10] 207: DD45F8 fldl [ebp-8] 210: DD05C80F6920 fldl [20690FC8] ; 0.01 216: DEC1 faddp st(1), st 218: DD55F8 fstl [ebp-8] 221: DD05780F6920 fldl [20690F78] ; 10.0 227: DED9 fcompp 229: DFE0 fnstsw ax 231: 9E sahf 232: 7A9E jp L2 234: 779C ja L2 236: E93DFFFFFF jmp L1 241: 90 nop NIL
From: verec on 12 Aug 2006 17:51 On 2006-08-12 21:49:58 +0100, Pierre THIERRY <nowhere.man(a)levallois.eu.org> said: > Le Sat, 12 Aug 2006 19:38:26 +0100, verec a écrit : >> static double piOver2() { >> double sum = std::exp(-std::pow(-10.0, 2.0)) ; >> sum += std::exp(-std::pow(10.0, 2)) ; > > For the sake of my curiosity, why do you calculate twice the same > number? 10^2 == (-10)^2 Well spotted! I guess this is only a mind-less translation of mine of the math formula pi/2 = integral[-10, 10, x](e^-(x^2)) :-( >> But I just can't tell how long 10,000 iterations would >> take because I lost patience after 2 minutes. > > That's very strange indeed. I can confirm it is pretty fast without any > optimization. I'm also working with SBCL. It gets from 5 seconds without > optimization down to 1.1 with speed 3 and anything else to 0. > > Maybe there's a flaw in your compiler. Try it on another implementation. Still on the case :-) I'll report in due course. > Curiously, > Nowhere man -- JFB
From: Barry Margolin on 12 Aug 2006 17:54
In article <44de2997$0$639$5a6aecb4(a)news.aaisp.net.uk>, verec <verec(a)mac.com> wrote: > I thought (erroneously?) that (the float XYZ) would > just drop the imaginary part, precisely because that's > the purpose of a cast? Or should I use (coerce ... in > addition? THE is not a type-cast, it is a declaration that the value of the expression is known a priori to be of the specified type. It allows the compiler to perform type dispatching at compile time rather than at run time, but it doesn't perform any conversion to ensure that the object is of that type. So you should write (the float (realpart xyz)). -- Barry Margolin, barmar(a)alum.mit.edu Arlington, MA *** PLEASE post questions in newsgroups, not directly to me *** *** PLEASE don't copy me on replies, I'll read them in the group *** |