[Openmcl-devel] Optimizing a stupid benchmark

Milan Jovanovic milanj at gmail.com
Thu May 4 16:55:06 PDT 2017


SBCL shouldn’t cons even if you are using full 64 bits (with proper
declaration) … except that bignum needs to be allocated as result.

On Fri, May 5, 2017 at 1:16 AM, Shannon Spires <svs at bearlanding.com> wrote:

> Very interesting. So let's normalize things. Looks like you're seeing
> about (/ 3.1 2147483647 <(214)%20748-3647>) = 1.4 ns per addition in SBCL
> and about (/ 48.3 2147483647 <(214)%20748-3647>) = 22.5 ns per addition
> in CCL.
>
> Note that your original total, 2305843005992468481, is a fixnum in SBCL
> but not in CCL.
>
> In CCL, most-positive-fixnum is (- (expt 2 60) 1).
> In SBCL, most-positive-fixnum is (- (expt 2 62) 1). So a fixnum is 2 bits
> larger in SBCL.
>
> Knowing that the sum of numbers from 0..n is (/ (* n (1+ n)) 2), and here
> n = 2147483646 <(214)%20748-3646>, we know how to limit the sum to be a
> fixnum. If we set the loop limit to (truncate (sqrt most-positive-fixnum))
> it's a quick and dirty way to be even more conservative than the true
> limit given by the quadratic formula.
>
> If I rewrite the CCL code such that sum is guaranteed to be a fixnum it's
> much faster:
>
> (defun sum-test-iterative ()
>   (declare (optimize speed (safety 0)))
>   (let ((sum 0))
>     (declare (fixnum sum))
>     (dotimes (i (truncate (sqrt most-positive-fixnum)))
>       (setf sum  (+ sum i)))
>   sum))
>
> ? (time (SUM-TEST-ITERATIVE))
> (SUM-TEST-ITERATIVE)
> took 5,061,389 microseconds (5.061389 seconds) to run.
> During that period, and with 4 available CPU cores,
>      5,077,467 microseconds (5.077467 seconds) were spent in user mode
>          6,632 microseconds (0.006632 seconds) were spent in system mode
> 576460751766552576
>
> This is about (/ 5.0 (truncate (sqrt most-positive-fixnum))) = 4.7 ns per
> addition.
>
> If I add two characters ( #. ), it becomes 2x faster still:
>
> (defun sum-test-iterative ()
>   (declare (optimize speed (safety 0)))
>   (let ((sum 0))
>     (declare (fixnum sum))
>     (dotimes (i #.(truncate (sqrt most-positive-fixnum)))  ; (declare
> (fixnum i)) works here too, without the #.
>       (setf sum  (+ sum i)))
>   sum))
>
> ? (time (SUM-TEST-ITERATIVE))
> (SUM-TEST-ITERATIVE)
> took 2,342,141 microseconds (2.342141 seconds) to run.
> During that period, and with 4 available CPU cores,
>      2,355,535 microseconds (2.355535 seconds) were spent in user mode
>          2,729 microseconds (0.002729 seconds) were spent in system mode
> 576460751766552576
>
> This is about (/ 2.3 (truncate (sqrt most-positive-fixnum))) = 2.1 ns per
> addition.
>
> The last function above in SBCL:
> * (time (sum-test-iterative))
>
> Evaluation took:
>   1.182 seconds of real time
>   1.180927 seconds of total run time (1.179539 user, 0.001388 system)
>   99.92% CPU
>   3,184,332,795 processor cycles
>   0 bytes consed
>
> 2305843008139952128
>
> This is about (/ 1.18 (truncate (sqrt most-positive-fixnum))) = 0.5 ns per
> addition which is close to one add per clock cycle on my 2.7 GHz machine.
> This is C-level speed, but it's only 4x faster than CCL if we keep
> everybody in the fixnum domain. I'd still like to know why CCL is 4x
> slower, but at least now we're comparing apples-to-apples.
>
> -SS
>
> On May 4, 2017, at 2:26 PM, Jonathan Fischer wrote:
>
> This is a stupid thing to test, I was just curious about it, and I’m
> wondering what obvious thing it is I’m doing wrong to get really bad
> results.
>
> For starters, a silly little function to sum up an arithmetic series:
>
> (defun sum-test-iterative ()
>   (let ((sum 0))
>     (dotimes (i 2147483647 <(214)%20748-3647>)
>       (setf sum (+ sum i)))
>     sum))
>
> In ClozureCL 1.11, this is horribly slow and conses like crazy:
>
> ? (time (sum-test-iterative))
> (SUM-TEST-ITERATIVE)
> took 62,875,836 microseconds (62.875835 seconds) to run.
>       2,818,179 microseconds ( 2.818179 seconds, 4.48%) of which was spent
> in GC.
> During that period, and with 4 available CPU cores,
>      55,949,019 microseconds (55.949020 seconds) were spent in user mode
>       7,153,305 microseconds ( 7.153305 seconds) were spent in system mode
>  20,127,468,688 bytes of memory allocated.
>  10,397 minor page faults, 8 major page faults, 0 swaps.
> 2305843005992468481
>
> SBCL 1.3.17 does much better:
>
> * (time (sum-test-iterative))
>
> Evaluation took:
>   7.741 seconds of real time
>   7.677832 seconds of total run time (7.594504 user, 0.083328 system)
>   99.19% CPU
>   18,536,585,073 processor cycles
>   0 bytes consed
>
> 2305843005992468481
>
> If I sprinkle in some declarations I can get SBCL down to a bit over 3
> seconds, but ClozureCL’s still pretty bad:
>
> (defun sum-test-iterative ()
>   (declare (optimize speed (safety 0)))
>   (let ((sum 0))
>     (declare ((signed-byte 64) sum))
>     (dotimes (i 2147483647 <(214)%20748-3647>)
>       (setf sum (the (signed-byte 64) (+ sum i))))
>     sum))
>
> ? (time (sum-test-iterative))
> (SUM-TEST-ITERATIVE)
> took 48,308,393 microseconds (48.308390 seconds) to run.
>       2,875,113 microseconds ( 2.875113 seconds, 5.95%) of which was spent
> in GC.
> During that period, and with 4 available CPU cores,
>      42,849,312 microseconds (42.849310 seconds) were spent in user mode
>       6,040,092 microseconds ( 6.040092 seconds) were spent in system mode
>  20,127,468,707 bytes of memory allocated.
>  14,039 minor page faults, 1 major page faults, 0 swaps.
> 2305843005992468481
>
> * (time (sum-test-iterative))
>
> Evaluation took:
>   3.267 seconds of real time
>   3.248668 seconds of total run time (3.228591 user, 0.020077 system)
>   99.45% CPU
>   7,821,424,512 processor cycles
>   0 bytes consed
>
> 2305843005992468481
>
> How can I help ClozureCL out here? Both of these are running 64-bit on
> macOS, btw.
> _______________________________________________
> Openmcl-devel mailing list
> Openmcl-devel at clozure.com
> https://lists.clozure.com/mailman/listinfo/openmcl-devel
>
>
>
> _______________________________________________
> Openmcl-devel mailing list
> Openmcl-devel at clozure.com
> https://lists.clozure.com/mailman/listinfo/openmcl-devel
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clozure.com/pipermail/openmcl-devel/attachments/20170505/48e9992b/attachment.htm>


More information about the Openmcl-devel mailing list