[Openmcl-devel] Faster Formatting for fun (the OpenMCL version)
Gary Byers
gb at clozure.com
Mon Jun 7 10:13:48 PDT 2004
On Mon, 7 Jun 2004, Gary King wrote:
> I've posted a similar question to MCL's list but this is the slightly
> different OpenMCL version.
>
> I'm generating a whole bunch of SQL statements mostly of the form
> "INSERT INTO FOO (...) VALUES (...);". Format is taking up about a
> third of my processing time so I was hoping to speed things up by using
> lower level calls. My attempts to do this, however, have failed
> miserably. Below I show a format statement that does what I want and
> then another bit of code that does the "same" thing using princ
> instead. The trouble is that the princ method is actually slower
> (though it appears not to cons at all). Note that the MCL data showed a
> much larger slow down and lots and lots of consing. Any ideas / help /
> commiseration would be appreciated.
>
This is one of many cases where a profiling/sampling utility would be
helpful: I can tell you what my intuition is, but one's intuition is
often wrong ...
Most writes to a file stream involve copying some bytes into a buffer
and adjusting some related pointers. (A few such calls also involve
doing I/O to the file system, but that's basically constant in the
two approaches you're looking at.)
In OpenMCL, all (well, most ...) streams are assumed to be globally
accessible, so all transactions on potentially shared streams involve
locking. In cases where there's no contention involved (i.e., most or
all of the time), locking isn't incredibly expensive in and of itself,
but the locking/unlocking has to be UNWIND-PROTECTed, and
UNWIND-PROTECT's fairly expensive in OpenMCL (for threading-related
reasons.) The actual "store some bytes in a buffer and adjust some
pointers" operation isn't reentrant, so it's also within a
WITHOUT-INTERRUPTS (which involves another UNWIND-PROTECT.)
It's generally the case that something like WRITE-STRING or
WRITE-SEQUENCE is likely to be faster than an equivalant sequence
of WRITE-CHARS; this is probably more pronounced in OpenMCL than
in other implementations because of this locking/WITHOUT-INTERRUPTS/
UNWIND-PROTECT overhead.
In your test cases, FORMAT does a fair amount of work (and some
consing) to interpret and process its arguments, then probably
writes a relatively small number of strings to the stream; the
PRINC version doesn't do much computation (or any consing), but writes
a larger number of shorter strings and pays the per-transaction overhead
more often.
If this intuition is at all accurate, then the performance PRINC-based
version could be improved by reducing the number of WRITE-STRING (or
GRAY:STREAM-WRITE-STRING) calls. (If this is correct, then it would
also be worth exploring ways of avoiding the overhead: I would guess
that most streams are thread-private and the reentrancy issue's a lot
more pronounced for things like *TERMINAL-IO* than for things created
with WITH-OPEN-FILE).
(Of course, it may be that the problem's something else entirely ...)
> ? (declaim (optimize (speed 3) (safety 0) (debug 0)))
> NIL
> ? (progn
> (when (probe-file "ccl:foo.temp")
> (delete-file "ccl:foo.temp"))
> (with-open-file (s "ccl:foo.temp" :direction :output
> :if-does-not-exist :create)
> (time (loop repeat 3000 do
> (format s "~%INSERT INTO ~A (~{~A~^, ~}) ~
> VALUES (~{'~A'~^, ~});"
> 'HATS
> '(a b c d e)
> (list 1 2 3 4 5))))))
> (LOOP REPEAT 3000 DO (FORMAT S "~%INSERT INTO ~A (~{~A~^, ~}) ~
> VALUES (~{'~A'~^, ~});" 'HATS '(A B C D E)
> (LIST 1 2 3 4 5))) took 1,088 milliseconds (1.088 seconds) to run.
> Of that, 620 milliseconds (0.620 seconds) were spent in user mode
> 30 milliseconds (0.030 seconds) were spent in system mode
> 438 milliseconds (0.438 seconds) were spent executing other OS
> processes.
> 120,288 bytes of memory allocated.
> NIL
> ? (progn
> (when (probe-file "ccl:foo.temp")
> (delete-file "ccl:foo.temp"))
>
> (with-open-file (s "ccl:foo.temp" :direction :output
> :if-does-not-exist :create)
> (let* ((vars '(a b c d e))
> (vals (list 1 2 3 4 5)))
> (time (loop repeat 3000 do
> (terpri s)
> (princ "INSERT INTO " s)
> (princ 'hats s)
> (princ " (" s)
> (loop for var = vars then (rest var)
> while var
> do (princ (first var) s)
> when (rest vars) do (princ ", " s))
> (princ ") VALUES (" s)
> (loop for var = vals then (rest var)
> while var
> do (princ "'" s) (princ (first var) s) (princ
> "'" s)
> when (rest vars) do (princ ", " s))
> (princ ");" s))))))
> (LOOP REPEAT 3000 DO (TERPRI S) (PRINC "INSERT INTO " S) (PRINC 'HATS
> S) (PRINC " (" S) (LOOP FOR VAR = VARS THEN (REST VAR) WHILE VAR DO
> (PRINC (FIRST VAR) S) WHEN (REST VARS) DO (PRINC ", " S)) (PRINC ")
> VALUES (" S) (LOOP FOR VAR = VALS THEN (REST VAR) WHILE VAR DO (PRINC
> "'" S) (PRINC (FIRST VAR) S) (PRINC "'" S) WHEN (REST VARS) DO (PRINC
> ", " S)) (PRINC ");" S)) took 1,674 milliseconds (1.674 seconds) to
> run.
> Of that, 1,170 milliseconds (1.170 seconds) were spent in user mode
> 30 milliseconds (0.030 seconds) were spent in system mode
> 474 milliseconds (0.474 seconds) were spent executing other OS
> processes.
> NIL
> ?
>
>
> --
> Gary Warren King, Lab Manager
> EKSL East, University of Massachusetts * 413 577 0176
>
> Power is actualized only where word and deed have not parted company,
> where words are not empty and deeds not brutal, where words are not
> used to veil intentions but to disclose realities, and deeds are not
> used to violate and destroy but to establish relations and create new
> realities.
> -- Hannah Arendt
>
> _______________________________________________
> Openmcl-devel mailing list
> Openmcl-devel at clozure.com
> http://clozure.com/mailman/listinfo/openmcl-devel
>
>
More information about the Openmcl-devel
mailing list