[Openmcl-devel] Why is with-output-to-string so slow?
Ron Garret
ron at flownet.com
Wed Oct 27 17:10:39 PDT 2010
I guess I didn't state my question clearly enough. Yes, obviously opening a zillion file streams is going to be slow because you have to interact with the OS. But to create a string output stream you don't have to interact with the OS. Why is creating a string output stream so much slower than allocating an adjustable vector with a fill pointer?
rg
On Oct 27, 2010, at 4:37 PM, Gary Byers wrote:
>
>
> On Wed, 27 Oct 2010, Ron Garret wrote:
>
>> WITH-OUTPUT-TO-STRING seems to have an inordinate amount of overhead. Baseline benchmark:
>>
>> ? (setf devnull (open "/dev/null" :direction :output :if-exists :overwrite))
>> #<BASIC-CHARACTER-OUTPUT-STREAM ISO-8859-1 (CHARACTER-SPECIAL/12) #x30200593DFCD>
>> ? (time (dotimes (i 1000000) (princ i devnull)))
>> ;Compiler warnings :
>> ; In an anonymous lambda form at position 36: Undeclared free variable DEVNULL
>> (DOTIMES (I 1000000) (PRINC I DEVNULL)) took 1,054,338 microseconds (1.054338 seconds) to run
>>
>>
>> Capturing the output in strings is almost four times slower:
>>
>> ? (time (dotimes (i 1000000) (with-output-to-string (s) (princ i s))))
>> (DOTIMES (I 1000000) (WITH-OUTPUT-TO-STRING (S) (PRINC I S))) took 3,952,464 microseconds (3.952464 seconds) to run
>>
>
> Well, creating 1000000 STRING-OUTPUT-STREAMs and writing an integer to
> each of them is about 4x slower than creating a single FILE-STREAM to
> /dev/null and writing 1000000 integers to it.
>
> To compare apples to apples, we can compare the time it takes to create
> 1000000 FILE-STREAMs and write an integer to each of them. On a 2.2GHz
> Core 2 Duo running FreeBSD, I get:
>
> ? (time (dotimes (i 1000000) (with-open-file (s "/dev/null" :direction :output :if-exists :overwrite) (princ i s))))
> (DOTIMES (I 1000000) (WITH-OPEN-FILE (S "/dev/null" :DIRECTION :OUTPUT :IF-EXISTS :OVERWRITE) (PRINC I S))) took 128,002,346 microseconds (128.002350 seconds) to run
> with 2 available CPU cores.
>
> Not surprisingly, creating and using a FILE-STREAM is fairly expensive
> (even if writes to that stream are NOPs, we still have to persuade the
> OS to open and close a file descriptor 1000000 times.)
>
> Comparing oranges to oranges is a little more complicated. One complicating
> factor is that printing integers via PRINC conses. (I don't know why and I'm
> not sure that it should, but printing lots of integers conses a lot). As in:
>
> ? (defvar *devnull* (open "/dev/null" :direction :output :if-exists :overwrite))
>
> *DEVNULL*
> ? (time (dotimes (i 1000000) (princ i *devnull*)))
> (DOTIMES (I 1000000) (PRINC I *DEVNULL*)) took 2,327,625 microseconds (2.327625 seconds) to run
> with 2 available CPU cores.
> During that period, 2,245,475 microseconds (2.245475 seconds) were spent in user mode
> 84,842 microseconds (0.084842 seconds) were spent in system mode
> 376,831 microseconds (0.376831 seconds) was spent in GC.
> 256,000,544 bytes of memory allocated.
> 97 minor page faults, 0 major page faults, 0 swaps.
> NIL
>
> That's around 256 bytes per integer. Compare that to:
>
> ? (time (dotimes (i 1000000) (write-char #\space *devnull*)))
> (DOTIMES (I 1000000) (WRITE-CHAR #\ *DEVNULL*)) took 99,138 microseconds (0.099138 seconds) to run
> with 2 available CPU cores.
> During that period, 76,754 microseconds (0.076754 seconds) were spent in user mode
> 12 microseconds (0.000012 seconds) were spent in system mode
> 0 minor page faults, 1 major page faults, 0 swaps.
>
> Compared to:
>
> ? (time (progn (with-output-to-string (s) (dotimes (i 1000000) (write-char #\space s))) nil))
> (PROGN (WITH-OUTPUT-TO-STRING (S) (DOTIMES (I 1000000) (WRITE-CHAR #\ S))) NIL) took 122,553 microseconds (0.122553 seconds) to run
> with 2 available CPU cores.
> During that period, 122,552 microseconds (0.122552 seconds) were spent in user mode
> 0 microseconds (0.000000 seconds) were spent in system mode
> 26,528 microseconds (0.026528 seconds) was spent in GC.
> 14,486,432 bytes of memory allocated.
> NIL
>
> Notice that we're still consing a lot here, because:
>
> a) this form of WITH-OUTPUT-TO-STRING returns a 1000000-element string (which
> the form above ignores, to keep Emacs from trying to print it ...)
> b) the internal string used by the STRING-OUTPUT-STREAM probably has to grow
> several times before it's capable of holding 10000000 characters (where "grow"
> means "be replaced with a larger string".) We try to recycle these strings
> in many cases; the GC tries to limit this recycling. (Note also that the
> time difference between this case and writing to /dev/null is almost identical
> to the GC time.)
>
> If you do something like:
>
> ? (defvar *s* (make-string-output-stream))
> *S*
> ? (time (dotimes (i 1000000) (write-char #\space *s*)))
> ;;; Don't care here, just want the string in *s* to "grow".
>
> ? (progn (get-output-stream-string *s*) ; "reset" the stream
> (time (dotimes (i 1000000) (write-char #\space *s*))))
>
> there shouldn't be any consing and the times are almost identical to the
> "writing to /dev/null" case (a few hundred NOP syscalls vs some presumably
> greater number of cache misses.)
>
> As one would expect.
>
>
>>
>> Simply allocating the extra storage does not account for the difference:
>>
>> ? (defun foo () (make-array 10 :element-type 'character :fill-pointer t :adjustable t))
>> FOO
>> ? (time (dotimes (i 1000000) (foo) (princ i devnull)))
>> ;Compiler warnings :
>> ; In an anonymous lambda form at position 42: Undeclared free variable DEVNULL
>> (DOTIMES (I 1000000) (FOO) (PRINC I DEVNULL)) took 1,453,295 microseconds (1.453295 seconds) to run
>>
>>
>> What accounts for all the extra overhead? Is there any way to speed it up?
>>
>> Thanks,
>> rg
>>
>> _______________________________________________
>> Openmcl-devel mailing list
>> Openmcl-devel at clozure.com
>> http://clozure.com/mailman/listinfo/openmcl-devel
>>
>>
More information about the Openmcl-devel
mailing list