[Openmcl-devel] Transfer the contents of a file to a tcp stream

Gary Byers gb at clozure.com
Tue May 3 14:22:52 PDT 2005



On Tue, 3 May 2005 rm at fabula.de wrote:

> On Tue, May 03, 2005 at 09:07:26PM +0100, Richard Newman wrote:
>> Just a thought -- the calculated speed of a transfer includes the setup
>> and tear-down times for the connection, as far as I can see.
>>
>> I don't know how big that file is, but try it with one that's 50MB or
>> more, so that the transfer time becomes more significant. It might be
>> that you're optimising the wrong part of your program.
>>
>> Failing that... profile!
>>
>> -R
>
> Well, even without the setup/teardown of the this OpenMCL will hardly
> erver reach Apache (BTW, how would teardown affect the measured time?).
> When sending _files_ (not content from memory) Apache tries hard to
> use a sendfile(2) or  similar techniques (depending on the underlying OS).
> When sendfile is used the kernel is instructed to copy (send) a file directly
> from one file descriptor to another. First of all, with this trick data
> never has to cross the (expensive) border between kernel space and userspace
> (as opposed to crossing it twice in the OpenMCL sample code). On a reasonably
> smart OS that data might even never pass the kernel but travers via DMA from
> disk to network card.
> Of course, since OpenMCL has a nice FFI interface it would be trivial to
> call sendfile from it .... but that would be cheatin' :-)
>
>
> Cheers Ralf Mattes


Hmm.  I was just about to recommend cheating (using sendfile) if it's
available (it's available on Linux, but doesn't seem to be there on
Darwin/OSX).

Apache seems to try to use sendfile if it's available, and apparently
tries to partially emulate it if it isn't.  (I'm not sure what's
involved in that emulation; at a minimum, it'd probably involve having
the file and socket share a single userspace buffer.)

If Apache's cheating (doing something other than an #_fread/#_fwrite
loop), it seems pretty reasonable for Christan's server to do so as
well.  I imagine that a lot of effort's been put into getting Apache
to cheat efficiently, and that it'd be worth looking into exactly
how it does that.

If one has one's hands on a CL STREAM s, the generic function
(CCL::STREAM-DEVICE s direction) - where "direction" is one of
:INPUT, :OUTPUT, or NIL - returns the underlying file descriptor.
(Yes, this should be exported and documented.)  You can cheat a bit
by doing something like:

(defun copy-file-to-socket (file-stream socket-stream)
   (let* ((file-fd (ccl::stream-device file-stream :input))
          (socket-fd (ccl::stream-devices socket-stream :output))
 	 (bufsize 8192)) ; arbitrary
     (force-output socket-stream) ; flush any buffered output
     (%stack-block ((buffer bufsize))
     ;; You could just say (#_lseek ...) here, but Linux's
     ;; #_lseek may have difficulty with large file offsets.
     (ccl::fd-lseek file-fd 0 #$SEEK_SET)
     (loop
       (let* ((nread (#_read file-fd buf bufsize)))
         (cond ((zerop nread) (return))
               ((minusp nread)(error ...))
               (t (let* ((nwritten (#_write socket-fd buf nread)))
                    (when (< nwritten nread)
                           ;;; Handle partial writes, errors
                          )))))))))

That's an artist's conception and may be a little buggy, but using
a single buffer and bypassing most of the buffered stream overhead
might get things -closer- to Apache's performance.




More information about the Openmcl-devel mailing list