[Openmcl-devel] Transfer the contents of a file to a tcp stream
Gary Byers
gb at clozure.com
Tue May 3 14:22:52 PDT 2005
On Tue, 3 May 2005 rm at fabula.de wrote:
> On Tue, May 03, 2005 at 09:07:26PM +0100, Richard Newman wrote:
>> Just a thought -- the calculated speed of a transfer includes the setup
>> and tear-down times for the connection, as far as I can see.
>>
>> I don't know how big that file is, but try it with one that's 50MB or
>> more, so that the transfer time becomes more significant. It might be
>> that you're optimising the wrong part of your program.
>>
>> Failing that... profile!
>>
>> -R
>
> Well, even without the setup/teardown of the this OpenMCL will hardly
> erver reach Apache (BTW, how would teardown affect the measured time?).
> When sending _files_ (not content from memory) Apache tries hard to
> use a sendfile(2) or similar techniques (depending on the underlying OS).
> When sendfile is used the kernel is instructed to copy (send) a file directly
> from one file descriptor to another. First of all, with this trick data
> never has to cross the (expensive) border between kernel space and userspace
> (as opposed to crossing it twice in the OpenMCL sample code). On a reasonably
> smart OS that data might even never pass the kernel but travers via DMA from
> disk to network card.
> Of course, since OpenMCL has a nice FFI interface it would be trivial to
> call sendfile from it .... but that would be cheatin' :-)
>
>
> Cheers Ralf Mattes
Hmm. I was just about to recommend cheating (using sendfile) if it's
available (it's available on Linux, but doesn't seem to be there on
Darwin/OSX).
Apache seems to try to use sendfile if it's available, and apparently
tries to partially emulate it if it isn't. (I'm not sure what's
involved in that emulation; at a minimum, it'd probably involve having
the file and socket share a single userspace buffer.)
If Apache's cheating (doing something other than an #_fread/#_fwrite
loop), it seems pretty reasonable for Christan's server to do so as
well. I imagine that a lot of effort's been put into getting Apache
to cheat efficiently, and that it'd be worth looking into exactly
how it does that.
If one has one's hands on a CL STREAM s, the generic function
(CCL::STREAM-DEVICE s direction) - where "direction" is one of
:INPUT, :OUTPUT, or NIL - returns the underlying file descriptor.
(Yes, this should be exported and documented.) You can cheat a bit
by doing something like:
(defun copy-file-to-socket (file-stream socket-stream)
(let* ((file-fd (ccl::stream-device file-stream :input))
(socket-fd (ccl::stream-devices socket-stream :output))
(bufsize 8192)) ; arbitrary
(force-output socket-stream) ; flush any buffered output
(%stack-block ((buffer bufsize))
;; You could just say (#_lseek ...) here, but Linux's
;; #_lseek may have difficulty with large file offsets.
(ccl::fd-lseek file-fd 0 #$SEEK_SET)
(loop
(let* ((nread (#_read file-fd buf bufsize)))
(cond ((zerop nread) (return))
((minusp nread)(error ...))
(t (let* ((nwritten (#_write socket-fd buf nread)))
(when (< nwritten nread)
;;; Handle partial writes, errors
)))))))))
That's an artist's conception and may be a little buggy, but using
a single buffer and bypassing most of the buffered stream overhead
might get things -closer- to Apache's performance.
More information about the Openmcl-devel
mailing list