[Openmcl-devel] Read all available input from socket stream?

Wed Jul 24 17:16:00 PDT 2013

Hopefully, this answers Zach's question as well.  The general answer is
that you can likely read whatever data's available on a socket without
blocking, but:

  a) it involves using CCL-specific primitives
  b) some of those primitives are OS-specific as well
  c) when dealing with a CCL stream, you can have "unread" data that's
     sitting in the stream's buffer, "unread" data that the OS says
     can be read immediately, and "unread" data that the other end of
     the socket hasn't sent yet (or hasn't been recieived yet.)  Off
     the top of my head, I don't remember whether there's a fit-for-public-
     consumption way of determining whether or not a stream has buffered
     data available.  If there isn't, there easily could be.

If we somehow know that a stream's input buffer is empty and if we don't
care about running on Windows or FreeBSD, we can obtain the stream's input
file descriptor via:

(ccl:stream-device stream :input)

and if that returns a non-nil result (as it will for an open TCP stream)
call CCL::UNREAD-DATA-AVAILABLE-P on that file descriptor.

On Unix-based systems other than FreeBSD, CCL::UNREAD-DATA-AVAILABLE-P
will either return NIL or a positive integer, and in the latter case you
should be able to do (e.g.) (READ-SEQUENCE large-enough :START 0 :END N)
without blocking and be confident that that reads N octets.

If I remember correctly, the underlying mechanism only worked for sockets
on FreeBSD (and not on other kinds of file descriptors), and if there's a way
to determine how much unread data exists in the OS for an arbitrary Windows
file handle I have no idea what that'd be.

There's a related question, namely "is it possible to read from this fd
now without blocking ?", and the answer to that question almost always
is "that depends on how much you want to read, as well as other factors."
You can (on Unix variants) put a file descriptor in non-blocking mode
and this will cause attempts to read more data than is immediately available
to return a distinguished error result, but that leaves you having to determine
how much unread data is available via binary search or by reading one byte
at a time.  (It's also the case that whatever support still exists for doing
non-blocking I/O on CCL streams is mostly leftover from the days of cooperative
threads; there are probably a few cases where non-blocking I/O would be otherwise
be useful, but I don't feel that people are missing too much by not being
able to do some things with non-blocking CCL streams.)

Note also that if CCL::UNREAD-DATA-AVAILABLE-P returns NIL, there's no way 
to tell from that result alone whether that's because the fd is at EOF or
whether "there's no do data available now, but might be later" if that concept
applies.

So ... I don't remember whether there's a current way of determining whether
a buffered input stream has buffered input; if there isn't, it'd be fairly easy
to define.  You can find out whether the OS has buffered input ready (and if so,
how much is available) if the OS provides a way of determining this and most
OSes that CCL runs on do so.  The internal function that tries to provide this
information hasn't changed in a long time and isn't likely to disappear soon,
but I don't want to promise that it won't change or disappear in the future if
there's good reason to change or remove it.

On Wed, 24 Jul 2013, Joshua Kordani wrote:

> Your code works for me.
> 
> I realize that I was mixing my metaphors up.? I am receiving a message that
> I was assuming would fit inside one packet, and what I was wanting to write
> was code that assumed this, blocked until input, and once input came in,
> read the entirety of it, no matter how large, into something I could use.?
> The CLHS example for read-sequence successfully reads into an array less
> data than the size of the array allocated.? this lead me to suspect that I
> could do the same thing with a ccl socket stream, but it seems that
> read-sequence won't come back until the array is completely filled.? So if I
> want to issue a message and expect a variable length response, I can't make
> the assumption that the entire response will arrive inside one packet (even
> though I'm pretty sure that this is the case, as I'm in control of all of
> the hardware that sits between me and the device in question, and I will
> never receive a response larger than say 100 bytes.? I was making that
> assumption, and also the assumption that read-sequence would encounter eof
> when there was no more data to be read.
> 
> It seems that the correct approach is to walk through the input a byte at a
> time and look for the end of message byte specified by the protocol I'm
> using, and if I stop reading before I get to the end of input I know I need
> to take what comes next and append it to what I've collected so far (or
> leave the buffer alone until I can read a full message out)? This last
> activity I've done before in other languages,? I should take some time to
> figure out how this same kind of work would be done in CL.
> 
> For posterity, the code that does what I want is here:?
> http://paste.lisp.org/+2YMI
> 
> Thanks,
> Josh
> On 7/24/13 6:31 AM, Gary Byers wrote:
>       Here's a very simple test case that tries to do something
>       similar to
>       what your code is apparently trying to do.
>       ;---
>       (defun server ()
>       ? (let* ((lsock (make-socket :connect :passive :local-host
>       "localhost" :local-port 40000 :reuse-address t))
>       ???????? (data (make-array 10 :element-type '(unsigned-byte
>       8))))
>       ??? (dotimes (i 10) (setf (aref data i) i))
>       ??? (let* ((stream (accept-connection lsock :wait t)))
>       ????? (write-sequence data stream)
>       ????? (when t
>       ??????? (sleep 5)
>       ??????? (write-sequence data stream))
>       ????? (close stream)
>       ????? (close lsock))))
> 
>
>       (defun test ()
>       ? (process-run-function "server" #'server)
>       ? (let* ((stream (make-socket :remote-host "localhost"
>       :remote-port 40000))
>       ???????? (data (make-array 19 :element-type '(unsigned-byte 8)))
>       ???????? (n (read-sequence data stream)))
>       ??? (close stream)
>       ??? (format t "~& requested 19 octets, got ~d" n)))
>       ;---
>
>       Calling (TEST) shoud pause for ~5 seconds, then report that it
>       requested
>       and go 19 octets.? Does this fail for you ?? If not, how does
>       the code
>       that you're dealing with differ ?
>
>       If you change the WHEN T in #'SERVER to WHEN NIL (and wait a few
>       minutes,
>       since I'm likely not using :reuse-address correctly), the server
>       will close
>       its side of the connection after writing 10 octets;
>       READ-SEQUENCE will see
>       that the stream is at EOF after those 10 octets are read.? This
>       (EOF) is the
>       only situation in which READ-SEQUENCE should return prematurely,
>       and it's the
>       only situation I'm aware of in which it does so in CCL.
> 
>
>       On Tue, 23 Jul 2013, Joshua Kordani wrote:
>
>             Greetings all.
>
>             I am attempting to read all currently available data
>             from a stream.? I am
>             not expecting much data, so I suspect I need to call
>             finish-output.? (read)
>             seems to return immediately, before it seems that
>             data is available.? I
>             guess in hindsight I don't know how the
>             implementation is supposed to know
>             when input has stopped coming in on what is
>             conceptually a stream with no
>             eof representing any termination of input.
>
>             I guess what I'm saying is, given a socket stream,
>             is there a way to read,
>             blocking until there is input to be read and until
>             there is no more data to
>             be read, this kind of stream?
>
>             http://paste.lisp.org/+2YLT
>             this is a paste of the funciton I'm trying to get
>             working.? I want to write
>             something to the stream, and I expect something
>             back, which I want to either
>             return from this function or simply emit to
>             *terminal-io*
>
>             Josh
> 
> 
> 
> 
>