[Openmcl-devel] wx86cl-Windows XP SP3 32bit-socket error #22 during write

Gary Byers gb at clozure.com
Fri Apr 17 08:13:06 PDT 2009



On Fri, 17 Apr 2009, John Miller wrote:

> Gary,
>

> Well, I modified _dosmaperr to return the error value it is passed
> if it cannot find a corresponding POSIX error.  Now it is returning
> error 997, which according to %windows-error-string is "Overlapped
> I/O operation is in progress According to the MSDN
> (http://msdn.microsoft.com/en-us/library/aa365747(VS.85).aspx) it is
> more of a notification than an error, signaling that WriteFile
> returned before the write operation completed.  I found another web
> page
> (http://www.microgate.com/techcenter/htmlhelp/html/hdlc5nzi.htm)
> that talks a little bit about how to handle async communication.  It
> makes the point that if this ERROR_IO_PENDING is returned one should
> take care not to make the same WriteFile (which must mean calling
> WriteFile with the same OVERLAPPED data) call until the
> communication is complete.  >
> What doesn't make sense is that it looks like the function lisp_open
> calls CreateFile without the FILE_FLAG_OVERLAPPED flag set, so the
> communication should by synchronous.  I am not sure why on my
> machine WriteFile seems to think it is writing to an asynchronous
> handle.

The file handle that we're dealing with is a socket (not created by lisp_open),
and it seems that sockets have the FILE_FLAG_OVERLAPPED bit set; doing a
read on a socket has to go through the whole song-and-dance of dealing
with overlapped I/O.

Writing to a socket (or, more generally, writing to any file handle) -might-
not complete synchronously, but I don't think that I'd seen this happen
and the code in lisp_write doesn't deal with it at all (and completely
misinterprets the ERROR_IO_PENDING return value as an error.)  It seems
like lisp_write() needs to go through a song-and-dance somewhat like
lisp_read() does.



>
> I wonder what would happen if the function just ignores this error (err.. notification)...

My understanding is that the write has been initiated (or at least
scheduled), but I don't think that we can tell whether or not it
completed or whether all of the bytes that we wanted to write have
been written without waiting around (waiting for the event handle
in the overlapped structure to be signaled, checking for other
errors, etc.)

I have a dim and perhaps incorrect memory that that code had been
written at one point (we're still setting up an event handle), but was
removed because it didn't seem necessary.  It may be the case that the
decision of whether or not to complete socket writes synchronously or
not is made or influenced by the NIC driver, and that we just haven't
seen this so far because most drivers usually decide that there's no
good reason to return ERROR_IO_PENDING.


>
> Regards,
> John
>
>
>
> On Friday, April 17, 2009, at 01:44AM, "Gary Byers" <gb at clozure.com> wrote:
>>
>>
>> On Thu, 16 Apr 2009, John Miller wrote:
>>
>>>
>>> On Apr 15, 2009, at 2:36 PM, Gary Byers wrote:
>>>
>>>>
>>>>
>>>> On Wed, 15 Apr 2009, John Miller wrote:
>>>>
>>>>> I am running CCL 1.3-r1195 (WindowsX8632) and getting the below
>>>>> "(error #22) during write" when trying to connect to the Xming
>>>>> server (or is that client? never could get that straight) on my
>>>>> Windows XP, SP 3 machine.  I also get the same error when I try to
>>>>> connect to swank from Emacs using slime.
>>>>
>>>>> I have another Windows XP machine that runs as a virtual machine
>>>>> under parallels and I do not have any problems running slime+swank
>>>>> on ccl on that machine (haven't tried clx, though).  I am stymied-
>>>>> which is admittedly a pretty easy thing to do- but I was wondering
>>>>> if anyone could provide insight into what an "(error #22) during
>>>>> write" means.
>>>>
>>>> It means "The device doesn't recognize the command."  (I suppose that
>>>> you'll want to know what that means now. I don't know yet.)
>>>>
>>>> On the machine that has problems, does networking generally work ?
>>>> E.g., is some network interface configured ? One would think that
>>>> that wouldn't matter much, since you're just trying to connect to
>>>> the loopback address, but this is Windows ...
>>>>
>>>
>>> I also should add that I rebooted my Win machine in diagnostic mode, which
>>> disables all the cruft my employer has placed on this machine, and I still
>>> get the socket error.  I installed SBCL 1.0.22 and it runs the slime/swank
>>> combo just fine.  Not sure what else I can check, and I believe I am the only
>>> unfortunate slob in the CCL universe with this problem, so I have a feeling
>>> this one is going to remain a mystery.
>>>
>>> Thanks anyway...
>>
>>
>> Confusingly, there are a couple of different sets of error numbers
>> that can be relevant under Windows:
>>
>>  - some functions return (perhaps indirectly, by setting the "errno"
>>    thread-local variable) a POSIX error number to indicate an error.
>>
>>  - many functions return (in another thread-local variable that can
>>    be accessed via #_GetLastError) a Windows error code.
>>
>> The actual numbers can conflict, but conflicting POSIX and Windows
>> error numbers generally have nothing to do with each other.
>>
>> In the case where you get an "error 22" during FD-STREAM-FORCE-OUTPUT,
>> we're interpreting the "22" as a Windows error number:
>>
>> ? (ccl::%windows-error-string 22)
>> "The device does not recognize the command. "
>>
>> but the function that failed actually returns a POSIX error number:
>>
>> ? (ccl::%strerror 22)
>> "Invalid argument"
>>
>> I don't know which of these is more generic and less helpful, but
>> I actually think that the POSIX interpretation may be very slightly
>> helpful.  The function that generates the error is a call to WriteFile
>> in the function lisp_write (in ccl/lisp-kernel/windows-calls.c):
>>
>>   if (WriteFile(hfile, buf, count, &nwritten, &overlapped)) {
>>     return nwritten;
>>   }
>>
>>   err = GetLastError();
>>   _dosmaperr(err);         /* map Windows error to POSIX error, set errno */
>>   return -1;
>>
>> So, the call to WriteFile is returning a Windows error number that gets
>> mapped to EINVAL (=22).  If you look at the function _dosmaperr (in that
>> same C source file), you'll see that a small number of Windows errors
>> are mapped to specific POSIX errors and anything not enumerated gets
>> mapped to EINVAL: we don't know with any confidence what WriteFile was
>> really complaining about.
>>
>>
>>
>
>



More information about the Openmcl-devel mailing list