[Openmcl-devel] wx86cl-Windows XP SP3 32bit-socket error #22 during write

Gary Byers gb at clozure.com
Sat Apr 18 03:07:25 PDT 2009


The good news is that I checked in some changes (to windows-calls.c, in
both 1.3 and the trunk) that're intended to handle the case where
WriteFile returns ERROR_IO_PENDING.

The bad news is that none of the (real or virtual) Win32 machines that
I have access to seems to ever return ERROR_IO_PENDING from calls
to WriteFile, so I haven't been able to test this (beyond convincing
myself that the changes don't seem to break anything in the case
where the WriteFile call completes synchronously.)

If John or anyone else who gets the spurious "error 22" trying to
force output to a socket could do an "svn update" and a kernel
rebuild and let us know whether this fixes the problem, that'd
be great.

On Fri, 17 Apr 2009, Gary Byers wrote:

>
>
> On Fri, 17 Apr 2009, John Miller wrote:
>
>> Gary,
>>
>
>> Well, I modified _dosmaperr to return the error value it is passed
>> if it cannot find a corresponding POSIX error.  Now it is returning
>> error 997, which according to %windows-error-string is "Overlapped
>> I/O operation is in progress According to the MSDN
>> (http://msdn.microsoft.com/en-us/library/aa365747(VS.85).aspx) it is
>> more of a notification than an error, signaling that WriteFile
>> returned before the write operation completed.  I found another web
>> page
>> (http://www.microgate.com/techcenter/htmlhelp/html/hdlc5nzi.htm)
>> that talks a little bit about how to handle async communication.  It
>> makes the point that if this ERROR_IO_PENDING is returned one should
>> take care not to make the same WriteFile (which must mean calling
>> WriteFile with the same OVERLAPPED data) call until the
>> communication is complete.  >
>> What doesn't make sense is that it looks like the function lisp_open
>> calls CreateFile without the FILE_FLAG_OVERLAPPED flag set, so the
>> communication should by synchronous.  I am not sure why on my
>> machine WriteFile seems to think it is writing to an asynchronous
>> handle.
>
> The file handle that we're dealing with is a socket (not created by lisp_open),
> and it seems that sockets have the FILE_FLAG_OVERLAPPED bit set; doing a
> read on a socket has to go through the whole song-and-dance of dealing
> with overlapped I/O.
>
> Writing to a socket (or, more generally, writing to any file handle) -might-
> not complete synchronously, but I don't think that I'd seen this happen
> and the code in lisp_write doesn't deal with it at all (and completely
> misinterprets the ERROR_IO_PENDING return value as an error.)  It seems
> like lisp_write() needs to go through a song-and-dance somewhat like
> lisp_read() does.
>
>
>
>>
>> I wonder what would happen if the function just ignores this error (err.. notification)...
>
> My understanding is that the write has been initiated (or at least
> scheduled), but I don't think that we can tell whether or not it
> completed or whether all of the bytes that we wanted to write have
> been written without waiting around (waiting for the event handle
> in the overlapped structure to be signaled, checking for other
> errors, etc.)
>
> I have a dim and perhaps incorrect memory that that code had been
> written at one point (we're still setting up an event handle), but was
> removed because it didn't seem necessary.  It may be the case that the
> decision of whether or not to complete socket writes synchronously or
> not is made or influenced by the NIC driver, and that we just haven't
> seen this so far because most drivers usually decide that there's no
> good reason to return ERROR_IO_PENDING.
>
>
>>
>> Regards,
>> John
>>
>>
>>
>> On Friday, April 17, 2009, at 01:44AM, "Gary Byers" <gb at clozure.com> wrote:
>>>
>>>
>>> On Thu, 16 Apr 2009, John Miller wrote:
>>>
>>>>
>>>> On Apr 15, 2009, at 2:36 PM, Gary Byers wrote:
>>>>
>>>>>
>>>>>
>>>>> On Wed, 15 Apr 2009, John Miller wrote:
>>>>>
>>>>>> I am running CCL 1.3-r1195 (WindowsX8632) and getting the below
>>>>>> "(error #22) during write" when trying to connect to the Xming
>>>>>> server (or is that client? never could get that straight) on my
>>>>>> Windows XP, SP 3 machine.  I also get the same error when I try to
>>>>>> connect to swank from Emacs using slime.
>>>>>
>>>>>> I have another Windows XP machine that runs as a virtual machine
>>>>>> under parallels and I do not have any problems running slime+swank
>>>>>> on ccl on that machine (haven't tried clx, though).  I am stymied-
>>>>>> which is admittedly a pretty easy thing to do- but I was wondering
>>>>>> if anyone could provide insight into what an "(error #22) during
>>>>>> write" means.
>>>>>
>>>>> It means "The device doesn't recognize the command."  (I suppose that
>>>>> you'll want to know what that means now. I don't know yet.)
>>>>>
>>>>> On the machine that has problems, does networking generally work ?
>>>>> E.g., is some network interface configured ? One would think that
>>>>> that wouldn't matter much, since you're just trying to connect to
>>>>> the loopback address, but this is Windows ...
>>>>>
>>>>
>>>> I also should add that I rebooted my Win machine in diagnostic mode, which
>>>> disables all the cruft my employer has placed on this machine, and I still
>>>> get the socket error.  I installed SBCL 1.0.22 and it runs the slime/swank
>>>> combo just fine.  Not sure what else I can check, and I believe I am the only
>>>> unfortunate slob in the CCL universe with this problem, so I have a feeling
>>>> this one is going to remain a mystery.
>>>>
>>>> Thanks anyway...
>>>
>>>
>>> Confusingly, there are a couple of different sets of error numbers
>>> that can be relevant under Windows:
>>>
>>>  - some functions return (perhaps indirectly, by setting the "errno"
>>>    thread-local variable) a POSIX error number to indicate an error.
>>>
>>>  - many functions return (in another thread-local variable that can
>>>    be accessed via #_GetLastError) a Windows error code.
>>>
>>> The actual numbers can conflict, but conflicting POSIX and Windows
>>> error numbers generally have nothing to do with each other.
>>>
>>> In the case where you get an "error 22" during FD-STREAM-FORCE-OUTPUT,
>>> we're interpreting the "22" as a Windows error number:
>>>
>>> ? (ccl::%windows-error-string 22)
>>> "The device does not recognize the command. "
>>>
>>> but the function that failed actually returns a POSIX error number:
>>>
>>> ? (ccl::%strerror 22)
>>> "Invalid argument"
>>>
>>> I don't know which of these is more generic and less helpful, but
>>> I actually think that the POSIX interpretation may be very slightly
>>> helpful.  The function that generates the error is a call to WriteFile
>>> in the function lisp_write (in ccl/lisp-kernel/windows-calls.c):
>>>
>>>   if (WriteFile(hfile, buf, count, &nwritten, &overlapped)) {
>>>     return nwritten;
>>>   }
>>>
>>>   err = GetLastError();
>>>   _dosmaperr(err);         /* map Windows error to POSIX error, set errno */
>>>   return -1;
>>>
>>> So, the call to WriteFile is returning a Windows error number that gets
>>> mapped to EINVAL (=22).  If you look at the function _dosmaperr (in that
>>> same C source file), you'll see that a small number of Windows errors
>>> are mapped to specific POSIX errors and anything not enumerated gets
>>> mapped to EINVAL: we don't know with any confidence what WriteFile was
>>> really complaining about.
>>>
>>>
>>>
>>
>>
> _______________________________________________
> Openmcl-devel mailing list
> Openmcl-devel at clozure.com
> http://clozure.com/mailman/listinfo/openmcl-devel
>
>



More information about the Openmcl-devel mailing list