[Openmcl-devel] Re: Read operation to unmapped address 0x00000020

Gary Byers gb at clozure.com
Tue Apr 12 12:22:45 UTC 2005



On Tue, 12 Apr 2005, Marco Baringer wrote:

> Gary Byers <gb at clozure.com> writes:
>
> > Yes, I'd be interested in seeing this if you still have it.
>
> i didn't have the original, but i was able to cause it farily quickly
> by sending in 100 simultaneous requests for a few minutes.
>
>

The crash scenario seems to be something like:

- some thread gets a segfault; this may or may not be benign

- the lisp kernel's exception-handling code is invoked, and starts to look
  at the saved machine register context that it receives as an argument.

- The register information isn't there (that "can't happen"), and the
  exception handler segfaults.

We -can- see the machine context from the second (recursive) segfault
(that's what the kernel debugger is showing), and the kernel exception
handler runs many times per session (and seems to run reliably under
much heavier load), so it's not clear what was different this time.

There -were- some changes in how some of this stuff works in 0.14.3,
and there was a bug in 0.14.2-p1 that could cause WAIT-ON-SEMAPHORE
to think that it had been interrupted when it had actually timed out
(if it got interrupted, but didn't return control until it would have
timed out anyway, it's probably more accurate to say that it timed
out.)

I can't promise that upgrading to 0.14.3 will fix the problem, because
I don't understand enough about the problem.

If it's possible for you to send me the code, I'll try to look at it
under GDB.




More information about the Openmcl-devel mailing list