[Openmcl-devel] Clozure CL 1.3-RC1 available

Gary Byers gb at clozure.com
Fri Feb 13 20:07:11 PST 2009


On Fri, 13 Feb 2009, John Stoneham wrote:

>
>> If you just start the lisp and type ^C in the listener, does it behave
>
>> itself, or do you get the same sort of "error reporting error" behavior
>> or other nonsense ?
>
> No behaving. I get the "error reporting error" and it freezes, can't even do a backtrace.
>
>> Is the machine a "Core Duo" (as opposed to a "Core 2 Duo" ?)  E.g., a 32-bit
>> machine ?
>
> Yes, it's a straight 32-bit machine.
>
>> At some very low level, I can imagine that hardware differences
>> could cause Mach to report the exception differently if it's raised on
>> a 32-bit machine than on a 64-bit machine.  I certainly don't know that to
>> be true, but I don't have a better theory, either: things seem to run
>> fine for me on 10.4.11, but I'm running the 32-bit lisp on a 64-bit
>> machine.  So far, that's the only difference that I can think of.
>
> Ouch, doesn't sound like good news. Let me know if there's anything I can test.
>

It's actually pretty good news; it's a pretty strong indication that
the problem has to do with differences between the way that an exception
is reported to/by system software on a 32-bit machine (as opposed to
a 64-bit machine.)  If we can understand what those differences are,
it's probably not too hard to fix the problem.  (Famous last words.)

Let me see if I can create a little program that'd tell us what's
different.  OK, if I remembered to attach it, the source should be
enclosed.  Compile and run via:

shell> cc -m32 -g -o mach32 mach32.c
shell> ./mach32

When I run it (on a Core 2 Duo), I get:

Mach exception: 2 (EXC_BAD_INSTRUCTION) with 2 codes: 0x1 0x0

If you get substantially different output, that'd be good (in the
sense that it'd tell us what's different and suggest how to handle it.)

If you get the same output, that's less good: it suggests that the
program counter value that gets passed to a real exception handler
is inaccurate (and we can't decode the bad instruction in the handler,
because the PC isn't pointing at the illegal instruction, but is
perhaps pointing a few bytes past it.  That may be harder to recover
from and is harder to test for (because Apple decided to rename a lot
of exception-related structures and fields, and I'm too lazy to wrestle
with that at the moment.)
-------------- next part --------------
#include <pthread.h>
#include <mach/machine/thread_state.h>
#include <mach/machine/thread_status.h>
#include <mach/exception_types.h>
#include <stdio.h>
#include <stdlib.h>


#define MACH_CHECK_ERROR(n, s) \
  do {if (n != KERN_SUCCESS) {fprintf(stderr, "mach error return %d in %s\n", n, s); exit(1);}} while (0)


char *exception_name(int xnum)
{
  switch (xnum) {
  case EXC_BAD_ACCESS: return "EXC_BAD_ACCESS";
  case EXC_BAD_INSTRUCTION: return "EXC_BAD_INSTRUCTION";
  case EXC_ARITHMETIC: return "EXC_ARITHMETIC";
  case EXC_SOFTWARE: return "EXC_SOFTWARE";
  case EXC_EMULATION: return "EXC_EMULATION";
  default: return "other";
  }
}
    

void *
exception_handler_proc(void *arg)
{
  extern boolean_t exc_server();

  mach_msg_server(exc_server, 256, *(mach_port_t *)arg, 0);
  /* Never returns.  Abort if it does. */
  abort();
}


pthread_t
create_system_thread(void* (*start_routine)(void *),
		     void* param)
{
  pthread_attr_t attr;
  pthread_t returned_thread = (pthread_t) 0;

  pthread_attr_init(&attr);
  pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);  

  /* 
     I think that's just about enough ... create the thread.
     Well ... not quite enough.  In Leopard (at least), many
     pthread routines grab an internal spinlock when validating
     their arguments.  If we suspend a thread that owns this
     spinlock, we deadlock.  We can't in general keep that
     from happening: if arbitrary C code is suspended while
     it owns the spinlock, we still deadlock.  It seems that
     the best that we can do is to keep -this- code from
     getting suspended (by grabbing TCR_AREA_LOCK)
  */
  pthread_create(&returned_thread, &attr, start_routine, param);
  pthread_attr_destroy(&attr);
  return returned_thread;
}


void
mach_exception_port_setup(mach_port_t *port)
{
  create_system_thread(exception_handler_proc, (void *)port);
}

kern_return_t
catch_exception_raise(mach_port_t exception_port,
		      mach_port_t thread,
		      mach_port_t task, 
		      exception_type_t exception,
		      exception_data_t code_vector,
		      mach_msg_type_number_t code_count)
{
  int i;

  fprintf(stderr, "Mach exception: %d (%s) with %d codes: ",exception, exception_name(exception), code_count);
  for (i = 0; i < code_count; i++) {
    fprintf(stderr,"0x%x ",code_vector[i]);
  }
  fprintf(stderr, "\n");
  exit(2);
}


mach_port_t main_thread_exception_port;

void
illegal()
{
  /*  __asm__ volatile("int $0xd5"); */
  __asm__ volatile("ud2a");
}


main()
{
  kern_return_t kret;
  
  kret = mach_port_allocate(mach_task_self(),
                            MACH_PORT_RIGHT_RECEIVE, 
                            &main_thread_exception_port);
  MACH_CHECK_ERROR(kret, "mach_port_allocate");
  
  kret = mach_port_insert_right(mach_task_self(),
                                main_thread_exception_port,
                                main_thread_exception_port,
                                MACH_MSG_TYPE_MAKE_SEND);

  MACH_CHECK_ERROR(kret, "mach_port_insert_right");

  kret = thread_set_exception_ports(mach_thread_self(),
                                    (EXC_MASK_ALL & ~(EXC_MASK_BREAKPOINT|EXC_MASK_SYSCALL|EXC_MASK_MACH_SYSCALL)),
                                    main_thread_exception_port,
                                    EXCEPTION_DEFAULT,
                                    MACHINE_THREAD_STATE);
  MACH_CHECK_ERROR(kret, "thread_set_exception_ports");
  

  mach_exception_port_setup(&main_thread_exception_port);
  illegal();
}


  



More information about the Openmcl-devel mailing list