[Openmcl-devel] P4 to i5 port

Lou Vanek lou.vanek at gmail.com
Tue Oct 12 16:04:36 PDT 2010

Thanks for the detailed response.

I'm not using threads in this portion of my program, but hunchentoot is, and
hunchentoot 1.1 is using bordeaux-threads. I found where thread creation occurs
and changed one line so that :use-standard-initial-bindings is no longer called.

THIS FIXED MY PROBLEM. My program ran fine after I forced a recompile
of the bordeaux package (and changing one line).

In file openmcl.lisp in the bordeaux-threads package, one line was changed
in function %make-thread.

(defun %make-thread (function name)
  (ccl:process-run-function (list :name name)
  ;;lv (ccl:process-run-function (list :name name
:use-standard-initial-bindings nil)

I'm not sure whether this is a bordeaux issue or an openmcl issue. Do you
think I should inform the bordeaux package maintainer?

Thanks again!

Lou Vanek

On Tue, Oct 12, 2010 at 6:22 PM, Gary Byers <gb at clozure.com> wrote:
> The error has to do with data structures ("frag vectors") that're used
> in the assembler/code-generator.  These things (a) are entirely private
> to the assembler (b) have a very well-defined lifetime - once the assembler
> has used them to generate a function, they aren't referenced (c) are used
> very heavily, so they're freelisted (recycled).
> The freelists are supposed to be thread-specific (e.g., each thread is
> supposed to have its own binding of CCL::*X86-LAP-FRAG-VECTOR-FREELIST*,
> and the value of that that variable in each thread is supposed to be
> an object of type CCL::POOL.)  The only thing that's special about that
> kind of object is that the GC will set a POOL's data slot to NIL whenever
> it encounters one (so that freelists don't grow indefinitely and their
> contents eventually get GCed.)
> Between GCs then, the assembler pops "frag vectors" off of a thread-private
> freelist (the fact that it's a thread-private list means that locking isn't
> needed, though this code isn't reentrant and has to disable interrupts) or
> creates one if that list is empty; when the assembler's created the
> function,
> it returns all of the frag vectors to the freelist.  There are ways to go
> wrong here, but that strategy does significantly reduce consing (enough to
> have a measurable and significant improves compilation speed, or did the
> last
> time that I tried to measure it.)
> One of the things that can go wrong would do so if two threads were trying
> to use the same freelist: if that happened, they could both pop the same
> element off of the freelist at (roughly) the same time and eventually return
> those elements to the freelist more than once.  If that happens, the
> freelist
> would become circular, and there's some leftover debugging code that checks
> for that and signals the error that appears in the backtrace.
> That's not supposed to happen, because each thread is supposed to use
> its own binding of the variable that contains that freelist: that's a
> "standard initial binding" that ordinarily takes effect whenever a
> thread is created.  There's an option to MAKE-PROCESS/PROCESS-RUN-FUNCTION
> that allows a thread to be created without any standard initial
> bindings, but the documentation doesn't stress that that's an
> exceptional and potentially dangerous thing to do.  A few years ago
> some problem that several people reported was traced to the fact that
> some third-party package was creating threads without "standard
> initial bindings" and were therefore stepping on shared resources that
> they thought were thread-private, and we promised to deprecate that
> option or to use scary language in the documentation. ("You should
> only use this option if you fully understand the implications of doing
> so.  If you do understand those issues, please explain them to us
> ...")
> My first guess is that your application is running threads that don't
> have standard initial bindings (and that two or more threads are
> therefore stepping on internal assembler data structures), and that
> the thread-creation code that you're using (whether it's yours or a
> third party's) should be changed to not say
> :USE-STANDARD-INITIAL-BINDINGS NIL.  I don't know for sure that that's
> the explanation, but if I'm correct in remembering that your P4 was a
> single-core machine and that the i5 is multi-core, then the fact that
> this problem showed up when you moved it to the i5 is consistent with
> that explanation: the bad things that can happen when two threads try
> to modify a data structure at the same time are more likely to happen
> when those threads are really running concurrently (on multiple cores)
> and literally trying to do do that modification at the same time.
> If that's not it, I don't have a good guess: I don't think that it's
> too easy for the freelist to get corrupted if it's only modified from
> a single thread, and I haven't seen or heard of this happening in several
> years.
> On Tue, 12 Oct 2010, Lou Vanek wrote:
>> Hi,
>> I'm in the process of porting an openmcl project from a 32-bit Pentium
>> 4 to a 64-bit i5.
>> Most of the code runs fine, but I'm having problems with a snippet of
>> code that is
>> compiled on the fly.
>> The line of code that is causing the problem is:
>>   (setq res (funcall (coerce form 'function)))
>> 'form' is bound to,
>>  (LAMBDA NIL (IFDEF "mode" 0 "Select a cube."))
>> The IFDEF function is never called. This is where openmcl throws an error.
>> I can eval this form just fine in the REPL, but not at run-time in the web
>> app
>> on the i5 using code compiled by the i5. The P4 never had a problem
>> running this.
>> I know this code looks funky but since this is a web app some of the web
>> pages
>> are built at runtime using text templates with embedded lisp forms. It's
>> these
>> embedded forms that are causing the problem. None of the static code
>> throws
>> errors.
>> A partial backtrace is shown below.
>> Some background. This code runs fine both on my P4 and the i5 as long
>> as I use FASLs
>> that are compiled on the P4. When I compile on the i5 I get the error
>> at runtime.
>> My setup is a bit complicated. This is being run in a debian lenny
>> virtual machine,
>> using the latest stable virtualbox on a win7 host. Debian Lenny is
>> stable and patched up.
>> Guest OS:
>>> uname -a
>> Linux deb 2.6.26-2-686 #1 SMP Thu Sep 16 19:35:51 UTC 2010 i686 GNU/Linux
>> I believe this is a 32-bit OS running on a 64-bit cpu.
>> Openmcl version running on the i5:
>> CL-USER> (lisp-implementation-type)
>> "Clozure Common Lisp"
>> CL-USER> (lisp-implementation-version)
>> "Version 1.6-dev-r14347M-trunk  (LinuxX8632)"
>> The openmcl version that is running on the P4 is about 3 months old, IIRC.
>> The ccl/hunchentoot/slime error log:
>> [2010-10-12 15:01:32 [ERROR]] Compiler bug or inconsistency:
>> frag-vector freelist is circular
>> #x1A74E066>) 71
>> (B665F3B8) : 2 (GET-BACKTRACE) 327
>> #x1A74E07E>) 119
>> (B665F408) : 4 (SIGNAL #<CCL::COMPILER-BUG #x1A74E07E>) 903
>> (B665F430) : 5 (%ERROR #<CCL::COMPILER-BUG #x1A74E07E> NIL -308708079) 111
>> (B665F444) : 6 (COMPILER-BUG "frag-vector freelist is circular") 127
>> (B665F46C) : 8 ((SETF %VECTOR-LIST-REF) 0 (#(85 137 229 106 235 ...)) 24)
>> 119
>> (B665F490) : 9 (FRAG-PUSH-BYTE #<FRAG  #x1A1A434E> 0) 151
>> (B665F4AC) : 10 (FRAG-LIST-PUSH-32 #<DLL-HEADER  #x1A74E766> 12) 223
>> NIL ...)) 3943
>> CCL::X862-EXPAND-VINSN)> (525 (ASH # 2) 2)) 359
>> CCL::X862-EXPAND-VINSN)> ((:NOT #) (525 # 2))) 607
>> (B665F554) : 14 (X862-EXPAND-VINSN #<SET-NARGS 3> #<DLL-HEADER
>> NIL ...) #S(X86::X86-IMMEDIATE-OPERAND :TYPE 256 :VALUE 12)
>> #<DLL-HEADER  #x1A74E756>) 1135
>> (B665F590) : 15 (X862-EXPAND-VINSNS #<DLL-HEADER  #x1A74E8CE>
>> NIL ...) #<DLL-HEADER  #x1A74E756>) 543
>> (B665F5B8) : 16 (X862-COMPILE #<CCL::AFUNC #x1A74EDA6> NIL T) 8863
>> "Select a cube.")) :NAME NIL :ENV NIL :POLICY NIL
>> "Select a cube.")) NIL NIL) 183
>> WEB::EVAL-EMBEDDED-LISP)> "<div style='display:none'>
>>   <div class='contact-top'></div>
>>   <div class='contact-content'>
>>       <!-- warning! don't put too many characters in this h1 or the
>> whole dialog goes to crap -->
>>       <h1 class='contact-title'>
>>           <?cl (web::ifdef \"mode\" 0 \"Select a cube.\")?>
>> ...
>> [Above is the text template that is being evaluated. The last line of
>> the backtrace
>> shows the lisp form that is causing openmcl to throw an error.]
>> If you require additional information I would be glad to get it to you.
>> Thanks,
>> Lou Vanek
>> _______________________________________________
>> Openmcl-devel mailing list
>> Openmcl-devel at clozure.com
>> http://clozure.com/mailman/listinfo/openmcl-devel

More information about the Openmcl-devel mailing list