[Openmcl-devel] representing infinity

Gary Byers gb at clozure.com
Wed Mar 24 21:50:26 PST 2004



On Wed, 24 Mar 2004, Gary King wrote:

> Hi Gary,
>
> To follow up a bit on this, please look at the follow OpenMCL session.
> I can create infinities one step at a time but not "all at once" in an
> unwind protect. I'm afraid this goes beyond my debugging skills but I'm
> hoping to learn more from your answer. FWIW, the unwind protect does
> work in MCL 5.0
>
> Welcome to OpenMCL Version (Beta: Darwin) 0.14.1-p1!
> ? (ccl:get-fpu-mode)
> (:ROUNDING-MODE :NEAREST :OVERFLOW T :UNDERFLOW NIL :DIVISION-BY-ZERO T
> :INVALID T :INEXACT NIL)
> ? (ccl:set-fpu-mode :division-by-zero nil)
> (:ROUNDING-MODE :NEAREST :OVERFLOW T :UNDERFLOW NIL :DIVISION-BY-ZERO
> NIL :INVALID T :INEXACT NIL)
> ? (setf positive-infinity (/ 0d0))
> #<INFINITY 1.797693134862316D+308>
> ? (ccl:set-fpu-mode :division-by-zero t)
> (:ROUNDING-MODE :NEAREST :OVERFLOW T :UNDERFLOW NIL :DIVISION-BY-ZERO T
> :INVALID T :INEXACT NIL)
> ? positive-infinity
> #<INFINITY 1.797693134862316D+308>
> ? (setf another-positive-infinity
> (unwind-protect
>                (progn
>                  (ccl:set-fpu-mode :division-by-zero nil)
>                  (/ 0d0))
>                (ccl:set-fpu-mode :division-by-zero t))))
> Unhandled exception 11 at 0x00004774, context->regs at #xf0134ad8
> Read operation to unmapped address 0x004ba000
> ? for help
> [5351] OpenMCL kernel debugger: ?
> (S)  Set specified GPR to new value
> (A)  Advance the program counter by one instruction (use with caution!)
> (D)  Describe the current exception in greater detail
> (R)  Show raw GPR/SPR register values
> (L)  Show Lisp values of tagged registers
> (F)  Show FPU registers
> (B)  Show backtrace
> (X)  Exit from this debugger, asserting that any exception was handled
> (K)  Kill OpenMCL process
> (?)  Show this help
> [5351] OpenMCL kernel debugger: D
> Read operation to unmapped address 0x004ba000
> [5351] OpenMCL kernel debugger: R
> r00 = 0x00000000  r08 = 0x00002015  r16 = 0x00002015  r24 = 0x05142A4E
> r01 = 0xF0134FC0  r09 = 0x05396580  r17 = 0x050124EE  r25 = 0x0000006C
> r02 = 0x00101F50  r10 = 0x05390000  r18 = 0x050124DE  r26 = 0x051429C6
> r03 = 0x004B9FFC  r11 = 0x051429C6  r19 = 0x00000033  r27 = 0x00000088
> r04 = 0x00427930  r12 = 0x004B8730  r20 = 0x00000000  r28 = 0x00002015
> r05 = 0x05141102  r13 = 0x00426070  r21 = 0x05399736  r29 = 0x7FFFFFFC
> r06 = 0x004B87BE  r14 = 0x0110C624  r22 = 0x000000E4  r30 = 0x0539A2FE
> r07 = 0x00000000  r15 = 0x0510BD9E  r23 = 0x00000340  r31 = 0x053983DE
>
>   PC = 0x00004774   LR = 0x0110C624  CTR = 0x01014B56  CCR = 0x42842482
> XER = 0x00000000  MSR = 0x0000F930  DAR = 0x004BA000  DSISR = 0x40000000
> [5351] OpenMCL kernel debugger: L
> rcontext = 0x00101F50
> nargs = 21301873
> r15 (fn) = #<Function PRINT-A-NAN #x0510bd9e>
> r23 (arg_z) = 208
> r22 (arg_y) = 57
> r21 (arg_x) = "performing CCL::UNKNOWN on (#<NaN
> -2.696539702543562D+308>
> r20 (temp0) = 0
> r19 (temp1/next_method_context) = #<Unbound>
> r18 (temp2/nfn) = #<Function %SET-FPSCR-CONTROL #x050124de>
> r17 (temp3/fname) = %SET-FPSCR-CONTROL
> r16 (temp4) = ()
> r31 (save0) = #<3-element vector subtag = 72 @#x053983de>
> r30 (save1) =
> r29 (save2) = 536870911
> r28 (save3) = ()
> r27 (save4) = 34
> r26 (save5) = #<2-element vector subtag = 9A @#x051429c6>
> r25 (save6) = 27
> r24 (save7) = FORMAT-ERROR
> [5351] OpenMCL kernel debugger: F
> f00 : 0xFFF80000000000D0 (nan)
> f01 : 0xBFF0000000000000 (-1.000000)
> f02 : 0x41D6403067400000 (1493221789.000000)
> f03 : 0x412E848000000000 (1000000.000000)
> f04 : 0x4173EA3800000000 (20882304.000000)
> f05 : 0x3FE40C7A3A33C850 (0.626523)
> f06 : 0xFFF8000000098F5B (nan)
> f07 : 0xC24BC19587859393 (-238423838475.152924)
> f08 : 0xC24BC19587859393 (-238423838475.152924)
> f09 : 0xC24BC19587859393 (-238423838475.152924)
> f10 : 0xC24BC19587859393 (-238423838475.152924)
> f11 : 0xC24BC19587859393 (-238423838475.152924)
> f12 : 0xC24BC19587859393 (-238423838475.152924)
> f13 : 0xC24BC19587859393 (-238423838475.152924)
> f14 : 0xC24BC19587859393 (-238423838475.152924)
> f15 : 0xC24BC19587859393 (-238423838475.152924)
> f16 : 0xC24BC19587859393 (-238423838475.152924)
> f17 : 0xC24BC19587859393 (-238423838475.152924)
> f18 : 0xC24BC19587859393 (-238423838475.152924)
> f19 : 0xC24BC19587859393 (-238423838475.152924)
> f20 : 0xC24BC19587859393 (-238423838475.152924)
> f21 : 0xC24BC19587859393 (-238423838475.152924)
> f22 : 0xC24BC19587859393 (-238423838475.152924)
> f23 : 0xC24BC19587859393 (-238423838475.152924)
> f24 : 0xC24BC19587859393 (-238423838475.152924)
> f25 : 0xC24BC19587859393 (-238423838475.152924)
> f26 : 0xC24BC19587859393 (-238423838475.152924)
> f27 : 0xC24BC19587859393 (-238423838475.152924)
> f28 : 0xC24BC19587859393 (-238423838475.152924)
> f29 : 0xC24BC19587859393 (-238423838475.152924)
> f30 : 0x4330000080000000 (4503601774854144.000000)
> f31 : 0x0000000000000000 (0.000000)
> FPSCR = 000000D0

There are two bugs here.  The one that causes the segfault is the
result of a change in some assembly-language code that's called
by SET-FPU-MODE (it stopped using a stack as a temporary location
but continued to push a frame on that stack (without popping the
frame off.)  The catching/throwing/unwind-protecting that the
REPL does might mask that bug, but trashing the stack in something
like UNWIND-PROTECT probably isn't going to work ...

The second bug is a little more subtle: the Floating Point Status
and Control Register (FPSCR, as seen above) has 24 "status" bits
and 8 "control" bits.  The "control" bits (mostly) indicate which
exceptions are enabled (and SET-FPU-MODE provides a high-level
interface to that); the "status" bits indicate which exceptions
have occurred.  If an enabled exception occurs, you get a trap
which'll manifest itself as a lisp ARITHMETIC-ERROR of some kind.

SET-FPU-MODE basically writes to the control bits without changing the
status bits.  If one of the status bits indicates that a
DIVISION-BY-ZERO occurred (even if that exception was disabled) and
we then write to the control bits and re-enable DIVISION-BY-ZERO,
the act of doing this might cause an exception (an enabled exception
will have its "occurred" bit set, even if the operation that caused
that happened hundreds of instructions earlier.

I think that the fix is to zero all of the status bits whenever
the control bits are written to (e.g., that all pending exceptions
get cleared whenever the set of enabled exceptions changes.)  There
are lower-level functions that read and write the control and status
fields independently; if any code that uses these functions wants
to check if a disabled exception occurred, it should do so before
re-enabling the exception.

The enclosed patch seems to fix both bugs; I haven't checked to
see whether any existing code (e.g., the float printer) expects
exception status to be preserved after control information changes.
-------------- next part --------------
Index: level-0/PPC/ppc-float.lisp
===================================================================
RCS file: /usr/local/tmpcvs/ccl-0.14/ccl/level-0/PPC/ppc-float.lisp,v
retrieving revision 1.3
diff -u -r1.3 ppc-float.lisp
--- level-0/PPC/ppc-float.lisp	10 Feb 2004 21:47:28 -0000	1.3
+++ level-0/PPC/ppc-float.lisp	25 Mar 2004 05:35:46 -0000
@@ -544,14 +544,13 @@
   (mtfsf #xfc fp0)                      ; set status fields [0-5]
   (blr))
 
-; Set the low 8 bits of the FPSCR; leave the high 24 unchanged
+; Set the low 8 bits of the FPSCR. Zero the upper 24 bits.
 (defppclapfunction %set-fpscr-control ((new arg_z))
   (unbox-fixnum imm0 new)
-  (stwu tsp -16 tsp)
-  (stw tsp 4 tsp)
+  (clrlwi imm0 imm0 24)                 ; ensure that "status" fields are clear
   (stw imm0 target::tcr.lisp-fpscr-low rcontext)
   (lfd fp0 target::tcr.lisp-fpscr-high rcontext)
-  (mtfsf #x03 fp0)                      ; set control fields [6-7]
+  (mtfsf #xff fp0)                      ; set all fields [0-7]
   (blr))
 
 (defppclapfunction %ffi-exception-status ()


More information about the Openmcl-devel mailing list