[Openmcl-devel] Re: [cl-ppcre-devel] cl-ppcre speedup
Gary Byers
gb at clozure.com
Wed Dec 15 10:22:45 PST 2004
On Wed, 15 Dec 2004, Alan Ruttenberg wrote:
>
> On Dec 15, 2004, at 3:41 AM, Gary Byers wrote:
>
> > I wonder if there's a clever way to inline CHAR-UPCASE/CHAR-DOWNCASE ?
> > (There are obviously straightforward ways to do this; it's sometimes
> > faster to do this via table lookup, but there may be ways to do this
> > without conditional branches or memory references.)
> >
>
> You never have to change more than one bit
>
> (loop for code from 0 below 256
> for char = (code-char code)
> for upcase-code = (char-code (char-upcase char))
> always (if (/= upcase-code code) (= #b100000 (logxor code
> upcase-code)) t))
>
> T
>
> If the top nibble isn't 6 or 7 then you don't change the bit
>
> (remove-duplicates
> (loop for code from 0 below 256
> for char = (code-char code)
> for upcase-code = (char-code (char-upcase char))
> when (/= upcase-code code)
> collect (/ (logand #xf0 code) 16)))
>
> (6 7)
>
> You need to do a lookup of the bit for 32 cases
>
> (loop for code from 0 below 256
> for char = (code-char code)
> for upcase-code = (char-code (char-upcase char))
> when (and (/= upcase-code code) (= (logand #xf0 code) #60)) sum 1)
> 15
>
> (loop for code from 0 below 256
> for char = (code-char code)
> for upcase-code = (char-code (char-upcase char))
> when (and (/= upcase-code code) (= (logand #xf0 code) #x70)) sum 1)
>
> 11
>
> I presume that the ifs aren't really conditionals but rather uses of a
> condition register value
>
> (defun char-upcase-by-bits (char)
> (let ((code (char-code char)))
> (let* ((nochange (if (= #x60 (logand code #xe0)) 1 0))
> (offset (if (= 0 (logand #x10 code)) 0 1))
> (testbit (if (logbitp (logior (ash offset 4) (logand #xf code))
> #b00000111111111111111111111111110) 1 0))
> (flipper (ash (logand testbit nochange) 5)))
> (code-char (logxor flipper code)))))
>
> seems to work:
>
> (loop for code from 0 below 256 for char = (code-char code)
> always (eq (char-upcase-by-bits char) (char-upcase char)))
>
> T
>
> The question is, how few instructions can this be packed in to and
> whether it runs faster than the simpler memory lookup.
>
>
What I was thinking of (and eventually figured out) was something like:
;;; Assume that arg_z contains a character. If it's uppercase, map it
;;; to lowercase. Don't branch.
(lis imm0 (char-code #\A))
(ori imm0 imm0 target::subtag-character) ; imm0 <- #\A
(lis imm1 (char-code #\A))
(ori imm1 imm1 target::subtag-character) ; imm1 <- #\Z)
(subfc imm1 arg_z imm1) ; imm1 <- 1, if
(addze imm1 rzero) ; (char<= arg_z #\Z),
; else 0
(subfc imm0 imm0 arg_z) ; imm0 <- 1, if
(addze imm0 rzero) ; (char>= arg_z #\A),
; else 0
(and imm0 imm1 imm0) ; imm0 <- 1 if upcase,
; else 0
(slwi imm0 imm0 5) ; imm0 < 32 or 0
(or arg_z arg_z imm0) ; maybe make lowercase
You might be able to knock a few instructions out of that, and I wouldn't
want to guarantee that I don't have the operands backwards in one or
both "subfc" instructions. The more straightforward:
(lis imm0 (char-code #\A))
(ori imm0 imm0 target::subtag-character) ; imm0 <- #\A
(lis imm1 (char-code #\A))
(ori imm1 imm1 target::subtag-character) ; imm1 <- #\Z)
(cmplw :cr0 arg_z imm0)
(cmplw :cr1 arg_z imm1)
(blt :cr0 no)
(bgt :cr1 no)
(ori arg_z arg_z 32)
no
is probably going to have worse performance in general (depending on
how often branches are mispredicted and how much that costs); that's
what I was trying to avoid.
(This is all just to address the question "if the compiler were to
open-code CHAR-UPCASE/CHAR-DOWNCASE, what kind of code should it
generate ?")
More information about the Openmcl-devel
mailing list