[Openmcl-devel] how many angels can dance on a unicode character?

Takehiko Abe keke at gol.com
Sat Apr 21 05:31:52 PDT 2007

Agh. Gary, you are too fast... 

A quick response.. 

> If wanted to exchange the first and last characters in that
> string, I might use something (stupid) like:
> (defun exchange-first-and-last-characters (string)
>    (let* ((len (length string)))
>      (when (> len 1)
>        (let* ((temp (char string (1- len))))
>          (setf (char string (1- len)) (char string 0)
>                (char string 0) temp)))
>       string))

You win. UTF-16 version would be hairy.
But this isn't fair because you don't do this in practice.

But if I really need it I'm sure I'll write it. And once I
have exchange-first-and-last-characters I'll never have
to look back.

> Suppose we were to instead say that - formally or not - these 16-bit
> strings were really UTF-16-encoded; we could allow the use of
> surrogate pairs inside 16-bit strings.  If we did this "informally",
> functions like SCHAR would either return true CHARACTER objects or the
> high or low half of a surrogate pair.  Since we aren't inventing a new
> language, the values returned by CHAR and SCHAR would have to be

Yes, but the CL standard does not say what CHARACTERS are
other than the standard characters. 

> even though they aren't "real": we can't ask ICU or
> anything else what the uppercase version of such a pseudo-character is
> in some locale. 

You can't. But I don't think that's a problem.

CHAR-UPCASE (and STRING-UPCASE) can safely return supplementary
characters untouched. (Not only supplementary characters. 
I think it's fine to ignore all char but ascii umm.. the
standard characters for CL case conversion functions).
When you want to do case coversion, you know what your locales
are and you ought to have certain expectation what the result
would look like. Then you ask ICU to do the job by passing a

Sorry. I'll try again tomorrow.

"Just carry on."

More information about the Openmcl-devel mailing list