[Openmcl-devel] code-char from #xD800 to #xDFFF
peter
p2.edoc at googlemail.com
Tue Jul 31 09:48:33 PDT 2012
It seems that code-char returns nil from #xD800 to #xDFFF, otherwise
it returns characters from 0 to (- (lsh 1 16) 3). I take it as
defined in ccl::fixnum->char.
<http://www.unicode.org/charts/PDF/UDC00.pdf> and
<http://www.unicode.org/charts/PDF/UD800.pdf> say
"Isolated surrogate code points have no interpretation; consequently,
no character code charts or names lists are provided for this range."
<http://ccl.clozure.com/manual/chapter4.5.html#Unicode> says these
codes: "will never be valid character codes and will return NIL for
arguments in that range".
When using CCL to run a dynamic web service, this can be inconvenient
when passing material from external sources through CCL to a remote
browsers (for instance, Japanese Emoji icon characters occupy this
code area, sources use them and web browsers render them).
I cannot understand why CCL should behave as it does in this, but
assume there is good reason. Ie. would it not make sense to return a
character with appropriate code value even if CCL has no use for
that.
Is there any efficient strategy which side-steps this issue?
At the moment I am intercepting character codes in this area and
replacing them with #\Replacement_Character or #\null, but in so
doing losing the character code. Hence I would be passing material
through CCL such that some characters were eliminated in transit,
hence changing the original meaning/intent of the material.
More information about the Openmcl-devel
mailing list