Fwd: [Openmcl-devel] Unicode in OpenMCL
stevej at pobox.com
Wed Jun 23 21:13:40 UTC 2004
Whoops, forgot to reply to the list.
Begin forwarded message:
> From: Steve Jenson <stevej at pobox.com>
> Date: June 23, 2004 2:04:55 PM PDT
> To: "Andrew P. Lentvorski, Jr." <bsder at mail.allcaps.org>
> Subject: Re: [Openmcl-devel] Unicode in OpenMCL
> On Jun 23, 2004, at 1:50 PM, Andrew P. Lentvorski, Jr. wrote:
>> On Jun 23, 2004, at 1:00 PM, Gary Byers wrote:
>>> I think that it would be bad to have EXTENDED-CHAR (basically, bad to
>>> have more than one type of CHARACTER); it makes more sense to me to
>>> make all CHARACTERs (BASE-CHARs) and make CHAR-CODE-LIMIT be 2^24
>>> or so (I think that Unicode 4 needs about 21 bits to natively encode
>>> any character.)
>> I can't seem to find it, but I seem to recall a discussion about this
>> on one of the Lisp Wikis.
>> The general consensus was that adding Unicode to the basic character
>> type was a bad idea because ASCII can make certain guarantees that
>> Unicode cannot. Collation sequence and canonical representation
>> being the ones I can remember (ie. characters have order and a set of
>> characters is defined by one and only one bytestream)
>> I am in favor of some form of Unicode string. However, the notion of
>> a "character" in Unicode is a very fuzzy thing.
> I think Unicode Normalization Forms deals adequately with the
> canonicalization problem: http://www.unicode.org/reports/tr15/
> Do you remember the problems that Collation introduced?
More information about the Openmcl-devel