Fwd: [Openmcl-devel] Unicode in OpenMCL

Wed Jun 23 14:13:40 PDT 2004

Whoops, forgot to reply to the list.

Begin forwarded message:

> From: Steve Jenson <stevej at pobox.com>
> Date: June 23, 2004 2:04:55 PM PDT
> To: "Andrew P. Lentvorski, Jr." <bsder at mail.allcaps.org>
> Subject: Re: [Openmcl-devel] Unicode in OpenMCL
>
>
> On Jun 23, 2004, at 1:50 PM, Andrew P. Lentvorski, Jr. wrote:
>
>> On Jun 23, 2004, at 1:00 PM, Gary Byers wrote:
>>
>>> I think that it would be bad to have EXTENDED-CHAR (basically, bad to
>>> have more than one type of CHARACTER); it makes more sense to me to
>>> make all CHARACTERs (BASE-CHARs) and make CHAR-CODE-LIMIT be 2^24
>>> or so (I think that Unicode 4 needs about 21 bits to natively encode
>>> any character.)
>>
>> I can't seem to find it, but I seem to recall a discussion about this 
>> on one of the Lisp Wikis.
>>
>> The general consensus was that adding Unicode to the basic character 
>> type was a bad idea because ASCII can make certain guarantees that 
>> Unicode cannot.  Collation sequence and canonical representation 
>> being the ones I can remember (ie. characters have order and a set of 
>> characters is defined by one and only one bytestream)
>>
>> I am in favor of some form of Unicode string.  However, the notion of 
>> a "character" in Unicode is a very fuzzy thing.
>
> I think Unicode Normalization Forms deals adequately with the 
> canonicalization problem: http://www.unicode.org/reports/tr15/
>
> Do you remember the problems that Collation introduced?
>
>
> -steve
>