[Openmcl-devel] Unicode in OpenMCL

Andrew P. Lentvorski, Jr. bsder at mail.allcaps.org
Wed Jun 23 22:53:56 UTC 2004


On Jun 23, 2004, at 2:13 PM, Steve Jenson wrote:

>> I think Unicode Normalization Forms deals adequately with the 
>> canonicalization problem: http://www.unicode.org/reports/tr15/
>>
>> Do you remember the problems that Collation introduced?

That standard is listed as for Unicode 4.0.0.  A quick look at the 
appendix shows that there are different conversions applicable for 
different versions of Unicode.  Not a huge problem, but not 
straightforward either.

Ah!  I remember the collation issue now--hash keys.  Using a Unicode 
string as a hash key hit all of the strange issues with collation and 
canonicalization.  To get around the collation problem requires an 
extra indirection into a collation table.

There is also the question of text direction if Unicode is going to be 
fully integrated into the language.

While languages like Python support Unicode strings, the programs 
themselves are not Unicode.  This is a much more difficult problem in 
Lisp where the line between data and program is far more ambiguous.

-a




More information about the Openmcl-devel mailing list