[Openmcl-devel] Plans for Unicode support within OpenMCL?

David Tolpin dvd at davidashen.net
Wed Mar 22 05:23:34 PST 2006


Hi,

I think the main confusion with regard to Unicode is that many tend  
to think that a unicode code point is a character. It is not. It is  
an integer index in UCS. A successful  approach is one that gave  
birth to UTF-8 (the most widely used Unicode encoding) and used in  
Plan-9.

  There is a character, which is what a character sans Unicode is.  
There is a rune, that is an integer referring to a unicode code  
point. One can use Unicode encoded as character strings, that is,  
UTF-8, or decode it into a Rune (that is, integer) array. Unicode  
support then boils down to decoders/encoders and classifiers on Runes  
and Rune arrays.

That's how Unicode is used and supported in real-world applications  
that deal with it.

David



More information about the Openmcl-devel mailing list