[Openmcl-devel] Plans for Unicode support within OpenMCL?
David Tolpin
dvd at davidashen.net
Wed Mar 22 05:23:34 PST 2006
Hi,
I think the main confusion with regard to Unicode is that many tend
to think that a unicode code point is a character. It is not. It is
an integer index in UCS. A successful approach is one that gave
birth to UTF-8 (the most widely used Unicode encoding) and used in
Plan-9.
There is a character, which is what a character sans Unicode is.
There is a rune, that is an integer referring to a unicode code
point. One can use Unicode encoded as character strings, that is,
UTF-8, or decode it into a Rune (that is, integer) array. Unicode
support then boils down to decoders/encoders and classifiers on Runes
and Rune arrays.
That's how Unicode is used and supported in real-world applications
that deal with it.
David
More information about the Openmcl-devel
mailing list