[Openmcl-devel] Unicode issues, esp security

Luis Oliveira luismbo at gmail.com
Mon Apr 13 15:02:52 PDT 2009

Stelian Ionescu <stelian.ionescu-zeus at poste.it> writes:

> On Mon, 2009-04-13 at 22:24 +0200, james anderson wrote:
>> [ironic in this discussion, is that utf-8b is non-conformant - by  
>> definition.]
> I don't think so. See http://www.unicode.org/versions/Unicode5.1.0/
> paragraph E: "in processing the UTF-8 code unit sequence <F0 80 80 41>,
> the only requirement on a converter is that the <41> be processed and
> correctly interpreted as <U+0041>."

I think James' point is that UTF-8B is not specified by any standard so
it has nothing to conform to.

You are right, though, that the UTF-8B decoding process is
compatible/conformant with UTF-8. Not so for the encoding process: a
UTF-8B encoder might generate invalid UTF-8.

Luís Oliveira

