[Openmcl-devel] Character Encoding Problem?

Philippe Sismondi psismondi at arqux.com
Thu Dec 23 19:17:40 PST 2010


In the past day or so I posted a question regarding file :external-format usage before I had learned everything I should have.

However, in attempting to sort out my character encoding problems I have observed some behaviour in ccl which seems problematic to me. This problem relates to the presence of a null character, i.e. #\Null, in a string.

When my function outputs the following two strings (in this order) to a file using external-format :utf-8, the character encoding of the second string gets messed up:

(format out "AB^@D~%")
(format out "María~%")

In the first string above ^@ represents the null character. Notice that the second string contains an accented i, which is char-code 237. If the null character is not present the second string is encoded properly on output. When the null is there I am getting something or other that is wrong, but I don't really know what 
ccl is trying to do with it.

The nulls are getting into the strings from external binary files that I am parsing. Either the input data is corrupt, or my parser is buggy. In any case, the string containing the null was output a thousand lines or so before the messed up string, so it took me a long time to find the connection.

However the nulls got in my strings, it does not seem right to me that the character encoding  on output should be affected by this. At least, I tried the same thing in sbcl and did not observe this behaviour.

Is this a bug? Or am I doin' it wrong?

Best,

- Phil -




More information about the Openmcl-devel mailing list