[Openmcl-devel] *default-character-encoding* should be :utf-8

Ron Garret ron at flownet.com
Sun Mar 4 08:23:02 PST 2012


On Mar 4, 2012, at 3:38 AM, Pascal J. Bourguignon wrote:

> Tim Bradshaw <tfb at tfeb.org> writes:
> 
>> On 4 Mar 2012, at 07:23, Pascal J. Bourguignon wrote:
>>> 
>>> On POSIX systems, I would like the defaults (*default-external-format*
>>> *default-file-character-encoding* *default-socket-character-encoding*)
>>> to be what is specified by the environment variables LC_ALL, or else
>>> LC_CTYPE, or else LANG, or else LANGUAGE (the latest being a GNU
>>> extension, I wonder why).  See http://clisp.org/impnotes/clisp.html
>>> (search: environment variables).
>> 
>> Do you understand how the encoding should depend on the locale
>> variables, as I've never been able to work that out (and the CLISP
>> documentation doesn't say how they work it out).  This isn't a
>> rhetorical question: I'd like to know.  Feel free to mail me privately
>> if you have any pointers, as this might be off-topic for the list.
> 
> LC_ALL=C               ==> :us-ascii
> LC_ALL=en_EN.UTF-8     ==> :utf-8
> LC_ALL=fr_FR.ISO8859-1 ==> :iso-8859-1
> LC_ALL=gr_GR.ISO8859-7 ==> :iso-8859-7
> 
> and so on.

But (IMHO) code (Lisp or otherwise) should ALWAYS be in UTF-8 no matter what (and IMHO, CCL ought to ship with UTF-8 as the default encodings).  There are two reasons for this.  First, code consist primarily of code points (no pun intended) <128, and UTF-8 is the most compact encoding for that distribution that can still represent all code points, and second, code often has to be dealt with by beginners who already have enough on their plate without having to worry about getting their encoding settings correct.

rg




More information about the Openmcl-devel mailing list