[Openmcl-devel] *default-character-encoding* should be :utf-8

Tom Emerson tree at dreamersrealm.net
Mon Sep 24 08:49:17 PDT 2012


Just some observations from the peanut gallery:

- Differentiating UTF-16/UTF-32 is possible because of the BOM at the
start of the file, which :UCS2, :UTF-16, and :UTF-32 will process.

- While not required, :UTF-8 encoded files can start with a suitably
encoded BOM, which the transcoder could detect I know that some
Windows-based editors will write the BOM on UTF-8 files, though none
on Unix will AFAIR.

- For source files I think having the compiler/loader grok the -*-
coding -*- directive at the start of the file is the best solution.
Python uses this, for example, and I usually put it in anyway when I'm
using Emacs because the majority of my Lisp programming is with NLP
related tasks in non-English languages.

I tend to agree with Gary in that the default should probably be
:ISO-8859-1 for the reasons he states: you can round trip between it
and Unicode without loss of information. The same is not true of
:UTF-8 --- as Alexander found.

I also agree with Ron, insofar as almost every text file I deal with
is encoded in UTF-8 and so I either have to set the default or
explicitly define the encoding every time I use with-open-file ---
which I usually do anyway in case I have to give my code to a coworker
who may not be using my init file. Doing this is annoying. But that's
life.

Perhaps it is worth adding some text to the installation guide that
explicitly talks about the default encodings, and giving the
appropriate recipes for defaulting to UTF-8 for those who want it? I
could put some text together (though probably not this week, since I'm
moving houses.)

    -tree

-- 
Tom Emerson
tree at dreamersrealm.net
http://www.dreamersrealm.net/tree



More information about the Openmcl-devel mailing list