[Openmcl-devel] *default-character-encoding* should be :utf-8

Gary Byers gb at clozure.com
Sun Mar 4 17:53:57 PST 2012



On Sun, 4 Mar 2012, Ron Garret wrote:

>
> On Mar 4, 2012, at 10:55 AM, Raymond Wiker wrote:
>
> Another point here is that the encoding of a particular file is what
> it is, no matter what the user's environment has been set up for. It
> is no help if the user's environment has been set up for a
> particular encoding when at least some source files are in a
> completely different encoding.
> Yes, this is another reason why it's important for everyone to use
> the same encoding unless there's a COMPELLING reason not to.  If
> you're writing code, UTF-8 is the One True Encoding.

If you're writing code in a vacuum, UTF-8 is a good choice.

If your sources contain a significant number of CJK characters, some
variant of UTF-16 can be a better choice.

If your sources are in some legacy encoding - MacRoman is an example
that still comes up from time to tine - then you obviously need to
process them with that encoding in effect or you'll lose information.
If you don't do this and the default encoding is utf-8, you'll
generally lose more information than if the default encoding was
fixed-length.

I/O to and from CHARACTER streams encoded in UTF-8 is generally at least
somewhat more expensive than it is when the stream is encoded in ISO-8859-1.
Some of this overhead is implementation artifact and some of it's inherent.
The parts that I'm attributing to "implementation artifact" could be addressed
by fairly significant changes to CCL's stream code and the inherent costs
would remain.  As things stand, those costs are measurable and can be significant
and obviously affect things like compilation speed.

One could make the argument that the benefits of changing CCL's defaults to UTF-8
are so compelling as to outweigh these and any similar costs that I'm not thinking
of.  I don't think that I agree with that, but it'd certainly be a plausible and
defensible argument with a lot of merit.

In reading some of the messages in this thread, I'm not sure that some of the
people arguing in favor of this change are even aware of the fact that there
are some costs associated with it, and may assume that my reluctance to make
the change has something to do with ... uh, I'm not sure, but it may be the
belief that I find it hard to change a DEFPARAMETER somewhere.  I'd like to
put these rumors to rest.  I don't mean to brag, but I've been changing
DEFPARAMETERs for a long time and have gotten pretty good at it (I can sometimes
do it without introducing typos or other catastrophes.)  I changed the value
in a DEFPARAMETER the other day, and just might change another one tomorrow.
(Whoops.  I said that I didn't want to brag, but just couldn't help it.)

>
> rg
>
> _______________________________________________
> Openmcl-devel mailing list
> Openmcl-devel at clozure.com
> http://clozure.com/mailman/listinfo/openmcl-devel
>
>



More information about the Openmcl-devel mailing list