[Openmcl-devel] *default-external-format* and encoding and decoding strings.

Dmitry Igrishin dfigrish at gmail.com
Mon Mar 7 15:50:52 PST 2016


2016-03-08 2:34 GMT+03:00 Ron Garret <ron at flownet.com>:

>
> On Mar 7, 2016, at 3:23 PM, Gary Byers <gb at clozure.com> wrote:
>
> >
> >
> > On 03/07/2016 03:36 PM, Ron Garret wrote:
> >> On Mar 7, 2016, at 1:30 PM, R. Matthew Emerson <rme at clozure.com> wrote:
> >>
> >>>> On Mar 5, 2016, at 12:03 PM, Ron Garret <ron at flownet.com> wrote:
> >>>>
> >>>>
> >>>> On Mar 5, 2016, at 8:39 AM, Dmitry Igrishin <dfigrish at gmail.com>
> wrote:
> >>>>
> >>>>>
> >>>>> 2016-03-05 16:25 GMT+03:00 Dmitry Igrishin <dfigrish at gmail.com>:
> >>>>> Hello,
> >>>>>
> >>>>> The *default-external-parameter* doesn't considered by
> >>>>> count-characters-in-octet-vector, decode-string-from-octets,
> >>>>> encode-string-to-octets, string-size-in-octets functions which
> >>>>> has the :external-format parameter. I believe that
> >>>>> *default-external-parameter* should affect the behaviour of
> >>>>> all functions with :external-format parameter, right?
> >>>>> Sorry, I meant the *default-external-format* special variable...
> >>>> This is a bug in ccl::lookup-character-encoding.  Here’s a patch:
> >>>>
> >>>> (in-package :ccl)
> >>>>
> >>>> (let ((ccl::*warn-if-redefine-kernel* nil))
> >>>>  (defun lookup-character-encoding (name)
> >>>>    (gethash (or name *default-external-format*)
> *character-encodings*)))
> >>> I don't think I can apply this.
> >>>
> >>> The issue is that nil is a valid character encoding name: it's a
> documented synonym for :iso-8859-1.
> >> That’s a bug in the documentation.  NIL should be a synonym for
> *default-character-encoding*.  (I’m not joking.  That part of the docs was
> written before the introduction of *default-character-encoding*.)
> > when that part (what part is it, by the way?) was written, :iso-8859-1
> was the default character encoding.
>
> Yes, I know.  That’s why I thought it was wrong, because the introduction
> of *default-external-format* had rendered it obsolete.  But then I read it
> more closely and remembered my history...
>
> > Changing the functions in question to use (e.g.)
> >
> > (defun encode-string-to-octets (string &key (external-format :default)
> ....) ...
> >
> > seems to be another approach
>
> Yes, just more work.
>
> > I suspect (but don't claim to know) that most uses of things like
> ENCODE-STRING-TO-OCTETS involve an explicit
> > :EXTERNAL-FORMAT argument.
>
> As one who believes that UTF-8 is the One True Encoding, I have:
>
> (setf *default-external-format* :utf-8)
>
> in my ccl-init file, and I never pass an explicit argument to
> ENCODE-STRING-TO-OCTETS.  I consider anything that doesn’t work under those
> circumstances to be a bug.  But I also don’t really care that much because
> it’s so easy to just patch or wrap things that don’t work the way I want
> them to.  So personally I think it’s not entirely unreasonable to just
> leave things the way they are.  It depends on how much value you want to
> put on adhering to the principle of least surprise for new users (though
> with CL that ship did sail a long, long time ago).
>
The current behaviour was surprising indeed after reading the documentation
about the ccl:*default-external-format* which says:

"The value of this variable is used when :external-format is unspecified or
specified as :default."

Well, I leave the :external-format unspecified and got the wrong result:

CL-USER> (decode-string-from-octets (encode-string-to-octets "лисп"))
"^Z^Z^Z^Z"
4
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.clozure.com/pipermail/openmcl-devel/attachments/20160308/80c01c12/attachment.htm>


More information about the Openmcl-devel mailing list