[Openmcl-devel] Slime's utf-8-unix
Gary Byers
gb at clozure.com
Fri Oct 27 15:32:02 PDT 2006
On Fri, 27 Oct 2006, Ben Hyde wrote:
>
> In the HTTP world the character set is very volatile. Coming from
> that world my first attempt was to slam the encoding just after the
> accept. At this moment I'm actually a bit unclear on how to do that
> in a safe and reliable manner.
The fact that STREAM-EXTERNAL-FORMAT is SETFable is supposed to help
to deal with this.
? (defvar *s* (make-socket :remote-host "clozure.com" :remote-port :smtp))
*S*
;;; Whoops; forgot that SMTP wants CRLF termination, and things should probably
;;; be in "NET-ASCII" at this point.
? (setf (stream-external-format *s*) '(:character-encoding :us-ascii
:line-termination :crlf))
#<EXTERNAL-FORMAT :US-ASCII/:CRLF #x300040EBC27D>
? (read-line *s*)
"220 clozure.com ESMTP Sendmail 8.13.3/8.13.3; Fri, 27 Oct 2006 15:10:41 -0600 (MDT)"
So, the stream handled CRLF translation for us (which is at least and
at most a small victory.) In general, changing a stream's
external-format affects subsequent user-level character I/O
operations; if there are already some buffered octets, they stay buffered;
changing the streams character encoding merely changes the way that those
octets will be interpreted as characters.
Real-world protocols may contain additional details (transfer encodings - if
that's the correct term - like base64 and quoted-printable) that this doesn't
help with at all, but it should be possible/safe/reliable to change character
encoding and/or line-termination on the fly.
Back to Swank: that seems to give us a third option, namely doing:
(let* ((stream-socket (ccl:accept-connection ... :wait t)))
(setf (stream-external-format stream-socket)
(make-external-format ...))
stream-socket)
in swank-openmcl.lisp's ACCEPT-CONNECTION.
Whether that's at all preferable to binding the special variables is unclear;
in theory. if the CCL:ACCEPT-CONNECTION was interrupted and something tried
to create a socket (from a break loop or something) at that time, the fact
that the defaults have been bound to non-default values might create
unexpected problems. For a number of reasons, that's an unlikely scenario,
so whether it's done via binding the defaults of via (SETF STREAM-EXTERNAL-FORMAT)
may not matter too much.
>> It'd probably be good - once some dust settles - to make utf-8 the
>> default;
>> getting Slime to support it would settle a lot of that dust.
>
> Absolutely! Unicode support in the emacs world is just as tangled as
> it is in the lisp world. Dust would come out of both sides.
>
I read a page recently which described the various ways that some versions
of XEmacs encoded characters internally; as one might expect, it was bloodcurdling.
> - ben
>
More information about the Openmcl-devel
mailing list