[Openmcl-devel] Bug with default-external-format

Pascal J. Bourguignon pjb at informatimago.com
Mon Mar 26 13:51:08 UTC 2012

ven <ven at mantra.io> writes:

> http://paste.lisp.org/display/128568
> Not sure if this SLIME or CCL related, but you can view the original
> hello-world function (that contains japanese characters), and the
> output in the slime-repl running ccl. 
> ccl is run with -K utf-8, and the system locale is en_US.UTF_8.

ccl ignores the environment variables for the encoding :-(

A file such as:
(defun hello-world ()
  (format t "おはよう world!"))
is meaningless.  We don't know what encoding it's written in.

If you use emacs:
;; -*- mode:lisp; coding:utf-8 -*-
(defun hello-world ()
  (format t "おはよう world!"))

$ ccl -norc 
Welcome to Clozure Common Lisp Version 1.7-dev-r14788M-trunk
? (setf ccl:*default-file-character-encoding* :utf-8)
? (load "hw.lisp")
? (hello-world)
おはよう world!

Notice that the -K option only sets the terminal encoding, for the
*terminal-io* stream, not for files:

   -K, --terminal-encoding : specify character encoding to use for *TERMINAL-IO*

You may use the following in ~/ccl-init.lisp:

(defun locale-terminal-encoding ()
  "Returns the terminal encoding specified by the locale(7)."
  #+(and ccl windows)
  ;; ccl doesn't support :windows-1252, otherwise we'd use:
  ;; (intern (format nil "WINDOWS-~A" (#_GetACP)) "KEYWORD")
  #-(and ccl windows)
  (dolist (var '("LC_ALL" "LC_CTYPE" "LANG")
               :iso-8859-1) ; some random default…
    (let* ((val (getenv var))
           (dot (position #\. val))
           (at  (position #\@ val :start (or dot (length val)))))
      (when (and dot (< dot (1- (length val))))
        (return (intern (let ((name (string-upcase (subseq val (1+ dot)
                                                           (or at (length val))))))
                          (if (and (prefixp "ISO" name) (not (prefixp "ISO-" name)))
                              (concatenate 'string "ISO-" (subseq name 3))

(defun set-terminal-encoding (encoding)
  ;; Swank sets the encoding on its streams correctly. 
  #-(and ccl (not swank)) (declare (ignore encoding))
  #+(and ccl (not swank))
  (mapc (lambda (stream)
          (setf (ccl::stream-external-format stream)
                (ccl:make-external-format :domain nil
                                          :character-encoding encoding
                                          #+unix :unix
                                          #+windows :windows
                                          #-(or unix windows) :unix)))
        (list (two-way-stream-input-stream  *terminal-io*)
              (two-way-stream-output-stream *terminal-io*)))

(let ((encoding  (locale-terminal-encoding)))
    (set-terminal-encoding encoding)
    (setf ccl:*default-file-character-encoding*   encoding
          ccl:*default-socket-character-encoding* encoding
          ccl:*default-external-format*           #+windows :windows
                                                  #-windows :unix
          ccl:*default-line-termination*          #+windows :windows
                                                  #-windows :unix))

__Pascal Bourguignon__                     http://www.informatimago.com/
A bad day in () is better than a good day in {}.

More information about the Openmcl-devel mailing list