[Openmcl-devel] byte-strings

Gary Byers gb at clozure.com
Fri Aug 20 20:00:50 PDT 2004



On Fri, 20 Aug 2004, Ben wrote:

> hi,
>
> i'm writing an application which wants to treat strings as byte arrays
> for the purposes of FFI.  more specifically, i'd like to be able to
> quickly copy the contents of strings in and out of foreign character
> arrays.  i don't really care about what the character representation
> is, i'm happy treating everyting as bits.
>
> under CMUCL and Franz i can directly pass strings to memcpy, which
> does the trick.  this works even with 16-bit strings in Franz,
> provided i pass in the correct byte-length.  under Lispworks there is
> a function "fli:replace-foreign-array" which i'm hoping will do the
> trick for me.
>
> of course there is nothing stopping me from doing (using UFFI)
>
> (defun copy-buf (str buf len src-offset 0 buf-offset 0))
>    (declare (optimize (speed 3) (safety 0))
>             (type string str)
>             (type array-char buf)
>             (type fixnum len src-offset buf-offset)
>             (dynamic-extent str buf len))
>    (typecase str
>      (simple-string
>       (loop for i fixnum from 0 below len
>             do
>             (setf (deref-array buf '(:array :char) (+ i buf-offset))
>                   (char-code (schar str (+ i src-offset))))))
>      (string
>       (loop for i fixnum from 0 below len
>             do
>             (setf (deref-array buf '(:array :char) (+ i buf-offset))
>                   (char-code (char str (+ i src-offset))))))))
>
> but i find memcpy is much, much faster.  is there any way to get at
> the underlying bits of a string?  e.g. is "with-cstr" using memcpy,
> and can i pass it a buffer to stuff the string into?
>

WITH-CSTR uses something called CCL::%COPY-IVECTOR-TO-PTR, which takes
5 arguments:

(ccl::%COPY-IVECTOR-TO-PTR     source			; an "ivector"
                               source-byte-offset	; not bounds-checked
                               dest			; a MACPTR
                               dest-byte-offset		; no checking here ..
                               nbytes)			; ... or here either

An "IVECTOR" is basically any (SIMPLE-ARRAY <type> (*)), where
<type> is something for which (UPGRADED-ARRAY-ELEMENT-TYPE <type>) is
other than T.  (E.g., a SIMPLE-STRING, a bit vector, a vector of
signed/unsigned bytes, a vector of some float type, etc.)  There are
other, implementation-level things that are also ivectors.

Note that a non-SIMPLE-STRING is -not- an ivector, but it's displaced
to one (see below.)

CCL::%COPY-IVECTOR-TO-PTR's written in assembler; it could be faster
in some cases (using DOUBLE-FLOATs or AltiVec to move larger chunks
of congruently-aligned stuff around), but in practice it's pretty fast.

To copy the bytes in a SIMPLE-STRING to foreign memory:

(ccl::%copy-ivector-to-ptr "ABC" 0 p 0 3)

If a string isn't a SIMPLE-STRING, it's (perhaps transitively) displaced
to one.  CCL::ARRAY-DATA-AND-OFFSET returns two values:

 (0) the underlying (SIMPLE-ARRAY * (*))
 (1) the cumulative displacement

Calling CCL::ARRAY-DATA-AND-OFFSET on something that -is- a
(SIMPLE-ARRAY * (*)) returns that argument and 0.

> thanks in advance, B

If you look at WITH-CSTR (I hadn't done so in a long time ...), you'll
see a lot of code that's a little too general wrapped around a call to
CCL::%COPY-IVECTOR-TO-PTR.  (Hmmm ... if you look at it as long as I
just did, you'll also see that CCL::DEREFERENCE-BASE-STRING should
just be returning the length of its argument as a third value, not
(+ offset length) as it's been doing.)



More information about the Openmcl-devel mailing list