[Openmcl-devel] Class instance vs struct creation question

Wed May 11 16:35:20 PDT 2011

Thanks for the explanation Gary.  That is indeed a fair amount of stuff to
account for after all...

/Jon

Original Message:
-----------------
From: Gary Byers gb at clozure.com
Date: Wed, 11 May 2011 00:32:19 -0600 (MDT)
To: j-anthony at comcast.net, openmcl-devel at clozure.com
Subject: Re: [Openmcl-devel] Class instance vs struct creation question

If you pretty-print the macroexpansion of

(defstruct foo a (b 1))

you'll see (amongst other things) the definition of the default constructor:

(DEFUN MAKE-FOO (&KEY ((:A #:A) NIL) ((:B #:B) 1))
   (GVECTOR :STRUCT
            '(#<CLASS-CELL for FOO #x302004726D4D>)
            #:A
            #:B))

CCL::GVECTOR is a macro whose effect is similar to the function CL:VECTOR;
rather than creating and initializing a SIMPLE-VECTOR with the specified
elements, the call above will create a vector-like object of the primitive
type used to represent structure instances.  The 0th element of that
structure will contain some type information (the list containing a 
CLASS-CELL object) and subsequent elements contain the values of
the constructor's arguments (defaulted to NIL and 1).  That operation
is a handful of instructions (maybe two hands full); processing a couple
of keyword arguments is probably at least as expensive.

In the similar case involving a standard class:

(defclass bar ()
   ((a :initform nil :initarg :a)
    (b :initform 1 :initarg :b)))

(make-instance 'bar ...)

conceptually has to look up the class name and do:

(make-instance (find-class 'bar) ...)

though the cost of the lookup can often be avoided if the class name in the
call to MAKE-INSTANCE is constant.)

MAKE-INSTANCE ((class standard-class)) &rest initargs

does

(apply #'allocate-instance class initargs)

For STANDARD-CLASSes, the initargs are ignored by the primary
ALLOCATE-INSTANCE
method.   That method creates a couple of vector-like objects: the instance
itself and a separate "slots vector".  (Class redefinition the number of
slots
in the instance to change; keeping the slots in a separate vector-like
object
makes this easier to deal with.)  The slots in the newly-allocated instance
(actually in its slot-vector) are all unbound.

MAKE-INSTANCE passes the newly-allocated instance and the initargs to 
INITIALIZE-INSTANCE, which in turn passes the instance and initargs
to SHARED-INITIALIZE.  SHARED-INITIALIZE initializes each slot based
on whether it has an initarg in the initargs list, on whether it's
still unbound, and on its initform (which is usually implemented as
an initfunction.)  When this process is all done, you've got an initialized
instance.

It can be hard to measure this sort of thing (if you do MAKE-FOO or
MAKE-INSTANCE enough times to get reasonable measurements, you're also
measuring memory allocation costs.)  Depending on lots of factors (how
many slots are involved, whether there are type constraints, whether
any of the GFs involved have applicable non-primary methods, ...) a
difference of 10X is probably atypically low.  I find that 10000 calls
to (MAKE-FOO) are about 400-500X faster than the same number of calls
to (MAKE-INSTANCE 'BAR).  (On a ~3GHz iMac, each instance of BAR seems
to take ~1.5usec to allocate and initialize and some of that's just
allocation time; that might be higher if (for instance) there are more
slots and more initargs.  Whether or not that's lost in the noise depends
on context, but it certainly might not be.)

There's a lot of room for improvement there: a lot of the cost in the
STANDARD-INSTANCE case is attributable to generality.  Making a INSTANCE
object and a SLOTS-VECTOR and hooking them together is just a few hands
full of instructions; calling the generic function ALLOCATE-INSTANCE
and passing it an indefinite number of initargs (that the standard-class
primary method will ignore) is probably a lot more handfuls.  If we knew
a few things - that BAR was a STANDARD-CLASS and that there are no
applicable
extending methods on ALLOCATE-INSTANCE, we could do something simpler than
call the generic function.  If we knew how many slots instances of BAR
would have, we could inline the whole shebang.   We could do similar (less
general) things to optimize the initialization steps if we had similar
knowledge; unfortunately we generally don't have that knowledge (or can't
simply and generally use it.)  Methods can be added and classes can be
redefined in ways that can invalidate these optimizations, and the acts
of changing classes and methods have to invalidate and possibly update
affected optimizations.

CCL has (or had) a set of optimizations (on MAKE-INSTANCE and other
things) that kicked in when the CLOS world was "frozen" (class and
method [re]definition was disallowed.)  They worked fairly well as
far as they went, but developers found the notion that the CLOS world
was frozen impractical.  (Apparently, it made fixing bugs and adding
new features awkward or impossible.  Go figure.)

Some implementations do a lot of work (that CCL doesn't) when classes
and methods are [re]defined, so that optimizations are enabled and
invalidated whenever the world changes.  This can make things a bit ...
ponderous ... and it may result in scenarios where the system's putting
a lot of effort into optimizing things that aren't really important.

It might be nice to be able to say (for instance) that you wanted
calls to (MAKE-INSTANCE 'BAR) to be made as fast as possible based
on the current state of the world and have the lisp scrunch its brow
and try to achieve that and to have that invalidated if anything
relevant changed in the CLOS environment.  If you said that and
then redefined BAR or a superclass of BAR, you might need to say
it again in order to make subsequent calls to MAKE-INSTANCE faster
again.

On Tue, 10 May 2011, j-anthony at comcast.net wrote:

> Hi All,
>
> I thought I sent this before, but if so it fell off the edge of the
> universe.
>
> Doing some trials, I notice that creating an instance of a class takes
> around 10x more time than creating a "comparable" structure.  By
> "comparable" I mean 'has the same user level slot definitions'.  This is
> mostly an issue of curiosity as the difference only starts to be
> "noticeable" when creating 100s of thousands and in any practical context
> the difference would surely be in the noise of the domain level processing
> required in either case.  Even so, it (at least naively) seems like rather
> a large difference.  Simply for the sake of edification, would someone
> (Gary, Mathew, ...) care to comment?
>
> Thanks!
>
> /Jon
>
>
> --------------------------------------------------------------------
> mail2web.com ? What can On Demand Business Solutions do for you?
> http://link.mail2web.com/Business/SharePoint
>
>
> _______________________________________________
> Openmcl-devel mailing list
> Openmcl-devel at clozure.com
> http://clozure.com/mailman/listinfo/openmcl-devel
>
>

--------------------------------------------------------------------
mail2web LIVE – Free email based on Microsoft® Exchange technology -
http://link.mail2web.com/LIVE