[Openmcl-devel] Re: Hemlock performance investigation

Andrew P. Lentvorski, Jr. bsder at mail.allcaps.org
Fri Aug 27 16:47:31 PDT 2004


On Aug 27, 2004, at 8:58 AM, Hamilton Link wrote:

> character-at-index is being called across the whole file on every 
> keystroke. Opening a file, it looks like it's being called about 34 
> times per character in the file (which is a little odd), but then it 
> settles down to scanning across the entire file once every keystroke.

Would the buffer happen to be one monolithic Unicode NSString (or is 
something expecting to be accessing an NSString)?  If so, O(n) accesses 
occur for every call to character-at-index since this is Unicode.  It 
has to walk the entire string from the beginning since the size in 
bytes of a character is not fixed.  If you scan every character in the 
file, then that is also O(n).

You wind up with an O(n^2) algorithm.  It doesn't matter *how* much you 
speed up the constant factors.  An O(n^2) algorithm is always going to 
be slow.  A quick test of this is whether an operation at the beginning 
of the file is much faster than at the end.  At one end, things will be 
pretty close to O(n), and fast.  At the other, things should be very 
close to O(n^2), and slow.  If you have an O(n) algorithm, then things 
should range from O(n), fast, on one side to O(1), blazing, on the 
other.

Firing up the (require "COCOA") system on "cocoa-editor.lisp" shows 
that insertions at the beginning of the file are pretty slow, but 
insertions at the end of the file are pretty fast.  This is backward 
from what I would expect, but I still smell an O(n^2) problem.


There are probably two options:

1) The better option is probably to use an NSTextView.  People seem to 
be doing very nice editors with them.  (cf. subethaedit and smultron).

2) The quicker option is probably to subclass NSString, implement a 
tree structure so that index accesses aren't O(n), and
use that instead.


>  At 4.5e-5 s per call, in a 5000-character file, this makes about a 
> .25 second delay per keystroke and the bulk of the delay per, and adds 
> up to tens of seconds in the course of typing a couple of lines, so 
> it's where most of the time is going.

45e-6 s per call is not that bad.  It's not great, but it's not bad.  
Especially for something which is crossing interface boundaries.  I 
would expect something closer to 4.5e-6, but even so, that would still 
be painful for a large file.

-a




More information about the Openmcl-devel mailing list