[Openmcl-devel] Profiling Parenscript compilation

Thu Dec 4 23:02:03 PST 2008

On Fri, Dec 5, 2008 at 01:49, Scott Bell <sblist at me.com> wrote:

> As a follow-up, I'd like to mention what our bottleneck was. It turned out that
> we were using some expensive CL-PPCRE scanners to do some basic dispatching
> when a simple cons pair would do (although in just a slightly more verbose syntax).
> It's not entirely clear why this is so different in SBCL vs. CCL, but it's something we
>  plan to investigate.

If you have no need to handle non-Latin-1 characters with your regular
expressions, you can set CL-PPCRE:*REGEX-CHAR-CODE-LIMIT* to 256.
This will considerably reduce memory usage and speed up scanner
creation.  Memory usage can be further reduced by setting
CL-PPCRE:*USE-BMH-MATCHERS* to a true value, but that comes at
slightly reduced matching performance.

In general, scanner creation is very slow, thus CL-PPCRE tries to
create scanners at compile time.  This is done using a compiler macro
which tests whether the string passed to one of the matching/replacing
function as a regular expression is a constant.  If it is, the scanner
is created at compile time. It is often sufficient to make the regular
expressions be constants for quite some runtime improvement.

If regular expressions need to be dynamically created, I had some
success by caching the scanners in a hash table indexed by the regular
expression string.  Beware, though:  Scanners are rather large, and it
is possible to end up with huge processes if you cache a lot of
scanners.

As a general note, it is fair to say that SBCL often creates faster
code than CCL, although CCL has improved a lot during the last year.

HTH,
Hans