[Openmcl-devel] Quick HW question...

Tue Nov 16 22:03:24 PST 2010

On Tue, 16 Nov 2010, R. Matthew Emerson wrote:

>
> On Nov 16, 2010, at 10:16 AM, Jon Anthony wrote:
>
>> This is some good information.  Thanks for the pointers.  But it also
>> highlights an issue I've thought about from time to time: with modern
>> processor architectures (especially pipelines, caches, and now cores)
>> how does one _not_ write naive code for these things?  Sure, 90+% of the
>> worry on this goes to the compiler writers, but it can be easy to
>> accidentally write something that defeats their efforts.
>
> On modern x86, I've all but given up.  I just write
> naive and straightforward code, and assume (or hope) that
> the hardware guys have optimized for that. In my experience,
> measurements typically show that the difference in execution
> time between "clever" and naive code is negligible.
>

One of the things that (a) can have a big effect on perfomance and (b)
the programming language can give the programmer some control over is
branch prediction, and this seems like it could be useful if it was
kept at a fairly high level.  If you say:

(if (some-test)
   (exceptional-case)
   (common-case))

you generally don't want the pipeline to be filling up with the instructions
that comprise the exceptional case only to discover that the test was false,
that that stuff needs to be thrown away, and that it should start processing
the common case.

Exactly what needs to be done to help the processor predict whether a branch
is taken tends to be architecture- and implementation-dependent.  (It's often
the case that simple rules - like "backward conditional branches are predicted
to be taken and forward conditional branches are predicted to not be taken" -
go a long way.)

Suppose that there were semi-magic macros (I -think- that they could be macros)
called something like EXPECTED-TRUE and EXPECTED-FALSE that each took a single
form and just caused that form to return whatever it returned, but the contract
was that the compiler could treat them as advice about whether the primary
return value was likely to be NIL or not.  So you could say:

(if (expected-false (some-test))
   (exceptional-case)
   (common-case))

and the compiler could do whatever it can do at compile time to try to ensure
that branch prediction favors the common case.

In something like:

(if (flip-coin)
   (one-thing)
   (the-other))

... well, the best that you can hope for is that you don't lose more
than 50% of the time for some obscure reason.  Cases like the first
example - where there's a clear reason to favor one arm of the IF -
are probably fairly common in most code, and if there was some simple,
not-too-onerous way of providing this kind of advice to the compiler
I'd certainly want to use it and imagine that other people would, too.

I think that GCC has a construct - __builtin_expect() or something
similar - that's intended to provide this advice.  I don't know of
any other programming languages that do so (but wouldn't be surprised
if such existed).

You probably don't want to casually design this too deeply into the
language.  C has a 'register' keyword that's supposed to advise the
compiler about which variables should reside in machine registers.
One still sees that used occasionally, but it looks kind of quaint
(like the author of that code really thinks that they can do a better
job of register allocation than the compiler can.)  I can believe
that someone in the future might look at old code and say "Wow.  Remember
2010, when there were branch-misprediction penalties and people said
__builtin_expect() and EXPECTED-TRUE to avoid them ?", but it -is- 2010,
those penalties exist and are often fairly severe, and it seems like
an optional, non-intrusive way of providing advice that could help to
avoid those penalties seems like a good thing.