[Openmcl-devel] Quick HW question...
Jon Anthony
j-anthony at comcast.net
Tue Nov 16 11:53:29 PST 2010
On Tue, 2010-11-16 at 12:33 -0500, R. Matthew Emerson wrote:
> On Nov 16, 2010, at 10:16 AM, Jon Anthony wrote:
>
> > This is some good information. Thanks for the pointers. But it also
> > highlights an issue I've thought about from time to time: with modern
> > processor architectures (especially pipelines, caches, and now cores)
> > how does one _not_ write naive code for these things? Sure, 90+% of the
> > worry on this goes to the compiler writers, but it can be easy to
> > accidentally write something that defeats their efforts.
>
> On modern x86, I've all but given up.
That's actually an example where "LOL" was appropriate
> I just write
> naive and straightforward code, and assume (or hope) that
> the hardware guys have optimized for that. In my experience,
> measurements typically show that the difference in execution
> time between "clever" and naive code is negligible.
>
> Intel has an optimization guide (you should be able to
> find it at http://www.intel.com/products/processor/manuals/).
>
> Clearly you can win big by writing cache-aware (or at least
> virtual memory-aware) code; I remember a fairly ecent article in
> ACM Queue about this.
>
> http://queue.acm.org/detail.cfm?id=1814327
Thanks for these pointers as well.
/Jon
>
> One interesting quotation:
>
> The speed disparity between primary and secondary storage on the Atlas Computer was on the order of 1:1,000. The Atlas drum took 2 milliseconds to deliver a sector; instructions took approximately 2 microseconds to execute. You lost around 1,000 instructions for each VM page fault.
>
> On a modern multi-issue CPU, running at some gigahertz clock frequency, the worst-case loss is almost 10 million instructions per VM page fault. If you are running with a rotating disk, the number is more like 100 million instructions.
>
>
More information about the Openmcl-devel
mailing list