Index Home About Blog
Date: Mon, 04 May 2009 08:22:12 +0200
From: Terje Mathisen <"terje.mathisen at">
Newsgroups: comp.arch
Subject: Re: The coming death of all RISC chips.
Message-ID: <>

Brett Davis wrote:
> In article <>,
>  Terje Mathisen <"terje.mathisen at"> wrote:
>> There are indeed a few such instructions, i.e. prefetch hints and other
>> "free-to-ignore" opcodes that are designed to allow the
>> compiler/assembler to tell the cpu what is going to happen.
> The prefetch instructions in mid range MIPS and PowerPC are almost
> useless. The problem is it tights up one of the two load ports, so your
> code runs slower, less waiting for the cache miss but overall its a loss.

This is a problem even for asm on x86: The hardware doesn't know that
_this_ particular PREFETCH is absolutely required, so it is free to
disregard it, particularly when the memory bus is (mostly) saturated anyway.

I have seen prefetch slowdowns (in asm) far more often than I have seen
speedups, unfortunately. :-(

OTOH, I have also seen real speedups when the prefetch hints were
replaced by real (dummy) load operations, touching one byte in each
cache line of a 4K (or less than half of L1) block. This treats RAM like
the burst/sequential device it really is, instead of pretending the R
really means you can do Random access.

> What I really want is a tool based off of GCC that tells me all the
> lines where it found potential aliasing issues, and which type of
> aliasing issue.

That would be nice.

> I also want a __declspec(noalias) directive that I can apply to a
> function to force the compiler to ignore any (false) aliasing issues and
> generate fully optimized code for that function.
> It would also be nice if __declspec(noalias) actually worked for my
> compiler...

Indeed. :-)


- <Terje.Mathisen at>
"almost all programming can be viewed as an exercise in caching"

Date: Tue, 08 Sep 2009 09:17:44 +0200
From: Terje Mathisen <>
Newsgroups: comp.arch
Subject: Re: What happened to computer architecture (and comp.arch?)
Message-ID: <>

Robert Myers wrote:
> Prefetch is hugely important, but how it actually works must involve a
> great deal of reverse-engineering on the part of competitors, because
> meaningful details never seem to be forthcoming from manufacturers.
> I'm assuming that Microsoft's compiler designers, for example, know
> lots of useful things that most others don't, and that they got them
> from the horse's mouth under an NDA.

I haven't seen a single x86-type CPU, from any manufacturer, where
letting the compiler issue PREFETCH instructions turns out to be a
general win.

Yes, they obviously do help a little with SPEC, particularly after the
compiler has been carefully tuned for these benchmarks, but in real life
I haven't seen this.

OTOH, hardware-based prefetch, in the form of stream detection in the
memory interface is indeed a huge win, but the compiler isn't involved
at all.
> It must be frustrating to see so much semi-ignorant discussion, but
> the little gems that occasionally fall on the carpet are well worth it
> to some of us.
> Why *didn't* the P4 have a barrel shifter?  Because the watts couldn't

To frustrate me and my asm code?

> be spared, I'm sure, but why was NetBurst jammed into that box?  I'm
> sure there is an answer that doesn't involve involve hopelessly arcane
> details.  Whether it's worth the time of any real computer achitect to
> talk about it would have to be an individual decision.

- <Terje.Mathisen at>
"almost all programming can be viewed as an exercise in caching"

Index Home About Blog