Index
Home
About
From: mash@mash.engr.sgi.com (John R. Mashey)
Newsgroups: comp.arch
Subject: Re: System Instruction Caches in Superscalar processors?
Date: 4 Feb 1996 03:03:06 GMT
In article <4eqngr$lml@news.connectnet.com>, rschnapp@fido.metaflow.com
(Russ Schnapp) writes:
|> The idea of dedicating some ICache to system processes is not entirely
|> daft. It's probably overkill, though. A set-associative cache does the
|> trick, and is a more general solution. It solves the interrupt-handler
This has been done for embedded applications, although generally with
off-chip caches, where one knows a lot about the behavior of the
overall system, and also where one expects to spend a fair amount of time
in system state, and do a lot of low-overhead context-switching.
For example, somebody once used MIPS R3000 chips with a double I-cache,
where the extra I-cache was selected by a bit settable by the kernel
of a telephone switching application.
--
-john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: mash@sgi.com
DDD: 415-933-3090 FAX: 415-967-8496
USPS: Silicon Graphics 6L-005, 2011 N. Shoreline Blvd, Mountain View, CA 94039-7311
From: mash@mash.engr.sgi.com (John R. Mashey)
Newsgroups: comp.arch
Subject: Re: Long latency instructions in optimizing compilers
Date: 1 Dec 1998 00:00:28 GMT
In article <abaum-3011981152330001@althea.pa.dec.com>, abaum@pa.dec.com
(Allen J. Baum) writes:
|> You can talk about a machine with extremely fast interrupt response.
|> I agree that interrupt response can be degraded by execution of a
|> non-interruptible
|> multi-cycle instruction.
So, it sounds like this thread was really about interrupt latency,
and guarantees thereof? i.e., such as that flavor of real-time that
cares about interrupt latency.
Note of course, that people who like tight real-time bounds on interrupt
latency hate things like:
a) Long cycle-count instructions that have to be finished before
the interrupt is recognized. [this thread of discussion ... and only
the tip of the iceberg, since the following can be much worse]
b) Non-lockable caches
c) Code that can cause TLB misses
d) Complex implementations, with lots of internal state that has
to be unraveled, as in some out-of-order CPUs.
e) Complex multiprocessor issues
Suppose you want a real-time UNIX on a multiprocessor (SMP or ccNUMA);
such things exist, but usually with fairly generous latency guarantees.
Suppose you want a tight guarantee:
(a) Write tight interrupt handler, counting instructions.
(b) On systems where possible, lock down the interrupt code into
part of the cache.
(c) Use large-page mapping, or unmapped space for the handler,
to avoid TLB misses.
but still:
(d) The CPU may hold off interrupts for the length of a long
instruction (~100 cycles certainly possible). Worse, it could
end up being stalled for 100s of cycles on one instruction.
For example, suppose you can use uncached load instructions to
retrieve data from an I/O device, and you've already issued the
load to the device, and the load has side-effects (some do,
as in retrieving next item from a queue), and can wait some modest
number of microseconds before returning. (100s or 1000s of cycles).
(e) An aggressive o-o-o CPU might take 10s of cycles to unwind it's
state.
(f) Even if the instruction stream looks OK, the CPU's bus may be
blocked off by a stream of memory requests that
were issued to the memory system before the interrupt arrived,
or by combinations of misses and writebacks of dirty data...
(100s, even 1000s of cycles, depending on how many outstanding
requests there are, and the nature of the coherency hardware, and
which events cause the CPU to stall.
Anyway, worst cases (not average cases) can get amazingly bad ... and
divisions or square roots are *nothing* compared to the other stuff
that can happen ... which is why hard real-time folks seem to prefer
simpler CPUs, with controllable caches, and understandable cycle counts :-)
--
-john mashey DISCLAIMER: <generic disclaimer: I speak for me only...>
EMAIL: mash@sgi.com DDD: 650-933-3090 FAX: 650-969-6289
USPS: Silicon Graphics/Cray Research 40U-005,
2011 N. Shoreline Blvd, Mountain View, CA 94043-1389
Index
Home
About