Index Home About
From: mash@mash.engr.sgi.com (John R. Mashey)
Newsgroups: comp.arch
Subject: Re: System Instruction Caches in Superscalar processors?
Date: 4 Feb 1996 03:03:06 GMT

In article <4eqngr$lml@news.connectnet.com>, rschnapp@fido.metaflow.com
(Russ Schnapp) writes:

|> The idea of dedicating some ICache to system processes is not entirely
|> daft.  It's probably overkill, though.  A set-associative cache does the
|> trick, and is a more general solution.  It solves the interrupt-handler

This has been done for embedded applications, although generally with
off-chip caches, where one knows a lot about the behavior of the
overall system, and also where one expects to spend a fair amount of time
in system state, and do a lot of low-overhead context-switching.
For example, somebody once used MIPS R3000 chips with a double I-cache,
where the extra I-cache was selected by a bit settable by the kernel
of a telephone switching application.

-- 
-john mashey    DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP:    mash@sgi.com 
DDD:    415-933-3090	FAX: 415-967-8496
USPS:   Silicon Graphics 6L-005, 2011 N. Shoreline Blvd, Mountain View, CA 94039-7311


From: mash@mash.engr.sgi.com (John R. Mashey)
Newsgroups: comp.arch
Subject: Re: Long latency instructions in optimizing compilers
Date: 1 Dec 1998 00:00:28 GMT

In article <abaum-3011981152330001@althea.pa.dec.com>, abaum@pa.dec.com
(Allen J. Baum) writes:

|> You can talk about a machine with extremely fast interrupt response.
|> I agree that interrupt response can be degraded by execution of a
|> non-interruptible
|> multi-cycle instruction.

So, it sounds like this thread was really about interrupt latency,
and guarantees thereof?  i.e., such as that flavor of real-time that
cares about interrupt latency.

Note of course, that people who like tight real-time bounds on interrupt
latency hate things like:
	a) Long cycle-count instructions that have to be finished before
	the interrupt is recognized.  [this thread of discussion ... and only
	the tip of the iceberg, since the following can be much worse]
	b) Non-lockable caches
	c) Code that can cause TLB misses
	d) Complex implementations, with lots of internal state that has
	to be unraveled, as in some out-of-order CPUs.
	e) Complex multiprocessor issues

Suppose you want a real-time UNIX on a multiprocessor (SMP or ccNUMA);
such things exist, but usually with fairly generous latency guarantees.
Suppose you want a tight guarantee:
	(a) Write tight interrupt handler, counting instructions.
	(b) On systems where possible, lock down the interrupt code into
	part of the cache.
	(c) Use large-page mapping, or unmapped space for the handler,
	to avoid TLB misses.
but still:
	(d) The CPU may hold off interrupts for the length of a long
	instruction (~100 cycles certainly possible).  Worse, it could
	end up being stalled for 100s of cycles on one instruction.
	For example, suppose you can use uncached load instructions to
	retrieve data from an I/O device, and you've already issued the
	load to the device, and the load has side-effects (some do,
	as in retrieving next item from a queue), and can wait some modest
	number of microseconds before returning. (100s or 1000s of cycles).

	(e) An aggressive o-o-o CPU might take 10s of cycles to unwind it's
	state.

	(f) Even if the instruction stream looks OK, the CPU's bus may be
	blocked off by a stream of memory requests that
	were issued to the memory system before the interrupt arrived,
	or by combinations of misses and writebacks of dirty data...
	(100s, even 1000s of cycles, depending on how many outstanding
	requests there are, and the nature of the coherency hardware, and
	which events cause the CPU to stall.

Anyway, worst cases (not average cases) can get amazingly bad ... and
divisions or square roots are *nothing* compared to the other stuff
that can happen ... which is why hard real-time folks seem to prefer
simpler CPUs, with controllable caches, and understandable cycle counts :-)
--
-john mashey    DISCLAIMER: <generic disclaimer: I speak for me only...>
EMAIL:  mash@sgi.com  DDD: 650-933-3090 FAX: 650-969-6289
USPS:   Silicon Graphics/Cray Research 40U-005,
2011 N. Shoreline Blvd, Mountain View, CA 94043-1389

Index Home About