Index Home About Blog
From: mash@mash.engr.sgi.com (John R. Mashey)
Newsgroups: comp.arch
Subject: Re: OS support for tagged pointers (was Re: 64-bit pointers)
Date: 22 Jul 1998 19:05:05 GMT

In article <sjcEvpCD4.FLr@netcom.com>, sjc@netcom.com (Steven Correll) writes:

|> There's at least one (infamous) example of a widely used program which
|> desperately wanted direct-to-user traps: once upon a time Unix /bin/sh
|> brazenly accessed outside-of-range addresses and relied on a signal
|> handler to allocate more memory in response. (I'm not sure one wants to
|> encourage that sort of thing!)

1) This of course caused nasty problems in the Motorola 68K timeframe,
given the issues with continuation after exception rather than restart.
Numerous companies in the early 1980s had to deal with this.

2) I've always felt bad about this, since I'm the one who accidentally
goaded Steve Bourne into doing this.  [Bell Labs Piscataway used the PWB shell
quite heavily for its scripting needs in the mid-1970s, on PDP-11s of course,
and performance was actually important.  I complained to Steve that
we just couldn't hack a 2:1 performance hit, and he went all-out to tune
the shell up, with this being one of the tricks.

--
-john mashey    DISCLAIMER: <generic disclaimer: I speak for me only...>
EMAIL:  mash@sgi.com  DDD: 650-933-3090 FAX: 650-969-6289
USPS:   Silicon Graphics/Cray Research 6L-005,
2011 N. Shoreline Blvd, Mountain View, CA 94043-1389


From: mash@mash.engr.sgi.com (John R. Mashey)
Newsgroups: comp.arch
Subject: Re: OS support for tagged pointers (was Re: 64-bit pointers)
Date: 24 Jul 1998 22:43:09 GMT

In article <aegl.901295470@stratus>, aegl@swdc.stratus.com (Tony Luck) writes:

|> Organization: Stratus Computer, Inc.
|>
|> richard@cogsci.ed.ac.uk (Richard Tobin) writes:

|> >Since the stack segment is usually extended in the same manner - and
|> >no-one thinks that's unreasonable - why did this cause an additional
|> >problem?
|>
|> On the 68000 stack growth *was* a problem.  The UniPlus+ port (by UniSoft)

Yes; put another way, UNIX semantics provide stack extension,
effectively implicitly from the programmer's viewpoint;
brk/sbrk have not acted in the same way, they are explicit calls,
rather than implicit work provided by the OS.

Although various tricks have been used for the
stack extension problem, in general, they tended to look like:
	1) User code references a location that implies stack growth.
	This can either be a stylized sequence of the sort Tony mentions,
	if that is necessary on the given hardware/OS.
	On some hardware/OS combinations, this is unnecessary, and a
	common strategy is to assume that a reference to unallocated memory
	beyond the stack pointer is a an implicit attempt to extend the
	stack, and the OS does.

	2) But the semantics of brk/sbrk don't work that way,
	i.e., the changes occur only in response to explicit syscalls.

	3) The OS efforts for stack extension are relatively staightforward:
	suspend the process, allocate some memory, and either restart the
	trapped instruction (if that's possible), or do the kind of special-case
	skip-over that Tony describes.

	4) As for 3), the same is true for a normal page-fault: the OS
	takes care of thing, and then gets back to the user.

	5) But, for the case in question, invoking a user-level signal-handler
	has some awkward issues that strongly complexify the OS, and
	easily have security holes if done wrong.  For the kind of case like
	stack underflow, or page fault, there is one magic continuation
	record that need be kept around at one time, and such is easily
	stuck into kernel stack associated with a process, or u-block, etc.
	On the other hand, if you have to safe the continuation record
	somewhere, call a user signal handler, the handler can execute fairly
	arbitrary code, including more syscalls, cause page faults,
	stack underflows, etc ... and recursively, and while one be tempted
	to stuff the information away in user space, sometimes the nature of
	the continuation information could break security.

	6) This problem sounds like it was an especially peculiarity of the
	fact that MC68Ks were popular UNIX porting targets, but it's just
	an example of a generic problem that has happened many times.
	This one is well-remembered because it just happened to be faced by
	quite a few people, all rushing products out in parallel.


-john mashey    DISCLAIMER: <generic disclaimer: I speak for me only...>
EMAIL:  mash@sgi.com  DDD: 650-933-3090 FAX: 650-969-6289
USPS:   Silicon Graphics/Cray Research 6L-005,
2011 N. Shoreline Blvd, Mountain View, CA 94043-1389


From: mash@mash.engr.sgi.com (John R. Mashey)
Newsgroups: comp.arch
Subject: Re: OS support for tagged pointers (was Re: 64-bit pointers)
Date: 29 Jul 1998 19:20:38 GMT

In article <y4r9z7dbsy.fsf@mailhost.neuroinformatik.ruhr-uni-bochum.de>,
Jan Vorbrueggen <jan@mailhost.neuroinformatik.ruhr-uni-bochum.de> writes:

|> mash@mash.engr.sgi.com (John R. Mashey) writes:
|>
|> > 	5) But, for the case in question, invoking a user-level signal-handler
|> > 	has some awkward issues that strongly complexify the OS, and
|> > 	easily have security holes if done wrong.  For the kind fo case like
|> > 	stack underflow, or page fault, there is one magic continuation
|> > 	record that need be kept around at one time, and such is easily
|> > 	stuck into kernel stack associated with a process, or u-block, etc.
|> > 	On the other hand, if you have to safe the continuation record
|> > 	somewhere, call a user signal handler, the handler can execute fairly
|> > 	arbitrary code, including more syscalls, cause page faults,
|> > 	stack underflows, etc ... and recursively, and while one be tempted
|> > 	to stuff the ifnormation away in user space, sometimes the nature of
|> > 	the continuation information could break security.
|>
|> Ah, now I understand the problem better: the point is that you want to
|> offer user exception handling for a page fault, and at the same time
|> offer to restart the original faulting instruction automatically. Well,
|> why does one have to? Make it the user's exceptions handler's duty to
|> get the restart of the mainline code work accoroding to the
|> programmer's intentions, and let's not have to OS guess what those
|> might be. As long as the various instructions' semantics are
|> well-defined (even if sometimes problematical, as in the case of the
|> 68k), the services such as offered by VMS' exception handling machnism
|> should be enough. Certainly the Bourne shell's optimization of handling
|> a page fault in the heap by extending it would have worked, even on the
|> 68k, because it know that the page fault occurs in response to one
|> well-defined operation.

Let's try again, hopefully I'll find something I posted on this topic a
year or two back that handles all this in a coherent way, but the
fundamental tricky problem is still there, and shows up in various
forms, which is the tradeoff between two extremes:
	clean, simple, portable ... but not powerful enough for some uses
vs
	complex, not very simple, unportable, sometimes dangerous, but powerful

The signal mechanism could have been designed as you suggest, but it wasn't,
and it was there years before this issue arose, for better or worse.
The suggestion above might have solved the Bourne shell's case, but it's
not particularly clean or portable across the range of CPUs on which
UNIXI has been ported, and it may well rely on exact knowledge of the
compiler-generated code [which has happened], and so is also vulnerable to
changes in compiler technology.

From the software view, the cleanest simplest model is:
	(a) The CPU provides an exception PC that points at the
	    instruction that caused the exception.
	(b) The instruction has had no side-effects, other than raising
	    the exception, or at least, if there are side-effects, enough
	    information is recorded to make it easy to undo them, or in
	    case of such things as S/370's MVCL, to continue them.
	(c) All instructions logically preceding the faulting instruction
	    are done, or at the very least, there are safe sequences for
	    assuring this.
	(d) No instruction logically after the faulting instruction has
	    had any effect.
	(e) It is cleanly possible to return to the faulting instruction and
	    restart it.

Unfortunately, these kinds of wishes:
	(a) Sometimes impose some performance penalties, implementation
	costs upon CPU designs, although most RISC chips do these without
	too much overhead.
	(b) Have often clashed with odd cases in various CPUs, including
		360/91 & 360/67 (both of which had some imprecise exceptions)
		68000 stuff mentioned before
		PRISMA SPARC had some "interesting" issues
		Intel i860
		VAX 9000 (I think, not from personal experience)
	(c) Often conflict with the wish to have both simple and fast floating
		point units, especially when the design included
		in-order issue to multiple floating-point units with long
		latencies.  In the wish to go fast, it is easy to end up
		with designs where late-detected exceptions violate (c) or (d).
		This is also where the old R2000 patent came from to do
		a quick check on exponents, either guaranteeing there would be
		no exception, and going full-blast, or deciding that there might
		be an exception, and stalling the pipeline until done.

Anyway, all of this has been traditionally one of the more subtle and harder
problems to get right:
	(a) Designing language and OS interfaces that are powerful enough,
	are reasonably portable, efficient
	(b) That do not make implicit assumptions about the underlying
	hardware that get broken by reasonable later implementations.
	(c) That do not require explicit guarantees of the underlying
	hardware and OS, that either constrain later systems, or need
	to get changed.

Following are some of the common assumptions that have been made that
caused trouble later:

	(a) Systems are uniprocessors, therefore the OS can use set-priority
	level for mutual exclusion.
		Broken by SMPs.
	(b) Within a given process, there is only one thread.
		Broken by threads, major work to get thread-safe libraries.
		errno is not such a good idea :-)
	(c) A CPU effectively is a sequential machine that executes one
		instruction at at time, in order.
		Broken by machines with funny exception models.
	(d) Not only (c), but it is allowed for one instruction to store
		into the immediately following instruction, and have that
		work.  [Code I wrote on S/360s in 1969 did this, and it
		still is being used; all the S/360-derivative family have
		had to implement this painful feature.]
	(e) A CPU uses strong-ordering of all memory accesses
		Broken, sometimes, by distinctions between cached and
		uncached accesses, but of course, by great desires to
		go to less strong orderings in the presence of caches,
		long memory latencies, write buffers, etc, etc.

--
-john mashey    DISCLAIMER: <generic disclaimer: I speak for me only...>
EMAIL:  mash@sgi.com  DDD: 650-933-3090 FAX: 650-969-6289
USPS:   Silicon Graphics/Cray Research 6L-005,
2011 N. Shoreline Blvd, Mountain View, CA 94043-1389

Index Home About Blog