Index
Home
About
Blog
From: mash@mash.engr.sgi.com (John R. Mashey)
Newsgroups: comp.arch
Subject: Re: OS support for tagged pointers (was Re: 64-bit pointers)
Date: 22 Jul 1998 19:05:05 GMT
In article <sjcEvpCD4.FLr@netcom.com>, sjc@netcom.com (Steven Correll) writes:
|> There's at least one (infamous) example of a widely used program which
|> desperately wanted direct-to-user traps: once upon a time Unix /bin/sh
|> brazenly accessed outside-of-range addresses and relied on a signal
|> handler to allocate more memory in response. (I'm not sure one wants to
|> encourage that sort of thing!)
1) This of course caused nasty problems in the Motorola 68K timeframe,
given the issues with continuation after exception rather than restart.
Numerous companies in the early 1980s had to deal with this.
2) I've always felt bad about this, since I'm the one who accidentally
goaded Steve Bourne into doing this. [Bell Labs Piscataway used the PWB shell
quite heavily for its scripting needs in the mid-1970s, on PDP-11s of course,
and performance was actually important. I complained to Steve that
we just couldn't hack a 2:1 performance hit, and he went all-out to tune
the shell up, with this being one of the tricks.
--
-john mashey DISCLAIMER: <generic disclaimer: I speak for me only...>
EMAIL: mash@sgi.com DDD: 650-933-3090 FAX: 650-969-6289
USPS: Silicon Graphics/Cray Research 6L-005,
2011 N. Shoreline Blvd, Mountain View, CA 94043-1389
From: mash@mash.engr.sgi.com (John R. Mashey)
Newsgroups: comp.arch
Subject: Re: OS support for tagged pointers (was Re: 64-bit pointers)
Date: 24 Jul 1998 22:43:09 GMT
In article <aegl.901295470@stratus>, aegl@swdc.stratus.com (Tony Luck) writes:
|> Organization: Stratus Computer, Inc.
|>
|> richard@cogsci.ed.ac.uk (Richard Tobin) writes:
|> >Since the stack segment is usually extended in the same manner - and
|> >no-one thinks that's unreasonable - why did this cause an additional
|> >problem?
|>
|> On the 68000 stack growth *was* a problem. The UniPlus+ port (by UniSoft)
Yes; put another way, UNIX semantics provide stack extension,
effectively implicitly from the programmer's viewpoint;
brk/sbrk have not acted in the same way, they are explicit calls,
rather than implicit work provided by the OS.
Although various tricks have been used for the
stack extension problem, in general, they tended to look like:
1) User code references a location that implies stack growth.
This can either be a stylized sequence of the sort Tony mentions,
if that is necessary on the given hardware/OS.
On some hardware/OS combinations, this is unnecessary, and a
common strategy is to assume that a reference to unallocated memory
beyond the stack pointer is a an implicit attempt to extend the
stack, and the OS does.
2) But the semantics of brk/sbrk don't work that way,
i.e., the changes occur only in response to explicit syscalls.
3) The OS efforts for stack extension are relatively staightforward:
suspend the process, allocate some memory, and either restart the
trapped instruction (if that's possible), or do the kind of special-case
skip-over that Tony describes.
4) As for 3), the same is true for a normal page-fault: the OS
takes care of thing, and then gets back to the user.
5) But, for the case in question, invoking a user-level signal-handler
has some awkward issues that strongly complexify the OS, and
easily have security holes if done wrong. For the kind of case like
stack underflow, or page fault, there is one magic continuation
record that need be kept around at one time, and such is easily
stuck into kernel stack associated with a process, or u-block, etc.
On the other hand, if you have to safe the continuation record
somewhere, call a user signal handler, the handler can execute fairly
arbitrary code, including more syscalls, cause page faults,
stack underflows, etc ... and recursively, and while one be tempted
to stuff the information away in user space, sometimes the nature of
the continuation information could break security.
6) This problem sounds like it was an especially peculiarity of the
fact that MC68Ks were popular UNIX porting targets, but it's just
an example of a generic problem that has happened many times.
This one is well-remembered because it just happened to be faced by
quite a few people, all rushing products out in parallel.
-john mashey DISCLAIMER: <generic disclaimer: I speak for me only...>
EMAIL: mash@sgi.com DDD: 650-933-3090 FAX: 650-969-6289
USPS: Silicon Graphics/Cray Research 6L-005,
2011 N. Shoreline Blvd, Mountain View, CA 94043-1389
From: mash@mash.engr.sgi.com (John R. Mashey)
Newsgroups: comp.arch
Subject: Re: OS support for tagged pointers (was Re: 64-bit pointers)
Date: 29 Jul 1998 19:20:38 GMT
In article <y4r9z7dbsy.fsf@mailhost.neuroinformatik.ruhr-uni-bochum.de>,
Jan Vorbrueggen <jan@mailhost.neuroinformatik.ruhr-uni-bochum.de> writes:
|> mash@mash.engr.sgi.com (John R. Mashey) writes:
|>
|> > 5) But, for the case in question, invoking a user-level signal-handler
|> > has some awkward issues that strongly complexify the OS, and
|> > easily have security holes if done wrong. For the kind fo case like
|> > stack underflow, or page fault, there is one magic continuation
|> > record that need be kept around at one time, and such is easily
|> > stuck into kernel stack associated with a process, or u-block, etc.
|> > On the other hand, if you have to safe the continuation record
|> > somewhere, call a user signal handler, the handler can execute fairly
|> > arbitrary code, including more syscalls, cause page faults,
|> > stack underflows, etc ... and recursively, and while one be tempted
|> > to stuff the ifnormation away in user space, sometimes the nature of
|> > the continuation information could break security.
|>
|> Ah, now I understand the problem better: the point is that you want to
|> offer user exception handling for a page fault, and at the same time
|> offer to restart the original faulting instruction automatically. Well,
|> why does one have to? Make it the user's exceptions handler's duty to
|> get the restart of the mainline code work accoroding to the
|> programmer's intentions, and let's not have to OS guess what those
|> might be. As long as the various instructions' semantics are
|> well-defined (even if sometimes problematical, as in the case of the
|> 68k), the services such as offered by VMS' exception handling machnism
|> should be enough. Certainly the Bourne shell's optimization of handling
|> a page fault in the heap by extending it would have worked, even on the
|> 68k, because it know that the page fault occurs in response to one
|> well-defined operation.
Let's try again, hopefully I'll find something I posted on this topic a
year or two back that handles all this in a coherent way, but the
fundamental tricky problem is still there, and shows up in various
forms, which is the tradeoff between two extremes:
clean, simple, portable ... but not powerful enough for some uses
vs
complex, not very simple, unportable, sometimes dangerous, but powerful
The signal mechanism could have been designed as you suggest, but it wasn't,
and it was there years before this issue arose, for better or worse.
The suggestion above might have solved the Bourne shell's case, but it's
not particularly clean or portable across the range of CPUs on which
UNIXI has been ported, and it may well rely on exact knowledge of the
compiler-generated code [which has happened], and so is also vulnerable to
changes in compiler technology.
From the software view, the cleanest simplest model is:
(a) The CPU provides an exception PC that points at the
instruction that caused the exception.
(b) The instruction has had no side-effects, other than raising
the exception, or at least, if there are side-effects, enough
information is recorded to make it easy to undo them, or in
case of such things as S/370's MVCL, to continue them.
(c) All instructions logically preceding the faulting instruction
are done, or at the very least, there are safe sequences for
assuring this.
(d) No instruction logically after the faulting instruction has
had any effect.
(e) It is cleanly possible to return to the faulting instruction and
restart it.
Unfortunately, these kinds of wishes:
(a) Sometimes impose some performance penalties, implementation
costs upon CPU designs, although most RISC chips do these without
too much overhead.
(b) Have often clashed with odd cases in various CPUs, including
360/91 & 360/67 (both of which had some imprecise exceptions)
68000 stuff mentioned before
PRISMA SPARC had some "interesting" issues
Intel i860
VAX 9000 (I think, not from personal experience)
(c) Often conflict with the wish to have both simple and fast floating
point units, especially when the design included
in-order issue to multiple floating-point units with long
latencies. In the wish to go fast, it is easy to end up
with designs where late-detected exceptions violate (c) or (d).
This is also where the old R2000 patent came from to do
a quick check on exponents, either guaranteeing there would be
no exception, and going full-blast, or deciding that there might
be an exception, and stalling the pipeline until done.
Anyway, all of this has been traditionally one of the more subtle and harder
problems to get right:
(a) Designing language and OS interfaces that are powerful enough,
are reasonably portable, efficient
(b) That do not make implicit assumptions about the underlying
hardware that get broken by reasonable later implementations.
(c) That do not require explicit guarantees of the underlying
hardware and OS, that either constrain later systems, or need
to get changed.
Following are some of the common assumptions that have been made that
caused trouble later:
(a) Systems are uniprocessors, therefore the OS can use set-priority
level for mutual exclusion.
Broken by SMPs.
(b) Within a given process, there is only one thread.
Broken by threads, major work to get thread-safe libraries.
errno is not such a good idea :-)
(c) A CPU effectively is a sequential machine that executes one
instruction at at time, in order.
Broken by machines with funny exception models.
(d) Not only (c), but it is allowed for one instruction to store
into the immediately following instruction, and have that
work. [Code I wrote on S/360s in 1969 did this, and it
still is being used; all the S/360-derivative family have
had to implement this painful feature.]
(e) A CPU uses strong-ordering of all memory accesses
Broken, sometimes, by distinctions between cached and
uncached accesses, but of course, by great desires to
go to less strong orderings in the presence of caches,
long memory latencies, write buffers, etc, etc.
--
-john mashey DISCLAIMER: <generic disclaimer: I speak for me only...>
EMAIL: mash@sgi.com DDD: 650-933-3090 FAX: 650-969-6289
USPS: Silicon Graphics/Cray Research 6L-005,
2011 N. Shoreline Blvd, Mountain View, CA 94043-1389
Index
Home
About
Blog