Index Home About Blog
From: torvalds@transmeta.com (Linus Torvalds)
Newsgroups: comp.std.c
Subject: Comments on "norestrict"ed pointer extension
Date: 8 Jun 1999 20:26:44 GMT

Ho humm..

As part of doing Linux work (something that so far has only been done
using gcc as the compiler), we're starting to have trouble with alias
analysis based on type information.  The kernel often bypasses the types
and does various nasty mangling of the type information.  And that makes
the standard alias type rules do the wrong thing for us.

So what I did, was to propose a extension to gcc where we had a way of
telling the compiler to _not_ depend on the normal type-based alias
analysis, and only fall back on the statically provable alias
information.

In a very real sense, this is the reverse of the "restrict" keyword that
was introduced to allow compilers to optimize more.  I'll call it
"norestrict" for obvious reasons.

Now, when approaching the gcc people I made a mistake: I didn't
initially explain exactly what I wanted and what the ramifications were,
and that resulted in some personal flames.  That, together with this not
being a standard thing, made many gcc people less than enthusiastic, and
some people suggested I'd bring the issue up as a standards extension
_first_.  Which I am hereby doing.

The problem is not that we can't do certain type conversions, it's
mainly that in order to be strictly conforming in those conversions (not
that Linux is strictly conforming in many other respects) to make the
compiler happy, the syntactic issues really explode in your face, and it
becomes really ugly to do (and the compiler usually ends up generating
_really_ bad code because you've done memcpy()'s or similar to avoid the
alias rules).

What I am looking for is comments about a "norestrict" kind of keyword,
that works syntactically similarly to the restrict case, but which
instead of saying "this pointer will never alias with any other pointer
of its kind" it says "this pointer can alias with any other pointer even
if the other pointer is _not_ of the same kind".  See the mirror
symmetry?

Just as an example, a "char *" in current C style is always considered
to be unrestricted, and this would extend those kinds of semantics to
other pointers too at the discretion of the programmer.

I additionally suggested an extended rule which would mean that any
explicit cast would always imply this "norestrict" sense for that
particular cast, but before getting into that issue I'd like to see what
people think about the more limited proposal.  Does everybody but me
hate it as a concept?

		Linus


From: torvalds@transmeta.com (Linus Torvalds)
Newsgroups: comp.std.c
Subject: Re: Comments on "norestrict"ed pointer extension
Date: 9 Jun 1999 04:17:31 GMT

In article <7jkcok$4j9$1@murrow.corp.sgi.com>,
John R. Mashey <mash@mash.engr.sgi.com> wrote:

>In article <7jjua4$7jh$1@palladium.transmeta.com>,
>torvalds@transmeta.com (Linus Torvalds) writes:
>
>I'm not quite sure what you want, maybe you can say more...

Sure, I'm a blabbermouth..

>|> In a very real sense, this is the reverse of the "restrict" keyword that
>|> was introduced to allow compilers to optimize more.  I'll call it
>|> "norestrict" for obvious reasons.
>
>|> What I am looking for is comments about a "norestrict" kind of keyword,
>|> that works syntactically similarly to the restrict case, but which
>|> instead of saying "this pointer will never alias with any other pointer
>|> of its kind" it says "this pointer can alias with any other pointer even
>|> if the other pointer is _not_ of the same kind".  See the mirror
>|> symmetry?
>First question: when you say "any other pointer", do you mean:
>	(a) Any other pointers, including restricts
>	(b) Any other pointers, except those declared restrict
>	(c) Any other norestrict pointers

My gut feeling would say "b". Anybody who has explicitly told the world
that a pointer is restricted probably had a very good idea of what he
was doing, so imo the compiler might as well just always consider
explicitly "restrict"ed pointers to be completely alias-free regardless
of what else is going on.

(c) in turn is fairly useless: only having it be a "norestrict" pointer
restriction towards other "norestrict" pointers means that you have to
declare everything norestrict, even pointers that you don't consider
wild.

As an example of the use of "restrict", imagine a inline function that
does something similar to memcpy(), but with extensions (munging its
data some way etc). For example, something like "memcpy_toio()" which is
the Linux way of copying from a regular address space to a IO device in
PCI space.

The source pointer of that copy would be marked "norestrict", something
like this:

	inline void memcpy_toio(unsigned long pci_addr, norestrict void *src,
		unsigned int count)
	{
		.. do the work ..
	}

and the functionality I'm looking for is to be able to do (extremely
contrived example, don't consider this to be really real-world at all):

	double x;

	x = ....:
	memcpy_toio(device_addr, &x, sizeof(x));

while I would feel comfortable that the compiler would _not_ re-order
the assignment to x across the use of it only because my inline function
happened to use a "random" type (usually a 32-bit integer type) to do
the actual accesses.

So remember that while the specific example above is made up, I hope
that it kind of illuminates the problem set: a specific set of accesses
would be considered to potentially alias with just about anything else,
because they do bitwise accesses.

Right now there is no easy way to tell the compiler to not use the
type-based alias information in inline functions like the above (and
note that with many compilers the "inline" directive might actually be
something that is just implied - it might not actually have been marked
that way by the programmer).

>a) Seems a bad idea, as it overrides restrict.

I agree. There is no reason to override restrict that I see.

>c) Might be an OK idea, in that one is declaring a bunch of pointers for
>which the compiler must be careful, but most code gets whatever
>optimization it normally gets, and at least the norestrict is a good hint
>of tricks.

That was the intent. Using "norestrict" to tell the compiler that there
is something going on that it really doesn't understand, and should not
try to use the normal alias rules that rely on type information that is
no longer dependable because we're doing low-level operations that
really operate on streams of memory rather than any specific type.

>b) Seems a little bothersome, in that:
>
>norestrict	others
>		storeback	dead
>load		X
>store		X		X
>
>I.e., a store through a norestrict would seem to kill any values in
>registers, and any modified values in registers need to be written back to
>memory, and then the load or store done, if there is any chance of storage
>overlap.

Hmm..  I don't really see what you're aiming at as a problem.  It would
certainly kill cached contents of memory, yes.  But that's what the
current rules say happen for "char *" accesses or accesses of the same
type, so this should not at least complicate the compiler - the logic
must already have been there, it's just that we're expanding the logic
to allow user-defined behaviour.

>	norestrict short *p1;  int *p2;
>
>	*p2 = *p2 +1;	/* good compiler has *p2 in register */
>	*p1 = a;	/* must write *p2 back to memory, write a, kill *p2 */
>
>Really good optimizers may be able to know that some cases are safe.

Any static alias analysis can obviously be done _regardless_ of either
restrict or norestrict.  That includes the normal cass of offsets off a
common pointer, and knowing that local variables that haven't had their
address taken cannot alias with anything - regardless of restrict or
norestrict. The "norestrict" keyword would =not= mean that the compiler
cannot prove to itself that it cannot alias.

But yes, the above definitely would imply that you cannot cache the
contents of *p2 across the assignment of *p1 - exactly because of the
"norestrict" addition. But the user must have had some reason for the
"norestrict", and the compiler must have the notion of "potential
aliases" anyway, so I don't see that as a lack of optimization or a
complication of the compiler.

>Anyway, does any of these capture what you're looking for?

Your example exactly captures the behaviour I was looking for.  I added
my own mainly to show the kind of things I'm encountering, and where the
regular ANSI aliasing rules would lead to some rather silly syntactic
complications.

Right now this all means that we compile the kernel with a gcc flag to
turn off the strict type-based alias code, but what I'm really looking
for is a good way to take advantage of the type-based alias stuff (which
obviously works wonderfully in all the _common_ code) without horrible
syntax. Thus "norestrict".

(There is obviously lots of old code that would also similarly benefit
from "norestrict", although people on the gcc lists seem to be of the
opinion that it has all been fixed wrt clever compilers that know about
the alias issues.  I'm not as convinced myself, but...)

		Linus


From: torvalds@transmeta.com (Linus Torvalds)
Newsgroups: comp.std.c
Subject: Re: Comments on "norestrict"ed pointer extension
Date: 9 Jun 1999 04:19:49 GMT

In article <375DE6CE.CF94F936@null.net>,
Douglas A. Gwyn <DAGwyn@null.net> wrote:
>
>Sounds like a real mess,

Why? I'm looking for feedback, but I'd like a bit more than "a real
mess" ;)

<	 but anyway: use "volatile" -- it tells the
>compiler to treat the variable strictly according to the abstract
>machine semantics (no cached values).

No, volatile does a lot more than that.

Sure, volatile _might_ work, but volatile is like swatting a fly with an
atom bomb.

And from my reading of the standard, there isn't even any guarantee that
the compiler couldn't do alias analysis even with volatile accesses, so
the point is pretty moot anyway.

		Linus


From: torvalds@transmeta.com (Linus Torvalds)
Newsgroups: comp.std.c
Subject: Re: Comments on "norestrict"ed pointer extension
Date: 10 Jun 1999 01:05:32 GMT

In article <375ED317.D247790D@null.net>,
Douglas A. Gwyn <DAGwyn@null.net> wrote:
>Nick Maclaren wrote:
>> There is little doubt that the intent of the committee is that both
>> initialisation and argument association are "storing" operations,
>
>I think there is *considerable* doubt that initialization is a
>"storing operation".
>
>Anyway, this is all irrelevant to Linus's issue.  He could either:
>(a) use  proper types; or (b) use "volatile" declarations to force
>the compiler to synchronize storage so that his type puns work.

No.

The thing is, that "volatile" _really_ doesn't do what I want it to do.
Let me count the ways:

 - "volatile" is basically completely undefined behaviour. The standard
   says that the compiler can do pretty much whatever it chooses,
   although it gives some pointers about what the "best effort" should
   be and what "volatile" was meant to imply.

   In short, "volatile" is a bad feature. It's not well-defined, and
   whatever the compiler does you can't really complain. In my opinion,
   any code that depends on volatile is almost certainly buggy as the
   standard stands now (the only case I really personally approve of is
   a "clock tick counter" kind of use)

 - "volatile" as it stands now does NOT guarantee that the compiler
   hasn't done the normal alias decisions and re-ordered other accesses.
   In short, while "volatile" _does_ create bad code, it does NOT imply
   the specific kind of anti-optimization that "norestrict" is meant to
   imply.

   People seem to think that because "norestrict" implies a certain
   class of lack of optimizations, and because "volatile" implies a
   certain class of optimizations that they should magically be the
   same. They are not the same at all, and I don't even see the logic
   why they somehow are classed together.

"Volatile" may be fixable.  A future standard might have a better
definition of it that might make it clear what the heck it is supposed
to really do.  But even if it did, that is not going to be a replacement
for "norestrict", because it's fundamentally a question of two
completely different issues.

So "norestrict" really is much more closely related to "restrict" than
volatile. Thus the name.

"restrict" says that a certain pointer is guaranteed to not alias with
other pointers, and that type-based alias analysis is unnecessary: you
can just take for granted that a restricted pointer does not alias with
anything else even if the two pointers are in the same type set.

"norestrict" says the reverse: it says that type-based alias analysis is
unnecessary, but that's because we know there can be aliases
_regardless_ of the types of the pointers. See? They both end up being
doing the same thing, but returning the "reverse" decision.

Think of it as a decision tree:

 - can the accesses be statically determined not to alias (ie we know
   enough about them that we can tell that there is no way they can
   alias)?

	   Yes: return OK_TO_REORDER

 - Is either access through a pointer of type "restrict"?

	   Yes: return OK_TO_REORDER

 - Is either access through a pointer of type "norestrict"?

	   Yes: return NOT_OK_TO_REORDER

 - Can we use type-based information to determine that they can be
   re-ordered?

	    Yes: return OK_TO_REORDER

 - else: we don't know enough about the accesses, we cannot consider
   them independent:

	    return NOT_OK_TO_REORDER

Note that "volatile" really is done at a completely different level.
"volatile" has various rules about multiple accesses to the same
location (which can be seen as a subcase of the aliasing issue: is the
location considered to be independent wrt itself?), rules that are _not_
implied in the handling of "norestrict".

In short: the current standard does not seem to give _ANY_ way to do
what I'm asking for, except for the "memcpy()" approach or the "do it a
byte at a type" approach - both of which are obviously broken for fairly
obvious reasons. "volatile" is not the answer.

		Linus


From: torvalds@transmeta.com (Linus Torvalds)
Newsgroups: comp.std.c
Subject: Re: Comments on "norestrict"ed pointer extension
Date: 10 Jun 1999 01:26:50 GMT

In article <christian.bau-0906991106080001@christian-mac.isltd.insignia.com>,
Christian Bau <christian.bau@isltd.insignia.com> wrote:

>"norestrict" would only make sense if using a "norestrict" pointer would
>give defined behavior in cases where the behavior is currently undefined.

Agreed. That is exactly the intent.

>That is quite easy: Just define that storing to a "norestrict" pointer is
>the same as memcpy from a temporary location, and accessing an lvalue
>through a "norestrict" pointer is the same as memcpy to a temporary
>location and reading the value from that location.

Again, 100% agreement.  That's exactly the semantics I'm after, although
unlike the "memcpy" approach the "norestrict" pointer has things like
alignment information associated with it.

>On the other hand, memcpy is a standard library function and could be
>implemented by the compiler in a clever way. So instead of using a
>"norestrict" pointer, use memcpy and hope that the compiler is clever.

That's where we don't agree any more.

I agree that it would work.  But I happen to think that it is absolutely
horrible syntax, and that it is a deficiency in the language that
apparently the horrible syntax is required.  I happen to also feel that
"undefined behaviour" is a BAD feature of any language - and allowing
people to define the behaviour is good.

Which is why I suggested "norestrict" pointers.  If a compiler writer
wants to implement them internally as memcpy(), and the compiler is good
at optimizing them away, that's fine.  So yes, you could choose to
implement it as a pre-processing phase of the compiler if you really
wanted to.

I happen to think the much simpler straightforward implementation would
be done by just simply extending the alias logic, but from a language
standpoint the two would obviously be equivalent.

			Linus


From: torvalds@transmeta.com (Linus Torvalds)
Newsgroups: comp.std.c
Subject: Re: Comments on "norestrict"ed pointer extension
Date: 10 Jun 1999 01:39:45 GMT

In article <01beb274$70a14440$daee58c0@aaron>,
David Jones <drj@pobox.com> wrote:

> [ complain to compiler vendor if "memcpy()" doesn't generate good code ]

Sure. And that's just a trivial thing to fix?

It so happens that the compiler vendor in question is open source.  The
"norestrict" thing I would at least have a chance in hell of actually
implementing myself, although for fairly obvious reasons I'd be much
happier if somebody else did so (people have made some early patches
that apparently still have some problems, but it just shows that it's
not that far off from being reality).

The memcpy() case I wouldn't get that far on, I suspect.

If you expect the "memcpy()" thing to be easy or expect many vendors to
do that optimization, I wonder what planet of super-humans you come from ;)

And that's ignoring the horrible syntax issue completely - I really
think the syntax is one of the most important issues for the whole
thing.

> 		You might get more mileage by asking for the
>sequence in load_int above to be optimized to the obvious code.

Sure.  I might ask for my car to be supercharged to go 250 mph.  But _I_
think I'd get more mileage from my local garage if I instead asked them
to just tune my engine a bit..

If you really see the "memcpy()" approach as being a reasonable one,
then I don't see any reason why the compiler couldn't just do that
internally whenever it sees "norestrict".

But "norestrict" is meant exactly so that it would be _easier_ for the
compiler writers to get a much more limited case right, while still
giving people who do _want_ to access memory the oldfashioned way a way
to avoid having the compiler re-order it.

			Linus

Index Home About Blog