Index Home About Blog
From: Linus Torvalds <torvalds@linux-foundation.org>
Newsgroups: fa.linux.kernel
Subject: Re: [PATCH 1/1] mm: unify pmd_free() implementation
Date: Mon, 28 Jul 2008 15:57:32 UTC
Message-ID: <fa.vqM0PVj7wSSBhNp/oj2FWz4Hll4@ifi.uio.no>

On Mon, 28 Jul 2008, Andrea Righi wrote:
>
> Move multiple definitions of pmd_free() from different include/asm-* into
> mm/util.c.

But this is horrible, because it forces a totally unnecessary function
call for that empty function.

Yeah, the function will be cheap, but the call itself will not be (it's a
C language barrier and basically disables optimizations around it, causing
things like register spill/reload for no good reason).

		Linus


From: Linus Torvalds <torvalds@linux-foundation.org>
Newsgroups: fa.linux.kernel
Subject: Re: [PATCH 1/1] mm: unify pmd_free() implementation
Date: Mon, 28 Jul 2008 16:51:36 UTC
Message-ID: <fa.lYfqOZME1DLTqmpOd/JP54BZ95U@ifi.uio.no>

On Mon, 28 Jul 2008, James Bottomley wrote:
>
> Are you sure about this (the barrier)?

I'm sure. Try it. It perturbs the code quite a bit to have a function call
in the thing, because it

 - clobbers all callee-clobbered registers.

   This means that all functions that _used_ to be leaf functions and
   needed no stack frame at all (because they were simple enough to use
   only the callee-clobbered registers) are suddenly now going to be
   significantly more costly.

   Ergo: you get more stack movement with save/restore crud.

 - it is a barrier wrt any variables that may be visible externally
   (perhaps because they had their address taken), so it forces a flush to
   memory for those.

 - if it has arguments and return values, it also ends up forcing a
   totally unnecessary argument setup (and all the fixed register crap
   that involves, which means that you lost almost all your register
   allocation freedom - not that you likely care, since most of your
   registers are dead _anyway_ around the function call)

So empty functions calls are _deadly_ especially if the code was a leaf
function before, and suddenly isn't any more.

On the other hand, there are also many cases where function calls won't
matter much at all. If you had other function calls around that same area,
all the above issues essentially go away, since your registers are dead
anyway, and the function obviously wasn't a leaf function before the new
call.

So it does depend quite a bit on the pattern of use. And yes, function
argument setup can be a big part of it too.

				Linus


From: Linus Torvalds <torvalds@linux-foundation.org>
Newsgroups: fa.linux.kernel
Subject: Re: [PATCH 1/1] mm: unify pmd_free() implementation
Date: Mon, 28 Jul 2008 17:14:33 UTC
Message-ID: <fa.hvykhzw0tkQZG7b7aVL/a/84raY@ifi.uio.no>

On Mon, 28 Jul 2008, James Bottomley wrote:
>
> Sorry ... should have been clearer.  My main concern is the cost of
> barrier() which is just a memory clobber ... we have to use barriers to
> place the probe points correctly in the code.

Oh, "barrier()" itself has _much_ less cost.

It still has all the "needs to flush any global/address-taken-of variables
to memory" property and can thus cause reloads, but that's kind of the
point of it, after all. So in that sense "barrier()" is free: the only
cost of a barrier is the cost of what you actually need to get done. It's
not really "free", but it's also not any more costly than what your
objective was.

In contrast, the "objective" in an empty function call is seldom the
serialization, so in that case the serialization is all just unnecessary
overhead.

Also, barrier() avoids the big hit of turning a leaf function into a
non-leaf one. It also avoids all the fixed registers and the register
clobbers (although for tracing purposes you may end up setting up fixed
regs, of course).

The leaf -> non-leaf thing is actually often the major thing. Yes, the
compiler will often inline functions that are simple enough to be leaf
functions with no stack frame, so we don't have _that_ many of them, but
when it hits, it's often the most noticeable part of an unnecessary
function call. And "barrier()" should never trigger that problem.

			Linus

Index Home About Blog