Thread-synchronous signals (Linus Torvalds)

Index Home About Blog

Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@osdl.org>
Subject: Re: x86: SIGTRAP handling differences from 2.4 to 2.6
Original-Message-ID: <Pine.LNX.4.44.0311221044480.2379-100000@home.osdl.org>
Date: Sat, 22 Nov 2003 19:03:22 GMT
Message-ID: <fa.jjco1hr.1g5q099@ifi.uio.no>

On Sat, 22 Nov 2003, Daniel Barlow wrote:
>
> There is a difference between 2.4 (tested in 2.4.23-rc2) and 2.6
> (tested in 2.6.0-pre9) in the handling of "int 3" inside a SIGTRAP
> handler.

Indeed.

The basic change is basically:

 - some signals are "thread synchronous", ie the thread _cannot_ continue
   without taking them. Basically, any instruction fault does this,
   since just returning would generally cause the instruction to just be
   done again, and cause the same fault.

 - the difference between 2.4.x and 2.6.x is that in 2.4.x such a
   thread-synchronous instruction will just blow through being blocked. So
   even if you block them, they'll still happen. In 2.6.x, trying to block
   a thread-synchronous signal will just cause the process to be killed
   with that signal ("it can't be delivered, it can't be ignored, let's
   just tell the user")

The reason for the change is that the 2.4.x behaviour ends up hiding bugs,
and can cause surprising deadlocks in threaded programs. The 2.6.x
behaviour is "You did something fundamentally wrong, just _die_ now".

> I'm not sure what the correct answer is, if indeed it's specified.
> For contrast, in FreeBSD 5.1 I'm told that the signal handler runs to
> completion and only on exit is it called again.

This works because "int 3" and "into" is what Intel calls a "trap" as
opposed to a "fault", and as such we _could_ delay handling the signal and
just continue along - when the exception happens, the CPU has already
executed the instruction, and the exception will return to _after_ the
instruction.

However, Linux will refuse to do that because delaying the SIGTRAP is
pointless:
 - you'd get it at the wrong spot, making it pointless
 - the wrong thread could get it if you just consider it a normal signal.

So Linux considers both "int 3" and "into" to be thread-synchronous, even
though they are trivially recoverable. Which means that we have two
options, and two options only: punch through the fact that the signal is
blocked, or just say "that's wrong", and kill it.

NOTE NOTE NOTE!! If you actually _want_ the 2.4.x behaviour of recursive
signal invocation, you should just tell the kernel so: use the SA_NODEFER
flag to sigaction() to tell the kernel that you don't want to defer
recursive signals.

In short, the 2.6.x behaviour is the right one. 2.4.x was a strange
violation of the signal blocking, and I consider the FreeBSD behaviour to
be just bizarre.

And with 2.6.x, if you actually _want_ recursive signal handlers, you can
do so (fairly portably, I might add - putting the SA_NODEFER flag there
should make everybody do the same thing, even *BSD).

			Linus

Newsgroups: fa.linux.kernel
From: Linus Torvalds <torvalds@osdl.org>
Subject: Re: x86: SIGTRAP handling differences from 2.4 to 2.6
Original-Message-ID: <Pine.LNX.4.44.0311230954460.17378-100000@home.osdl.org>
Date: Sun, 23 Nov 2003 18:02:56 GMT
Message-ID: <fa.kmipj8a.1d38kgq@ifi.uio.no>

On 22 Nov 2003, H. Peter Anvin wrote:
> >
> > Hmm.. Looking at the signal sending code, we actually do special-case
> > "init" there already - but only for the "kill -1" case. If the test for
> > "pid > 1" was moved into "group_send_sig_info()" instead, that would
> > pretty much do it, I think.
> >
>
> Okay... I'm going to ask the obvious dumb question:
>
> Why do we bother special-casing init at all?

Because the kernel depends on it existing. "init" literally _is_ special
from a kernel standpoint, because its' the "reaper of zombies" (and, may I
add, that would be a great name for a rock band).

So without init, the kernel wouldn't have anybody to fall back on when a
parent process dies, and would become very very unhappy. Historically it
actually oopsed the kernel.

UNIX semantics literally _require_ that "getppid()" should return 1 if
your parent dies, and that's "current->p_parent->tgid". So we have to have
a parent with pid 1, and thus init really _is_ special.

Yeah, we could have _other_ special cases (we could create another process
that is invisible and has pid 1), but the fact is, _some_ special case is
required. It might as well be "you can't kill init".

		Linus

Index Home About Blog