November 8, 2012
Remote SSH Commands and Broken Connections
One problem with executing commands via ssh (that is, on ssh's command line, not via an interactive login shell) is that the command isn't terminated when the ssh connection dies. You can see this by running:
ssh otherhost /bin/sleep 600and interrupting ssh with Ctrl+C. On otherhost, sleep will still be running. Its parent, the sshd process forked to handle the connection, will be gone, and sleep will have been reparented to init (PID 1).
You don't get this problem when you use ssh interactively. All the processes that
you start have a controlling terminal, and when the connection dies and the controlling
terminal goes away, the processes you started are killed with SIGHUP (unless they detached
from the controlling terminal, such as with setsid
).
One solution is to always allocate a terminal, by specifying ssh's -t option:
ssh -t otherhost /bin/sleep 600But this isn't always feasible, especially if you're running ssh from a script which doesn't have a terminal.
We need a way to kill the remote command when its parent sshd process dies. Fortunately, on Linux, the prctl syscall provides a solution:
PR_SET_PDEATHSIG (since Linux 2.1.57)
Set the parent process death signal of the calling process to arg2 (either a signal value in the range 1..maxsig, or 0 to clear). This is the signal that the calling process will get when its parent dies. This value is cleared for the child of a fork(2).
We can write a simple C wrapper, called diewithparent, which calls prctl and then execs the command:
int main (int argc, char** argv)
{
prctl(PR_SET_PDEATHSIG, SIGTERM);
execvp(argv[1], argv + 1);
return 127;
}
And use it like this:
ssh otherhost diewithparent /bin/sleep 600Download the complete C source, which features error checking and options parsing (so you can specify the signal number). (Compile with cc -o diewithparent diewithparent.c.)
Naturally, this is a Linux-only solution. On other systems, the best solution (as far as I can tell) would be to
fork and exec the command. In the parent, continuously poll the parent PID (with getppid()
). When
it changes to 1, you know the parent has died so you kill the command.