Shells create process groups when running commands. This is true regardless of synchronous or asynchronous commmands.
Unix processes are not automatically supervisory processes (unlike Erlang). This extends Unix shells as well.
However things like pipe groups and terminal TTY shortcuts (CTRL+C) obscure this fact.
Let's clear this up.
Imagine we have a Python program ./script.py which will run forever, but also
listen to signals, print them out and exit.
#!/usr/bin/env python3
import time
import signal
def sighandler(signum, frame):
print('SIGNAL HANDLER called with SIGNAL:', signum)
exit()
signal.signal(signal.SIGINT, sighandler)
signal.signal(signal.SIGTERM, sighandler)
signal.signal(signal.SIGQUIT, sighandler)
print('READY')
while True:
print('LIVE')
time.sleep(1)Then we have a shell script ./orchestrate.sh that "orchestrates" the Python
program:
#!/usr/bin/env sh
echo $$
# it does not matter if this was asynchronous using `&` and wait
python3 ./script.pyIf you run this script, you'll get something like:
> ./orchestrate.sh
16933
READY
LIVE
LIVE
^CSIGNAL HANDLER called with SIGNAL: 2
What has happened is that you have used a terminal shortcut CTRL+C which is
first handled by your terminal emulator. The terminal emulator will send SIGINT
to the foreground process group 16933. This will interrupt both bash and the
python program. Which results in the entire process hierarchy stopping.
However if you instead use kill -SIGINT 16933 from a different terminal, this
will send SIGINT to just the bash process, but bash will not propagate the
SIGINT to its python child process. The ./orchestrator exits, but
./script.py continues running.
To ensure that you are killing the entire process hierarchy you need to instead
use kill -SIGINT -16933. The - prefix sends the signal to the process group
instead.
Obviously our ./orchestrator.sh is not very robust. To make it robust, we must
explicitly turn it into a supervisor.
To do this, we can use traps:
#!/usr/bin/env sh
echo $$
trap 'exit' INT QUIT TERM
trap 'kill -TERM 0' EXIT
python3 ./script.pyThe above will trap SIGINT, SIGQUIT, SIGTERM and raise an EXIT condition.
The EXIT condition will then be handled by kill 0, which sends SIGTERM to
the current process group. Note that SIGTERM is the default kill signal,
however it is good to be explicit. Make sure to use -TERM for posix
compatibility.
The above is just an example, your orchestrator traps should be customised to your situation.
Remember it is possible to receive signals multiple times. So the python
signal handler can be called multiple times. You must make sure that the handler
is idempotent.
Finally:
# sends SIGINT to just the PID
kill -INT <pid>
# sends SIGINT to the process group using the pid
kill -INT -<pid>The only good way of creating shell script orchestrators is like this:
cleanup () {
kill -TERM $(jobs -p) 2>/dev/null || true
}
trap 'exit' INT QUIT TERM
trap cleanup EXIT
comm &
{
comm &
comm &
wait
} &
waitIf you use set -m inside a script, each individual command group will get its own process group ID.
If you don't then they inherit the shell's process group ID.
Session ID is meant to be associated to a "tty", or to "sessions" that is separate from a tty. So daemons get their own SID.
But also each terminal (pty or vty or tty) gets their own SID. An SID is for process grouping at the "session" level related to a terminal or non-terminal.
Process groups are intended for allowing one to signal a group together. Thus process trees don't really exist in Unix, there are process groups instead. Process trees still exist, but only through PID to PPID relationships.
Oh it turns out that
kill 0is bad idea when you're a script, cause you inherit the same process group. (This means you could in fact kill the parent process as well, so you're not really simulating a process tree semantics).Instead we have to use this:
Unfortunately relying on
jobs -pis a Bash specific behaviour. There's nothing in ZSH that I can find that allows you to get a simple list of the PIDs of the current job table.