systemd and StandardInput. taking control of tty - bash

I have a systemd unit script which looks something like this
cat /usr/lib/systemd/system/hello.service
[Unit]
Description=Simple Hello World service
After=syslog.target network.target
[Service]
Type=forking
EnvironmentFile=/root/hello.env
ExecStart=/bin/gdb /root/hello
StandardInput=tty-force
StandardOutput=inherit
TTYPath=/dev/pts/0
TTYReset=yes
TimeoutStartSec=infinty
[Install]
WantedBy=multi-user.target
The whole point is, i want to start the service with gdb on start up.[Since the process involves lot of environmental variables i cannot use the gdb directly on the process.]
systemctl start hello (which is actually working).
But once i exit out of gdb tty is completely messed up.None of the control key work, ^Z, ^C.
This are the observation till now.
As describer by systemd man pages with "StandardInput=tty-force", will actually force the executing process to take control of tty.
Before i launch the process
# tty
/dev/pts/0
# ps -aef | grep bash
root 2805 2803 0 10:42 pts/0 00:00:00 -bash
root 2860 2805 0 10:45 pts/0 00:00:00 grep --color=auto bash
After i launch
# tty
/dev/pts/0
# ps -aef | grep bash
root 2805 2803 0 10:42 ? 00:00:00 -bash
root 2884 2805 0 10:47 ? 00:00:00 grep --color=auto bash
Tried reset the terminal, still doesn't work.
subsequent systemctl command dsplay the below error
systemctl stop hello
Error creating textual authentication agent: Error opening current controlling terminal for the process (`/dev/tty'): No such device or address (polkit-error-quark, 0)
So the question is is there a way to reset the tty back to bash ?

Related

Using Caffeinate command in MacOS

I'm running a torch implementation of DCgan. As the training is taking a lot of time, I thought of using caffeinate in Macos in order to keep the system awake until the training and generating finishes. So, I used the ps command and found the following as output.
PID TTY TIME CMD
5607 ttys000 0:00.18 -bash
6206 ttys000 16:06.47 python dcgan_nocomment.py
6209 ttys000 0:01.49 python dcgan_nocomment.py
6210 ttys000 0:01.53 python dcgan_nocomment.py
6318 ttys001 0:00.03 -bash
In order to use caffeinate, which process pid should I need to consider to put in the following command
caffeinate -disu -w [pid]
You can also use caffeinate in utility mode, where you don't need to provide a pid, but a utility that should be executed. E.g. wrap your work in a shell script mywork.sh and:
caffeinate -disu mywork.sh

Can't send signals to process created by PTY.spawn() in Ruby

I'm running ruby in an Alpine docker container (it's a sidekiq worker, if that matters). At a certain point, my application receives instructions to shell out to a subcommand. I need to be able to stream STDOUT rather than have it buffer. This is why I'm using PTY instead of system() or another similar answer. I'm executing the following line of code:
stdout, stdin, pid = PTY.spawn(my_cmd)
When I connect to the docker container and run ps auxf, I see this:
root 7 0.0 0.4 187492 72668 ? Sl 01:38 0:00 ruby_program
root 12378 0.0 0.0 1508 252 pts/4 Ss+ 01:38 0:00 \_ sh -c my_cmd
root 12380 0.0 0.0 15936 6544 pts/4 S+ 01:38 0:00 \_ my_cmd
Note how the child process of ruby is "sh -c my_cmd", which itself then has a child "my_cmd" process.
"my_cmd" can take a significant amount of time to run. It is designed so that sending a signal USR1 to the process causes it to save its state so it can be resumed later and abort cleanly.
The problem is that the pid returned from "PTY.spawn()" is the pid of the "sh -c my_cmd" process, not the "my_cmd" process. So when I execute:
Process.kill('USR1', pid)
it sends USR1 to sh, not to my_cmd, so it doesn't behave as it should.
Is there any way to get the pid that corresponds to the command I actually specified? I'm open to ideas outside the realm of PTY, but it needs to satisfy the following constraints:
1) I need to be able to stream output from both STDOUT and STDERR as they are written, without waiting for them to be flushed (since STDOUT and STDERR get mixed together into a single stream in PTY, I'm redirecting STDERR to a file and using inotify to get updates).
2) I need to be able to send USR1 to the process to be able to pause it.
I gave up on a clean solution. I finally just executed
pgrep -P #{pid}
to get the child pid, and then I could send USR1 to that process. Feels hacky, but when ruby gives you lemons...
You should send your arguments as arrays. So instead of
stdout, stdin, pid = PTY.spawn("my_cmd arg1 arg2")
You should use
stdout, stdin, pid = PTY.spawn("my_cmd", "arg1", "arg2")
etc.
Also see:
Process.spawn child does not receive TERM
https://zendesk.engineering/running-a-child-process-in-ruby-properly-febd0a2b6ec8

nohup job control rhel6/centos

I have a question about UNIX job control in RHEL6
Basically, I am trying to implement passenger debug log rotation using logrotate. I am following the instructions here:
https://github.com/phusion/passenger/wiki/Phusion-Passenger-and-logrotation
I've got everything setup correctly (I think). My problem is this; when I spawn the background job using
nohup pipetool $HOME/passenger.log < $HOME/passenger.pipe &
And then log out and back in, if I inspect the process table, for example by using 'ps aux' if I check the pid of the process it appears as with the command 'bash'. I have tried changing the first line of the command to "#!/usr/bin/ruby". Here is an example of this:
[root#server]# nohup pipetool /var/log/nginx/passenger-debug.log < /var/pipe/passenger.pipe &
[1] 63767
[root#server]# exit
exit
[me#server]$ sudo su
[sudo] password for me:
[root#server]# ps aux | grep 63767
root 63767 0.0 0.0 108144 2392 pts/0 S 15:26 0:00 bash
root 63887 0.0 0.0 103236 856 pts/0 S+ 15:26 0:00 grep 63767
[root#server]#
When this occurs the line in the supplied logrotate file ( killall -HUP pipetool ) fails because the 'pipetool' is not matched. Again, I've tried changing the first line to #!/usr/bin/ruby. This had no impact. So, my question is basically; is there any good way to have the actual command appear in the process table instead of just 'bash' when spawned using job control? I am using bash as the shell when I invoke the pipetool. I appreciate you taking the time to help me.
This should work for you: edit pipetool to set the global variable $PROGRAM_NAME:
$PROGRAM_NAME = 'pipetool'
The script should then show up as pipetool in the process list.

Monit fails to start process

I've written a scrip that works fine to start and stop a server.
#!/bin/bash
PID_FILE='/var/run/rserve.pid'
start() {
touch $PID_FILE
eval "/usr/bin/R CMD Rserve"
PID=$(ps aux | grep Rserve | grep -v grep | awk '{print $2}')
echo "Starting Rserve with PID $PID"
echo $PID > $PID_FILE
}
stop () {
pkill Rserve
rm $PID_FILE
echo "Stopping Rserve"
}
case $1 in
start)
start
;;
stop)
stop
;;
*)
echo "usage: rserve {start|stop}" ;;
esac
exit 0
If I start it by running
rserve start
and then start monit it will correctly capture the PID and the server:
The Monit daemon 5.3.2 uptime: 0m
Remote Host 'localhost'
status Online with all services
monitoring status Monitored
port response time 0.000s to localhost:6311 [DEFAULT via TCP]
data collected Mon, 13 May 2013 20:03:50
System 'system_gauss'
status Running
monitoring status Monitored
load average [0.37] [0.29] [0.25]
cpu 0.0%us 0.2%sy 0.0%wa
memory usage 524044 kB [25.6%]
swap usage 4848 kB [0.1%]
data collected Mon, 13 May 2013 20:03:50
If I stop it, it will properly kill the process and unmonitor it. However if I start it again, it won't start the server again:
ps ax | grep Rserve | grep -vc grep
1
monit stop localhost
ps ax | grep Rserve | grep -vc grep
0
monit start localhost
[UTC May 13 20:07:24] info : 'localhost' start on user request
[UTC May 13 20:07:24] info : monit daemon at 4370 awakened
[UTC May 13 20:07:24] info : Awakened by User defined signal 1
[UTC May 13 20:07:24] info : 'localhost' start: /usr/bin/rserve
[UTC May 13 20:07:24] info : 'localhost' start action done
[UTC May 13 20:07:34] error : 'localhost' failed, cannot open a connection to INET[localhost:6311] via TCP
Here is the monitrc:
check host localhost with address 127.0.0.1
start = "/usr/bin/rserve start"
stop = "/usr/bin/rserve stop"
if failed host localhost port 6311 type tcp with timeout 15 seconds for 5 cycles
then restart
I had problem start or stop process via shell too.
One solution might be add "/bin/bash" in the config like this:
start program = "/bin/bash /urs/bin/rserv start"
stop program = "/bin/bash /urs/bin/rserv stop"
It worked for me.
monit is a silent killer. It does not tell you anything. Here are things I would check which monit won't help you identify
Check permissions of all the files you are reading / writing. If you are redirecting output to a file, make sure that file is writable by uid and gid you are using to execute the program
Again check exec permission on the program you are trying to run
Specify full path to any program you are trying to execute ( not strictly necessary, but you don't have to worry about path not being set if you always specify full path )
Make sure you can run the program outside of monit without any error before trying to investigate why monit is not starting.
If the Monit log is displaying
failed to start (exit status -1) -- no output
Then it may be that you're trying to run a script without any of the Bash infrastructure. You can run such a command by wrapping it in /bin/bash -c, like so:
check process my-process
matching "my-process-name"
start program = "/bin/bash -c '/etc/init.d/my-init-script'"
When monit starts it checks for its own pidfile and checks if the process with
matching PID is running already - if it does, then it just wakes up this
process.
in your case, check if this pid is being used by some other process:
ps -ef |grep 4370
if yes, then you need to remove the below file(usually under /run directory) and start monit again:
monit.pid
For me, the issue was that the stop command was not being run, even though I specifically specified "then restart" on the configuration.
The solution was just to change:
start program = "/etc/init.d/.... restart"

Why do processes spawned by cron end up defunct?

I have some processes showing up as <defunct> in top (and ps). I've boiled things down from the real scripts and programs.
In my crontab:
* * * * * /tmp/launcher.sh /tmp/tester.sh
The contents of launcher.sh (which is of course marked executable):
#!/bin/bash
# the real script does a little argument processing here
"$#"
The contents of tester.sh (which is of course marked executable):
#!/bin/bash
sleep 27 & # the real script launches a compiled C program in the background
ps shows the following:
user 24257 24256 0 18:32 ? 00:00:00 [launcher.sh] <defunct>
user 24259 1 0 18:32 ? 00:00:00 sleep 27
Note that tester.sh does not appear--it has exited after launching the background job.
Why does launcher.sh stick around, marked <defunct>? It only seems to do this when launched by cron--not when I run it myself.
Additional note: launcher.sh is a common script in the system this runs on, which is not easily modified. The other things (crontab, tester.sh, even the program that I run instead of sleep) can be modiified much more easily.
Because they haven't been the subject of a wait(2) system call.
Since someone may wait for these processes in the future, the kernel can't completely get rid of them or it won't be able to execute the wait system call because it won't have the exit status or evidence of its existence any more.
When you start one from the shell, your shell is trapping SIGCHLD and doing various wait operations anyway, so nothing stays defunct for long.
But cron isn't in a wait state, it is sleeping, so the defunct child may stick around for a while until cron wakes up.
Update: Responding to comment...
Hmm. I did manage to duplicate the issue:
PPID PID PGID SESS COMMAND
1 3562 3562 3562 cron
3562 1629 3562 3562 \_ cron
1629 1636 1636 1636 \_ sh <defunct>
1 1639 1636 1636 sleep
So, what happened was, I think:
cron forks and cron child starts shell
shell (1636) starts sid and pgid 1636 and starts sleep
shell exits, SIGCHLD sent to cron 3562
signal is ignored or mishandled
shell turns zombie. Note that sleep is reparented to init, so when the sleep exits init will get the signal and clean up. I'm still trying to figure out when the zombie gets reaped. Probably with no active children cron 1629 figures out it can exit, at that point the zombie will be reparented to init and get reaped. So now we wonder about the missing SIGCHLD that cron should have processed.It isn't necessarily vixie cron's fault. As you can see here, libdaemon installs a SIGCHLD handler during daemon_fork(), and this could interfere with signal delivery on a quick exit by intermediate 1629Now, I don't even know if vixie cron on my Ubuntu system is even built with libdaemon, but at least I have a new theory. :-)
to my opinion it's caused by process CROND (spawned by crond for every task) waiting for input on stdin which is piped to the stdout/stderr of the command in the crontab. This is done because cron is able to send resulting output via mail to the user.
So CROND is waiting for EOF till the user command and all it's spawned child processes have closed the pipe. If this is done CROND continues with the wait-statement and then the defunct user command disappears.
So I think you have to explicitly disconnect every spawned subprocess in your script form the pipe (e.g. by redirecting it to a file or /dev/null.
so the following line should work in crontab :
* * * * * ( /tmp/launcher.sh /tmp/tester.sh &>/dev/null & )
I suspect that cron is waiting for all subprocesses in the session to terminate. See wait(2) with respect to negative pid arguments. You can see the SESS with:
ps faxo stat,euid,ruid,tty,tpgid,sess,pgrp,ppid,pid,pcpu,comm
Here's what I see (edited):
STAT EUID RUID TT TPGID SESS PGRP PPID PID %CPU COMMAND
Ss 0 0 ? -1 3197 3197 1 3197 0.0 cron
S 0 0 ? -1 3197 3197 3197 18825 0.0 \_ cron
Zs 1000 1000 ? -1 18832 18832 18825 18832 0.0 \_ sh <defunct>
S 1000 1000 ? -1 18832 18832 1 18836 0.0 sleep
Notice that the sh and the sleep are in the same SESS.
Use the command setsid(1). Here's tester.sh:
#!/bin/bash
setsid sleep 27 # the real script launches a compiled C program in the background
Notice you don't need &, setsid puts it in the background.
I’d recommend that you solve the problem by simply not having two separate processes: Have launcher.sh do this on its last line:
exec "$#"
This will eliminate the superfluous process.
I found this question while I was looking for a solution with a similar issue. Unfortunately answers in this question didn't solve my problem.
Killing defunct process is not an option as you need to find and kill its parent process. I ended up killing the defunct processes in the following way:
ps -ef | grep '<defunct>' | grep -v grep | awk '{print "kill -9 ",$3}' | sh
In "grep ''" you can narrow down the search to a specific defunct process you are after.
I have tested the same problem so many times.
And finally I've got the solution.
Just specify the '/bin/bash' before the bash script as shown below.
* * * * * /bin/bash /tmp/launcher.sh /tmp/tester.sh

Resources