Getting a background process ID is easy to do from the prompt by going:
$ my_daemon &
$ echo $!
But what if I want to run it as a different user like:
su - joe -c "/path/to/my_daemon &;"
Now how can I capture the PID of my_daemon?
Succinctly - with a good deal of difficulty.
You have to arrange for the su'd shell to write the child PID to a file and then pick the output. Given that it will be 'joe' creating the file and not 'dex', that adds another layer of complexity.
The simplest solution is probably:
su - joe -c "/path/to/my_daemon & echo \$! > /tmp/su.joe.$$"
bg=$(</tmp/su.joe.$$)
rm -f /tmp/su.joe.$$ # Probably fails - joe owns it, dex does not
The next solution involves using a spare file descriptor - number 3.
su - joe -c "/path/to/my_daemon 3>&- & echo \$! 1>&3" 3>/tmp/su.joe.$$
bg=$(</tmp/su.joe.$$)
rm -f /tmp/su.joe.$$
If you're worried about interrupts etc (and you probably should be), then you trap things too:
tmp=/tmp/su.joe.$$
trap "rm -f $tmp; exit 1" 0 1 2 3 13 15
su - joe -c "/path/to/my_daemon 3>&- & echo \$! 1>&3" 3>$tmp
bg=$(<$tmp)
rm -f $tmp
trap 0 1 2 3 13 15
(The caught signals are HUP, INT, QUIT, PIPE and TERM - plus 0 for shell exit.)
Warning: nice theory - untested code...
The approaches presented here didn't work for me. Here's what I did:
PID_FILE=/tmp/service_pid_file
su -m $SERVICE_USER -s /bin/bash -c "/path/to/executable $ARGS >/dev/null 2>&1 & echo \$! >$PID_FILE"
PID=`cat $PID_FILE`
As long as the output from the background process is redirected, you can send the PID to stdout:
su "${user}" -c "${executable} > '${log_file}' 2>&1 & echo \$!"
The PID can then be redirected to a file owned by the first user, rather than the second user.
su "${user}" -c "${executable} > '${log_file}' 2>&1 & echo \$!" > "${pid_file}"
The log files do need to be owned by the second user to do it this way, though.
Here's my solution
su oracle -c "/home/oracle/database/runInstaller" &
pid=$(pgrep -P $!)
Explantation
pgrep -P $! - Gets the child process of the parent pid $!
I took the above solution by Linux, but had to add a sleep to give the child process a chance to start.
su - joe -c "/path/to/my_daemon > /some/output/file" &
parent=$!
sleep 1
pid=$(pgrep -P $parent)
Running in bash, it doesn't like pid=$(pgrep -P $!) but if I add a space after the ! it's ok: pid=$(pgrep -P $! ). I stuck with the extra $parent variable to remind myself what I'm doing next time I look at the script.
Related
I often end up in this situation:
$ sudo something &
[1] 21838
$# Oh, shoot, it's hung, and assume the pid has scrolled off the screen
$ kill %1
-bash: kill: (21838) - Operation not permitted
$# Ah, rats. I forgot I sudo'ed that.
$# Wishful thinking:
$ sudo kill %1
kill: cannot find process "%1"
$# Now I have to use ps and find the pid I want.
$ ps -elf | grep something
$ ps -elf | grep sleep
4 S root 21838 1928 0 80 0 - 53969 poll_s 11:28 pts/2 00:00:00 sudo sleep 100
4 S root 21840 21838 0 80 0 - 26974 hrtime 11:28 pts/2 00:00:00 sleep 100
$ sudo kill -9 21838
[1]+ Killed sudo something
I would really like to know if there is a better workflow for this. I'm really surprised there isn't a bash expression to turn %1 into a pid number.
Is there a bash trick for converting %1 to it's underlying pid? (Yes, I know I could have saved it at launch with $!)
To get the PID of a job, use: jobs -p N, where N is the job number:
$ yes >/dev/null &
[1] 2189
$ jobs -p 1
2189
$ sudo kill $(jobs -p 1)
[1]+ Terminated yes > /dev/null
Alternatively, and more strictly answering your question, you might find -x useful: it runs a command, replacing a job spec with the corresponding PID:
$ yes >/dev/null &
[1] 2458
$ jobs -x sudo kill %1
[1]+ Terminated yes > /dev/null
I find -p more intuitive, personally, but I get the appeal of -x.
Here is the piece of code from shell script that is causing the problem.
LOG_FILE="/home/sample.log"
PID_FILE="/home/sample.pid"
sudo -u user1 trinidad -e production > "$LOG_FILE" 2>&1 & echo $! > "$PID_FILE"
PARENT_PID=`cat "$PID_FILE"`
pgrep -P "$PARENT_PID" > "$PID_FILE"
But here the last command does not print anything to PID_FILE. So for debugging purpose I tried echoing echo $PARENT_PID. It correctly prints the output like 1234.
Also in shell script If I do pgrep -P 1234 then also it prints the child process correctly but only if I do pgrep -P $PARENT_PID then it prints nothing.
You are writing stuff into a file and then reading the file back in. While that is just wasteful, not actually an explanation of your problem, I would refactor to
LOG_FILE="/home/sample.log"
PID_FILE="/home/sample.pid"
sudo -u user1 trinidad -e production > "$LOG_FILE" 2>&1 &
PARENT_PID=$!
pgrep -P "$PARENT_PID" > "$PID_FILE"
I'm guessing your actual problem is that the sudo process doesn't spawn any children. The action of pgrep -P is to print processes which are children of the PID you specify; if your process doesn't spawn any children, it won't print any.
I'm writing a bash script for kicking up an uncertain program. The run time of the program is unknown. The script will also kick up a while loop for using linux commands or perf to record something in a 1 second manner.
./my_app &
$i=1
while true;
do
perf stat -a -A -e writeback:writeback_dirty_page sleep $i >> out
done
How can I stop the while loop while my_app is finished? Thank you.
Make your while loop conditional on the process id of the app existing:
./my_app &
app_pid=$!
i=1
while ps -p $app_pid >/dev/null 2>&1
do
perf stat -a -A -e writeback:writeback_dirty_page sleep $i >> out
done
Get the pid using
echo $!
then
kill
you can send kill signal from my_app to the process that spawn my_app
Here is the real example
test.sh
#!/bin/bash
./my_app.sh $$ &
while [ 1 ]
do
echo running....
sleep 2
done
my_app.sh
#!/bin/bash
sleep 10
kill -9 $1
I have a script that uses ssh to login to a remote machine, cd to a particular directory, and then start a daemon. The original script looks like this:
ssh server "cd /tmp/path ; nohup java server 0</dev/null 1>server_stdout 2>server_stderr &"
This script appears to work fine. However, it is not robust to the case when the user enters the wrong path so the cd fails. Because of the ;, this command will try to run the nohup command even if the cd fails.
The obvious fix doesn't work:
ssh server "cd /tmp/path && nohup java server 0</dev/null 1>server_stdout 2>server_stderr &"
that is, the SSH command does not return until the server is stopped. Putting nohup in front of the cd instead of in front of the java didn't work.
Can anyone help me fix this? Can you explain why this solution doesn't work? Thanks!
Edit: cbuckley suggests using sh -c, from which I derived:
ssh server "nohup sh -c 'cd /tmp/path && java server 0</dev/null 1>master_stdout 2>master_stderr' 2>/dev/null 1>/dev/null &"
However, now the exit code is always 0 when the cd fails; whereas if I do ssh server cd /failed/path then I get a real exit code. Suggestions?
See Bash's Operator Precedence.
The & is being attached to the whole statement because it has a higher precedence than &&. You don't need ssh to verify this. Just run this in your shell:
$ sleep 100 && echo yay &
[1] 19934
If the & were only attached to the echo yay, then your shell would sleep for 100 seconds and then report the background job. However, the entire sleep 100 && echo yay is backgrounded and you're given the job notification immediately. Running jobs will show it hanging out:
$ sleep 100 && echo yay &
[1] 20124
$ jobs
[1]+ Running sleep 100 && echo yay &
You can use parenthesis to create a subshell around echo yay &, giving you what you'd expect:
sleep 100 && ( echo yay & )
This would be similar to using bash -c to run echo yay &:
sleep 100 && bash -c "echo yay &"
Tossing these into an ssh, and we get:
# using parenthesis...
$ ssh localhost "cd / && (nohup sleep 100 >/dev/null </dev/null &)"
$ ps -ef | grep sleep
me 20136 1 0 16:48 ? 00:00:00 sleep 100
# and using `bash -c`
$ ssh localhost "cd / && bash -c 'nohup sleep 100 >/dev/null </dev/null &'"
$ ps -ef | grep sleep
me 20145 1 0 16:48 ? 00:00:00 sleep 100
Applying this to your command, and we get
ssh server "cd /tmp/path && (nohup java server 0</dev/null 1>server_stdout 2>server_stderr &)"
or:
ssh server "cd /tmp/path && bash -c 'nohup java server 0</dev/null 1>server_stdout 2>server_stderr &'"
Also, with regard to your comment on the post,
Right, sh -c always returns 0. E.g., sh -c exit 1 has error code
0"
this is incorrect. Directly from the manpage:
Bash's exit status is the exit status of the last command executed in
the script. If no commands are executed, the exit status is 0.
Indeed:
$ bash -c "true ; exit 1"
$ echo $?
1
$ bash -c "false ; exit 22"
$ echo $?
22
ssh server "test -d /tmp/path" && ssh server "nohup ... &"
Answer roundup:
Bad: Using sh -c to wrap the entire nohup command doesn't work for my purposes because it doesn't return error codes. (#cbuckley)
Okay: ssh <server> <cmd1> && ssh <server> <cmd2> works but is much slower (#joachim-nilsson)
Good: Create a shell script on <server> that runs the commands in succession and returns the correct error code.
The last is what I ended up using. I'd still be interested in learning why the original use-case doesn't work, if someone who understands shell internals can explain it to me!
I'm trying to nohup a command and run it as a different user, but every time I do this two processes are spawned.
For example:
$ nohup su -s /bin/bash nobody -c "my_command" > outfile.txt &
This definitely runs my_command as nobody, but there's an extra process that I don't want to shown up:
$ ps -Af
.
.
.
root ... su -s /bin/bash nobody my_command
nobody ... my_command
And if I kill the root process, the nobody process still lives... but is there a way to not run the root process at all? Since getting the id of my_command and killing it is a bit more complicated.
This could be achieved as:
su nobody -c "nohup my_command >/dev/null 2>&1 &"
and to write the pid of 'my_command' in a pidFile:
pidFile=/var/run/myAppName.pid
touch $pidFile
chown nobody:nobody $pidFile
su nobody -c "nohup my_command >/dev/null 2>&1 & echo \$! > '$pidFile'"
nohup runuser nobody -c "my_command my_command_args....." < /dev/null >> /tmp/mylogfile 2>&1 &
If the user with nologin shell, run as follows:
su - nobody -s /bin/sh -c "nohup your_command parameter >/dev/null 2>&1 &"
Or:
runuser - nobody -s /bin/sh -c "nohup your_command parameter >/dev/null 2>&1 &"
Or:
sudo su - nobody -s /bin/sh -c "nohup your_command parameter >/dev/null 2>&1 &"
sudo runuser -u nobody -s /bin/sh -c "nohup your_command parameter >/dev/null 2>&1 &"
You might do best to create a small script in e.g. /usr/local/bin/start_my_command like this:
#!/bin/bash
nohup my_command > outfile.txt &
Use chown and chmod to set it to be executable and owned by nobody, then just run su nobody -c /usr/local/bin/start_my_command.
A note on running this on a session, is that if you run in background, there is a job associated with the session, and background jobs may be killed (the su -c gets around this).
To disassociate the process from the shell (so you can exit the shell but keep the process running), use disown.