How to check if a specific executable has a live process - bash

I want to write a script that checks periodically if a specific executable has a live process, something like this:
psping [-c ###] [-t ###] [-u user-name] exe-name
-c - limit amount of pings, Default is infinite
-t - define alternative timeout in seconds, Default is 1 sec
-u - define user to check process for. The default is ANY user.
For example, psping java will list all processes that are currently invoked by the java command.
The main goal is to count and echo the number of live processes for a user, whose executable file is exe-name, java in the above example.
I wrote a function:
perform_ping(){
ps aux | grep "${EXE_NAME}" | awk '{print $2}' | while read PID
do
echo $PID # -> This will echo the correct PID
# How to find if this PID was executed by ${EXE_NAME}
done
fi
sleep 1
}
I'm having a hard time figuring out how to check if a specific executable file has a live process.

To list all processes that opens a file, we can use the lsof command. Because an executable must be opened in order to be run, we may just use lsof for this purpose.
The next problem is that when we run a java file, we simply type java some_file, and if we issue lsof java it will coldly says that lsof: status error on java: No such file or directory because the java is actually /usr/bin/java.
To convert from java to /usr/bin/java we can use which java, so the command would be:
lsof $(which $EXE_FILE)
The output may looks like this:
lsof: WARNING: can't stat() tracefs file system /sys/kernel/debug/tracing
Output information may be incomplete.
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
python3 26969 user txt REG 8,1 4526456 15409 /usr/bin/python3.6
In this case I searched python3 as lsof $(which python3). It will report the PID in the second field. But when there's another user that invokes python3 too, lsof will issue the warning on stderr like the first two lines because it cannot read other users info. Therefore, we modify the command as:
lsof $(which python3) 2> /dev/null
to suppress the warning. Then we're almost there:
lsof $(which python3) 2> /dev/null | awk 'NR > 1 { print $2 }'
Then you can use read to catch the PID.
Edit: how to list all processes for all users?
By default lsof doesn't read process for a specific file, but after further reading man lsof I found that there are options that meet your needs.
-a causes list selection options to be ANDed.
-c c selects the listing of files for processes executing the command that begins with the characters of c. Multiple commands may be specified, using multiple -c options.
-u s selects the listing of files for the user whose login names or user ID numbers are in the comma-separated set s.
Therefore, you can use
lsof -c java
to list all commands that are run by java. And to see a specific user, add -u option as
lsof -a -c java -u user
-a is needed for the AND operation. If you run this command you will see multiple entry for a process, to unique them, run
lsof -c java 2> /dev/null | sed 1d | sort -uk2,2
Also please notice that users may run their own java in their path and therefore you have to decide which one to monitor: java or /usr/bin/java.

Related

Shell Script:How to get all the Process Id which are using a file-system/directory

I am checking the processes which is using a file system. Now when I do fuser there are hundreds of processes are coming.
fuser -cu /xyz
Output truncated:
393ce(xyz) 1044c(root) 1068cm(oracle) 2065ce(xyz) 3729ce(xyz)
I want just process id in file separated by newline character so that I can run a loop to check the processes.
If you only want id instead of id(user) then don't use the -u option. Documentation of fuser -u:
-u, --user
Append the user name of the process owner to each PID.
For me, fuser -c / has a different format than your sample. Each id is followed by letters denoting the type of access. The letters are printed to stderr, therefore I will use 2>&- to hide them.
$ fuser -c /
/: 1717rce 1754rce 1765rce 1785rce ...
$ fuser -c / 2>&-
1717 1754 1765 1785 ...
You can use grep to print one id per line:
$ fuser -c / 2>&- | grep -o '[0-9]*'
1717
1754
1765
1785
...
However, to run a loop you don't need one id per line. Ids separated by spaces work as well:
for id in $(fuser -c / 2>&-); do
echo "id = $id"
done

How to detect in bash script where stdout and stderr logs go?

I have a bash script called from cron multiple times with different parameters and redirecting their outputs to different logs approximately like this:
* * * * * /home/bob/bin/somescript.sh someparameters >> /home/bob/log/param1.log 2>&1
I need my script get in some variable the value "/home/bob/log/param1.log" in this case. It could as well have a date calculated for logfilename instead of "param1". Main reason as of now is reuse of same script for similar purposes and be able to inform a user via monitored folder where he should look for more info - give him a logfile name in some warning file.
How do I detect to which log the output (&1 or both &1 and &2) goes?
If you are running Linux, you can read the information from the proc file system. Assume you have the following program in stdout.sh.
#! /bin/bash
readlink -f /proc/$$/fd/1 >&2
Interactively it shows your terminal.
$ ./stdout.sh
/dev/pts/0
And with a redirection it shows the destination.
$ ./stdout.sh > nix
/home/ceving/nix
at runtime /usr/bin/lsof or /usr/sbin/lsof gives open file
lsof -p $$ -a -d 1
lsof -p $$ -a -d 2
filename1=$(lsof -p $$ -a -d 1 -F n)
filename1=${filename1#*$'\n'n}
filename2=$(lsof -p $$ -a -d 2 -F n)
filename2=${filename2#*$'\n'n}

bash script parallel ssh remote command

i have a script that fires remote commands on several different machines through ssh connection. Script goes something like:
for server in list; do
echo "output from $server"
ssh to server execute some command
done
The problem with this is evidently the time, as it needs to establish ssh connection, fire command, wait for answer, print it. What i would like is to have script that would try to establish connections all at once and return echo "output from $server" and output of command as soon as it gets it, so not necessary in the list order.
I've been googling this for a while but didn't find an answer. I cannot cancel ssh session after command run as one thread suggested, because i need an output and i cannot use parallel gnu suggested in other threads. Also i cannot use any other tool, i cannot bring/install anything on this machine, only useable tool is GNU bash, version 4.1.2(1)-release.
Another question is how are ssh sessions like this limited? If i simply paste 5+ or so lines of "ssh connect, do some command" it actually doesn't do anything, or execute only on first from list. (it works if i paste 3-4 lines). Thank you
Have you tried this?
for server in list; do
ssh user#server "command" &
done
wait
echo finished
Update: Start subshells:
for server in list; do
(echo "output from $server"; ssh user#server "command"; echo End $server) &
done
wait
echo All subshells finished
There are several parallel SSH tools that can handle that for you:
http://code.google.com/p/pdsh/
http://sourceforge.net/projects/clusterssh/
http://code.google.com/p/sshpt/
http://code.google.com/p/parallel-ssh/
Also, you could be interested in configuration deployment solutions such as Chef, Puppet, Ansible, Fabric, etc (see this summary ).
A third option is to use a terminal broadcast such as pconsole
If you only can use GNU commands, you can write your script like this:
for server in $servers ; do
( { echo "output from $server" ; ssh user#$server "command" ; } | \
sed -e "s/^/$server:/" ) &
done
wait
and then sort the output to reconcile the lines.
I started with the shell hacks mentionned in this thread, then proceeded to something somewhat more robust : https://github.com/bearstech/pussh
It's my daily workhorse, and I basically run anything against 250 servers in 20 seconds (it's actually rate limited otherwise the connection rate kills my ssh-agent). I've been using this for years.
See for yourself from the man page (clone it and run 'man ./pussh.1') : https://github.com/bearstech/pussh/blob/master/pussh.1
Examples
Show all servers rootfs usage in descending order :
pussh -f servers df -h / |grep /dev |sort -rn -k5
Count the number of processors in a cluster :
pussh -f servers grep ^processor /proc/cpuinfo |wc -l
Show the processor models, sorted by occurence :
pussh -f servers sed -ne "s/^model name.*: //p" /proc/cpuinfo |sort |uniq -c
Fetch a list of installed package in one file per host :
pussh -f servers -o packages-for-%h dpkg --get-selections
Mass copy a file tree (broadcast) :
tar czf files.tar.gz ... && pussh -f servers -i files.tar.gz tar -xzC /to/dest
Mass copy several remote file trees (gather) :
pussh -f servers -o '|(mkdir -p %h && tar -xzC %h)' tar -czC /src/path .
Note that the pussh -u feature (upload and execute) was the main reason why I programmed this, no tools seemed to be able to do this. I still wonder if that's the case today.
You may like the parallel-ssh project with the pssh command:
pssh -h servers.txt -l user command
It will output one line per server when the command is successfully executed. With the -P option you can also see the output of the command.

Capture historical process history UNIX?

I'm wondering if there a way of capturing a list of the processes executed on a non-interactive shell?
Basically I have a script which calls some variables from other sources and I want to see what the values of said variables are. However, the script executes and finishes very quickly so I can't capture the values using ps.
Is there a way to log processes and what arguments were used?
TIA
Huskie
EDIT:
I'm using Solaris in this instance. I even thought about about having a quick looping script to capture the values being passed - but this doesn't seem very accurate and I'm sure executions aren't always being captured.
I tried this:
#!/bin/ksh
while [ true ]
do
ps -ef | grep $SCRIPT_NAME |egrep -v 'shl|lis|grep' >> grep_out.txt
done
I'd use sleep but I can't specify any precision as all my sleep executables want an integer value rather than any fractional value.
On Solaris:
truss -s!all -daDf -t exec yourCommand 2>&1 | grep -v ENOENT
On AIX and possibly other System V based OSes:
truss -s!all -daDf -t execve yourCommand 2>&1 | grep -v ENOENT
On Linux and other OSes supporting strace, you can use this command:
strace -ff -etrace=execve yourCommand 2>&1 >/dev/tty | grep -v ENOENT
In case the command you want to trace is already running, you can replace yourCommand by -p pid with pid being the process to be traced process id.
EDIT:
Here is a way to trace your running script(s) under Solaris:
for pid in $(pgrep -f $SCRIPT_NAME); do
truss -s!all -daDf -t exec -p $pid 2>&1 | grep -v ENOENT > log.$pid.out &
done
Note that with Solaris, you might also use dtrace to get the same (and more).
Most shells can be invoked in debug mode, where each statement being executed is printed to stdout (or stderr) after variable substitution and expansion.
For Bourne like shells (sh, bash), debug is enabled with the -x option (as in bash -x myscript) or using the set -x statement within the script itself.
However, debugging only works for the 'current' script. If the script calls other scripts, these other scripts will not execute in debug mode. Furthermore, the code inside functions may not be executed in debug mode either - depends on the specific shell - although you can use set -x within a function to enable debug explicitly.
A very much more verbose (at least by default) option is to use something like strace for this.
strace -f -o trace.out script.sh
will give you huge amounts of information about what the script is doing. For your specific usage you will likely want to limit the output a bit with the -e trace=.... option to control which system calls are traced.
Use truss instead of strace on Solaris. Use dtruss on OS X (I believe). With appropriate command line argument changes as well.

How to set the process name of a shell script?

Is there any way to set the process name of a shell script? This is needed for killing this script with the killall command.
Here's a way to do it, it is a hack/workaround but it works pretty good. Feel free to tweak it to your needs, it certainly needs some checks on the symbolic link creation or using a tmp folder to avoid possible race conditions (if they are problematic in your case).
Demonstration
wrapper
#!/bin/bash
script="./dummy"
newname="./killme"
rm -iv "$newname"
ln -s "$script" "$newname"
exec "$newname" "$#"
dummy
#!/bin/bash
echo "I am $0"
echo "my params: $#"
ps aux | grep bash
echo "sleeping 10s... Kill me!"
sleep 10
Test it using:
chmod +x dummy wrapper
./wrapper some params
In another terminal, kill it using:
killall killme
Notes
Make sure you can write in your current folder (current working directory).
If your current command is:
/path/to/file -q --params somefile1 somefile2
Set the script variable in wrapper to /path/to/file (instead of ./dummy) and call wrapper like this:
./wrapper -q --params somefile1 somefile2
You can use the kill command on a PID so what you can do is run something in the background, get its ID and kill it
PID of last job run in background can be obtained using $!.
echo test & echo $!
You cannot do this reliably and portably, as far as I know. On some flavors of Unix, changing what's in argv[0] will do the job. I don't believe there's a way to do that in most shells, though.
Here are some references on the topic.
Howto change a UNIX process and child process name by modifying argv0
Is there a way to change the effective process name in Python?
This is an extremely old post. Pretty sure the original poster got his/her answer long ago. But for newcomers, thought I'd explain my own experience (after playing with bash for a half hour). If you start a script by script name w/ something like:
./script.sh
the process name listed by ps will be "bash" (on my system). However if you start a script by calling bash directly:
/bin/bash script.sh
/bin/sh script.sh
bash script.sh
you will end up with a process name that contains the name of the script. e.g.:
/bin/bash script.sh
results in a process name of the same name. This can be used to mark pids with a specific script name. And, this can be useful to (for example) use the kill command to stop all processes (by pid) that have a process name containing said script name.
You can all use the -f flag to pgrep/pkill which will search the entire command line rather than just the process name. E.g.
./script &
pkill -f script
Include
#![path to shell]
Example for path to shell -
/usr/bin/bash
/bin/bash
/bin/sh
Full example
#!/usr/bin/bash
On Linux at least, killall dvb works even though dvb is a shell script labelled with #!. The only trick is to make the script executable and invoke it by name, e.g.,
dvb watch abc write game7 from 9pm for 3:30
Running ps shows a process named
/usr/bin/lua5.1 dvb watch ...
but killall dvb takes it down.
%1, %2... also do an adequate job:
#!/bin/bash
# set -ex
sleep 101 &
FIRSTPID=$!
sleep 102 &
SECONDPID=$!
echo $(ps ax|grep "^\(${FIRSTPID}\|${SECONDPID}\) ")
kill %2
echo $(ps ax|grep "^\(${FIRSTPID}\|${SECONDPID}\) ")
sleep 1
kill %1
echo $(ps ax|grep "^\(${FIRSTPID}\|${SECONDPID}\) ")
I put these two lines at the start of my scripts so I do not have to retype the script name each time I revise the script. It won't take $0 of you put it after the first shebang. Maybe someone who actually knows can correct me but I believe this is because the script hasn't started until the second line so $0 doesn't exist until then:
#!/bin/bash
#!/bin/bash ./$0
This should do it.
My solution uses a trivial python script, and the setproctitle package. For what it's worth:
#!/usr/bin/env python3
from sys import argv
from setproctitle import setproctitle
from subprocess import run
setproctitle(argv[1])
run(argv[2:])
Call it e.g. run-with-title and stick it in your path somewhere. Then use via
run-with-title <desired-title> <script-name> [<arg>...]
Run bash script with explicit call to bash (not just like ./test.sh). Process name will contain script in this case and can be found by script name. Or by explicit call to bash with full path as
suggested in display_name_11011's answer:
bash test.sh # explicit bash mentioning
/bin/bash test.sh # or with full path to bash
ps aux | grep test.sh | grep -v grep # searching PID by script name
If the first line in script (test.sh) explicitly specifies interpreter:
#!/bin/bash
echo 'test script'
then it can be called without explicit bash mentioning to create process with name '/bin/bash test.sh':
./test.sh
ps aux | grep test.sh | grep -v grep
Also as dirty workaround it is possible to copy and use bash with custom name:
sudo cp /usr/bin/bash /usr/bin/bash_with_other_name
/usr/bin/bash_with_other_name test.sh
ps aux | grep bash_with_other_name | grep -v grep
Erm... unless I'm misunderstanding the question, the name of a shell script is whatever you've named the file. If your script is named foo then killall foo will kill it.
We won't be able to find pid of the shell script using "ps -ef | grep {scriptName}" unless the name of script is overridden using shebang. Although all the running shell scripts come in response of "ps -ef | grep bash". But this will become trickier to identify the running process as there will be multiple bash processing running simultaneously.
So a better approach is to give an appropriate name to the shell script.
Edit the shell script file and use shebang (the very first line) to name the process e.g. #!/bin/bash /scriptName.sh
In this way we would be able to grep the process id of scriptName using
"ps -ef | grep {scriptName}"

Resources