Efficiently find PIDs of many processes started by services - bash

I have a file with many service names, some of them are running, some of them aren't.
foo.service
bar.service
baz.service
I would like to find an efficient way to get the PIDs of the running processes started by the services (for the not running ones a 0, -1 or empty results are valid).
Desired output example:
foo.service:8484
bar.server:
baz.service:9447
(bar.service isn't running).
So far I've managed to do the following: (1)
cat t.txt | xargs -I {} systemctl status {} | grep 'Main PID' \
| awk '{print $3}'
With the following output:
8484
9447
But I can't tell which service every PID belongs to.
(I'm not bound to use xargs, grep or awk.. just looking for the most efficient way).
So far I've managed to do the following: (2)
for f in `cat t.txt`; do
v=`systemctl status $f | grep 'Main PID:'`;
echo "$f:`echo $v | awk '{print \$3}'`";
done;
-- this gives me my desired result. Is it efficient enough?

I ran into similar problem and fount leaner solution:
systemctl show --property MainPID --value $SERVICE
returns just the PID of the service, so your example can be simplified down to
for f in `cat t.txt`; do
echo "$f:`systemctl show --property MainPID --value $f`";
done

You could also do:
while read -r line; do
statuspid="$(sudo service $line status | grep -oP '(?<=(process|pid)\s)[0-9]+')"
appendline=""
[[ -z $statuspid ]] && appendline="${line}:${statuspid}" || appendline="$line"
"$appendline" >> services-pids.txt
done < services.txt
To use within a variable, you could also have an associative array:
declare -A servicearray=()
while read -r line; do
statuspid="$(sudo service $line status | grep -oP '(?<=(process|pid)\s)[0-9]+')"
[[ -z $statuspid ]] && servicearray[$line]="statuspid"
done < services.txt
# Echo output of array to command line
for i in "${!servicearray[#]}"; do # Iterating over the keys of the associative array
# Note key-value syntax
echo "service: $i | pid: ${servicearray[$i]}"
done
Making it more efficient:
To list all processes with their execution commands and PIDs. This may give us more than one PID per command, which might be useful:
ps -eo pid,comm
So:
psidcommand=$(ps -eo pid,comm)
while read -r line; do
# Get all PIDs with the $line prefixed
statuspids=$(echo $psidcommand | grep -oP '[0-9]+(?=\s$line)')
# Note that ${statuspids// /,} replaces space with commas
[[ -z $statuspids ]] && appendline="${line}:${statuspids// /,}" || appendline="$line"
"$appendline" >> services-pids.txt
done < services.txt
OUTPUT:
kworker:5,23,28,33,198,405,513,1247,21171,22004,23749,24055
If you're confident your file has the full name of the process, you can replace the:
grep -oP '[0-9]+(?=\s$line)'
with
grep -oP '[0-9]+(?=\s$line)$' # Note the extra "$" at the end of the regex
to make sure it's an exact match (in the grep without trailing $, line "mys" would match with "mysql"; in the grep with trailing $, it would not, and would only match "mysql").

Building up on Yorik.sar's answer, you first want to get the MainPID of a server like so:
for SERVICE in ...<service names>...
do
MAIN_PID=`systemctl show --property MainPID --value $SERVICE`
if test ${MAIN_PID} != 0
than
ALL_PIDS=`pgrep -g $MAIN_PID`
...
fi
done
So using systemctl gives you the PID of the main process controlled by your daemon. Then the pgrep gives you the daemon and a list of all the PIDs of the processes that daemon started.
Note: if the processes are user processes, you have to use the --user on the systemctl command line for things to work:
MAIN_PID=`systemctl --user show --property MainPID --value $SERVICE`
Now you have the data you are interested in the MAIN_PID and ALL_PIDS variables, so you can print the results like so:
if test -n "${ALL_PID}"
then
echo "${SERVICE}: ${ALL_PIDS}"
fi

Related

shell if statement always returning true

I want to check if my VPN is connected to a specific country. The VPN client has a status option but sometimes it doesn't return the correct country, so I wrote a script to check if I'm for instance connected to Sweden. My script looks like this:
#!/bin/bash
country=Sweden
service=expressvpn
while true; do
if ((curl -s https://www.iplocation.net/find-ip-address | grep $country | grep -v "grep" | wc -l) > 0 )
then
echo "$service connected!!!"
else
echo "$service not connected!"
$service connect $country
fi;
sleep 5;
done
The problem is, it always says "service connected", even when it isn't. When I enter the curl command manually, wc -l returns 0 if it didn't find Sweden and 1 when it does. What's wrong with the if statement?
Thank you
Peter
(( )) enters a math context -- anything inside it is interpreted as a mathematical expression. (You want your code to be interpreted as a math expression -- otherwise, > 0 would be creating a file named 0 and storing wc -l's output in that file, not comparing the output of wc -l to 0).
Since you aren't using )) on the closing side, this is presumably exactly what's happening: You're storing the output of wc -l in a file named 0, and then using its exit status (successful, since it didn't fail) to decide to follow the truthy branch of the if statement. [Just adding more parens on the closing side won't fix this, either, since curl -s ... isn't valid math syntax].
Now, if you want to go the math approach, what you can do is run a command substitution, which replaces the command with its output; that is a math expression:
# smallest possible change that works -- but don't do this; see other sections
if (( $(curl -s https://www.iplocation.net/find-ip-address | grep $country | grep -v "grep" | wc -l) > 0 )); then
...if your curl | grep | grep | wc becomes 5, then after the command substitution this looks like:
if (( 5 > 0 )); then
...and that does what you'd expect.
That said, this is silly. You want to know if your target country is in curl's output? Just check for that directly with shell builtins alone:
if [[ $(curl -s https://www.iplocation.net/find-ip-address) = *"$country"* ]]; then
echo "Found $country in output of curl" >&2
fi
...or, if you really want to use grep, use grep -q (which suppresses output), and check its exit status (which is zero, and thus truthy, if and only if it successfully found a match):
if curl -s https://www.iplocation.net/find-ip-address | grep -q -e "$country"; then
echo "Found $country in output of curl with grep" >&2
fi
This is more efficient in part because grep -q can stop as soon as it finds a match -- it doesn't need to keep reading more content -- so if your file is 16KB long and the country name is in the first 1KB of output, then grep can stop reading from curl (and curl can stop downloading) as soon as that first match 1KB in is seen.
The result of the curl -s https://www.iplocation.net/find-ip-address | grep $country | grep -v "grep" | wc -l statement is text. You compare text and number, that is why your if statement does not work.
This might solve your problem;
if [ $(curl -s https://www.iplocation.net/find-ip-address | grep $country | grep -v "grep" | wc -l) == "0" ] then ...
That worked, thank you for your help, this is what my script looks now:
#!/bin/bash
country=Switzerland
service=expressvpn
while true; do
if curl -s https://www.iplocation.net/find-ip-address | grep -q -e "$country"; then
echo "Found $country in output of curl with grep" >&2
echo "$service not connected!!!"
$service connect Russia
else
echo "$service connected!"
fi;
sleep 5;
done

Bash script - how to count processes that have most parents

we were given a task to write a script in a course. We have to make the script find out which proccess is "deepest" in process hierarchy, something like "pstree" command, but the output will be "depth_of_process : processes_with_the_depth".
I have started something, but I can't make it work. Could you please look at it and help me ? I haven't even started producing the output, I am working on the algorithm now - trying to make it into something like reverse depth-first search. In case the code is not self-explanatory enough, please let me know, I will do my best to describe it.
#!/bin/bash
PROCS=$(ps -eo "%p %P" | tail -n +2 | sort -nr)
declare -a array
while read -r line; do
counter=1
read kid parent
while read -r otherline; do
read kid2 parent2
if [ "$parent" = "$kid2" ]; then
counter=$((counter+1))
parent="$parent2"
fi
done <<< "$PROCS"
test=2
array["$kid"]="$counter"
done <<< "$PROCS"
#for value in "${!array[#]}"; do
# echo "$value ${array[value]}"
#done
echo "$PROCS"
If pstree is allowed I could offer this (thanks #tripleee for optimizing):
for processid in $(ps -ax | awk 'NR>1 {print $1}' ); do
depth=$(pstree -sA $processid | head -n1 | sed -e 's#-+-.*#---foobar#' -e 's#---*#\n#g' -eq | wc -l)
echo "$depth: $processid"
done
It might have issues if your processes contain two or more dashes in a row.
Of course you can add " | sort" after "done" to get the deepest processes.

A script to find all the users who are executing a specific program

I've written the bash script (searchuser) which should display all the users who are executing a specific program or a script (at least a bash script). But when searching for scripts fails because the command the SO is executing is something like bash scriptname.
This script acts parsing the ps command output, it search for all the occurrences of the specified program name, extracts the user and the program name, verifies if the program name is that we're searching for and if it's it displays the relevant information (in this case the user name and the program name, might be better to output also the PID, but that is quite simple). The verification is accomplished to reject all lines containing program names which contain the name of the program but they're not the program we are searching for; if we're searching gedit we don't desire to find sgedit or gedits.
Other issues I've are:
I would like to avoid the use of a tmp file.
I would like to be not tied to GNU extensions.
The script has to be executed as:
root# searchuser programname <invio>
The script searchuser is the following:
#!/bin/bash
i=0
search=$1
tmp=`mktemp`
ps -aux | tr -s ' ' | grep "$search" > $tmp
while read fileline
do
user=`echo "$fileline" | cut -f1 -d' '`
prg=`echo "$fileline" | cut -f11 -d' '`
prg=`basename "$prg"`
if [ "$prg" = "$search" ]; then
echo "$user - $prg"
i=`expr $i + 1`
fi
done < $tmp
if [ $i = 0 ]; then
echo "No users are executing $search"
fi
rm $tmp
exit $i
Have you suggestion about to solve these issues?
One approach might looks like such:
IFS=$'\n' read -r -d '' -a pids < <(pgrep -x -- "$1"; printf '\0')
if (( ! ${#pids[#]} )); then
echo "No users are executing $1"
fi
for pid in "${pids[#]}"; do
# build a more accurate command line than the one ps emits
args=( )
while IFS= read -r -d '' arg; do
args+=( "$arg" )
done </proc/"$pid"/cmdline
(( ${#args[#]} )) || continue # exited while we were running
printf -v cmdline_str '%q ' "${args[#]}"
user=$(stat --format=%U /proc/"$pid") || continue # exited while we were running
printf '%q - %s\n' "$user" "${cmdline_str% }"
done
Unlike the output from ps, which doesn't distinguish between ./command "some argument" and ./command "some" "argument", this will emit output which correctly shows the arguments run by each user, with quoting which will re-run the given command correctly.
What about:
ps -e -o user,comm | egrep "^[^ ]+ +$1$" | cut -d' ' -f1 | sort -u
* Addendum *
This statement:
ps -e -o user,pid,comm | egrep "^\s*\S+\s+\S+\s*$1$" | while read a b; do echo $a; done | sort | uniq -c
or this one:
ps -e -o user,pid,comm | egrep "^\s*\S+\s+\S+\s*sleep$" | xargs -L1 echo | cut -d ' ' -f1 | sort | uniq -c
shows the number of process instances by user.

pgrep inside bash script not working

I'm trying to save the pgrep -f "instance" output inside a variable in a bash script. For some reason it doesn't work:
here is the code:
function check_running() {
local component_identifier=$1
local process_check=`get_component_param_value $component_identifier $process_check_col | tr -d '\n'`
if [ "$process_check" != "n/a" ]; then
log "process to check: ****""$process_check""****"
pid=`pgrep -f $process_check`
log "pid: " $pid
fi
}
I have tried with different ways, in single and double quotes.
Also, neither this works:
pid=$(pgrep -f "$process_check")
Please note that the process_check variable returns correctly and is definitely working.
I believe the problem is that this field is at the end of the line and may contain a \n char, that is why I've added a tr in the process_check var.
Any idea?
this is the output of the logs:
process to check: ****"instance"****
pid:
I found a way to answer this question:
echo `ps -ef | grep "$process_check" | grep -wv 'grep\|vi\|vim'` | awk '{print $2}'
I hope other people will find this useful
Well, in my case it didn't like the arguments I was trying to pass.
pgrep -f "example" # working
pgrep -f "-f -N example" # not working

Bash script checking cpu usage of specific process

First off, I'm new to this. I have some experience with windows scripting and apple script but not much with bash. What I'm trying to do is grab the PID and %CPU of a specific process. then compare the %CPU against a set number, and if it's higher, kill the process. I feel like I'm close, but now I'm getting the following error:
[[: 0.0: syntax error: invalid arithmetic operator (error token is ".0")
what am I doing wrong? here's my code so far:
#!/bin/bash
declare -i app_pid
declare -i app_cpu
declare -i cpu_limit
app_name="top"
cpu_limit="50"
app_pid=`ps aux | grep $app_name | grep -v grep | awk {'print $2'}`
app_cpu=`ps aux | grep $app_name | grep -v grep | awk {'print $3'}`
if [[ ! $app_cpu -gt $cpu_limit ]]; then
echo "crap"
else
echo "we're good"
fi
Obviously I'm going to replace the echos in the if/then statement but it's acting as if the statement is true regardless of what the cpu load actually is (I tested this by changing the -gt to -lt and it still echoed "crap"
Thank you for all the help. Oh, and this is on a OS X 10.7 if that is important.
I recommend taking a look at the facilities of ps to avoid multiple horrible things you do.
On my system (ps from procps on linux, GNU awk) I would do this:
ps -C "$app-name" -o pid=,pcpu= |
awk --assign maxcpu="$cpu_limit" '$2>maxcpu {print "crappy pid",$1}'
The problem is that bash can't handle decimals. You can just multiply them by 100 and work with plain integers instead:
#!/bin/bash
declare -i app_pid
declare -i app_cpu
declare -i cpu_limit
app_name="top"
cpu_limit="5000"
app_pid=`ps aux | grep $app_name | grep -v grep | awk {'print $2'}`
app_cpu=`ps aux | grep $app_name | grep -v grep | awk {'print $3*100'}`
if [[ $app_cpu -gt $cpu_limit ]]; then
echo "crap"
else
echo "we're good"
fi
Keep in mind that CPU percentage is a suboptimal measurement of application health. If you have two processes running infinite loops on a single core system, no other application of the same priority will ever go over 33%, even if they're trashing around.
#!/bin/sh
PROCESS="java"
PID=`pgrep $PROCESS | tail -n 1`
CPU=`top -b -p $PID -n 1 | tail -n 1 | awk '{print $9}'`
echo $CPU
I came up with this, using top and bc.
Use it by passing in ex: ./script apache2 50 # max 50%
If there are many PIDs matching your program argument, only one will be calculated, based on how top lists them. I could have extended the script by catching them all and avergaing the percentage or something, but this will have to do.
You can also pass in a number, ./script.sh 12345 50, which will force it to use an exact PID.
#!/bin/bash
# 1: ['command\ name' or PID number(,s)] 2: MAX_CPU_PERCENT
[[ $# -ne 2 ]] && exit 1
PID_NAMES=$1
# get all PIDS as nn,nn,nn
if [[ ! "$PID_NAMES" =~ ^[0-9,]+$ ]] ; then
PIDS=$(pgrep -d ',' -x $PID_NAMES)
else
PIDS=$PID_NAMES
fi
# echo "$PIDS $MAX_CPU"
MAX_CPU="$2"
MAX_CPU="$(echo "($MAX_CPU+0.5)/1" | bc)"
LOOP=1
while [[ $LOOP -eq 1 ]] ; do
sleep 0.3s
# Depending on your 'top' version and OS you might have
# to change head and tail line-numbers
LINE="$(top -b -d 0 -n 1 -p $PIDS | head -n 8 \
| tail -n 1 | sed -r 's/[ ]+/,/g' | \
sed -r 's/^\,|\,$//')"
# If multiple processes in $PIDS, $LINE will only match\
# the most active process
CURR_PID=$(echo "$LINE" | cut -d ',' -f 1)
# calculate cpu limits
CURR_CPU_FLOAT=$(echo "$LINE"| cut -d ',' -f 9)
CURR_CPU=$(echo "($CURR_CPU_FLOAT+0.5)/1" | bc)
echo "PID $CURR_PID: $CURR_CPU""%"
if [[ $CURR_CPU -ge $MAX_CPU ]] ; then
echo "PID $CURR_PID ($PID_NAMES) went over $MAX_CPU""%"
echo "[[ $CURR_CPU""% -ge $MAX_CPU""% ]]"
LOOP=0
break
fi
done
echo "Stopped"
Erik, I used a modified version of your code to create a new script that does something similar. Hope you don't mind it.
A bash script to get the CPU usage by process
usage:
nohup ./check_proc bwengine 70 &
bwegnine is the process name we want to monitor 70 is to log only when the process is using over 70% of the CPU.
Check the logs at: /var/log/check_procs.log
The output should be like:
DATE | TOTAL CPU | CPU USAGE | Process details
Example:
03/12/14 17:11 |20.99|98| ProdPROXY-ProdProxyPA.tra
03/12/14 17:11 |20.99|100| ProdPROXY-ProdProxyPA.tra
Link to the full blog:
http://felipeferreira.net/?p=1453
It is also useful to have app_user information available to test whether the current user has the rights to kill/modify the running process. This information can be obtained along with the needed app_pid and app_cpu by using read eliminating the need for awk or any other 3rd party parser:
read app_user app_pid tmp_cpu stuff <<< \
$( ps aux | grep "$app_name" | grep -v "grep\|defunct\|${0##*/}" )
You can then get your app_cpu * 100 with:
app_cpu=$((${tmp_cpu%.*} * 100))
Note: Including defunct and ${0##*/} in grep -v prevents against multiple processes matching $app_name.
I use top to check some details. It provides a few more details like CPU time.
On Linux this would be:
top -b -n 1 | grep $app_name
On Mac, with its BSD version of top:
top -l 1 | grep $app_name

Resources