I am writing a bash script that logs into remote nodes and returns the services being run on that node.
#!/bin/bash
declare -a SERVICES=('redis-server' 'kube-controller-manager' 'kubelet' 'postgres' 'mongod' 'elasticsearch');
for svc in "${SERVICES[#]}"
do
RESULT=`ssh 172.29.219.109 "ps -ef | grep -v grep | grep $svc"`
if [ -z ${RESULT} ]
then
echo "Is Empty" > /dev/null
else
echo "$svc is running on this node"
fi
done
Now the output of ssh 172.29.219.109 "ps -ef | grep -v grep | grep $svc" on the node is ::
postgres 2102 1 0 Jan29 ? 00:24:27 /opt/PostgresPlus/pgbouncer/bin/pgbouncer -d /opt/PostgresPlus/pgbouncer/share/pgbouncer.ini
postgres 2394 1 0 Jan29 ? 00:20:10 /opt/PostgresPlus/9.4AS/bin/edb-postgres -D /opt/PostgresPlus/9.4AS/data
postgres 2431 2394 0 Jan29 ? 00:00:01 postgres: logger process
postgres 2434 2394 0 Jan29 ? 00:07:15 postgres: checkpointer process
postgres 2435 2394 0 Jan29 ? 00:01:10 postgres: writer process
postgres 2436 2394 0 Jan29 ? 00:03:27 postgres: wal writer process
postgres 2437 2394 0 Jan29 ? 00:20:03 postgres: autovacuum launcher process
postgres 2438 2394 0 Jan29 ? 00:37:00 postgres: stats collector process
postgres 2494 1 0 Jan29 ? 00:08:12 /opt/PostgresPlus/9.4AS/bin/pgagent -l 1 -s /var/log/ppas-agent-9.4.log hostaddr=localhost port=5432 dbname=postgres user=postgres
postgres 2495 2394 0 Jan29 ? 00:11:25 postgres: postgres postgres 127.0.0.1[59246] idle
When I run the script, I do get the result I want but Im getting an unwanted message which seems to be related to the variable in which I am storing my result.
# ./map_services_to_nodes.sh
./map_services_to_nodes.sh: line 12: [: too many arguments
postgres is found on this node
The Algo that I im using is ::
Search for all services defined in my array.
Store the result in a variable.
If Variable is empty, that means that service is not running.
If its not empty, service is running.
You need to escape $ (to avoid local expansion) and " when using inside ssh command, also avoid using the outdated back-ticks for command-substitution, use $(..), see Why Use $(STATEMENT) instead of legacy STATEMENT
RESULT=$(ssh 172.29.219.109 "ps -ef | grep -v grep | grep \$svc")
and double quote variables inside test operator,
if [ -z "${RESULT}" ]
Changed the below
if [ -z ${RESULT} ]
to
if [ -z "${RESULT}" ]
and it worked.
# ./map_services_to_nodes.sh
postgres is found on this node
Related
I need in a bash script a IF condition on the existence of a role in a PostgreSQL database. I have found solutions in SQL code [1, 2], but I need something I can use directly in bash, I assume with the help of psql. In [2] there are also psql solutions, but I don't manage to adapt it in a IF statement.
I have tried this unsuccessfully (I am a PostgreSQL and bash newbie):
psql_USER=my
if [ "$( psql -h db -U postgres --no-psqlrc --single-transaction --pset=pager=off --tuples-only --set=ON_ERROR_STOP=1 -tc "SELECT 1 FROM pg_user WHERE usename = $psql_USER" | grep -q 1 )" == '1' ] > /dev/null 2> /dev/null; then
echo "HOURRA !"
fi;
Result is:
Password for user postgres:
ERROR: column « my » does not exist
LINE 1: SELECT 1 FROM pg_user WHERE usename = my
^
I would avoid the quoting problem like this:
if psql -Atq -c "SELECT '#' || usename || '#' FROM pg_user" | grep -q '#'"$psql_USER"'#'
then
echo yes
fi
The psql invocation selects a list of all usernames, prefixed and suffixed with #. The grep has return code 0 if psql_USER contains one of these user names, else 1. The then branch of if is only taken if the return code of the pipeline is 0, that is, if the user exists in the database.
Here is the 'smem' command I run on the Redhat/CentOS Linux system. I expect the output be printed without the fields with zero size however I would expect the heading columns.
smem -kt -c "pid user command swap"
PID User Command Swap
7894 root /sbin/agetty --noclear tty1 0
9666 root ./nimbus /opt/nimsoft 0
7850 root /sbin/auditd 236.0K
7885 root /usr/sbin/irqbalance --fore 0
11205 root nimbus(hdb) 0
10701 root nimbus(spooler) 0
8446 trapsanalyzer1 /opt/traps/analyzerd/analyz 0
50316 apache /usr/sbin/httpd -DFOREGROUN 0
50310 apache /usr/sbin/httpd -DFOREGROUN 0
3971 root /usr/sbin/lvmetad -f 36.0K
63988 root su - 0
7905 ntp /usr/sbin/ntpd -u ntp:ntp - 4.0K
7876 dbus /usr/bin/dbus-daemon --syst 44.0K
9672 root nimbus(controller) 0
7888 root /usr/lib/systemd/systemd-lo 0
63990 root -bash 0
59978 postfix pickup -l -t unix -u 0
3977 root /usr/lib/systemd/systemd-ud 736.0K
9016 postfix qmgr -l -t unix -u 0
50303 root /usr/sbin/httpd -DFOREGROUN 0
3941 root /usr/lib/systemd/systemd-jo 52.0K
8199 root //usr/lib/vmware-caf/pme/bi 0
8598 daemon /opt/quest/sbin/.vasd -p /v 0
8131 root /usr/sbin/vmtoolsd 0
7881 root /usr/sbin/NetworkManager -- 8.0K
8364 root /opt/puppetlabs/puppet/bin/ 0
8616 daemon /opt/quest/sbin/.vasd -p /v 0
23290 root /usr/sbin/rsyslogd -n 3.8M
64091 root python /bin/smem -kt -c pid 0
7887 polkitd /usr/lib/polkit-1/polkitd - 0
8363 root /usr/bin/python2 -Es /usr/s 0
53606 root /usr/share/metricbeat/bin/m 0
24631 nagios /usr/local/ncpa/ncpa_passiv 0
24582 nagios /usr/local/ncpa/ncpa_listen 0
7886 root /opt/traps/bin/authorized 76.0K
7872 root /opt/traps/bin/pmd 12.0K
8374 root /opt/puppetlabs/puppet/bin/ 0
7883 root /opt/traps/bin/trapsd 64.0K
----------------------------------------------------
54 10 5.1M
Like this?:
$ awk '$NF!=0' file
PID User Command Swap
7850 root /sbin/auditd 236.0K
...
7883 root /opt/traps/bin/trapsd 64.0K
----------------------------------------------------
54 10 5.1M
But instead of using the form awk ... file you'd probably like to smem ... | awk '$NF!=0'.
Could you please try following, for extra precautions removing the space from last fields(in case it is there).
smem -kt -c "pid user command swap" | awk 'FNR==1{print;next} {sub(/[[:space:]]+$/,"")} $NF==0{next} 1'
I have a simple bash script as follows that is part of a docker image.
test.sh,
#!/bin/bash
set -e
logit() {
log_date=`date +"%F %T"`
echo "[$log_date][INFO] $1"
}
waitForServerToStart() {
while true; do
logit "Testing .... 1"
netstat -anpt
logit "Testing .... 2"
netstat -anpt | grep tcp
logit "Testing .... 3"
sleep 5
logit "Testing .... 4"
done
}
waitForServerToStart
run.sh,
#!/bin/sh
/test.sh &
# Run forever
while true; do sleep 5; done
Dockerfile,
FROM openjdk:8u191-jre-alpine3.9
COPY files/run.sh /
COPY files/test.sh /
CMD ["/run.sh"]
If I run this container I only get the following output which leads me to believe somehow grep and "pipe" seem to get blocked.
[2019-03-06 11:10:45][INFO] Testing .... 1
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 172.17.0.2:58278 xxx.xxx.xx.xx:443 FIN_WAIT2 -
[2019-03-06 11:10:45][INFO] Testing .... 2
Can someone please shed some light around this ?
It works fine If I comment out netstat -anpt | grep tcp. I would then see the subsequent log lines and it would also continue in the loop.
[2019-03-06 11:25:36][INFO] Testing .... 3
[2019-03-06 11:25:41][INFO] Testing .... 4
This one has me puzzled! But I have a solution for you:
Use awk instead of grep
In test.sh use this instead:
netstat -anpt | awk /tcp/
So that the file looks like this:
#!/bin/bash
set -e
logit() {
log_date=`date +"%F %T"`
echo "[$log_date][INFO] $1"
}
waitForServerToStart() {
while true; do
logit "Testing .... 1"
netstat -anpt
logit "Testing .... 2"
netstat -anpt | awk /tcp/
logit "Testing .... 3"
sleep 5
logit "Testing .... 4"
done
}
waitForServerToStart
For a reason that I cannot explain - grep will not return when reading from the pipe when invoked from the script. I created your container locally, ran it and entered it - and the command netstat -anpt | grep tcp runs just fine and exits. If you replace it with netstat -anpt | cat in your test.sh script, then it will also pass just fine.
I looked all over the place for the someone with an identical issue with grep in a container from the distro you are using, the version etc. - but came up empty handed.
I believe that it may have to do with grep waiting for a EOF character that never lands - but I am not sure.
I have a bash script called sr_run_batch.sh which does super resolution of images. Now I want to do testing on different servers in parallel at the same time. ie. 1 Virtual machine at one given point of time. then 2 virtual machines at one point of time , 3 and then 4.
I tried writing into it the commands
for host in $(cat hosts.txt); do ssh "$host" "$command" >"output.$host"; done
ssh-keygen && for host in $(cat hosts.txt); do ssh-copy-id $host; done
where the file hosts.txt contains the list of servers: username#ip(format) but when I run this, it gives me substitution error
Hence, I tried pssh (parallel-ssh)
pssh -h hosts-file -l username -P $command
command being ./sr_run_batch.sh
but it didn't run, so I modified this to
pssh -h hosts-file -l ben -P -I<./sr_run_batch.sh
But, for some unknown reason, it just prints the echo statements in the code.
here is the code :
NList=(5)
VList=(1)
FList=("input/flower1.jpg" "input/flower2.jpg" "input/flower3.jpg" "input/flower4.jpg")
IList=("320X240" "640X480" "1280X960" "1920X1200")
SList=(2 3)
for VM in ${VList[#]}; do
for ((index=0; index < ${#FList};)) do
file=$FList[$index]
image_size=$IList[$index]
width=`echo $image_size|cut -d "X" -f1`
height=`echo $image_size|cut -d "X" -f2`
for scale_factor in ${SList[#]}; do
for users in ${NList[#]}; do
echo "V: $VM, " "F: $file, " "S: $scale_factor, " "I: $width $height , " "N: $users"
for i in `seq 1 $users` ; do
./sr_run_once.sh $file $width $height $scale_factor &
done
wait
done # for users
done # for scale_factor
done # for index
done # for VM
exit 0
Have you also tried to use pssh with a simple bash-script so see if the communication is set up ok?
$ pssh -h hosts.txt -A -l ben -P -I<./uptime.sh
Warning: do not enter your password if anyone else has superuser
privileges or access to your account.
Password:
10.0.0.67: 11:06:50 up 28 min, 2 users, load average: 0.00, 0.00, 0.00
[1] 11:06:50 [SUCCESS] 10.0.0.67
10.0.0.218: 11:06:50 up 24 min, 2 users, load average: 0.00, 0.05, 0.20
[2] 11:06:50 [SUCCESS] 10.0.0.218
I am trying to get my Capistrano deploy script working, but it is not doing the symlinking as it is configured to do as shown below.
set :linked_files, %w{config/database.yml}
set :linked_dirs, %w{log tmp vendor/bundle public/system}
When it runs the related command, I get the following:
WARN [SKIPPING] No Matching Host for /usr/bin/env [ -f /path/to/shared/config/database.yml ]
If I run this command on the server, either through ssh or through logging onto the server and running the command, I get no response from the command.
user: ~
$ [ -f /path/to/shared/config/database.yml ]
user: ~
$
The file does exist in the specified location and has permissions.
user: ~
$ ll /path/to/shared/config/
total 4.0K
drwxrwxr-x 2 user group 33 Nov 30 10:58 .
drwxrwxr-x 7 user group 89 Nov 30 10:58 ..
-rwxrwxr-x 1 user group 805 Nov 30 10:58 database.yml
user: ~
Shouldn't this return a true or a false, instead of nothing? Is there a configuration I may have changed that suppresses the output? I get no response at all whether the file exists or not.
In your response to the actual question you ask, test (which is what [ is an alias for) does in fact not return output to stdout. It returns an exit code.
user: ~
$ [ -f /path/to/shared/config/database.yml ] # if the file exists
user: ~
$ echo $?
0
user: ~
$ [ -f /path/to/shared/config/database.yml ] # if the file does not exist
user: ~
$ echo $?
1
test -f /path/to/file (or [ -f /path/to/file ]) yields an exit code of 0 if the file exists or 1 if it does not. If you want to check that a file is there and echo the path to it, try:
[ -f /path/to/file ] && echo "/path/to/file"