Parsing named parameter values using AWK - bash

I am trying to come up with a script/alias that would quickly give me the list of processes run by an application. The parameters used to initiate the process by the application are named parameters and not positional
I need to extract the parameter values of -u, -s and -svn
$ ps -ef | grep pmdtm | grep -v grep
infa_adm 24581 31146 0 Oct24 ? 00:09:28 pmdtm -PR -gmh dhvifoapp03
-gmp 6015 -guid ddcbd7ab-2ed0-4696-aea3-01573968b1bc -rst 300
-s Address_Validator:wf_AddressValidator.s_m_AddressValidatorS
-dn Session task instance [s_m_AddressValidatorS] -c pmserver.cfg
-run 68_4262_654212_4_0_0_0_3263_77_2_2018_10_24___13_32_47_182
-u Administrator -uns Native -crd rlVuBI4mUFi1V/7/jyrD6f9dMurwD9Yxddio6KDy/
zwlzM5rRDMeV766VoSBqb3Snjlvu849sTXlWpJ8WjzPomNOF4U87H7x5oy
JKbtxVg/vjR6gPwWwVSdEHvPjlpwSKPcuDx6glCbB1ksrvKCAzRsW1BTlP
GOfQbnd1ptnkO83iY14k4LUpJlx8+upBhwSxk9a0TPD44byO+/4Qhe7Mg==
-svn Int01_dev -dmn Domain_dev
-nma https://DHVIFOAPP03.RENTERS-CHOICE-INC.COM:6005
-nmc w/Yt3IIMbmBQf+NnN1CAKmq5ab01nxZTJEA/YCf96Pb5zT9K9VFBO4+Nvqt
FuF8gzvqf/qHbw2tcXk4DnNP4m5vJvuEhxe9vQCN8pmpJytiZKV9Np7rBbapVzra
9TEOQVm9webRg8JZB70MQryVjQlGkJDpRs9cdOCXAu1aFhNE6LNF+
c5qhLdOz/vWCI3I2 -sid 3
-hhn dhvifoapp04.renters-choice-inc.com -hpn 15555
-hto 60 -rac 0 -SSL
-rsn RAC_dev ServiceResilienceTimeout=300
I am able to extract it for a single field using the following command, but how do I get multiple values?
$ echo "List of running jobs ==> "; ps -ef | grep pmdtm | grep -v grep | awk -F"-s " "{print \$2}"|awk -F" " "{print \$1}"
List of running jobs ==>
Address_Validator:wf_AddressValidator.s_m_AddressValidatorS
Desired output =
List of running jobs ==>
Address_Validator:wf_AddressValidator.s_m_AddressValidatorS | Administrator | Int01_dev

You can do multiple "OR" expressions in grep with something like this:
grep -E "^-s|^-u|^-svn" < file.txt
The above will only print out the lines that start out with -s, -u or -svn. Based on that, the following command does exactly what you want:
echo "List of running jobs ==> " $(ps -ef | grep pmdtm | grep -v grep | grep -E "^-s|^-u|^-svn" | awk '{ print $2 " |" }')
When running your contents on the post with the above command, I get this output:
List of running jobs ==> Address_Validator:wf_AddressValidator.s_m_AddressValidatorS | Administrator | Int01_dev |
You get a trailing | at the end, but you can trim that out separately.
Updated:
After your comment below, updated the command to do exactly what you need.
echo -e "List of running jobs ==> \n " $(ps -ef | grep pmdtm | grep -v grep | awk 'BEGIN { RS = " -"} $1 ~ /^s$|^u$|^svn$/ { print $2,"|"}')
It does assume a couple of things:
All the named parameters will have non-empty named parameters. Otherwise, it will simply output a blank.
That all the named parameters starts with a -, immediately followed by the parameter itself.

Related

Adjustment to a grep / sed output

I'm trying to edit a grep/sed output - currently it is as follows;
grep -Pzo '(?s)INTO .?db.? VALUES[^(]\K[^;]*' aniko.sql | grep -Pao '\(([^,]*,){2}\K[^,]*'
'yscr_bbYcqN'
'yscr_bbS4kf'
'yscr_bbhrSZ'
'yscr_bbBl0C'
'yscr_bbrsKX'
I am then wanting to add test_ prefix to them all, however, I can only get it to amend one of them - so;
#!/bin/sh
read -p "enter the cPanel username: " cpuser
cd "/home/$cpuser/public_html"
dbusers=($(grep -Pzo '(?s)INTO .?db.? VALUES[^(]\K[^;]*' aniko.sql | grep -Pao '\(([^,]*,){2}\K[^,]*' | grep -Pzo [^\']));
for dbuser in $dbusers; do sed -i "s/$dbuser/${cpuser}\_${dbuser}/g" aniko.sql; done;
The result of this only updates one of the lines (all should show sensationahosti_)
sensationalhosti_yscr_bbYcqN
yscr_bbS4kf
yscr_bbhrSZ
yscr_bbBl0C
yscr_bbrsKX
The format of the file I am changing is as follows;
INSERT INTO `db` VALUES ('localhost','blog','sensationalhosti_yscr_bbYcqN','Y','Y','Y','Y','Y','N','N','N','N','N'),('localhost','blog1','yscr_bbS4kf','Y','Y','Y','Y','Y','N','N','N','N','N'),('localhost','blog','yscr_bbhrSZ','Y','Y','Y','Y','Y','N','N','N','N','N'),('localhost','blog','yscr_bbBl0C','Y','Y','Y','Y','Y','N','N','N','N','N'),('localhost','blog','yscr_bbrsKX','Y','Y','Y','Y','Y','N','N','N','N','N');

How to save result in separate column in excel from shell script?

I am saving the executed result in one of the excel sheet. The result will show in new rows like below:
I have used the below command :
$ bash eg.sh Behatscripts.txt | egrep -w 'Executing the|scenario' >> output.xls
I want to display result like below:
| A | B | c |
1 Executing the script:cap_dutch_home 1 scenario(1passed)
2 Executing the script:cap_english_home 1 scenario(1passed)
One more thing is while executing it will create output.xls separate file, instead of using already existed.
Thanks for any suggestions.
You can use this;
with awk;
bash eg.sh Behatscripts.txt | egrep -w 'Executing the|scenario' | awk 'BEGIN {print "Column_A\tColumn_B"}NR%2{printf "%s \t",$0;next;}1' output.xls
without egrep
bash eg.sh Behatscripts.txt | awk '/Executing the|scenario/' | awk 'BEGIN {print "Column_A\tColumn_B"}NR%2{printf "%s \t",$0;next;}1' >> output.xls

Oneline file-monitoring

I have a logfile continously filling with stuff.
I wish to monitor this file, grep for a specific line and then extract and use parts of that line in a curl command.
I had a look at How to grep and execute a command (for every match)
This would work in a script but I wonder if it is possible to achieve this with the oneliner below using xargs or something else?
Example:
Tue May 01|23:59:11.012|I|22|Event to process : [imsi=242010800195809, eventId = 242010800195809112112, msisdn=4798818181, inbound=false, homeMCC=242, homeMNC=01, visitedMCC=238, visitedMNC=01, timestamp=Tue May 12 11:21:12 CEST 2015,hlr=null,vlr=4540150021, msc=4540150021 eventtype=S, currentMCC=null, currentMNC=null teleSvcInfo=null camelPhases=null serviceKey=null gprsenabled= false APNlist: null SGSN: null]|com.uws.wsms2.EventProcessor|processEvent|139
Extract the fields I want and semi-colon separate them:
tail -f file.log | grep "Event to process" | awk -F'=' '{print $2";"$4";"$12}' | tr -cd '[[:digit:].\n.;]'
Curl command, e.g. something like:
http://user:pass#www.some-url.com/services/myservice?msisdn=...&imsi=...&vlr=...
Thanks!
Try this:
tail -f file.log | grep "Event to process" | awk -F'=' '{print $2" "$4" "$12; }' | tr -cd '[[:digit:].\n. ]' |while read msisdn imsi vlr ; do curl "http://user:pass#www.some-url.com/services/myservice?msisdn=$msisdn&imsi=$imsi&vlr=$vlr" ; done

How to get PID with only app name

When I run ps aux on my computer, I get output like this
myname 234 0.0 0.9 828060 76584 ?? S 9:10am 0:27.01 /RandomApp.app
If I pipe the output to grep, I can look for the name of a particular app
ps aux | grep "/RandomApp.app/"
Is there anyway from there to get the PID (the value in the second column) of the result of the grep.
ps aux | awk '/RandomApp.app/ {print $2}'
With GNU grep:
ps ax -o pid,comm | grep "/RandomApp.app" | grep -o '^[^ ]*'
Or take a look at pgrep:
pgrep bash
Output (e.g.):
3006
3440
10714
16524
16603
16863
18921
23945
You can use the match operator. If RandomApp.app is in the 11th column, this code will print the second column.
$11 ~ /RandomApp.app/ { print $2 }
Put the above in a file aboveawkfile.awk and run with -f operator. so
ps aux | awk -f aboveawkfile.awk

Error calling system() within awk

I'm trying to execute a system command to find out how many unique references a csv file has in its first seven characters as part of a larger awk script that processes the same csv file. There are duplicate entries and I don't want awk to parse the whole file twice so I'm avoiding NR. The gist of this part of the script is:
#!/bin/bash
awk '
{
#do some stuff, then when finished, count the number of unique references
productFile="BusinessObjects.csv";
systemCall = sprintf( "cat %s | cut -c 1-7 | sort | uniq | wc -l", $productFile );
productCount=`system( systemCall )`-1; #subtract 1 to remove column label row
}' < BusinessObjects.csv
And the interpreter doesn't like it:
awk: cmd. line:19: ^ syntax error ./awkscript.sh: line 38: syntax error near unexpected token '('
./awkscript.sh: line 38: systemCall = sprintf( "cat %s | cut -c 1-7 | sort | uniq | wc -l", $productFile );
If I hard-code the system command
productCount=`system( "cat BusinessObjects.csv | cut -c 1-7 | sort | uniq | wc -l" )`-1;
I get:
./awkscript.sh: command substitution: line 39: syntax error near unexpected token '"cat BusinessObjects.csv | cut -c 1-7 | sort | uniq | wc -l"'
./awkscript.sh: command substitution: line 39: 'system( "cat BusinessObjects.csv | cut -c 1-7 | sort | uniq | wc -l" )'
Technically, I could do this outside of awk at the start of the shell script, store the result in a system variable, and then pass it to awk using -v, but it's not great for the readability of the awk script (it's a few hundred lines long). Do I have a space or quotes in the wrong place? I've tried fiddling, but I can't seem to present the call to system() in a way that the interpreter will accept. Finally, is there a more sensible way to do this?
Edit: the csv file is indeed semicolon-delimited, so it's best to cut using the delimiter rather than the number of chars (thanks!).
ProductRef;Data1;Data2;etc
1234567;etc;etc;etc
Edit 2:
I'm trying to parse a csv file whose first column is full of N unique product references, and create a series of associated HTML pages that include a "Page n of N" information field. It's (painfully obviously) the first time I've used awk, but it seemed like an appropriate tool for parsing csv files. I'm trying to hence count and return the number of unique references. At the shell
cut -d\; -f1 BusinessObjects.csv | sort | uniq | wc -l
works fine, but I can't get it working inside awk by doing
#!/bin/bash
if [ -n "$1" ]
then
productFile=$1
else
echo "Missing product file argument."
exit
fi
awk -v productFile=$productFile '
BEGIN {
FS=";";
productCount = 0;
("cut -d\"\;\" -f1 " productFile " | sort | uniq | wc -l") | getline productCount;
productCount -=1; #remove the column label row
}
{
print productCount;
}'
I get a syntax error on the cut code if I don't wrap the semicolon in \"\;\" and the script just hangs without printing anything when I do.
I don't remember that you can use backticks in awk.
productCount=`system( systemCall )`-1; #subtract 1 to remove column label row
You can read your output by not using system and running your command directly, and using getline instead:
systemCall | getline productCount
productCount -= 1
Or more completely
productFile = "BusinessObjects.csv"
systemCall = "cut -c 1-7 " productFile " | sort | uniq | wc -l"
systemCall | getline productCount
productCount -= 1
No need to use sprintf and include cat.
Assigning strings to variables is also optional. You can just have "xyz" | getline ....
sort | uniq can just be sort -u if supported.
Quoting may be necessary if filename has spaces or characters that may confuse the command.
getline may alter global variables differently from expected. See https://www.gnu.org/software/gawk/manual/html_node/Getline.html.
Could something like this be an option?
$ cat productCount.sh
#!/bin/bash
if [ -n "$1" ]
then
productCount=`cat $1 | cut -c 1-7 | sort | uniq | wc -l`
echo $productCount
else
echo "please supply a filename as parameter"
fi
$ ./productCount.sh BusinessObjects.csv
9

Resources