Awk with Printf usage - shell

i am trying to use awk to filter the output of a autorep command
output of autorep command:
/tmp $ autorep -j Test_Job -q
/* ----------------- Test_Job ----------------- */
insert_job: Test_Job job_type: CMD
box_name: Test_box
command: echo
machine: machine_name
owner: ownername
permission: gx,ge,wx
date_conditions: 0
condition: s(testjob1)
description: "echo"
std_out_file: "/tmp/test_job.out"
std_err_file: "/tmp/test_job.err"
alarm_if_fail: 1
alarm_if_terminated: 1
/tmp $ autorep -j Test_Job2 -q
/* ----------------- Test_Job2 ----------------- */
insert_job: Test_Job2 job_type: CMD
command: echo
machine: machinename
owner: owner
permission:
date_conditions: 1
days_of_week: mo,tu,we,th,fr
start_mins: 9,19,29,39,49,59
run_window: "06:00-19:00"
description: "test discription"
std_out_file: "/tmp/Test_Job2.out"
std_err_file: "/tmp/Test_Job2.err"
alarm_if_fail: 1
alarm_if_terminated: 1
i have the below shell script to filter out the data:
#!/bin/bash
TXT=/tmp/test1.txt
CSV=/tmp/test1.csv
echo "Enter the JOB_NAME or %SEARCHSTRING%"
while read -r i;
do
awk '$1 == "insert_job:" {printf "%s %s ", $2, $4}; $1 == "condition:" {printf "%s ", $2}; $1 == "days_of_week:" {printf "%s ", $2}; $1 == "date_conditions:" {printf "%s\n ", $2}' < <(autorep -j $i -q) >$TXT
echo
break
done
if [ -s $TXT ]
then
(echo "job_name,job_type,Date_Conditions,Days_of_week/Conditions"; cat test1.txt) | sed 's/ \+/,/g' > $CSV
else
echo "Please check the %SEARCHSTRING% or JOB_NAME"
fi
the output I am looking for:
Test_Job CMD 0 s(testjob1)
Test_Job2 CMD 1 mo,tu,we,th,fr 9,19,29,39,49,59 "06:00-19:00"
but the command is not working and i am getting the data like below:
Test_Job CMD 00s(testjob1) Test_Job2 CMD 1 mo,tu,we,th,fr 9,19,29,39,49,59 "06:00-19:00"
can someone help me out with getting the correct output
EDIT:
Let me explain what i am trying to do. i am using the below command and i am giving a key word as %Test% (which will fetch all the jobs with name Test in it), so i will be basically running this query on all the jobs with that keyword and inturn would be getting a list with the filtered out options as per my query. i am getting the getting the data but all the data is on one line rather that each job data on each line:
EDIT 2:
So as you can see if date_condition: 0 then the job may or may not have condition: in it and if a job has date_condition: 1 then it will have 'days_of_week:' and may or may not have other fields like 'run_window:'.
so is there a way i can modify the script to print out maybe 'N/A' if some field is missing. And also if i get the data on each line individually

You don't say this in your question, but I'm assuming there are multiple jobs output by autorep. You need to keep track of the transition from one job to another in the output. You also need to anchor your regexes to prevent false matches.
awk '/^insert_job/ {if (flag) {printf "\n"}; printf "%s %s ", $2, $4; flag = 1}; /^date_conditions/ {printf "%s", $2}; /^condition/ {printf "%s", $2}' < <(autorep -j Test_Job -q)
This outputs a newline before each "insert_job" (and thus each job) after the first so each line of output is a different job.
Multiple lines for readability:
awk '
/^insert_job/ {if (flag) {printf "\n"};
printf "%s %s ", $2, $4;
flag = 1};
/^date_conditions/ {printf "%s", $2};
/^condition/ {printf "%s", $2}
' < <(autorep -j Test_Job -q)

Related

Search keywords in master csv if keyword exist then update input csv 2nd column with value true or false

Input csv - new_param.csv
value like -
ID
Identity
as-uid
cp_cus_id
evs
k_n
master.csv has value like -
A, xyz, id, abc
n, xyz, as-uid, abc, B, xyz, ne, abc
q, xyz, id evs, abc
3, xyz, k_n, abc, C, xyz, ad, abc
1, xyz, zd, abc
z, xyz, ID, abc
Require Output Updated new_param.csv - true or false in 2nd column
ID,true
Identity,false
as-uid,true
cp_cus_id,false
evs,true
k_n,true
tried below code no output -
#!/bin/bash
declare -a keywords=(`cat new_param.csv`)
length=${#keywords[#]}
for (( j=0; j<length; j++ ));
do
a= LC_ALL=C awk -v kw="${keywords[$j]}" -F, '{for (i=1;i<=NF;i++) if ($i ~ kw) {print i}}' master.csv
b=0
if [ $a -gt $b ]
then
echo true $2 >> new_param.csv
else
echo false $2 >> new_param.csv
fi
done
Please help someone !
Tried above mention code but does not helping me
getings error like -
test.sh: line 29: [: -gt: unary operator expected test.sh: line 33: -f2: command not found
awk -v RS=', |\n' 'NR == FNR { a[$0] = 1; next }
{ gsub(/,.*/, ""); b = "" b $0 (a[$0] ? ",true" : ",false") "\n" }
END { if (FILENAME == "new_param.csv") printf "%s", b > FILENAME }' master.csv new_param.csv
Try this Shellcheck-clean pure Bash code:
#! /bin/bash -p
outputs=()
while read -r kw; do
if grep -q -E "(^|[[:space:],])$kw([[:space:],]|\$)" master.csv; then
outputs+=( "$kw,true" )
else
outputs+=( "$kw,false" )
fi
done <new_param.csv
printf '%s\n' "${outputs[#]}" >new_param.csv
You may need to tweak the regular expression used with grep -E depending on what exactly you want to count as a match.
Using grep to find exact word matches:
$ grep -owf new_param.csv master.csv | sort -u
ID
as-uid
evs
k_n
Then feed this to awk to match against new_param.csv entries:
awk '
BEGIN { OFS="," }
FNR==NR { a[$1]; next }
{ print $1, ($1 in a) ? "true" : "false" }
' <(grep -owf new_param.csv master.csv | sort -u) new_param.csv
This generates:
ID,true
Identity,false
as-uid,true
cp_cus_id,false
evs,true
k_n,true
Once the results are confirmed as correct OP can add > new_param.csv to the end of the awk script, eg:
awk 'BEGIN { OFS="," } FNR==NR ....' <(grep -owf ...) new_parame.csv > new_param.csv
^^^^^^^^^^^^^^^
Alternative awk option:
Use a , for the field separator and concatenate the 3rd field for each record of the master.csv to the variable m. Second, read each record from the new-params.csv file and use the index funtion to determine whether that record exists in the m variable string.
awk -F", " '
FNR==NR{m=m$3}
FNR<NR{print $0 (index(m,$0) ? ",true" : ",false")}
' master.csv new-params.csv
Output:
ID,true
Identity,false
as-uid,true
cp_cus_id,false
evs,true
k_n,true

awk custom printf command generation

TL;DR - I have a variable which looks like a format specifier ($TEMP) which I need to use with awk printf.
So by doing this:
awk '-v foo="$temp" {....{printf foo} else {print $_}}' tempfile1.txt > tmp && mv tmp tempfile1.txt
Bash should see this:
awk '{.....{printf "%-5s %-6s %...\n", $1, $2, $3....} else {print $_}}' tempfile1.txt > tmp && mv tmp tempfile1.txt
Sample Input:
col1 col2 col3
aourwyo5637[dfs] 67tyd 8746.0000
tomsd - 4
o938743[34* 0 834.92
.
.
.
Expected Output:
col1 col2 col3
aourwyo5637[dfs] 67tyd 8746.0000
tomsd - 4
o938743[34* 0646sggg 834.92
.
.
.
Long Version
I am new to scripting and after over 5 hours of scouring the internet and doing what I believe is a patchwork of information, I have hit a brick wall.
Scenario:
So I have a multiple random tables I need to open in a directory. Since I do not know anything about a given table except that I need to format all data that is on line 4 and all lines after line 14 of the file.
I need to make a custom printf command in awk on the fly so the padding for each column is equal to a value (say 5 SPACES) so the table looks pretty once I open it up.
This is what I am come up with so far:
awk '{
for (i=1;i<=NF;i++)
{
max_length=length($i);
if ( max_length > linesize[i] )
{
linesize[i]=max_length+5;
}
}}
END{
for (i = 1; i<=length(linesize); i++)
{
print linesize[i] >> "tempfile1.txt"
}
}' file1.txt
# remove all blank lines in tempfile1.txt
awk 'NF' tempfile1.txt > tmp && mv tmp tempfile1.txt
# Get number of entries in tempfile1.txt
n=`awk 'END {print NR}' tempfile1.txt`
# This for loop generates the pattern I need for the printf command
declare -a a
for((i=0;i<$n;i++))
do
a[i]=`awk -v i=$((i+1)) 'FNR == i {print}' tempfile1.txt`
temp+=%-${a[i]}s' '
temp2+='$'$((i+1))', '
#echo "${a[$i]}";
#echo "$sub"
done
temp='"'${temp::-2}'\n", '
# echo "$temp"
temp=$temp${temp2::-2}''
# echo "$temp"
awk <something here>
# Tried the one below and it gave an error
awk -v tem="$temp" '{printf {tem}}
So ideally what I would like is the awk command is to look like this by simply putting the bash variable temp in the awk command.
So by doing this:
awk '-v foo="$temp" {if(FNR >=14 || FNR == 4) {printf foo} else {print $_}}' tempfile1.txt > tmp && mv tmp tempfile1.txt
Bash should see this:
awk '{if(FNR >=14 || FNR == 4) {printf "%-5s %-6s %...\n", $1, $2, $3....} else {print $_}}' tempfile1.txt > tmp && mv tmp tempfile1.txt
It sounds like this MIGHT be what you want but it's still hard to tell from your question:
$ cat tst.awk
BEGIN { OFS=" " }
NR==FNR {
for (i=1;i<=NF;i++) {
width[i] = (width[i] > length($i) ? width[i] : length($i))
}
next
}
{
for (i=1;i<=NF;i++) {
printf "%-*s%s", width[i], $i, (i<NF?OFS:ORS)
}
}
$ awk -f tst.awk file file
col1 col2 col3
aourwyo5637[dfs] 67tyd 8746.0000
tomsd - 4
o938743[34* 0 834.92
I ran it against the sample input from your question after removing all the spurious .s.
# Tried the one below and it gave an error
awk -v tem="$temp" '{printf {tem}}
' at end of line is missing
{tem} is wrong; just write tem
printf's , expr-list is missing
\n is missing
Corrected:
awk -v tem="$temp" "{printf tem\"\n\", $temp2 0}"
or
awk "{printf \"$temp\n\", $temp2 0}"
(simpler).

Insert a date in a column using awk

I'm trying to format a date in a column of a csv.
The input is something like: 28 April 1966
And I'd like this output: 1966-04-28
which can be obtain with this code:
date -d "28 April 1966" +%F
So now I thought of mixing awk and this code to format the entire column but I can't find out how.
Edit :
Example of input : (separators "|" are in fact tabs)
1 | 28 April 1966
2 | null
3 | null
4 | 30 June 1987
Expected output :
1 | 1966-04-28
2 | null
3 | null
4 | 30 June 1987
A simple way is
awk -F '\\| ' -v OFS='| ' '{ cmd = "date -d \"" $3 "\" +%F 2> /dev/null"; cmd | getline $3; close(cmd) } 1' filename
That is:
{
cmd = "date -d \"" $3 "\" +%F 2> /dev/null" # build shell command
cmd | getline $3 # run, capture output
close(cmd) # close pipe
}
1 # print
This works because date doesn't print anything to its stdout if the date is invalid, so the getline fails and $3 is not changed.
Caveats to consider:
For very large files, this will spawn a lot of shells and processes in those shells (one each per line). This can become a noticeable performance drag.
Be wary of code injection. If the CSV file comes from an untrustworthy source, this approach is difficult to defend against an attacker, and you're probably better off going the long way around, parsing the date manually with gawk's mktime and strftime.
EDIT re: comment: To use tabs as delimiters, the command can be changed to
awk -F '\t' -v OFS='\t' '{ cmd = "date -d \"" $3 "\" +%F 2> /dev/null"; cmd | getline $3; close(cmd) } 1' filename
EDIT re: comment 2: If performance is a worry, as it appears to be, spawning processes for every line is not a good approach. In that case, you'll have to do the parsing manually. For example:
BEGIN {
OFS = FS
m["January" ] = 1
m["February" ] = 2
m["March" ] = 3
m["April" ] = 4
m["May" ] = 5
m["June" ] = 6
m["July" ] = 7
m["August" ] = 8
m["September"] = 9
m["October" ] = 10
m["November" ] = 11
m["December" ] = 12
}
$3 !~ /null/ {
split($3, a, " ")
$3 = sprintf("%04d-%02d-%02d", a[3], m[a[2]], a[1])
}
1
Put that in a file, say foo.awk, and run awk -F '\t' -f foo.awk filename.csv.
This should work with your given input
awk -F'\\|' -vOFS="|" '!/null/{cmd="date -d \""$3"\" +%F";cmd | getline $3;close(cmd)}1' file
Output
| 1 |1966-04-28
| 2 | null
| 3 | null
| 4 |1987-06-30
I would suggest using a language that supports parsing dates, like perl:
$ cat file
1 28 April 1966
2 null
3 null
4 30 June 1987
$ perl -F'\t' -MTime::Piece -lane 'print "$F[0]\t",
$F[1] eq "null" ? $F[1] : Time::Piece->strptime($F[1], "%d %B %Y")->strftime("%F")' file
1 1966-04-28
2 null
3 null
4 1987-06-30
The Time::Piece core module allows you to parse and format dates, using the standard format specifiers of strftime. This solution splits the input on a tab character and modifies the format if the second field is not "null".
This approach will be much faster than using system calls or invoking subprocesses, as everything is done in native perl.
Here is how you can do this in pure BASH and avoid calling system or getline from awk:
while IFS=$'\t' read -ra arr; do
[[ ${arr[1]} != "null" ]] && arr[1]=$(date -d "${arr[1]}" +%F)
printf "%s\t%s\n" "${arr[0]}" "${arr[1]}"
done < file
1 1966-04-28
2 null
3 null
4 1987-06-30
Only one date call and no code injection problem is possible, see the following:
This script extracts the dates (using awk) into a temporary file processes them with one "date" call and merges the results back (using awk).
Code
awk -F '\t' 'match($3,/null/) { $3 = "0000-01-01" } { print $3 }' input > temp.$$
date --file=temp.$$ +%F > dates.$$
awk -F '\t' -v OFS='\t' 'BEGIN {
while ( getline < "'"dates.$$"'" > 0 )
{
f1_counter++
if ($0 == "0000-01-01") {$0 = "null"}
date[f1_counter] = $0
}
}
{$3 = date[NR]}
1' input.$$
One-liner using bash process redirections (no temporary files):
inputfile=/path/to/input
awk -F '\t' -v OFS='\t' 'BEGIN {while ( getline < "'<(date -f <(awk -F '\t' 'match($3,/null/) { $3 = "0000-01-01" } { print $3 }' "$inputfile") +%F)'" > 0 ){f1_counter++; if ($0 == "0000-01-01") {$0 = "null"}; date[f1_counter] = $0}}{$3 = date[NR]}1' "$inputfile"
Details
here is how it can be used:
# configuration
input=/path/to/input
temp1=temp.$$
temp2=dates.$$
output=output.$$
# create the sample file (optional)
#printf "\t%s\n" $'1\t28 April 1966' $'2\tnull' $'3\tnull' $'4\t30 June 1987' > "$input"
# Extract all dates
awk -F '\t' 'match($3,/null/) { $3 = "0000-01-01" } { print $3 }' "$input" > "$temp1"
# transform the dates
date --file="$temp1" +%F > "$temp2"
# merge csv with transformed date
awk -F '\t' -v OFS='\t' 'BEGIN {while ( getline < "'"$temp2"'" > 0 ){f1_counter++; if ($0 == "0000-01-01") {$0 = "null"}; date[f1_counter] = $0}}{$3 = date[NR]}1' "$input" > "$output"
# print the output
cat "$output"
# cleanup
rm "$temp1" "$temp2" "$output"
#rm "$input"
Caveats
Using "0000-01-01" as a temporary placeholder for invalid (null) dates
The code should be faster than other methods calling "date" a lot of times, but it reads the input file two times.

concatenate two variables from 2 different awk commands in a single echo

Job = grep 'Job:' | awk '{ print $3 }'
Status = grep 'Job Status:' | awk '{ print $3 }'
Both the variables are printed correctly by using two echo statements.I want a result like Job name - status in a single line.I have tried below commands. But its printing only 2nd variable like - status
echo "$Job - $Status"
echo "${Job} - ${Status}"
echo -e "${Job} - ${Status}"
please help!
You can do it with a single awk command:
awk '/Job:/ { job = $3 } /Job Status:/ { status = $3 } END { print job " - " status }' file
If Job: comes before Job Status:
awk '/Job:/ { job = $3 } /Job Status:/ { print job " - " $3; exit }' file
Or vice versa:
awk '/Job Status:/ { status = $3 } /Job Status:/ { print $3 " - " status; exit }' file
I think that should work:
echo $(awk ' /Job:/ { print $3} ' file)" - "$(awk ' /Job Status:/ { print $3} ' file)
but konsolebox's version is probably better, as there is only one awk invocation.
I think you are trying to find out how to get the result of running some command and store it in a variable. Then you want to do that twice and print both variables on the same line.
So the basic syntax is:
result=$(some command)
e.g. if
date +'%Y'
tells you the year is 2014, but you want 2014 in a variable called year, you can do
year=$(date +'%Y')
then you can echo $year like this:
echo $year
2014
So, coming to your actual question, you want two variables, one for the output of each of two commands:
job=$(grep "Job:" someFile | awk '{print $3}')
status=$(grep "Job Status:" someFile | awk '{print $3}')
then you can do:
echo $job $status
and get both things on the same line.
The other answers are saying you can avoid invoking awk twice, which is true, but doesn't explain how to capture the result of running a command into a variable. In general, you don't need to use awk and grep, because this:
grep xyz | awk ...
is equivalent to
awk '/xyz/ {...}'
but uses one fewer processes (i.e. no grep) and therefore fewer resources.
And by the way, you must not put any spaces either side of = in bash either. It is
variable=something
not
variable = something

how to parse a config file (*.conf) in shell script?

I am new to shell script. I have a file app.conf as :
[MySql]
user = root
password = root123
domain = localhost
database = db_name
port = 3306
[Logs]
level = logging.DEBUG
[Server]
port = 8080
I want to parse this file in shell script and want to extract mysql credentials from the same. How can I achieve that?
I'd do this:
pw=$(awk '/^password/{print $3}' app.conf)
user=$(awk '/^user/{print $3}' app.conf)
echo $pw
root123
echo $user
root
The $() sets the variable pw to the output of the command inside. The command inside looks through your app.conf file for a line starting password and then prints the 3rd field in that line.
EDITED
If you are going to parse a bunch of values out of your config file, I would make a variable for the config file name:
CONFIG=app.conf
pw=$(awk '/^password/{print $3}' "${CONFIG}")
user=$(awk '/^user/{print $3}' "${CONFIG}")
Here's how to do the two different ports... by setting a flag to 1 when you come to the right section and exiting when you find the port.
mport=$(awk '/^\[MySQL\]/{f=1} f==1&&/^port/{print $3;exit}' "${CONFIG}")
sport=$(awk '/^\[Server\]/{f=1} f==1&&/^port/{print $3;exit}' "${CONFIG}")
You will want to search for "shell ini file parser". I would start with something like this:
ini_get () {
awk -v section="$2" -v variable="$3" '
$0 == "[" section "]" { in_section = 1; next }
in_section && $1 == variable {
$1=""
$2=""
sub(/^[[:space:]]+/, "")
print
exit
}
in_section && $1 == "" {
# we are at a blank line without finding the var in the section
print "not found" > "/dev/stderr"
exit 1
}
' "$1"
}
mysql_user=$( ini_get app.conf MySql user )
Using awk:
awk -F ' *= *' '$1=="user"||$1=="password"{print $2}' my.cnf
root
gogslab
I ran in a similar problem yesterday and thought the best solution might be, if you get an associative array like "key - value" after parsing the file.
I you like to see a running example have a look at https://github.com/philippkemmeter/set-resolution/blob/master/set-resolution.
Adapted to your problem, this might work:
function receive_assoc_declare_statement {
awk -F '=' 'BEGIN {ORS=" "}
{
gsub(/[ \t]+/, "", $1);
gsub(/[ \t]+/, "", $2);
print "[" $1 "]=" $2
}' app.conf
}
eval 'declare -A CONF=('`receive_assoc_declare_statement`')'
You then have access to for instance user via ${CONF[user]}.
The gsub is trimming keys and values, so that you can use tab etc. to format your config file.
It's lacking sections, but you could add this functionality using sed to create one config array per section:
sed -n '/\[MySql\]/, /\[/ {p}' test.removeme | sed '1 d; $ d'
So answering your question in total, this script might work:
MYSQL=`sed -n '/\[MySql\]/, /\[/ {p}' app.conf | sed '1 d; $ d' | awk -F '=' 'BEGIN {ORS=" "}
{
gsub(/[ \t]+/, "", $1);
gsub(/[ \t]+/, "", $2);
print "[" $1 "]=" $2
}' `
eval 'declare -A MYSQL=('$MYSQL')'
The other sections correspondingly.

Resources