I'm trying to get the cassandra schema version into variable from output of nodetool command.
Here are some of the output of nodetool commands:
Cluster Information:
Name: Test Cluster
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
65e78f0e-e81e-30d8-a631-a65dff93bf82: [127.0.0.1]
When few nodes are not reachable here's the output.
Cluster Information:
Name: Production Cluster
Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
Schema versions:
UNREACHABLE: 1176b7ac-8993-395d-85fd-41b89ef49fbb: [10.202.205.203]
Can anyone suggest how to get schema version into variable irrespective of reachable or not?
Tried to use awk and grep commands but didn't work because of unreachable.
Another version of an awk script that will match only the UUID type REGEX can be written to use match() setting the internal RSTART and RLENGTH variables that can then be used with substr().
That would be:
awk '
/Schema versions:/ {
set=1
next
}
set {
match($0,/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}/)
print substr($0, RSTART, RLENGTH)
exit
}' file
Example Use/Output
$ awk '
> /Schema versions:/ {
> set=1
> next
> }
> set {
> match($0,/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}/)
> print substr($0, RSTART, RLENGTH)
> exit
> }' << 'eof'
> Cluster Information:
> Name: Test Cluster
> Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
> Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
> Schema versions:
> 65e78f0e-e81e-30d8-a631-a65dff93bf82: [127.0.0.1]
>
> eof
65e78f0e-e81e-30d8-a631-a65dff93bf82
and
$ awk '
> /Schema versions:/ {
> set=1
> next
> }
> set {
> match($0,/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}/)
> print substr($0, RSTART, RLENGTH)
> exit
> }' << 'eof'
> Cluster Information:
> Name: Production Cluster
> Snitch: org.apache.cassandra.locator.DynamicEndpointSnitch
> Partitioner: org.apache.cassandra.dht.Murmur3Partitioner
> Schema versions:
> UNREACHABLE: 1176b7ac-8993-395d-85fd-41b89ef49fbb: [10.202.205.203]
> eof
1176b7ac-8993-395d-85fd-41b89ef49fbb
You can use the command in a command substitution in bash to capture the result in a variable.
Let me know if you have further questions.
Awk will do the job for that:
version=$(awk '/Schema versions:/ {
getline
gsub(/:/,"")
if ($1 == "UNREACHABLE") {
print $2
} else {
print $1
}
}' < <(nodetool_cmd)) # remplace "nodetool_cmd" by the correct command
$ echo "$version" #when reachable
65e78f0e-e81e-30d8-a631-a65dff93bf82
$ echo "$version" # when unreachable
1176b7ac-8993-395d-85fd-41b89ef49fbb
# or in single line:
version=$(awk '/version/ {getline;gsub(/:/,"");if ($1 == "UNREACHABLE") {print $2} else {print $1}}' < <(nodetool_cmd))
Related
i am trying to use awk to filter the output of a autorep command
output of autorep command:
/tmp $ autorep -j Test_Job -q
/* ----------------- Test_Job ----------------- */
insert_job: Test_Job job_type: CMD
box_name: Test_box
command: echo
machine: machine_name
owner: ownername
permission: gx,ge,wx
date_conditions: 0
condition: s(testjob1)
description: "echo"
std_out_file: "/tmp/test_job.out"
std_err_file: "/tmp/test_job.err"
alarm_if_fail: 1
alarm_if_terminated: 1
/tmp $ autorep -j Test_Job2 -q
/* ----------------- Test_Job2 ----------------- */
insert_job: Test_Job2 job_type: CMD
command: echo
machine: machinename
owner: owner
permission:
date_conditions: 1
days_of_week: mo,tu,we,th,fr
start_mins: 9,19,29,39,49,59
run_window: "06:00-19:00"
description: "test discription"
std_out_file: "/tmp/Test_Job2.out"
std_err_file: "/tmp/Test_Job2.err"
alarm_if_fail: 1
alarm_if_terminated: 1
i have the below shell script to filter out the data:
#!/bin/bash
TXT=/tmp/test1.txt
CSV=/tmp/test1.csv
echo "Enter the JOB_NAME or %SEARCHSTRING%"
while read -r i;
do
awk '$1 == "insert_job:" {printf "%s %s ", $2, $4}; $1 == "condition:" {printf "%s ", $2}; $1 == "days_of_week:" {printf "%s ", $2}; $1 == "date_conditions:" {printf "%s\n ", $2}' < <(autorep -j $i -q) >$TXT
echo
break
done
if [ -s $TXT ]
then
(echo "job_name,job_type,Date_Conditions,Days_of_week/Conditions"; cat test1.txt) | sed 's/ \+/,/g' > $CSV
else
echo "Please check the %SEARCHSTRING% or JOB_NAME"
fi
the output I am looking for:
Test_Job CMD 0 s(testjob1)
Test_Job2 CMD 1 mo,tu,we,th,fr 9,19,29,39,49,59 "06:00-19:00"
but the command is not working and i am getting the data like below:
Test_Job CMD 00s(testjob1) Test_Job2 CMD 1 mo,tu,we,th,fr 9,19,29,39,49,59 "06:00-19:00"
can someone help me out with getting the correct output
EDIT:
Let me explain what i am trying to do. i am using the below command and i am giving a key word as %Test% (which will fetch all the jobs with name Test in it), so i will be basically running this query on all the jobs with that keyword and inturn would be getting a list with the filtered out options as per my query. i am getting the getting the data but all the data is on one line rather that each job data on each line:
EDIT 2:
So as you can see if date_condition: 0 then the job may or may not have condition: in it and if a job has date_condition: 1 then it will have 'days_of_week:' and may or may not have other fields like 'run_window:'.
so is there a way i can modify the script to print out maybe 'N/A' if some field is missing. And also if i get the data on each line individually
You don't say this in your question, but I'm assuming there are multiple jobs output by autorep. You need to keep track of the transition from one job to another in the output. You also need to anchor your regexes to prevent false matches.
awk '/^insert_job/ {if (flag) {printf "\n"}; printf "%s %s ", $2, $4; flag = 1}; /^date_conditions/ {printf "%s", $2}; /^condition/ {printf "%s", $2}' < <(autorep -j Test_Job -q)
This outputs a newline before each "insert_job" (and thus each job) after the first so each line of output is a different job.
Multiple lines for readability:
awk '
/^insert_job/ {if (flag) {printf "\n"};
printf "%s %s ", $2, $4;
flag = 1};
/^date_conditions/ {printf "%s", $2};
/^condition/ {printf "%s", $2}
' < <(autorep -j Test_Job -q)
I have a script that reads from a file.
################################################
# IP TABLES FOR INSTALL_CONFIG #
# #
# m = master #
# k = kibana #
# d = data #
# i = ingest #
# c = coordinator #
# Format: xxx.xxx.xxx.xxx m #
################################################
#
10.1.7.93 m
10.1.7.94 k
10.1.7.95 d
This is the function that the script uses.
function readIpFile () {
initMasterVar=0
grep "^[^# ]" node_list.txt | awk '$2 ~ /m/ { print $1 }' > tmp_master_list.txt
grep "^[^# ]" node_list.txt | awk '$2 ~ /k/ { print $1 }' > tmp_kibana_list.txt
grep "^[^# ]" node_list.txt | awk '$2 ~ /i/ { print $1 }' > tmp_ingest_list.txt
grep "^[^# ]" node_list.txt | awk '$2 ~ /d/ { print $1 }' > tmp_data_list.txt
grep "^[^# ]" node_list.txt | awk '$2 !~ /k/ { print $1 }' > tmp_all_nodes.txt
}
The functions purpose is to read from a master node list, it then sorts the list into tmp files according to the role each IP or FQDN is assigned. The grep statement filters all lines that begin with #, and AWK searches the second field for the role, and prints the IP with that role, redirected into a tmp file which is used later in the script.
My problem is that before, this function was working fine. The commands individually work in my terminal and grep is able to locate the file, and filter it accordingly. However when input in this function in this script, it breaks.
I am unsure what I am doing wrong. My script when put into shellcheck turns up no errors that would cause this.
A couple of us mentioned doing all this sorting in a single awk script instead of 5 different pipelines as an optimization - that way, the file only has to be read once. One way to do that is using in-awk output redirection:
awk '/^[# ]/ { next } # Skip lines starting with a # or space.
$2 ~ /m/ { print $1 > "/path/to/tmp_master_list.txt" }
$2 ~ /k/ { print $1 > "/path/to/tmp_kibana_list.txt" }
$2 ~ /i/ { print $1 > "/path/to/tmp_ingest_list.txt" }
$2 ~ /d/ { print $1 > "/path/to/tmp_data_list.txt" }
$2 !~ /k/ { print $1 > "/path/to/tmp_all_nodes.txt" }' /path/to/node_list.txt
I am trying to go through a list of workstations, and add data to their corresponding individual files. I have already made the individual files, but now I need help using sed or something else to replace text in the individual XML files:
For example:
workstation_list.txt has the following lines:
workstation1
workstation1.domain.com
127.0.0.1
00:00:00:00:00:00
workstation2
workstation2.domain.com
127.0.0.2
11:11:11:11:11:11
I have two files: workstation1 and workstation2 with the following XML:
< HOST_NAME >workstation3< /HOST_NAME >
< HOST_FQDN >workstation3.domain.com< /HOST_FQDN >
< IP >127.0.0.0< /IP >
< MAC >33:33:33:33:33:33< /MAC >
I can do a "while read line do" without a problem, but I've never used more than one variable.
Thank you for your help!
The whole task can be accomplished with a single call to gawk:
awk -v RS='\n\n' '{
print "< HOST_NAME >" $1 "< /HOST_NAME >" > $1
print "< HOST_FQDN >" $2 "< /HOST_FQDN >" > $1
print "< IP >" $3 "< /IP >" > $1
print "< MAC >" $4 "< /MAC >" > $1
close($1)
}' file
Given your sample input it creates two files as follows:
workstation1:
< HOST_NAME >workstation1< /HOST_NAME >
< HOST_FQDN >workstation1.domain.com< /HOST_FQDN >
< IP >127.0.0.1< /IP >
< MAC >00:00:00:00:00:00< /MAC >
workstation2:
< HOST_NAME >workstation2< /HOST_NAME >
< HOST_FQDN >workstation2.domain.com< /HOST_FQDN >
< IP >127.0.0.2< /IP >
< MAC >11:11:11:11:11:11< /MAC >
I'm writing a small BASH script that reads a csv file with names on it and prompts the user for a name to be removed. The csv file looks like this:
Smith,John
Jackie,Jackson
The first and last name of the person to be removed from the list are saved in the bash variables $first_name and $last_name.
This is what I have so far:
cat file.csv | awk -F',' -v last="$last_name" -v first="$first_name" ' ($1 != last || $2 != first) { print } ' > tmpfile1
This works fine. However, it still outputs to tmpfile1 even if no employee matches that name. What I would like is to have something like:
if ($1 != last || $2 != first) { print } > tmpfile1 ; else { print "No Match Found." }
I'm new to awk and can't get that last part to work.
NOTE: I do not want to use something like grep -v "$last_name,$first_name"; I want to use a filtering function.
You can redirect right inside the awk script, and only output matches found.
awk -F',' -v last="$last_name" -v first="$first_name" '
$1==last && $2==first {next}
{print > "tmpfile"}
' file.csv
Here are some differences between your script and this....
This has awk reading your CSV directly, rather than having UUOC.
This actively skips the records you want to skip,
and prints everything else through a redirect.
Note that you could, if you wanted, specify the target to which to redirect in a variable you pass in using -v as well.
If you really want the "No match found" error, you can set a flag, then use the END special condition in awk...
awk -F',' -v last="$last_name" -v first="$first_name" '
$1==last && $2==first { found=1; next }
{ print > "tmpfile" }
END { if (!found) print "No match found." > "/dev/stderr" }
' file.csv
And if you want no tmpfile to be created if a match wasn't found, you would either need to scan the file TWICE, once to verify that there's a match, and once to print, or if there's no risk that the size of the file would be too great for available memory, you could keep a buffer:
awk -F',' -v last="$last_name" -v first="$first_name" '
$1==last && $2==first { next }
{ output = (output ? output ORS : "" ) $0 }
END {
if (output)
print output > "tmpfile"
else
print "No match found." > "/dev/stderr"
}
' file.csv
Disclaimer: I haven't tested any of these. :)
You can do two passes over the file, or you can queue up all of the file so far in memory and then just fail if you reach the END block with no match.
awk -v first="$first" last="$last" '$1 != last || $2 != first {
for (i=1; i<=n; ++i) print a[i] >>"tempfile"; p=1; split("", a); }
# No match yet, remember this line for later
!p { a[++n] = $0; next }
# If we get through to here, there was a match
p { print >>"tempfile" }
END { if (!p) { print "no match" >"/dev/stderr"; exit 1 } }' filename
This requires you to have enough memory to store the entire file (this will be required when there is no match).
With a bash script, you can test if awk print something.
If yes, remove the tmpfile.
c=$(awk -F',' -v a="$last_name" -v b="$first_name" '
$1==a && $2==b {c=1;next}
{print > "tmpfile"}
END{if (!c){print "no match"}}' infile)
[ -n "$c" ] && { echo "$c"; rm tmpfile;}
I'm trying to make a program that sorts machines based on load, but i'm having a hard time parsing the ssh output. What i have so far is this:
gen_data()
{
declare -a machines=("machine1" "machine2" "machine3" "machine4" "machine5")
for i in ${machines[#]}; do
ssh $i "hostname && uptime"
done | awk ' BEGIN {cnt=0} \
{ printf("%s, ", $0)
cnt++
if(cnt % 3 == 0) {printf("\n") }
}' > ~/perf_data
}
#function check_data
# check for load averages (fields 6,7,8) which are greater than 7
check_data()
{
awk -F"," '{ if($6 < 9.0 && $7 < 9.0 && $8 < 9.0)
{print $0 }
}' ~/perf_data
}
most of this code is a modified version of a code that checked machine loads and emailed you if it was too high, but i can't quite get it to print out the machine names or make the perf_data file correctly.
What i'm trying to get is for a list of machines me#machine*.network.com, the program tests the load of the machine, and if it's low enough it prints the machine name:
me#machine1.network.com me#machine5.network.com me#machine10.network.com
that way i can pipe the output to another program that will use those machines.
Since i'm a n00b in awk i really need help with this.
Instead of this:
for i in ${machines[#]}; do
ssh $i "hostname && uptime"
done | awk ...
use this to make your life easier
for m in ${machines[#]}; do
ssh $i <<'COMMANDS'
echo "$(hostname):$(uptime)" | awk -F: '{gsub(/,/,"",$NF); print $1, $NF}'
COMMANDS
done > ~/perf_data
Then check_data can be
check_data() {
awk '$2 < 9 && $3 < 9 && $4 < 9 {print $1} ~/perf_data
}
Rather than modifying this script, you can write new one.
Here's a version replacing your script entirely, which fetches the load average in a Linux specific way:
for host in machine1 machine2 machine3
do
ssh "$host" '[ "$(awk "\$1 < 9" /proc/loadavg)" ] && hostname'
done > ~/perf_data
Alternately, you can do it through uptime:
for host in machine1 machine2 machine3
do
ssh "$host" '[ "$(uptime | awk -F"[ ,]+" "\$11 < 9")" ] && hostname'
done > ~/perf_data
Both these assume that you're interested in the current load, so it checks the 1 minute average rather than also caring about the 15 minute average.