I'm trying to make a program that sorts machines based on load, but i'm having a hard time parsing the ssh output. What i have so far is this:
gen_data()
{
declare -a machines=("machine1" "machine2" "machine3" "machine4" "machine5")
for i in ${machines[#]}; do
ssh $i "hostname && uptime"
done | awk ' BEGIN {cnt=0} \
{ printf("%s, ", $0)
cnt++
if(cnt % 3 == 0) {printf("\n") }
}' > ~/perf_data
}
#function check_data
# check for load averages (fields 6,7,8) which are greater than 7
check_data()
{
awk -F"," '{ if($6 < 9.0 && $7 < 9.0 && $8 < 9.0)
{print $0 }
}' ~/perf_data
}
most of this code is a modified version of a code that checked machine loads and emailed you if it was too high, but i can't quite get it to print out the machine names or make the perf_data file correctly.
What i'm trying to get is for a list of machines me#machine*.network.com, the program tests the load of the machine, and if it's low enough it prints the machine name:
me#machine1.network.com me#machine5.network.com me#machine10.network.com
that way i can pipe the output to another program that will use those machines.
Since i'm a n00b in awk i really need help with this.
Instead of this:
for i in ${machines[#]}; do
ssh $i "hostname && uptime"
done | awk ...
use this to make your life easier
for m in ${machines[#]}; do
ssh $i <<'COMMANDS'
echo "$(hostname):$(uptime)" | awk -F: '{gsub(/,/,"",$NF); print $1, $NF}'
COMMANDS
done > ~/perf_data
Then check_data can be
check_data() {
awk '$2 < 9 && $3 < 9 && $4 < 9 {print $1} ~/perf_data
}
Rather than modifying this script, you can write new one.
Here's a version replacing your script entirely, which fetches the load average in a Linux specific way:
for host in machine1 machine2 machine3
do
ssh "$host" '[ "$(awk "\$1 < 9" /proc/loadavg)" ] && hostname'
done > ~/perf_data
Alternately, you can do it through uptime:
for host in machine1 machine2 machine3
do
ssh "$host" '[ "$(uptime | awk -F"[ ,]+" "\$11 < 9")" ] && hostname'
done > ~/perf_data
Both these assume that you're interested in the current load, so it checks the 1 minute average rather than also caring about the 15 minute average.
Related
i am trying to use awk to filter the output of a autorep command
output of autorep command:
/tmp $ autorep -j Test_Job -q
/* ----------------- Test_Job ----------------- */
insert_job: Test_Job job_type: CMD
box_name: Test_box
command: echo
machine: machine_name
owner: ownername
permission: gx,ge,wx
date_conditions: 0
condition: s(testjob1)
description: "echo"
std_out_file: "/tmp/test_job.out"
std_err_file: "/tmp/test_job.err"
alarm_if_fail: 1
alarm_if_terminated: 1
/tmp $ autorep -j Test_Job2 -q
/* ----------------- Test_Job2 ----------------- */
insert_job: Test_Job2 job_type: CMD
command: echo
machine: machinename
owner: owner
permission:
date_conditions: 1
days_of_week: mo,tu,we,th,fr
start_mins: 9,19,29,39,49,59
run_window: "06:00-19:00"
description: "test discription"
std_out_file: "/tmp/Test_Job2.out"
std_err_file: "/tmp/Test_Job2.err"
alarm_if_fail: 1
alarm_if_terminated: 1
i have the below shell script to filter out the data:
#!/bin/bash
TXT=/tmp/test1.txt
CSV=/tmp/test1.csv
echo "Enter the JOB_NAME or %SEARCHSTRING%"
while read -r i;
do
awk '$1 == "insert_job:" {printf "%s %s ", $2, $4}; $1 == "condition:" {printf "%s ", $2}; $1 == "days_of_week:" {printf "%s ", $2}; $1 == "date_conditions:" {printf "%s\n ", $2}' < <(autorep -j $i -q) >$TXT
echo
break
done
if [ -s $TXT ]
then
(echo "job_name,job_type,Date_Conditions,Days_of_week/Conditions"; cat test1.txt) | sed 's/ \+/,/g' > $CSV
else
echo "Please check the %SEARCHSTRING% or JOB_NAME"
fi
the output I am looking for:
Test_Job CMD 0 s(testjob1)
Test_Job2 CMD 1 mo,tu,we,th,fr 9,19,29,39,49,59 "06:00-19:00"
but the command is not working and i am getting the data like below:
Test_Job CMD 00s(testjob1) Test_Job2 CMD 1 mo,tu,we,th,fr 9,19,29,39,49,59 "06:00-19:00"
can someone help me out with getting the correct output
EDIT:
Let me explain what i am trying to do. i am using the below command and i am giving a key word as %Test% (which will fetch all the jobs with name Test in it), so i will be basically running this query on all the jobs with that keyword and inturn would be getting a list with the filtered out options as per my query. i am getting the getting the data but all the data is on one line rather that each job data on each line:
EDIT 2:
So as you can see if date_condition: 0 then the job may or may not have condition: in it and if a job has date_condition: 1 then it will have 'days_of_week:' and may or may not have other fields like 'run_window:'.
so is there a way i can modify the script to print out maybe 'N/A' if some field is missing. And also if i get the data on each line individually
You don't say this in your question, but I'm assuming there are multiple jobs output by autorep. You need to keep track of the transition from one job to another in the output. You also need to anchor your regexes to prevent false matches.
awk '/^insert_job/ {if (flag) {printf "\n"}; printf "%s %s ", $2, $4; flag = 1}; /^date_conditions/ {printf "%s", $2}; /^condition/ {printf "%s", $2}' < <(autorep -j Test_Job -q)
This outputs a newline before each "insert_job" (and thus each job) after the first so each line of output is a different job.
Multiple lines for readability:
awk '
/^insert_job/ {if (flag) {printf "\n"};
printf "%s %s ", $2, $4;
flag = 1};
/^date_conditions/ {printf "%s", $2};
/^condition/ {printf "%s", $2}
' < <(autorep -j Test_Job -q)
Requirement
I have a txt file in which last column have URLs.
Some of the URL entries have IPs instead of FQDN.
So, for entries with IPs (e.g. url=https://174.37.243.85:443*), I need to do reverse nslookup for IP and replace the result (FQDN) with IP.
Text File Input
httpMethod=SSL-SNI destinationIPAddress=174.37.243.85 url=https://174.37.243.85:443*
httpMethod=SSL-SNI destinationIPAddress=183.3.226.92 url=https://pingtas.qq.com:443/*
httpMethod=SSL-SNI destinationIPAddress=184.173.136.86 url=https://v.whatsapp.net:443/*
Expected Output
httpMethod=SSL-SNI destinationIPAddress=174.37.243.85 url=https://55.f3.25ae.ip4.static.sl-reverse.com:443/*
httpMethod=SSL-SNI destinationIPAddress=183.3.226.92 url=https://pingtas.qq.com:443/*
httpMethod=SSL-SNI destinationIPAddress=184.173.136.86 url=https://v.whatsapp.net:443/*
Here's a quick and dirty attempt in pure Awk.
awk '$3 ~ /^url=https?:\/\/[0-9.]*([:\/?*].*)?$/ {
# Parse out the hostname part
split($3, n, /[\/:?\*]+/);
cmd = "dig +short -x " n[2]
cmd | getline reverse;
sub(/\.$/, "", reverse);
close(cmd)
# Figure out the tail after the hostname part
match($3, /^url=https:?\/\/[0-9.]*/); # update index
$3 = n[1] "://" reverse substr($3, RSTART+RLENGTH) } 1' file
If you don't have dig, you might need to resort to nslookup or host instead; but the only one of these which portably offers properly machine-readable output is dig so you might want to install it for that feature alone.
Solution 1st: Within single awk after discussion on comments adding this now:
awk '
{
if(match($0,/\/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/)){
val_match=substr($0,RSTART+1,RLENGTH-1);
system("nslookup " val_match " > temp")};
val=$0;
while(getline < "temp"){
if($0 ~ /name/){
num=split($0, array," ");
sub(/\./,"",array[num]);
sub(val_match,array[num],val);
print val}}
}
NF
' Input_file
Solution 2nd: It is my initial solution with awk and shell.
Following simple script may help you on same:
cat script.ksh
CHECK_IP () {
fdqn=$(echo "$1" | awk '{if(match($0,/\/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/)){system("nslookup " substr($0,RSTART+1,RLENGTH-1))}}')
actual_fdqn=$(echo "$fqdn" | awk '/name/{sub(/\./,""$NF);print $NF}')
echo "$actual_fdqn"
}
while read line
do
val=$(CHECK_IP "$line")
if [[ -n "$val" ]]
then
echo "$line" | awk -v var="$val" '{if(match($0,/\/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/)){ip_val=substr($0,RSTART+1,RLENGTH-1);sub(ip_val,var)}} 1'
else
echo "$line"
fi
done < "Input_file"
I am trying to return a random word from /usr/share/dict/words on my *NIX machine. I have written the following script using BASH, AWK, and SED together to do it, but I feel like it should be writable using AWK alone by using the RN and NF fields somehow.
#!/bin/bash
get_secret_word () {
awk '/^[A-Za-z]+$/ {if (length($1) > 3 && length($1) < 9)
print $1}' /usr/share/dict/words > /tmp/word_list
word_list_length=$(wc -l /tmp/word_list | awk '{print $1}')
random_number=$(( $RANDOM%$word_list_length ))
secret_word=$(sed "${random_number}!d" /tmp/word_list)
return $secret_word
}
get_secret_word
echo $secret_word
Any suggestions? I love AWK, and I'm trying to understand it better.
Try something like:
awk '
BEGIN {
srand('"$RANDOM"')
}
{
if (/^[A-Za-z]+$/ && length() > 3 && length() < 9)
words[i++] = $1
}
END {
print words[int(rand() * i)]
}' /usr/share/dict/words
Whether you store the words in memory or in a file will depend on your use case.BR.
I am looking for another approach to apply RIPEMD-160 to the second column of a csv file.
Here is my code
awk -F "," -v env_var="$key" '{
tmp="echo -n \047" $2 env_var "\047 | openssl ripemd160 | cut -f2 -d\047 \047"
if ( (tmp | getline cksum) > 0 ) {
$3 = toupper(cksum)
}
close(tmp)
print
}' /test/source.csv > /ziel.csv
I run it in a big csv file (1Go), it takes 2 days and I get only 100Mo, that means i need to wait a month to get all my new CSV.
Can you help me with another idea and approach to get my data faster.
Thanks in advance
you can use GNU Parallel to increase the speed of output by executing the awk command in parallel For explanation check here
cat /test/source.csv | parallel --pipe awk -F "," -v env_var="$key" '{
tmp="echo -n \047" $2 env_var "\047 | openssl ripemd160 | cut -f2 -d\047 \047"
if ( (tmp | getline cksum) > 0 ) {
$3 = toupper(cksum)
}
close(tmp)
print
}' > /ziel.csv
# prepare a batch (to avoir fork from awk)
awk -F "," -v env_var="$key" '
BEGIN {
print "if [ -r /tmp/MD160.Result ];then rm /tmp/MD160.Result;fi"
}
{
print "echo \"\$( echo -n \047" $2 env_var "\047 | openssl ripemd160 )\" >> /tmp/MD160.Result"
} ' /test/source.csv > /tmp/MD160.eval
# eval the MD for each line with batch fork (should be faster)
. /tmp/MD160.eval
# take result and adapt for output
awk '
# load MD160
FNR == NR { m[NR] = toupper($2); next }
# set FS to ","
FNR == 1 { FS = ","; $0 = $0 "" }
# adapt original line
{ $3 = m[FNR]; print}
' /tmp/MD160.Result /test/source.csv > /ziel.csv
Note:
not tested (so the print need maybe some tuning with escape)
no error treatment (assume everything is ok). I advice to make some test (like inclunding line reference in reply and test in second awk).
fork at batch level will be lot more faster than fork from awk including piping fork, catching the reply
not a specialist of openssl ripemd160 but there is maybe another way to treat element in a bulk process without opening everytime a fork from same file/source
Your solution hits Cygwin where it hurts the most: Spawning new programs. Cygwin is terrible slow at this.
You can make this faster by using all cores in you computer, but it will still be very slow.
You need a program that does not start other programs to compute the RIPEMD sum. Here is a small Python script that takes the CSV on standard input and outputs the CSV on standard output with the second column replaced with the RIPEMD sum.
riper.py:
#!/usr/bin/python
import hashlib
import fileinput
import os
key = os.environ['key']
for line in fileinput.input():
# Naiive CSV reader - split on ,
col = line.rstrip().split(",")
# Compute RIPEMD on column 2
h = hashlib.new('ripemd160')
h.update(col[1]+key)
# Update column 2 with the hexdigext
col[1] = h.hexdigest().upper();
print ','.join(col)
Now you can run:
cat source.csv | key=a python riper.py > ziel.csv
This will still only use a single core of your system. To use all core GNU Parallel can help. If you do not have GNU Parallel 20161222 or newer in your package system, it can be installed as:
(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
You will need Perl installed to run GNU Parallel:
key=a
export key
parallel --pipe-part --block -1 -a source.csv -k python riper.py > ziel.csv
This will on the fly chop source.csv into one block per CPU core and for each block run the python script. On my 8 core this processes a 1 GB file with 139482000 lines in 300 seconds.
If you need it faster still, you will need to convert riper.py to a compiled language (e.g. C).
Hi dear friends I have a shell script command for store output of
/proc/meminfo command in some variables and I want to sum these
variables, but I got just this result kB+KB+KB and the code doesn't
work,can anybody help to fix it,thanks
numA=$(grep -m 1 "MemTotal" /proc/meminfo | awk '{ print $2 }')
numB=$(grep -m 1 "MemFree" /proc/meminfo | awk '{ print $3 }')
numC=$(grep -m 1 "Buffers" /proc/meminfo | awk '{ print $4 }')
numD=$(grep -m 1 "Cached" /proc/meminfo | awk '{ print $5 }')
echo "-------------------"
echo $numA $numB $numC $numD
echo " ****--------------------"
numsum=$numB+$numC+$numD
echo "numsum =MemFree+Buffers+Cached=$numsum"
echo $numsum
numminus=$mumA-$numsum
echo "numminus =MemTotal-(MemFree+Buffers+Cached)=$numminus"
numDivide=$numminus/$numA
echo "numDivide =numminus/numA=$numsum"
The whole thing should be a single Awk script. Extracting each field to a separate shell variable so you can use the shell's notoriously poor arithmetic facilities is just crazy. In particular, even if you do get some arithmetic in Bash (though the syntax is different from what you tried), it will still be integer only; so your division result will simply be zero.
awk '/MemTotal/ && !memtotal { memtotal = $2 }
/MemFree/ && !memfree { memfree = $3 }
/Buffers/ &&!buffers { buffers = $4 }
/Cached/ && !cached { cached = $5 }
END (
# Ugh, is this really necessary?
print "-------------------"
print memtotal, memfree, buffers, cached
print " ****-------------------"
numsum=memfree+buffers+cached
print "numsum =MemFree+Buffers+Cached=" numsum
numminus=memtotal-numsum
print "numminus =MemTotal-(MemFree+Buffers+Cached)=" numminus
numDivide=numminus/memtotal
print "numDivide =numminus/memtotal=" numDivide }' /proc/meminfo
If one of the values could be zero, this may require a slightly different approach for pulling out the first match.
I renamed the first four variables; the other three should probably get sensible names instead as well, but I could not quickly understand what you are hoping to calculate.
A somewhat more idiomatic approach for capturing the result of the calculation for later use is to have the Awk script print just the computer-readable output. The following script is rich in comments -- it could be pared down to be much smaller if you remove the comments, but I suppose legibility and maintainability would trump brevity here. Incidentally, this also demonstrates the "slightly different approach" to ensure that we always get the first value of a variable.
memRatio=$(awk '# Populate an associative array with first occurrences
/MemTotal/ && !("memtotal" in i) { i["memtotal"] = $2 }
/MemFree/ && !("memfree" in i) { i["memfree"] = $3 }
/Buffers/ && !("buffers" in i) { i["buffers"] = $4 }
/Cached/ && !("cached" in i) { i["cached"] = $5 }
# Have we collected all the keys for the array? Then print and quit
("memtotal" in i) && ("memfree" in i) &&
("buffers" in i) && ("cached" in i) {
print (i["memtotal"]-i["memfree"]-i["buffers"]-i["cached"])/i["memtotal"]
exit 0 # success
}
# If we fall through to here, we never captured the variables
END { exit 1 }' /proc/meminfo)
Though on my system, all these values seem to be in $2, not in successively increasing columns. In this case, the capturing code could be simplified somewhat (use a single regex for all four keys; use the regex match regardless of case for the array key).
I only modify the scripts so that it returns some values. I am not an expert but i only fixed error that where visible to me. Hope it works for you.
numA=$(grep -m 1 "MemTotal" /proc/meminfo | awk '{ print $2 }')
numB=$(grep -m 1 "MemFree" /proc/meminfo | awk '{ print $2 }')
numC=$(grep -m 1 "Buffers" /proc/meminfo | awk '{ print $2 }')
numD=$(grep -m 1 "Cached" /proc/meminfo | awk '{ print $2 }')
echo "-------------------"
echo $numA $numB $numC $numD
echo " ****--------------------"
numsum=$(($numB+$numC+$numD))
echo "numsum =MemFree+Buffers+Cached=$numsum"
echo $numsum
numminus=$mumA-$numsum
echo "numminus =MemTotal-(MemFree+Buffers+Cached)=$numminus"
numDivide=$numminus/$numA
echo "numDivide =numminus/numA=$numsum"