I am using the following code in bash linux to extract time duration for each connection
for a in folder/*.pcap
do
difference=$(echo $(tshark -r $a -T fields -e frame.time_epoch | tail -n 1) - $(tshark -r $a -T fields -e frame.time_epoch | head -n 1) | bc)
echo $difference
done
However, the process time is very high (1 minute for 100 pcaps). Any ideas how to improve process time?
Would this work:
myfun() {
a=$1
difference=$(echo $(tshark -r $a -T fields -e frame.time_epoch | tail -n 1) - $(tshark -r $a -T fields -e frame.time_epoch | head -n 1) | bc)
echo $difference
}
export -f myfun
parallel myfun ::: folder/*.pcap
You can install GNU Parallel simply by:
wget http://git.savannah.gnu.org/cgit/parallel.git/plain/src/parallel
chmod 755 parallel
Watch the intro videos to learn more: http://pi.dk/1
10 seconds installation:
wget pi.dk/3 -qO - | sh -x
Related
I have a bash script that does a pretty decent job on reporting CPU level above 95%. The issue I am running into is it will report on even "spikes". This script runs every 10 minutes and checks all of my servers. Is there a way to only report if the server reports a level above 95% for 3 iterations? say after the 3rd time it runs, i.e 30 min.
12:00 - 1st report - 98%
12:10 - 2nd report - 99%
12:20 - 3rd report - 98% (now alert the admin)
here is the section of the script:
for sn in $(cat /tmp/hosts |grep -v "#"); do
cpuuse=$(ssh -qn -o ConnectTimeout=15 -oStrictHostKeyChecking=no -o BatchMode=yes $sn "top -b -n2 -p 1 | fgrep \"Cpu(s)\" | tail -1 | awk -F'id,' -v prefix=\"\$prefix\" '{ split(\$1, vs, \",\"); v=vs[length(vs)]; sub(\"%\", \"\", v); printf \"%s%.1f%%\n\", prefix, 100 - v }' | rev | cut -c 4- | rev")
if [[ "$cpuuse" -ge 95 ]]; then
echo "CPU Alert!! $sn CPU is high - $cpuuse%" | mailx -s "CPU Alert on $sn" admin#sample.com
fi
done
AFAIK There isn't really a bash trick. You just need to store a counter somewhere. Something like this could do the trick:
for sn in $(cat /tmp/hosts |grep -v "#"); do
cpuuse=$(ssh -qn -o ConnectTimeout=15 -oStrictHostKeyChecking=no -o BatchMode=yes $sn "top -b -n2 -p 1 | fgrep \"Cpu(s)\" | tail -1 | awk -F'id,' -v prefix=\"\$prefix\" '{ split(\$1, vs, \",\"); v=vs[length(vs)]; sub(\"%\", \"\", v); printf \"%s%.1f%%\n\", prefix, 100 - v }' | rev | cut -c 4- | rev")
counter_file=/tmp/my-counter-file-$sn # separate counter file for each server
if [[ "$cpuuse" -ge 95 ]]; then
date >> $counter_file # just add a line to the counter file
if [[ $(wc -l $counter_file) -ge 3 ]]; then
echo "CPU Alert!! $sn CPU is high - $cpuuse%" | mailx -s "CPU Alert on $sn" admin#sample.com
rm $counter_file # message was sent, reset counter
fi
else
rm $counter_file # below limit, reset counter
fi
done
The trick here is to store a counter in a file. The number of lines in the file is your counter value.
I have the following line of code:
for h in "${Hosts[#]}" ; do echo "$MyLog" | grep -m 1 -B 3 -A 1 $h >> /LogOutput ; done
My hosts variable is a large array of hosts
Is there a better way to do this that doesn't require me to echo on each loop? Like grep on a variable instead?
No echo, no loop
#!/bin/bash
hosts=(host1 host2 host3)
MyLog="
asf host
sdflkj
sadkjf
sdlkjds
lkasf
sfal
asf host2
sdflkj
sadkjf
"
re="${hosts[#]}"
egrep -m 1 -B 3 -A 1 ${re// /|} <<< "$MyLog"
Variant with one echo
echo "$MyLog" | egrep -m 1 -B 3 -A 1 ${re// /|}
Usage
$ ./test
sdlkjds
lkasf
sfal
asf host2
sdflkj
One echo, no loops, and all grepping done in parallel, with GNU Parallel:
echo "$MyLog" | parallel -k --tee --pipe 'grep -m 1 -B 3 -A 1 {}' ::: "${hosts[#]}"
The -k keeps the output in order.
The --tee and the --pipe ensure that the stdin is duplicated to all processes.
The processes that are run in parallel are enclosed in single quotes.
printf your string to multiple-line that you can then grep? Something like:
printf '%s\n' "${Hosts[#]}" | grep -m 1 -B 3 -A 1 $h >> /LogOutput
Assuming you're on GNU system. otherwise info grep
From grep --help
grep --help | head -n1
Output
Usage: grep [OPTION]... PATTERN [FILE]...
So according to that you can do.
for h in "${Hosts[#]}" ; do grep -m 1 -B 3 -A 1 "$h" "$MyLog" >> /LogOutput ; done
Hi I have the following batch script where I submitted each file to a separate processing as follows:
for file in ../Positive/*.txt_rn; do
bsub <<EOF
#BSUB -L /bin/bash
#BSUB -W 150:00
#BSUB -M 10000
#BSUB -n 3
#BSUB -e /somefolder/errors/%J.err
#BSUB -o /somefolder/errors/%J.out
while read line; do
name=`cat \$line | awk '{print $1":"$2"-"$3}'`
four=`cat \$line | awk '{print $4}' | cut -d\: -f4`
fasta=\$name".fa"
op=\$name".rs"
echo \$name | xargs samtools faidx /somefolder/rn4/Rattus_norvegicus/UCSC/rn4/Sequence/WholeGenomeFasta/genome.fa > \$fasta
Process -F \$fasta -M "list_"\$four".txt" -p 0.003 | awk '(\$5 >= 0.67)' > \$op
if [ -s "\$op" ]
then
cat "\$line" >> ../Positive_Strand/$file".cons"
fi
rm \$lne
rm \$op
rm \$fasta
done < $file
EOF
done
I am am somehow unable to store the values of the column from the line (which is in $line variable into the $name and $four variable and hence unable to carry on further processes. Also any suggestions to edit the code for a better version of it would be welcome.
If you change EOF to 'EOF' then you will more properly disable shell interpretation. Your problem is that your back-ticks (`) are not escaped.
I've fixed your indentation and cleaned up some of your code. Note that the syntax highlighting here doesn't understand cat <<'EOF'. If you paste that into vim with highlighting enabled, you'll see that block is all the same color since it's just a string.
bsub_helper() {
cat <<'EOF'
#BSUB -L /bin/bash
#BSUB -W 150:00
#BSUB -M 10000
#BSUB -n 3
#BSUB -e /somefolder/errors/%J.err
#BSUB -o /somefolder/errors/%J.out
while read line; do
name=`cat $line | awk '{print $1":"$2"-"$3}'`
four=`cat $line | awk '{print $4}' | cut -d: -f4`
fasta="$name.fa"
op="$name.rs"
genome="/somefolder/rn4/Rattus_norvegicus/UCSC/rn4/Sequence/WholeGenomeFasta/genome.fa"
echo $name | xargs samtools faidx "$genome" > "$fasta"
Process -F "$fasta" -M "list_$four.txt" -p 0.003 | awk '($5 >= 0.67)' > "$op"
if [ -s "$op" ]
then
cat "$line" >> "../Positive_Strand/$file.cons"
fi
rm "$lne" "$op" "$fasta"
EOF
echo " done < \"$1\""
}
for file in ../Positive/*.txt_rn; do
bsub_helper "$file" |bsub
done
I created a helper function because I needed to get the input in two commands. I am assuming that $file is the only variable in that block that you want interpreted. I also surrounded that variable (among others) with quotes so that the code can support file names with spaces in them. The final line of the helper has nested double quotes for this reason.
I left your echo $name | xargs … line alone because it's so odd. Without quotes around $name, xargs will take each whitespace-separated entry as its own file. With quotes, xargs will only supply one (likely invalid) file name to samtools.
If $name is a single file, try:
samtools faidx "$genome" "$name" > "$fasta"
If $name is multiple files and none of them have spaces, try:
samtools faidx "$genome" $name > "$fasta"
The only reason to use xargs here would be if you have too much content for one command line, but if you're running echo $name | xargs then you'll run into the same problem.
I have a file 'tbook1' with lot of numerical values (+2M). I have to perform the below in bash (Solaris / RHEL):
Do following:
Remove 1st and last 2 lines
Remove (,") & (")
Substitute (, ) with (,)
I can do it using two methods:
Method1:
sed -e 1d -e 's/,"//g' -e 's/, /,/g' -e 's/"//g' -e 'N;$!P;$!D;$d' tbook1 > tbook1.3
method2:
tail -n +2 tbook1 | head -n -2 > tbook1.1
sed -e 's/,"//' -e 's/, //' tbook 1.1 > tbook1.2
I want to know which one is better i.e. faster & efficient (resource usage) ?
Method 1 would usually be more efficient, mainly because of method 2's extra pipe and intermediate file that gets read and written to..
Method one scans the file only once and writes 1 result (but please store the result in a file with different name)
Method two 2 scans the original file and the intermediate result and write the intermediate and the final result. It is bound to be about twice slower.
I think head and tail are more efficient for this line elimination task than pure sed. But the other two answers are also right. You should avoid running several passes.
You can improve the second method by chaining them together:
tail -n +2 book.txt | head -n -2 | sed -e 's/,"//' -e 's/, //'
Then head and tail are faster. Try it your self (on a reasonable sized file):
#!/usr/bin/env bash
target=/dev/null
test(){
mode=$1
start=$(date +%s)
if [ $mode == 1 ]; then
sed -e 1d -e 's/,"//g' -e 's/, /,/g' -e 's/"//g' -e 'N;$!P;$!D;$d' book.txt > $target
elif [ $mode == 2 ]; then
tail -n +2 book.txt | head -n -2 | sed -e 's/,"//' -e 's/, //' > $target
else
cat book.txt > /dev/null
fi
((time = $(date +%s) - $start))
echo $time "seconds"
}
echo "cat > /dev/null"
test 0
echo "sed > $target"
test 1
echo "tail/head > $target"
test 2
My results:
cat > /dev/null
0 seconds
sed > /dev/null
5 seconds
tail/head > /dev/null
3 seconds
First off, I'm new to this. I have some experience with windows scripting and apple script but not much with bash. What I'm trying to do is grab the PID and %CPU of a specific process. then compare the %CPU against a set number, and if it's higher, kill the process. I feel like I'm close, but now I'm getting the following error:
[[: 0.0: syntax error: invalid arithmetic operator (error token is ".0")
what am I doing wrong? here's my code so far:
#!/bin/bash
declare -i app_pid
declare -i app_cpu
declare -i cpu_limit
app_name="top"
cpu_limit="50"
app_pid=`ps aux | grep $app_name | grep -v grep | awk {'print $2'}`
app_cpu=`ps aux | grep $app_name | grep -v grep | awk {'print $3'}`
if [[ ! $app_cpu -gt $cpu_limit ]]; then
echo "crap"
else
echo "we're good"
fi
Obviously I'm going to replace the echos in the if/then statement but it's acting as if the statement is true regardless of what the cpu load actually is (I tested this by changing the -gt to -lt and it still echoed "crap"
Thank you for all the help. Oh, and this is on a OS X 10.7 if that is important.
I recommend taking a look at the facilities of ps to avoid multiple horrible things you do.
On my system (ps from procps on linux, GNU awk) I would do this:
ps -C "$app-name" -o pid=,pcpu= |
awk --assign maxcpu="$cpu_limit" '$2>maxcpu {print "crappy pid",$1}'
The problem is that bash can't handle decimals. You can just multiply them by 100 and work with plain integers instead:
#!/bin/bash
declare -i app_pid
declare -i app_cpu
declare -i cpu_limit
app_name="top"
cpu_limit="5000"
app_pid=`ps aux | grep $app_name | grep -v grep | awk {'print $2'}`
app_cpu=`ps aux | grep $app_name | grep -v grep | awk {'print $3*100'}`
if [[ $app_cpu -gt $cpu_limit ]]; then
echo "crap"
else
echo "we're good"
fi
Keep in mind that CPU percentage is a suboptimal measurement of application health. If you have two processes running infinite loops on a single core system, no other application of the same priority will ever go over 33%, even if they're trashing around.
#!/bin/sh
PROCESS="java"
PID=`pgrep $PROCESS | tail -n 1`
CPU=`top -b -p $PID -n 1 | tail -n 1 | awk '{print $9}'`
echo $CPU
I came up with this, using top and bc.
Use it by passing in ex: ./script apache2 50 # max 50%
If there are many PIDs matching your program argument, only one will be calculated, based on how top lists them. I could have extended the script by catching them all and avergaing the percentage or something, but this will have to do.
You can also pass in a number, ./script.sh 12345 50, which will force it to use an exact PID.
#!/bin/bash
# 1: ['command\ name' or PID number(,s)] 2: MAX_CPU_PERCENT
[[ $# -ne 2 ]] && exit 1
PID_NAMES=$1
# get all PIDS as nn,nn,nn
if [[ ! "$PID_NAMES" =~ ^[0-9,]+$ ]] ; then
PIDS=$(pgrep -d ',' -x $PID_NAMES)
else
PIDS=$PID_NAMES
fi
# echo "$PIDS $MAX_CPU"
MAX_CPU="$2"
MAX_CPU="$(echo "($MAX_CPU+0.5)/1" | bc)"
LOOP=1
while [[ $LOOP -eq 1 ]] ; do
sleep 0.3s
# Depending on your 'top' version and OS you might have
# to change head and tail line-numbers
LINE="$(top -b -d 0 -n 1 -p $PIDS | head -n 8 \
| tail -n 1 | sed -r 's/[ ]+/,/g' | \
sed -r 's/^\,|\,$//')"
# If multiple processes in $PIDS, $LINE will only match\
# the most active process
CURR_PID=$(echo "$LINE" | cut -d ',' -f 1)
# calculate cpu limits
CURR_CPU_FLOAT=$(echo "$LINE"| cut -d ',' -f 9)
CURR_CPU=$(echo "($CURR_CPU_FLOAT+0.5)/1" | bc)
echo "PID $CURR_PID: $CURR_CPU""%"
if [[ $CURR_CPU -ge $MAX_CPU ]] ; then
echo "PID $CURR_PID ($PID_NAMES) went over $MAX_CPU""%"
echo "[[ $CURR_CPU""% -ge $MAX_CPU""% ]]"
LOOP=0
break
fi
done
echo "Stopped"
Erik, I used a modified version of your code to create a new script that does something similar. Hope you don't mind it.
A bash script to get the CPU usage by process
usage:
nohup ./check_proc bwengine 70 &
bwegnine is the process name we want to monitor 70 is to log only when the process is using over 70% of the CPU.
Check the logs at: /var/log/check_procs.log
The output should be like:
DATE | TOTAL CPU | CPU USAGE | Process details
Example:
03/12/14 17:11 |20.99|98| ProdPROXY-ProdProxyPA.tra
03/12/14 17:11 |20.99|100| ProdPROXY-ProdProxyPA.tra
Link to the full blog:
http://felipeferreira.net/?p=1453
It is also useful to have app_user information available to test whether the current user has the rights to kill/modify the running process. This information can be obtained along with the needed app_pid and app_cpu by using read eliminating the need for awk or any other 3rd party parser:
read app_user app_pid tmp_cpu stuff <<< \
$( ps aux | grep "$app_name" | grep -v "grep\|defunct\|${0##*/}" )
You can then get your app_cpu * 100 with:
app_cpu=$((${tmp_cpu%.*} * 100))
Note: Including defunct and ${0##*/} in grep -v prevents against multiple processes matching $app_name.
I use top to check some details. It provides a few more details like CPU time.
On Linux this would be:
top -b -n 1 | grep $app_name
On Mac, with its BSD version of top:
top -l 1 | grep $app_name