Arithmetic operation fails in Shell script - bash

Basically I'm trying to check if there are any 200 http responses in the log, in last 3 line. but I'm getting the below error. Because of this the head command is failing..Please help
LINES=`cat http_access.log |wc -l`
for i in $LINES $LINES-1 $LINES-2
do
echo "VALUE $i"
head -$i http_access.log | tail -1 > holy.txt
temp=`cat holy.txt| awk '{print $9}'`
if [[ $temp == 200 ]]
then
echo "line $i has 200 code at "
cat holy.txt | awk '{print $4}'
fi
done
Output:
VALUE 18
line 18 has 200 code at [21/Jan/2018:15:34:23
VALUE 18-1
head: invalid trailing option -- - Try `head --help' for more information.

Use $((...)) to perform arithmetic.
for i in $((LINES)) $((LINES-1)) $((LINES-2))
Without it, it's attempting to run the commands:
head -18 http_access.log
head -18-1 http_access.log
head -18-2 http_access.log
The latter two are errors.
A more flexible way to write the for loop would be using C-style syntax:
for ((i = LINES - 2; i <= LINES; ++i)); do
...
done

You got the why from JohnKugelman's answer, I will just propose a simplified code that might work for you:
while read -ra fields; do
[[ ${fields[9]} = 200 ]] && echo "Line ${fields[0]} has 200 code: ${fields[4]}"
done < <(cat -n http_access.log | tail -n 3 | tac)
cat -n: Numbers lines of the file
tail -n 3: Prints 3 last lines. You can just change this number for more lines
tac: Prints the lines outputted by tail in reversed order
read -ra fields: Reads the fields into an array $fields
${fields[0]}: The line number
${fields[num_of_field]}: Individual fields
You can also use wc instead of numbering using cat -n. For larger inputs, this will be slightly faster:
lines=$(wc -l < http_access.log)
while read -ra fields; do
[[ ${fields[8]} = 200 ]] && echo "Line $lines has 200 code: ${fields[3]}"
((lines--))
done < <(tail -n 3 http_access.log | tac)

Related

Incremental head -n in bash for loop

How can I do a incremental for loop, for the -n in the head command (head -n)?
Does this work?
for (( i = 1 ; i <= $NUMBER ; i++ ))
head -$(NUMBER) filename.txt
NUMBER=$((NUMBER+1))
done
The code is suppose to display different texts off from filename.txt using the -n
The following should work:
for (( i = 1 ; i < `wc -l filename.txt | cut -f 1 -d ' '` ; i++ )); do
head -$i filename.txt | tail -1;
done
The wc -l filename.txt gets the number of lines in filename.txt. cut -f 1 -f ' ' takes the first field from the wc which is the number of lines. This is used as the upper bound for the loop.
head -$i takes the first $i lines and tail -1 takes the last line of that. This gives you one line blocks.

Output a file in two columns in BASH

I'd like to rearrange a file in two columns after the nth line.
For example, say I have a file like this here:
This is a bunch
of text
that I'd like to print
as two
columns starting
at line number 7
and separated by four spaces.
Here are some
more lines so I can
demonstrate
what I'm talking about.
And I'd like to print it out like this:
This is a bunch and separated by four spaces.
of text Here are some
that I'd like to print more lines so I can
as two demonstrate
columns starting what I'm talking about.
at line number 7
How could I do that with a bash command or function?
Actually, pr can do almost exactly this:
pr --output-tabs=' 1' -2 -t tmp1
↓
This is a bunch and separated by four spaces.
of text Here are some
that I'd like to print more lines so I can
as two demonstrate
columns starting what I'm talking about.
at line number 7
-2 for two columns; -t to omit page headers; and without the --output-tabs=' 1', it'll insert a tab for every 8 spaces it added. You can also set the page width and length (if your actual files are much longer than 100 lines); check out man pr for some options.
If you're fixed upon “four spaces more than the longest line on the left,” then perhaps you might have to use something a bit more complex;
The following works with your test input, but is getting to the point where the correct answer would be, “just use Perl, already;”
#!/bin/sh
infile=${1:-tmp1}
longest=$(longest=0;
head -n $(( $( wc -l $infile | cut -d ' ' -f 1 ) / 2 )) $infile | \
while read line
do
current="$( echo $line | wc -c | cut -d ' ' -f 1 )"
if [ $current -gt $longest ]
then
echo $current
longest=$current
fi
done | tail -n 1 )
pr -t -2 -w$(( $longest * 2 + 6 )) --output-tabs=' 1' $infile
↓
This is a bunch and separated by four spa
of text Here are some
that I'd like to print more lines so I can
as two demonstrate
columns starting what I'm talking about.
at line number 7
… re-reading your question, I wonder if you meant that you were going to literally specify the nth line to the program, in which case, neither of the above will work unless that line happens to be halfway down.
Thank you chatraed and BRPocock (and your colleague). Your answers helped me think up this solution, which answers my need.
function make_cols
{
file=$1 # input file
line=$2 # line to break at
pad=$(($3-1)) # spaces between cols - 1
len=$( wc -l < $file )
max=$(( $( wc -L < <(head -$(( line - 1 )) $file ) ) + $pad ))
SAVEIFS=$IFS;IFS=$(echo -en "\n\b")
paste -d" " <( for l in $( cat <(head -$(( line - 1 )) $file ) )
do
printf "%-""$max""s\n" $l
done ) \
<(tail -$(( len - line + 1 )) $file )
IFS=$SAVEIFS
}
make_cols tmp1 7 4
Could be optimized in many ways, but does its job as requested.
Input data (configurable):
file
num of rows borrowed from file for the first column
num of spaces between columns
format.sh:
#!/bin/bash
file=$1
if [[ ! -f $file ]]; then
echo "File not found!"
exit 1
fi
spaces_col1_col2=4
rows_col1=6
rows_col2=$(($(cat $file | wc -l) - $rows_col1))
IFS=$'\n'
ar1=($(head -$rows_col1 $file))
ar2=($(tail -$rows_col2 $file))
maxlen_col1=0
for i in "${ar1[#]}"; do
if [[ $maxlen_col1 -lt ${#i} ]]; then
maxlen_col1=${#i}
fi
done
maxlen_col1=$(($maxlen_col1+$spaces_col1_col2))
if [[ $rows_col1 -lt $rows_col2 ]]; then
rows=$rows_col2
else
rows=$rows_col1
fi
ar=()
for i in $(seq 0 $(($rows-1))); do
line=$(printf "%-${maxlen_col1}s\n" ${ar1[$i]})
line="$line${ar2[$i]}"
ar+=("$line")
done
printf '%s\n' "${ar[#]}"
Output:
$ > bash format.sh myfile
This is a bunch and separated by four spaces.
of text Here are some
that I'd like to print more lines so I can
as two demonstrate
columns starting what I'm talking about.
at line number 7
$ >

Getting error: sed: -e expression #1, char 2: unknown command: `.'

EDIT: FIXED. Now concerned with optimizing the code.
I am writing a script to separate data from one file into multiple files. When I run the script, I get the error: "sed: -e expression #1, char 2: unknown command: `.'" without any line number, making it somewhat hard to debug. I have checked the lines in which I use sed individually, and they work without problem. Any ideas? I realize that there are a lot of things that I did somewhat unconventionally and that there are faster ways of doing some things (I'm sure there's a way to avoid continuously importing somefile), but right now I'm just trying to understand this error. Here is the code:
x1=$(sed -n '1p' < somefile | cut -f1)
y1=$(sed -n '1p' < somefile | cut -f2)
p='p'
for i in 1..$(seq 1 $(cat "somefile" | wc -l))
do
x2=$(sed -n $i$p < somefile | cut -f1)
y2=$(sed -n $i$p < somefile | cut -f1)
if [ "$x1" = "$x2" ] && [ "$y1" = "$y2" ];
then
x1=$x2
y1=$x2
fi
s="$(sed -n $i$p < somefile | cut -f3) $(sed -n $i$p < somefile | cut$
echo $s >> "$x1-$y1.txt"
done
The problem is in the following line:
for i in 1..$(seq 1 $(cat "somefile" | wc -l))
If somefile were to have 3 lines, then this would result in following values of i:
1..1
2
3
Clearly, something like sed -n 1..1p < filename would result in the error you are observing: sed: -e expression #1, char 2: unknown command: '.'
You rather want:
for i in $(seq 1 $(cat "somefile" | wc -l))
This is the cause of the problem:
for i in 1..$(seq 1 $(cat "somefile" | wc -l))
Try just
for i in $(seq 1 $(wc -l < somefile))
However, you are reading your file many, many times too often with all those sed commands. Read it just once:
read x1 y1 < <(sed 1q somefile)
while read x2 y2 f3 f4; do
if [[ $x1 = $x2 && $y1 = $y2 ]]; then
x1=$x2
y1=$x2
fi
echo "$f3 $f4"
done < somefile > "$x1-$y1.txt"
The line where you construct the s variable is truncated -- I'm assuming you have 4 fields per line.
Note: a problem with cut-and-paste coding is that you introduce errors: you assign y2 the same field as x2

Get 20% of lines in File randomly

This is my code:
nb_lignes=`wc -l $1 | cut -d " " -f1`
for i in $(seq $nb_lignes)
do
m=`head $1 -n $i | tail -1`
//command
done
Please how can i change it to get Get 20% of lines in File randomly to apply "command" on each line ?
20% or 40% or 60 % (it's a parameter)
Thank you.
This will randomly get 20% of the lines in the file:
awk -v p=20 'BEGIN {srand()} rand() <= p/100' filename
So something like this for the whole solution (assuming bash):
#!/bin/bash
filename="$1"
pct="${2:-20}" # specify percentage
while read line; do
: # some command with "$line"
done < <(awk -v p="$pct" 'BEGIN {srand()} rand() <= p/100' "$filename")
If you're using a shell without command substitution (the <(...) bit), you can do this - but the body of the loop won't be able to have any side effects in the outer script (e.g. any variables it sets won't be set anymore once the loop completes):
#!/bin/sh
filename="$1"
pct="${2:-20}" # specify percentage
awk -v p="$pct" 'BEGIN {srand()} rand() <= p/100' "$filename" |
while read line; do
: # some command with "$line"
done
Try this:
file=$1
nb_lignes=$(wc -l $file | cut -d " " -f1)
num_lines_to_get=$((20*${nb_lignes}/100))
for (( i=0; i < $num_lines_to_get; i++))
do
line=$(head -$((${RANDOM} % $nb_lignes)) $file | tail -1)
echo "$line"
done
Note that ${RANDOM} only generates numbers less than 32768 so this approach won't work for large files.
If you have shuf installed, you can use the following to get a random line instead of using $RANDOM.
line=$(shuf -n 1 $file)
you can do it with awk.see below:
awk -v b=20 '{a[NR]=$0}END{val=((b/100)*NR)+1;for(i=1;i<val;i++)print a[i]}' all.log
the above command prints 20% of all the lines starting from begining of the file.
you just have to change the value of b on command line to get the required % of lines.
tested below:
> cat temp
1
2
3
4
5
6
7
8
9
10
> awk -v b=10 '{a[NR]=$0}END{val=((b/100)*NR)+1;for(i=1;i<val;i++)print a[i]}' temp
1
> awk -v b=20 '{a[NR]=$0}END{val=((b/100)*NR)+1;for(i=1;i<val;i++)print a[i]}' temp
1
2
>
shuf will produce the file in a randomized order; if you know how many lines you want, you can give that to the -n parameter. No need to get them one at a time. So:
shuf -n $(( $(wc -l < $FILE) * $PCT / 100 )) "$file" |
while read line; do
# do something with $line
done
shuf comes standard with GNU/Linux distros afaik.

How to verify information using standard linux/unix filters?

I have the following data in a Tab delimited file:
_ DATA _
Col1 Col2 Col3 Col4 Col5
blah1 blah2 blah3 4 someotherText
blahA blahZ blahJ 2 someotherText1
blahB blahT blahT 7 someotherText2
blahC blahQ blahL 10 someotherText3
I want to make sure that the data in 4th column of this file is always an integer. I know how to do this in perl
Read each line, Store value of 4th column in a variable
check if that variable is an integer
if above is true, continue the loop
else break out of the loop with message saying file data not correct
But how would I do this in a shell script using standard linux/unix filter? My guess would be to use grep, but I am not sure how?
cut -f4 data | LANG=C grep -q '[^0-9]' && echo invalid
LANG=C for speed
-q to quit at first error in possible long file
If you need to strip the first line then use tail -n+2 or you could get hacky and use:
cut -f4 data | LANG=C sed -n '1b;/[^0-9]/{s/.*/invalid/p;q}'
awk is the tool most naturally suited for parsing by columns:
awk '{if ($4 !~ /^[0-9]+$/) { print "Error! Column 4 is not an integer:"; print $0; exit 1}}' data.txt
As you get more complex with your error detection, you'll probably want to put the awk script in a file and invoke it with awk -f verify.awk data.txt.
Edit: in the form you'd put into verify.awk:
{
if ($4 !~/^[0-9]+$/) {
print "Error! Column 4 is not an integer:"
print $0
exit 1
}
}
Note that I've made awk exit with a non-zero code, so that you can easily check it in your calling script with something like this in bash:
if awk -f verify.awk data.txt; then
# action for success
else
# action for failure
fi
You could use grep, but it doesn't inherently recognize columns. You'd be stuck writing patterns to match the columns.
awk is what you need.
I can't upvote yet, but I would upvote Jefromi's answer if I could.
Sometimes you need it BASH only, because tr, cut & awk behave differently on Linux/Solaris/Aix/BSD/etc:
while read a b c d e ; do [[ "$d" =~ ^[0-9] ]] || echo "$a: $d not a numer" ; done < data
Edited....
#!/bin/bash
isdigit ()
{
[ $# -eq 1 ] || return 0
case $1 in
*[!0-9]*|"") return 0;;
*) return 1;;
esac
}
while read line
do
col=($line)
digit=${col[3]}
if isdigit "$digit"
then
echo "err, no digit $digit"
else
echo "hey, we got a digit $digit"
fi
done
Use this in a script foo.sh and run it like ./foo.sh < data.txt
See tldp.org for more info
Pure Bash:
linenum=1; while read line; do field=($line); if ((linenum>1)); then [[ ! ${field[3]} =~ ^[[:digit:]]+$ ]] && echo "FAIL: line number: ${linenum}, value: '${field[3]}' is not an integer"; fi; ((linenum++)); done < data.txt
To stop at the first error, add a break:
linenum=1; while read line; do field=($line); if ((linenum>1)); then [[ ! ${field[3]} =~ ^[[:digit:]]+$ ]] && echo "FAIL: line number: ${linenum}, value: '${field[3]}' is not an integer" && break; fi; ((linenum++)); done < data.txt
cut -f 4 filename
will return the fourth field of each line to stdout.
Hopefully that's a good start, because it's been a long time since I had to do any major shell scripting.
Mind, this may well not be the most efficient compared to iterating through the file with something like perl.
tail +2 x.x | sort -n -k 4 | head -1 | cut -f 4 | egrep "^[0-9]+$"
if [ "$?" == "0" ]
then
echo "file is ok";
fi
tail +2 gives you all but the first line (since your sample has a header)
sort -n -k 4 sorts the file numerically on the 4th column, letters will rise to the top.
head -1 gives you the first line of the file
cut -f 4 gives you the 4th column, of the first line
egrep "^[0-9]+$" checks if the value is a number (integers in this case).
If egrep finds nothing, $? is 1, otherwise it's 0.
There's also:
if [ `tail +2 x.x | wc -l` == `tail +2 x.x | cut -f 4 | egrep "^[0-9]+$" | wc -l` ] then
echo "file is ok";
fi
This will be faster, requiring two simple scans through the file, but it's not a single pipeline.
#OP, use awk
awk '$4+0<=0{print "not ok";exit}' file

Resources