I would like to block IP address from a log file, IPs are collected from access.log file in apache2.
The IPs are correctly collected in file ips.log, but while reading the file to ban collected IPs, the block is not done.
#!/bin/bash
# Store words to avoid an search for
BADWORDS=( '/etc/passwd' 'file?' 'w00tw00t' 'fckeditor' 'ifconfig' )
# Get number of elements in the backup_files array
entries=${#BADWORDS[#]}
for ((i=0; i<= entries-1; i++))
do
setBadWord=${BADWORDS[$i]}
tail -F /var/log/apache2/access.log | grep --line-buffered "$setBadWord" | while read -r a; do echo "$a" | awk '{ print $1 } ' >> ips.log; done
done # end for
while IFS= read -r ip; do
iptables -A INPUT -s "$ip" -j DROP
done < ips.log
Your code has many issues:
it runs a new copy of awk for every line selected (awk is not needed at all);
it tries to run the first loop multiple times (once for every element of "$BADWORDS")
the first loop never finishes (because of tail -F) and so the iptables loop never starts
the iptables command appends a new rule even if the IP has been seen before
it is simpler to write i<entries rather than i<=entries-1, and even simpler to just use for setBadword in "${BADWORDS[#]}"; do ...
If you really want to permanently loop reading the logfile, with GNU utilities you can do something like:
#!/bin/sh
log=/var/log/apache2/access.log
words=/my/list/of/badwords/one/per/line
banned=/my/list/of/already/banned/ips
tail -F "$log" |\
grep --line-buffered -Ff "$words" |\
while read ip junk; do
grep -qxF $ip "$banned" || (
iptables -A INPUT -s $ip -j DROP
echo $ip >> "$banned"
)
done
# we never get here because "tail -F" never finishes
To just process the logfile once and then finish, you can feed grep from "$log" directly:
grep --line-buffered -Ff "$words" "$log" | ...
but it is probably less error-prone to just use fail2ban which is explicitly designed for this sort of task.
Related
I have a log file with a lot of lines with the following format:
IP - - [Timestamp Zone] 'Command Weblink Format' - size
I want to write a script.sh that gives me the number of times each website has been clicked.
The command:
awk '{print $7}' server.log | sort -u
should give me a list which puts each unique weblink in a separate line. The command
grep 'Weblink1' server.log | wc -l
should give me the number of times the Weblink1 has been clicked. I want a command that converts each line created by the Awk command above to a variable and then create a loop that runs the grep command on the extracted weblink. I could use
while IFS='' read -r line || [[ -n "$line" ]]; do
echo "Text read from file: $line"
done
(source: Read a file line by line assigning the value to a variable) but I don't want to save the output of the Awk script in a .txt file.
My guess would be:
while IFS='' read -r line || [[ -n "$line" ]]; do
grep '$line' server.log | wc -l | ='$variabel' |
echo " $line was clicked $variable times "
done
But I'm not really familiar with connecting commands in a loop, as this is my first time. Would this loop work and how do I connect my loop and the Awk script?
Shell commands in a loop connect the same way they do without a loop, and you aren't very close. But yes, this can be done in a loop if you want the horribly inefficient way for some reason such as a learning experience:
awk '{print $7}' server.log |
sort -u |
while IFS= read -r line; do
n=$(grep -c "$line" server.log)
echo "$line" clicked $n times
done
# you only need the read || [ -n ] idiom if the input can end with an
# unterminated partial line (is illformed); awk print output can't.
# you don't really need the IFS= and -r because the data here is URLs
# which cannot contain whitespace and shouldn't contain backslash,
# but I left them in as good-habit-forming.
# in general variable expansions should be doublequoted
# to prevent wordsplitting and/or globbing, although in this case
# $line is a URL which cannot contain whitespace and practically
# cannot be a glob. $n is a number and definitely safe.
# grep -c does the count so you don't need wc -l
or more simply
awk '{print $7}' server.log |
sort -u |
while IFS= read -r line; do
echo "$line" clicked $(grep -c "$line" server.log) times
done
However if you just want the correct results, it is much more efficient and somewhat simpler to do it in one pass in awk:
awk '{n[$7]++}
END{for(i in n){
print i,"clicked",n[i],"times"}}' |
sort
# or GNU awk 4+ can do the sort itself, see the doc:
awk '{n[$7]++}
END{PROCINFO["sorted_in"]="#ind_str_asc";
for(i in n){
print i,"clicked",n[i],"times"}}'
The associative array n collects the values from the seventh field as keys, and on each line, the value for the extracted key is incremented. Thus, at the end, the keys in n are all the URLs in the file, and the value for each is the number of times it occurred.
Not sure how to explain this but, what I am trying to achieve is this:
- tailing a file and grepping for a patter A
- then I want to pipe into another customGrepFunction where it matches pattern B, and if B matches echo something out. Need the customGrepFunction in order to do some other custom stuff.
The sticky part here is how to make the grepCustomFunction work here.In other words when only patternA matches echo the whole line and when both patterA & patternB match printout something custom:
when I only run:
tail -f file.log | grep patternA
I can see the pattenA rows are being printed/tailed however when I add the customGrepFunction nothing happens.
tail -f file.log | grep patternA | customGrepFunction
And the customGrepFunction should be available globally in my bin folder:
customGrepFunction(){
if grep patternB
then
echo "True"
fi
}
I have this setup however it doesn't do what I need it to do, it only echos True whenever I do Ctrl+C and exit the tailing.
What am I missing here?
Thanks
What's Going Wrong
The code: if grep patternB; then echo "true"; fi
...waits for grep patternB to exit, which will happen only when the input from tail -f file.log | grep patternA hits EOF. Since tail -f waits for new content forever, there will never be an EOF, so your if statement will never complete.
How To Fix It
Don't use grep on the inside of your function. Instead, process content line-by-line and use bash's native regex support:
customGrepFunction() {
while IFS= read -r line; do
if [[ $line =~ patternB ]]; then
echo "True"
fi
done
}
Next, make sure that grep isn't buffering content (if it were, then it would be written to your code only in big chunks, delaying until such a chunk is available). The means to do this varies by implementation, but with GNU grep, it would look like:
tail -f file.log | grep --line-buffered patternA | customGrepFunction
Here is what I have and not working:
for i in `cat cnames.csv`
do nslookup $i | grep -v "8.8.8.8\|=\|Non-authoritative" >> output.txt
done
Any better solutions?
This is Bash FAQ 001; you don't iterate over a file using a for loop.
while IFS= read -r i; do
nslookup "$i"
done < cnames.csv | grep -v "8.8.8.8\|=\|Non-authoritative" > output.txt
Note that you don't need to run grep separate for each call to nslookup; you can pipe the aggregate output to a single call.
You can use the exit status of nslookup.
for i in $(cat cnames.csv); do
if nslookup "$i"; then
echo "$i is valid"
else
echo "$i not found"
fi
done
Is cnames.csv a real .csv file? Wouldn't that require to extract only the column with addresses in them? Right now the commas and other fields (if existing) are read too.
You could probably get them all looked up faster in parallel and more succinctly with GNU Parallel
parallel -a cnames.csv nslookup {} | grep ...
Protagonists
The Admin
Pipes
The Cron Daemon
A bunch of text processing utilities
netstat
>> the Scribe
Setting
The Cron Daemon is repeatedly performing the same job where he forces an innocent netstat to show the network status (netstat -n). Pipes then have to pick up the information and deliver it to bystanding text processing utilities (| grep tcp | awk '{ print $5 }' | cut -d "." -f-4). >> has to scribe the important results to a file. As his highness, The Admin, is a lazy and easily annoyed ruler, >> only wants to scribe new information to the file.
*/1 * * * * netstat -n | grep tcp | awk '{ print $5 }' | cut -d "." -f-4 >> /tmp/file
Soliloquy by >>
To append, or not append, that is the question:
Whether 'tis new information to bother The Admin with
and earn an outrageous Fortune,
Or to take Arms against `netstat` and the others,
And by opposing, ignore them? To die: to sleep;
note by the publisher: For all those that had problems understanding Hamlet, like I did, the question is, how do I check if the string is already included in the file and if not, append it to the file?
Unless you are dealing with a very big file, you can use the uniq command to remove the duplicate lines from the file. This means you will also have the file sorted, I don't know if this is an advantage or disadvantage for you:
netstat -n | grep tcp | awk '{ print $5 }' | cut -d "." -f-4 >> /tmp/file && sort /tmp/file | uniq > /tmp/file.uniq
This will give you the sorted results without duplicates in /tmp/file.uniq
What a piece of work is piping, how easy to reason about,
how infinite in use cases, in bash and script,
how elegant and admirable in action,
how like a vim in flexibility,
how like a gnu!
Here is a slightly different take:
netstat -n | awk -F"[\t .]+" '/tcp/ {print $9"."$10"."$11"."$12}' | sort -nu | while read ip; do if ! grep -q $ip /tmp/file; then echo $ip >> /tmp/file; fi; done;
Explanation:
awk -F"[\t .]+" '/tcp/ {print $9"."$10"."$11"."$12}'
Awk splits the input string by tabs and ".". The input string is filtered (instead of using a separate grep invocation) by lines containing "tcp". Finally the resulting output fields are concatenated with dots and printed out.
sort -nu
Sorts the IPs numerically and creates a set of unique entries. This eliminates the need for the separate uniq command.
if ! grep -q $ip /tmp/file; then echo $ip >> /tmp/file; fi;
Greps for the ip in the file, if it doesn't find it, the ip gets appended.
Note: This solution does not remove old entries and clean up the file after each run - it merely appends - as your question implied.
I am launching a website, and I wanted to setup a Bash one-liner so when someone hits the site it would make a beep using the internal buzzer.
So far it's working using the following.
tail -f access_log | while read x ; do echo -ne '\007' $x '\n' ; done
Tail follows the access_log and dumps to STDOUT, get STDOUT line at a time, echo the line with '\007' "internal beep hex code", and done...
This works like a beauty... Every hit shows the line from the log and beeps... However, it got annoying very quickly, so ideally I wanted to filter the tail -f /access/log before it's piped into the while so that read only gets lines I care about. I was thinking grep "/index.php" would be a good indication of visitors...
This is where the issue is...
I can do...
tail -f access_log | while read x ; do echo -ne '\007' $x '\n' ; done
beeps on everything
and i can do...
tail -f access_log | grep "/index.php"
and pages are shown with no beep, but when i do
tail -f access_log | grep "/index.php" | while read x ; do echo -ne '\007' $x '\n' ; done
Nothing happens, no line from log, no beep.
I think the grep is messing it up somewhere, but I can't figure out where.
I'd love it to be a one liner, and I know it should really be done in a script and would be easier, but it doesn't explain why the above, which I think should work, isn't.
Grep's output is buffered when it's used in a pipe. Use --line-buffered to force it to use line buffering so it outputs lines immediately.
tail -f access_log | grep --line-buffered "/index.php" | while read x ; do echo -ne '\007' $x '\n' ; done
You could also combine the grep and while loop into a single awk call:
tail -f access_log | awk '/\/index.php/ { print "\007" $0 }'
Grep buffers output when standard output is not a terminal. You need to pass the --line-buffered switch to grep to force it to flush standard output whenever it writes a line.
Using sed -u for unbuffered:
Lighter than awk and grep, using sed could be simple, quick and efficient:
tail -f access.log | sed -une "s#/index.php#&\o7#p"
sed will replace /index.php by found string & plus beep: \o7, then print lines where something was replaced. With -u, sed will read lines by lines, unbuffered.
path='/index.php'
tail -f access.log | sed -une "s#${path}#&\o7#p"