Lining up pipeline results alongside input (here, "ip" and whois grep results) - bash

I need to perform a whois lookup on a file containing IP addresses and output both the country code and the IP address into a new file. In my command so far I find the IP addresses and get a unique copy that doesn't match allowed ranges. Then I run a whois lookup to find out who the foreign addresses are. Finally it pulls the country code out. This works great, but I can't get it show me the IP alongside the country code since that isn't included in the whois output.
What would be the best way to include the IP address in the output?
awk '{match($0,/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/); ip = substr($0,RSTART,RLENGTH); print ip}' myInputFile \
| sort \
| uniq \
| grep -v '66.33\|66.128\|75.102\|216.106\|66.6' \
| awk -F: '{ print "whois " $1 }' \
| bash \
| grep 'country:' \
>> myOutputFile
I had thought about using tee, but am having troubles lining up the data in a way that makes sense. The output file should be have both the IP Address and the country code. It doesn't matter if they are a single or double column.
Here is some sample input:
Dec 27 04:03:30 smtpfive sendmail[14851]: tBRA3HAx014842: to=, delay=00:00:12, xdelay=00:00:01, mailer=esmtp, pri=1681345, relay=redcondor.itctel.c
om. [75.102.160.236], dsn=4.3.0, stat=Deferred: 451 Recipient limit exceeded for this se
nder
Dec 27 04:03:30 smtpfive sendmail[14851]: tBRA3HAx014842: to=, delay=00:00:12, xdelay=00:00:01, mailer=esmtp, pri=1681345, relay=redcondor.itctel.c
om. [75.102.160.236], dsn=4.3.0, stat=Deferred: 451 Recipient limit exceeded for this se
nder
Thanks.

In general: Iterate over your inputs as shell variables; this then lets you print them alongside each output from the shell.
The below will work with bash 4.0 or newer (requires associative arrays):
#!/bin/bash
# ^^^^- must not be /bin/sh, since this uses bash-only features
# read things that look vaguely like IP addresses into associative array keys
declare -A addrs=( )
while IFS= read -r ip; do
case $ip in 66.33.*|66.128.*|75.102.*|216.106.*|66.6.*) continue;; esac
addrs[$ip]=1
done < <(grep -E -o '[0-9]+[.][0-9]+[.][0-9]+[.][0-9]+')
# getting country code from whois for each, printing after the ip itself
for ip in "${!addrs[#]}"; do
country_line=$(whois "$ip" | grep -i 'country:')
printf '%s\n' "$ip $country_line"
done
An alternate version which will work with older (3.x) releases of bash, using sort -u to generate unique values rather than doing that internal to the shell:
while read -r ip; do
case $ip in 66.33.*|66.128.*|75.102.*|216.106.*|66.6.*) continue;; esac
printf '%s\n' "$ip $(whois "$ip" | grep -i 'country:')"
done < <(grep -E -o '[0-9]+[.][0-9]+[.][0-9]+[.][0-9]+' | sort -u)
It's more efficient to perform input and output redirection for the script as a whole than to put a >> redirection after the printf itself (which would open the file before each print operation and close it again after, incurring a substantial performance penalty), which is why suggested invocation for this script looks something like:
countries_for_addresses </path/to/logfile >/path/to/output

Related

Using Bash to search files for

I have two text files, 1 with a unique list of IP addresses and the other with a longer list of an IP address population. I am trying to identify what IP addresses from the unique list appear in the longer population list using bash.
So far I have:
#!/bin/bash
while read line; do
grep -Ff "$line" population.txt
done < $uniqueIP.txt
The script runs but I get no output. I am trying to count the number of occurrences but having issues coming up with the logic. Currently, the script does not output anything.
Sample Input:
uniqueIP.txt
192.168.1.10
192.168.1.11
192.168.1.12
192.168.1.13
192.168.1.14
population.txt
192.168.1.12
192.168.1.14
192.168.1.15
192.168.1.16
192.168.1.17
192.168.1.18
192.168.1.19
192.168.1.22
192.168.1.23
Sample Output:
Found: 192.168.1.12
Found: 192.168.1.14
Total: 2
Here is my suggested bash function count_unique_ips for your problem
#!/bin/bash
function count_unique_ips()
{
local unique_ips_file=$1
local population_file=$2
local unique_ip_list=$(tr ' ' '\n' < ${unique_ips_file} | sort | uniq)
for ip in "${unique_ip_list[#]}"; do
count=$(grep -o -i ${ip} ${population_file} | wc -l)
echo "'${ip}': ${count}"
done
}
count_unique_ips uniqueIP.txt population.txt
You should get the following output based on 2 given input files uniqueIP.txt and population.txt
'192.168.1.10': 0
'192.168.1.11': 0
'192.168.1.12': 1
'192.168.1.13': 0
'192.168.1.14': 1
A short method, using only grep, could be
grep -Fxf uniqueIP.txt population.txt
but this would be slow for large files. A faster way is
sort uniqueIP.txt population.txt | uniq -d
assuming the IPs in population.txt are also unique within this file, and one IP per line without extra characters (including blank characters).

Using sed to append to a line in a file

I have a script running to use output from commands that I run using a string from the file that I want to update.
for CLIENT in `cat /home/"$ID"/file.txt | awk '{print $3}'`
do
sed "/$CLIENT/ s/$/ $(sudo bpgetconfig -g $CLIENT -L | grep -i "version name")/" /home/"$ID"/file.txt >> /home/"$ID"/updated_file.txt
done
The output prints out the entire file once for each line with the matching line in question updated.
How do I sort it so that it only sends the matching line to the new file.
The input file contains lines similar to below:
"Host OS" "OS Version" "Hostname"
I want to run a script that will use the hostname to run a command and grab details about an application on the host and then print only the application version to the end of the line with the host in it:
"Host OS" "OS Version" "Hostname" "Application Version
What you're doing is very fragile (e.g. it'll break if the string represented by $CLIENT appears on other lines or multiple times on 1 line or as substrings or contains regexp metachars or...) and inefficient (you're reading file.txt one per iteration of the loop instead of once total) and employing anti-patterns (e.g. using a for loop to read lines of input, plus the UUOC, plus deprecated backticks, etc.)
Instead, let's say the command you wanted to run was printf '%s' 'the_third_string' | wc -c to replace each third string with the count of its characters. Then you'd do:
while read -r a b c rest; do
printf '%s %s %s %s\n' "$a" "$b" "$(printf '%s' "$c" | wc -c)" "$rest"
done < file
or if you had more to do and so it was worth using awk:
awk '{
cmd = "printf \047%s\047 \047" $3 "\047 | wc -c"
if ( (cmd | getline line) > 0 ) {
$3 = line
}
close(cmd)
print
}' file
For example given this input (courtesy of Rabbie Burns):
When chapman billies leave the street,
And drouthy neibors, neibors, meet;
As market days are wearing late,
And folk begin to tak the gate,
While we sit bousing at the nappy,
An' getting fou and unco happy,
We think na on the lang Scots miles,
The mosses, waters, slaps and stiles,
That lie between us and our hame,
Where sits our sulky, sullen dame,
Gathering her brows like gathering storm,
Nursing her wrath to keep it warm.
We get:
$ awk '{cmd="printf \047%s\047 \047"$3"\047 | wc -c"; if ( (cmd | getline line) > 0 ) $3=line; close(cmd)} 1' file
When chapman 7 leave the street,
And drouthy 8 neibors, meet;
As market 4 are wearing late,
And folk 5 to tak the gate,
While we 3 bousing at the nappy,
An' getting 3 and unco happy,
We think 2 on the lang Scots miles,
The mosses, 7 slaps and stiles,
That lie 7 us and our hame,
Where sits 3 sulky, sullen dame,
Gathering her 5 like gathering storm,
Nursing her 5 to keep it warm.
The immediate answer is to use sed -n to not print every line by default, and add a p command where you do want to print. But running sed in a loop is nearly always the wrong thing to do.
The following avoids the useless cat, the don't read lines with for antipattern, the obsolescent backticks, and the loop; but without knowledge of what your files look like, it's rather speculative. In particular, does command need to run for every match separately?
file=/home/"$ID"/file.txt
pat=$(awk '{ printf "\\|$3"}' "$file")
sed -n "/${pat#\\|}/ s/$/ $(command)/p' "$file" >> /home/"$ID"/updated_file.txt
The main beef here is collecting all the patterns we want to match into a single regex, and then running sed only once.
If command needs to be run uniquely for each line, this will not work out of the box. Maybe then turn back to a loop after all. If your task is actually to just run a command for each line in the file, try
while read -r line; do
# set -- $line
# client=$3
printf "%s " "$line"
command
done <file >>new_file
I included but commented out commands to extract the third field into $client before you run command.
(Your private variables should not have all-uppercase names; those are reserved for system variables.)
Perhaps in fact this is all you need:
while read -r os osver host; do
printf "%s " "$os" "$osver" "$host"
command "$host" something something
done </home/"$ID"/file.txt >/home/"$ID"/updated_file.txt
This assumes that the output of command is a well-formed single line of output with a final newline.
This might work for you (GNU sed, bash/dash):
echo "command () { expr length \"\$1\"; }" >> funlib
sed -E 's/^((\S+\s){2})(\S+)(.*)/. .\/funlib; echo "\1$(command "\3")\4"/e' file
As an example of a command, I create a function called command and append it to a file funlib in the current directory.
The sed invocation, sources the funlib and runs the command function in the RHS of the substitution command within an interpolation of string, displayed by the echo command which is made possible by the evaluation flag e.
N.B. The evaluation uses the dash shell or whatever the /bin/sh is symlinked to.

Bash grep, awk o sed to reverse find

I am creating a script to look for commonly used patterns in a password.Although I have security policies in the hosting panel, servers have been outdated due to incompatibilities.
Example, into the file words.txt, i put in there, the word test, when i execute grep -c test123 words.txt. When I look for that pattern I need it to find it but I think that with the command grep it won't work for me.
Script:
EMAILPASS=`/root/info.sh -c usera | grep #`
for PAR in ${EMAILPASS} ; do
EMAIL=$(echo "${PAR}" | grep # | cut -f1 -d:)
PASS=$(echo "${PAR}" | cut -d: -f 2)
PASS="${PASS,,}"
FINDSTRING=$(grep -ic "${PASS}" /root/words.txt)
echo -e ""
echo -e "Validating password ${EMAIL}"
echo -e ""
if [ $FINDSTRING -ge 1 ] ; then
echo "Insecre"
else
echo "Secure"
fi
the current output of the command is as follows
# grep -c test123 /root/words.txt
0
I think grep is not good for what I need, maybe someone can help me.
I could also use awk or sed but I can't find an option to help me.
Regardsm
Reverse your application.
echo test123 | grep -f words.txt
Each line of the text file will be used as a pattern to test against the input.
edit
Apparently you actually do want to see if the whole password is an actual word, rather than just checking to see if it's based on a dictionary word. That's considerably less secure, but easy enough to do. The logic you have will not report test123 as insecure unless the whole passwword is an exact match for a word in the dictionary.
You said you were putting test in the dictionary and using test123 as the password, so I assumed you were looking for passwords based on dictionary words, which was the structure I suggested above. Will include as commented alternate lines below.
Also, since you're doing a case insensitive search, why bother to downcase the password?
declare -l pass # set as always lowecase
would do it, but there's no need.
Likewise, unless you are using it again later, it isn't necessary to put everything into a variable first, such as the grep results. Try to remove anything not needed -- less is more.
Finally, since we aren't catching the grep output in a variable and testing that, I threw it away with -q. All we need to see is whether it found anything, and the return code, checked by the if, tells us that.
/root/info.sh -c usera | grep # | # only lines with at signs
while IFS="$IFS:" read email pass # parse on read with IFS
do printf "\n%s\n\n" "Validating password for '$email'"
if grep -qi "$pass" /root/words.txt # exact search (-q = quiet)
#if grep -qif /root/words.txt <<< "$pass" # 'based on' search
then echo "Insecure"
else echo "Secure" # well....
fi
done
I think a better paradigm might be to just report the problematic ones and be silent for those that seem ok, but that's up to you.
Questions?

Portable way to resolve host name to IP address

I need to resolve a host name to an IP address in a shell script. The code must work at least in Cygwin, Ubuntu and OpenWrt(busybox).
It can be assumed that each host will have only one IP address.
Example:
input
google.com
output
216.58.209.46
EDIT:
nslookup may seem like a good solution, but its output is quite unpredictable and difficult to filter. Here is the result command on my computer (Cygwin):
>nslookup google.com
Unauthorized answer:
Serwer: UnKnown
Address: fdc9:d7b9:6c62::1
Name: google.com
Addresses: 2a00:1450:401b:800::200e
216.58.209.78
I've no experience with OpenWRT or Busybox but the following one-liner will should work with a base installation of Cygwin or Ubuntu:
ipaddress=$(LC_ALL=C nslookup $host 2>/dev/null | sed -nr '/Name/,+1s|Address(es)?: *||p')
The above works with both the Ubuntu and Windows version of nslookup. However, it only works when the DNS server replies with one IP (v4 or v6) address; if more than one address is returned the first one will be used.
Explanation
LC_ALL=C nslookup sets the LC_ALL environment variable when running the nslookup command so that the command ignores the current system locale and print its output in the command’s default language (English).
The 2>/dev/null avoids having warnings from the Windows version of nslookup about non-authoritative servers being printed.
The sed command looks for the line containing Name and then prints the following line after stripping the phrase Addresses: when there's more than one IP (v4 or 6) address -- or Address: when only one address is returned by the name server.
The -n option means lines aren't printed unless there's a p commandwhile the-r` option means extended regular expressions are used (GNU sed is the default for Cygwin and Ubuntu).
If you want something available out-of-the-box on almost any modern UNIX, use Python:
pylookup() {
python -c 'import socket, sys; print socket.gethostbyname(sys.argv[1])' "$#" 2>/dev/null
}
address=$(pylookup google.com)
With respect to special-purpose tools, dig is far easier to work with than nslookup, and its short mode emits only literal answers -- in this case, IP addresses. To take only the first address, if more than one is found:
# this is a bash-specific idiom
read -r address < <(dig +short google.com | grep -E '^[0-9.]+$')
If you need to work with POSIX sh, or broken versions of bash (such as Git Bash, built with mingw, where process substitution doesn't work), then you might instead use:
address=$(dig +short google.com | grep -E '^[0-9.]+$' | head -n 1)
dig is available for cygwin in the bind-utils package; as bind is most widely used DNS server on UNIX, bind-utils (built from the same codebase) is available for almost all Unix-family operating systems as well.
Here's my variation that steals from earlier answers:
nslookup blueboard 2> /dev/null | awk '/Address/{a=$3}END{print a}'
This depends on nslookup returning matching lines that look like:
Address 1: 192.168.1.100 blueboard
...and only returns the last address.
Caveats: this doesn't handle non-matching hostnames at all.
TL;DR; Option 2 is my preferred choice for IPv4 address. Adjust the regex to get IPv6 and/or awk to get both. There is a slight edit to option 2 suggested use given in EDIT
Well a terribly late answer here, but I think I'll share my solution here, esp. because the accepted answer didn't work for me on openWRT(no python with minimal setup) and the other answer errors out "no address found after comma".
Option 1 (gives the last address from last entry sent by nameserver):
nslookup example.com 2>/dev/null | tail -2 | tail -1 | awk '{print $3}'
Pretty simple and straight forward and doesn't really need an explanation of piped commands.
Although, in my tests this always gave IPv4 address (because IPv4 was always last line, at least in my tests.) However, I read about the unexpected behavior of nslookup. So, I had to find a way to make sure I get IPv4 even if the order was reversed - thanks regex
Option 2 (makes sure you get IPv4):
nslookup example.com 2>/dev/null | sed 's/[^0-9. ]//g' | tail -n 1 | awk -F " " '{print $2}'
Explanation:
nslookup example.com 2>/dev/null - look up given host and ignore STDERR (2>/dev/null)
sed 's/[^0-9. ]//g' - regex to get IPv4 (numbers and dots, read about 's' command here)
tail -n 1 - get last 1 line (alt, tail -1)
awk -F " " '{print $2} - Captures and prints the second part of line using " " as a field separator
EDIT: A slight modification based on a comment to make it actually more generalized:
nslookup example.com 2>/dev/null | printf "%s" "$(sed 's/[^0-9. ]//g')" | tail -n 1 | printf "%s" "$(awk -F " " '{print $1}')"
In the above edit, I'm using printf command substitution to take care of any unwanted trailing newlines.

Connecting Two Bash Commands

I have Ubuntu Linux. I found one command will let me download unread message subjects from Gmail:
curl -u USERNAME:PASSWORD --silent "https://mail.google.com/mail/feed/atom" | tr -d '\n' | awk -F '<entry>' '{for (i=2; i<=NF; i++) {print $i}}' | sed -n "s/<title>\(.*\)<\/title.*name>\(.*\)<\/name>.*/\2 - \1/p"
...and then another command to let me send mail easily (once I installed the sendemail command via apt-get):
sendEmail -f EMAIL#DOMAIN.COM -v -t PHONE#SMS.COM -u Gmail Notifier -m test -s MAILSERVER:PORT -xu EMAIL#DOMAIN.COM -xp PASSWORD
(Note when in production I'll probably swap -v above with -q.)
So, if one command downloads one line subjects, how can I pipe these into the sendEmail command?
For instance, I tried using a pipe character between the two, where I used "$1" after the -m parameter, but what happened was that when I had no unread emails it would still send me at least one empty message.
If you help me with this, I'll use this information to share on StackOverflow how to build a Gmail Notifier that one can hook up to SMS messages on their phone.
I think if you mix viraptor & DigitalRoss' answers you get what you want. I created a sample test by creating a fake file with the following input:
File contents:
foo
bar
baz
Then I ran this command:
% cat ~/tmp/baz | while read x; do if [[ $x != "" ]]; then echo "x: '$x'"; fi; done
This will only print lines with input out. I'm not familiar with sendEmail; does it need the body to be on stdin or can you pass it on the cmdline?
You do know you can do that directly in gmail by using a filter and your SMS email gateway, right?
But back to the question...
You can get control in a shell script for command output with the following design pattern:
command1 | while read a b c restofline; do
: execute commands here
: command2
done
Read puts the first word in a, the second in b, and the rest of the line in restofline. If the loop consists of only a single command, the xargs program will probably just do what you want. Read in particular about the -I parameter which allows you to place the substituted argument anywhere in the command.
Sometimes the loop looks like ... | while read x; do, which puts the entire line into x.
Try this structure:
while read line
do
sendemailcommand ... -m $line ...
done < <(curlcommand)
I'd look at the xargs command, which provides all the features you need (as far as I can tell).
http://unixhelp.ed.ac.uk/CGI/man-cgi?xargs
Maybe something like this:
curl_command > some_file
if [[ `wc -l some_file` != "0 some_file" ]] ; then
email_command < some_file
fi

Resources