Using Bash to search files for - bash

I have two text files, 1 with a unique list of IP addresses and the other with a longer list of an IP address population. I am trying to identify what IP addresses from the unique list appear in the longer population list using bash.
So far I have:
#!/bin/bash
while read line; do
grep -Ff "$line" population.txt
done < $uniqueIP.txt
The script runs but I get no output. I am trying to count the number of occurrences but having issues coming up with the logic. Currently, the script does not output anything.
Sample Input:
uniqueIP.txt
192.168.1.10
192.168.1.11
192.168.1.12
192.168.1.13
192.168.1.14
population.txt
192.168.1.12
192.168.1.14
192.168.1.15
192.168.1.16
192.168.1.17
192.168.1.18
192.168.1.19
192.168.1.22
192.168.1.23
Sample Output:
Found: 192.168.1.12
Found: 192.168.1.14
Total: 2

Here is my suggested bash function count_unique_ips for your problem
#!/bin/bash
function count_unique_ips()
{
local unique_ips_file=$1
local population_file=$2
local unique_ip_list=$(tr ' ' '\n' < ${unique_ips_file} | sort | uniq)
for ip in "${unique_ip_list[#]}"; do
count=$(grep -o -i ${ip} ${population_file} | wc -l)
echo "'${ip}': ${count}"
done
}
count_unique_ips uniqueIP.txt population.txt
You should get the following output based on 2 given input files uniqueIP.txt and population.txt
'192.168.1.10': 0
'192.168.1.11': 0
'192.168.1.12': 1
'192.168.1.13': 0
'192.168.1.14': 1

A short method, using only grep, could be
grep -Fxf uniqueIP.txt population.txt
but this would be slow for large files. A faster way is
sort uniqueIP.txt population.txt | uniq -d
assuming the IPs in population.txt are also unique within this file, and one IP per line without extra characters (including blank characters).

Related

Count the lines from output using pipeline

I am trying to count how many files have words with the pattern [Gg]reen.
#!/bin/bash
for File in `ls ./`
do
cat ./$File | egrep '[Gg]reen' | sed -n '$='
done
When I do this I get this output:
1
1
3
1
1
So I want to count the lines to get in total 5. I tried using wc -l after the sed but it didn't work; it counted the lines in all the files. I tried to use >file.txt but it didn't write anything on it. And when I use >> instead it writes but when I execute the shell it appends the lines again.
Since according to your question, you want to know how many files contain a pattern, you are interested in the number of files, not the number of pattern occurances.
For instance,
grep -l '[Gg]reen' * | wc -l
would produce the number of files which contain somewhere green or Green as a substring.

Check IP address off a list using ksh/bash

I have a list (text file) with the following data:
app1 example1.google.com
app2 example2.google.com
dev1 device1.google.com
cell1 iphone1.google.com
I want to check the ip address of the URLs/hostnames and update the text file with the gathered ip. Example:
app1 example1.google.com 192.168.1.10
app2 example2.google.com 192.168.1.55
dev1 device1.google.com 192.168.1.53
cell1 iphone1.google.com 192.168.1.199
You can use dig to get the IP (but the domains must exist). Not tested for IPv6.
#! /bin/bash
while read name url ; do
ip=$(dig -4 $url | grep '^[^;]' | grep -o '\([0-9]*[.:]\)\+[0-9.:]*$')
printf '%s %s %s\n' "$name" "$url" "$ip"
done < data.txt
If there are only two columns in the file, this might help:
awk '{"dig +short " $2 | getline ip ; print $1, $2, ip}' file
First we run a subshell (not really a good idea to run this for zillions of records) and in it a "dig +short" (the shortest possibility which came to my mind to get IP address only) with the FQDN of a machine (found in the second column). Then we print all the original columns and the new one too (with the IP address). Output can be redirected to a new file with a single >. I wouldn't consider "safe" to edit original files.

How to extract the IP part of a variable in bash

I have written a bash script that reads a text file containing URLs and finds its IP. Each line of this file contains a URL. I want to create a .csv file as output which has 2 columns, the first column for URL and the 2nd column for its IP. Here is the script:
#!/bin/bash
while IFS= read -r line; do
ip=$(dig +short $line)
echo "${line}, ${ip}" >> ipfile.csv
done < domains
It works fine. The problem is that sometimes when I use dig +short example.com, instead of returning only the IP of the "example.com", it returns something like: example2.com IP. In this case the 2nd column saves example2.com and the corresponding IP moves to the 1st column of the next row.
So my question is: "How can I ignore the first part (example2.com) and only extract and save the "IP" part in the second column"?
I tried to split the text by space and newline character, but unfortunately it didn't work for me.
#!/bin/bash
while IFS= read -r line; do
ip=$(dig +short $line)
if [[ $ip = *\n* ]]
then
bar=${ip##*\n}
echo "{line}, ${bar}" >> ipfile.csv
else
echo "${line}, ${ip}" >> ipfile.csv
fi
done < domain
You may extract anything that looks like an IPv4 address using the following grep expression:
echo 'hello 11.22.33.44 world' | grep -E -o "[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+"
Explanation:
-E: Use regex
-o: Print only the matching part
[0-9]+\.: Match a sequence of digits, followed by a period (escaped because a period in regex has special meaning)
This is repeated 4 times, excluding the final period
This solution has some false positives (9999.0000.1.2 passes the pattern) but assuming dig doesn't output something seriously messed up, this will do.
Also it doesn't support IPv6, which might be a problem for you, but it is trivial to modify for IPv6 so it is left as an exercise to the reader :)
Assuming dig +shirt only returns two forms :
IP
name IP
We can use this script :
#!/usr/bin/env bash
while IFS= read -r line; do
read name ip <<<"$(dig +short $line)" # get name and IP
echo "${line}, ${ip:-$name}" >> ipfile.csv # If second column is absent, take the first column
done < domains

shell for in loop with multible variable [duplicate]

This question already has answers here:
Iterate over two arrays simultaneously in bash
(6 answers)
Closed 6 years ago.
in my question i learn to use loops and with you help I final a installation script for multible instance of a software. thank you very much :)
now I try to automatic setup the configuration files by using sed. For this i need multible variables in a loop.
I read from the system the IP-Adresses and the hostnames for the IP's (PTR)
IPADDR=`ifconfig | awk '{print $2}' | egrep -o '([0-9]+\.){3}[0-9]+'|grep -v 127.0.0.1`
for ipaddr in ${IPADDR[#]}; do echo $ipaddr; done
for iphost in ${IPADDR[#]}; do host $iphost |grep pointer | awk '{print $NF}' RS='.\n'; done
my Script know, there ar 3 IP's, know the IP-Addresses and the Hostnames.
the numbers of IP (3) are now my 001 002 003. this running well.
if I like to edit the config files with sed, I need the 3 variable to do this.
command anyname-001 -some -parameter in my case is a copy to a path. my path is now
/etc/anyname-001, /etc/anyname-003 and /etc/anyname-003
by using sed I need also the 3 IP-Addresses and the 3 hostnames.
sed -i 's/IPADDR/'${ipaddr}'/g' /etc/anyname-${number}/config.cfg
sed -i 's/HOSTNAME/'${hostname}'/g' /etc/anyname-${number}/config.cfg
how can I bring my loop to this with all variables on same time. T try many things. I found nested loops but it not work
001 >> IP:a.a.a.a >> hostname aaa.aaa.aa
002 >> IP:b.b.b.b >> hostname bbb.bbb.bb
003 >> IP:c.c.c.c >> hostname ccc.ccc.cc
Thank you
Assuming the length of both of your arrays HOSTNAME and IPADDR are the same, then you can loop over their elements via indexes in one run.
The length of an array is calculated by using the '#' in the array variable, for example as:
echo ${#HOSTNAME[#]}
So, overall your code would look like something:
count=${#HOSTNAME[#]}
for (( i=0; i<${count}; i++ ));
do
echo ${HOSTNAME[$i]};
echo ${IPADDR[$i]};
((j=i+1));
sed -i 's/IPADDR/'${IPADDR[$i]}'/g' /etc/anyname-${j}/config.cfg
sed -i 's/HOSTNAME/'${HOSTNAME[$i]}'/g' /etc/anyname-${j}/config.cfg
done

Lining up pipeline results alongside input (here, "ip" and whois grep results)

I need to perform a whois lookup on a file containing IP addresses and output both the country code and the IP address into a new file. In my command so far I find the IP addresses and get a unique copy that doesn't match allowed ranges. Then I run a whois lookup to find out who the foreign addresses are. Finally it pulls the country code out. This works great, but I can't get it show me the IP alongside the country code since that isn't included in the whois output.
What would be the best way to include the IP address in the output?
awk '{match($0,/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/); ip = substr($0,RSTART,RLENGTH); print ip}' myInputFile \
| sort \
| uniq \
| grep -v '66.33\|66.128\|75.102\|216.106\|66.6' \
| awk -F: '{ print "whois " $1 }' \
| bash \
| grep 'country:' \
>> myOutputFile
I had thought about using tee, but am having troubles lining up the data in a way that makes sense. The output file should be have both the IP Address and the country code. It doesn't matter if they are a single or double column.
Here is some sample input:
Dec 27 04:03:30 smtpfive sendmail[14851]: tBRA3HAx014842: to=, delay=00:00:12, xdelay=00:00:01, mailer=esmtp, pri=1681345, relay=redcondor.itctel.c
om. [75.102.160.236], dsn=4.3.0, stat=Deferred: 451 Recipient limit exceeded for this se
nder
Dec 27 04:03:30 smtpfive sendmail[14851]: tBRA3HAx014842: to=, delay=00:00:12, xdelay=00:00:01, mailer=esmtp, pri=1681345, relay=redcondor.itctel.c
om. [75.102.160.236], dsn=4.3.0, stat=Deferred: 451 Recipient limit exceeded for this se
nder
Thanks.
In general: Iterate over your inputs as shell variables; this then lets you print them alongside each output from the shell.
The below will work with bash 4.0 or newer (requires associative arrays):
#!/bin/bash
# ^^^^- must not be /bin/sh, since this uses bash-only features
# read things that look vaguely like IP addresses into associative array keys
declare -A addrs=( )
while IFS= read -r ip; do
case $ip in 66.33.*|66.128.*|75.102.*|216.106.*|66.6.*) continue;; esac
addrs[$ip]=1
done < <(grep -E -o '[0-9]+[.][0-9]+[.][0-9]+[.][0-9]+')
# getting country code from whois for each, printing after the ip itself
for ip in "${!addrs[#]}"; do
country_line=$(whois "$ip" | grep -i 'country:')
printf '%s\n' "$ip $country_line"
done
An alternate version which will work with older (3.x) releases of bash, using sort -u to generate unique values rather than doing that internal to the shell:
while read -r ip; do
case $ip in 66.33.*|66.128.*|75.102.*|216.106.*|66.6.*) continue;; esac
printf '%s\n' "$ip $(whois "$ip" | grep -i 'country:')"
done < <(grep -E -o '[0-9]+[.][0-9]+[.][0-9]+[.][0-9]+' | sort -u)
It's more efficient to perform input and output redirection for the script as a whole than to put a >> redirection after the printf itself (which would open the file before each print operation and close it again after, incurring a substantial performance penalty), which is why suggested invocation for this script looks something like:
countries_for_addresses </path/to/logfile >/path/to/output

Resources