I have written a bash script that reads a text file containing URLs and finds its IP. Each line of this file contains a URL. I want to create a .csv file as output which has 2 columns, the first column for URL and the 2nd column for its IP. Here is the script:
#!/bin/bash
while IFS= read -r line; do
ip=$(dig +short $line)
echo "${line}, ${ip}" >> ipfile.csv
done < domains
It works fine. The problem is that sometimes when I use dig +short example.com, instead of returning only the IP of the "example.com", it returns something like: example2.com IP. In this case the 2nd column saves example2.com and the corresponding IP moves to the 1st column of the next row.
So my question is: "How can I ignore the first part (example2.com) and only extract and save the "IP" part in the second column"?
I tried to split the text by space and newline character, but unfortunately it didn't work for me.
#!/bin/bash
while IFS= read -r line; do
ip=$(dig +short $line)
if [[ $ip = *\n* ]]
then
bar=${ip##*\n}
echo "{line}, ${bar}" >> ipfile.csv
else
echo "${line}, ${ip}" >> ipfile.csv
fi
done < domain
You may extract anything that looks like an IPv4 address using the following grep expression:
echo 'hello 11.22.33.44 world' | grep -E -o "[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+"
Explanation:
-E: Use regex
-o: Print only the matching part
[0-9]+\.: Match a sequence of digits, followed by a period (escaped because a period in regex has special meaning)
This is repeated 4 times, excluding the final period
This solution has some false positives (9999.0000.1.2 passes the pattern) but assuming dig doesn't output something seriously messed up, this will do.
Also it doesn't support IPv6, which might be a problem for you, but it is trivial to modify for IPv6 so it is left as an exercise to the reader :)
Assuming dig +shirt only returns two forms :
IP
name IP
We can use this script :
#!/usr/bin/env bash
while IFS= read -r line; do
read name ip <<<"$(dig +short $line)" # get name and IP
echo "${line}, ${ip:-$name}" >> ipfile.csv # If second column is absent, take the first column
done < domains
Related
Sorry if this a basic/stupid question.
I have no experience in shell scripting but am keen to learn and develop.
I want to create a script that reads a file, extracts an IP address from one line, extracts a port number from another line and sends them both toa a variable so I can telnet.
My file looks kinda like this;
Server_1_ip=192.168.1.1
Server_2_ip=192.168.1.2
Server_port=7777
I want to get the IP only of server_1
And the port.
What I have now is;
Cat file.txt | while read line; do
my_ip="(grep -oE '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' <<< "$line")"
echo "$my_ip"
done < file.txt
This works but how do I specify server_1_ip?
Next I did the same thing to find the port but somehow the port doesn't show, instead it shows "server_port" and not the number behind it
Do i need to cat twice or can I combine the searches? And why does it pass the IP to variable but not the port?
Many thanks in advance for any input.
Awk may be a better fit:
awk -F"=" '$1=="Server_1_ip"{sip=$2}$1=="Server_port"{sport=$2}END{print sip, sport}' yourfile
This awk says:
Split each row into columns delimited by an equal sign: -F"="
If the first column has the string "Server_1_ip" then store the value in the second column to awk variable sip: $1=="Server_1_ip"{sip=$2}
If the first column as the string "Server_port" then store the value in the second column to awk variable sport: $1=="Server_port"{sport=$2}
Once the entire file has been processed, then print out the value in the two variables: END{print sip, sport}
You could do something like:
#!/bin/bash
while IFS='=' read key value; do
case "$key" in
Server_1_ip) target_ip="$value";;
Server_port) target_port="$value";;
esac
done < input
This is almost certainly not the appropriate solution, since it requires you to statically define the string Server_1_ip, but it's not entirely clear what you are trying to do. You could eval the lines, but that is risky.
How exactly you want to determine the host name to match will greatly influence the desired solution. Perhaps you just want something like:
#!/bin/bash
target_host="${1-Server_1_ip}"
while IFS='=' read key value; do
case "$key" in
$target_host) target_ip="$value";;
Server_port) target_port="$value";;
esac
done < input
Here's a sed variant:
sed -n -e'/Server_1_ip=\(.*\)/{s//\1/;h}; /Server_port=\([0-9]*\)/{s// \1/;H;g;s/\n//p;q}' inputfile
-n stops normal output,
apply -e commands
First find target server, ie. Server_1_ip= with a subexpression that remembers the value (assumed its well formed IPv4). Apply a command block that replaces the pattern space (aka current line) with the saved subexpression and then copies the pattern space to the hold buffer; end command block.
Continue looking for port line. Apply command block that removes the prefix leaving the port number; append the port to the hold buffer (so now you have IP newline port in hold); copy the hold buffer back to pattern space; delete the newline and print result; quit.
Note: GNU and BSD sed can vary especially with trying to join lines, ie. s/\n//.
I have made a script to practice my Bash, only to realize that this script does not take tabulation into account, which is a problem since it is designed to find and replace a pattern in a Python script (which obviously needs tabulation to work).
Here is my code. Is there a simple way to get around this problem ?
pressure=1
nline=$(cat /myfile.py | wc -l) # find the line length of the file
echo $nline
for ((c=0;c<=${nline};c++))
do
res=$( tail -n $(($(($nline+1))-$c)) myfile.py | head -n 1 | awk 'gsub("="," ",$1){print $1}' | awk '{print$1}')
#echo $res
if [ $res == 'pressure_run' ]
then
echo "pressure_run='${pressure}'" >> myfile_mod.py
else
echo $( tail -n $(($nline-$c)) myfile.py | head -n 1) >> myfile_mod.py
fi
done
Basically, it finds the line that has pressure_run=something and replaces it by pressure_run=$pressure. The rest of the file should be untouched. But in this case, all tabulation is deleted.
If you want to just do the replacement as quickly as possible, sed is the way to go as pointed out in shellter's comment:
sed "s/\(pressure_run=\).*/\1$pressure/" myfile.py
For Bash training, as you say, you may want to loop manually over your file. A few remarks for your current version:
Is /myfile.py really in the root directory? Later, you don't refer to it at that location.
cat ... | wc -l is a useless use of cat and better written as wc -l < myfile.py.
Your for loop is executed one more time than you have lines.
To get the next line, you do "show me all lines, but counting from the back, don't show me c lines, and then show me the first line of these". There must be a simpler way, right?
To get what's the left-hand side of an assignment, you say "in the first space-separated field, replace = with a space , then show my the first space separated field of the result". There must be a simpler way, right? This is, by the way, where you strip out the leading tabs (your first awk command does it).
To print the unchanged line, you do the same complicated thing as before.
A band-aid solution
A minimal change that would get you the result you want would be to modify the awk command: instead of
awk 'gsub("="," ",$1){print $1}' | awk '{print$1}'
you could use
awk -F '=' '{ print $1 }'
"Fields are separated by =; give me the first one". This preserves leading tabs.
The replacements have to be adjusted a little bit as well; you now want to match something that ends in pressure_run:
if [[ $res == *pressure_run ]]
I've used the more flexible [[ ]] instead of [ ] and added a * to pressure_run (which must not be quoted): "if $res ends in pressure_run, then..."
The replacement has to use $res, which has the proper amount of tabs:
echo "$res='${pressure}'" >> myfile_mod.py
Instead of appending each line each loop (and opening the file each time), you could just redirect output of your whole loop with done > myfile_mod.py.
This prints literally ${pressure} as in your version, because it's single quoted. If you want to replace that by the value of $pressure, you have to remove the single quotes (and the braces aren't needed here, but don't hurt):
echo "$res=$pressure" >> myfile_mod.py
This fixes your example, but it should be pointed out that enumerating lines and then getting one at a time with tail | head is a really bad idea. You traverse the file for every single line twice, it's very error prone and hard to read. (Thanks to tripleee for suggesting to mention this more clearly.)
A proper solution
This all being said, there are preferred ways of doing what you did. You essentially loop over a file, and if a line matches pressure_run=, you want to replace what's on the right-hand side with $pressure (or the value of that variable). Here is how I would do it:
#!/bin/bash
pressure=1
# Regular expression to match lines we want to change
re='^[[:space:]]*pressure_run='
# Read lines from myfile.py
while IFS= read -r line; do
# If the line matches the regular expression
if [[ $line =~ $re ]]; then
# Print what we matched (with whitespace!), then the value of $pressure
line="${BASH_REMATCH[0]}"$pressure
fi
# Print the (potentially modified) line
echo "$line"
# Read from myfile.py, write to myfile_mod.py
done < myfile.py > myfile_mod.py
For a test file that looks like
blah
test
pressure_run=no_tab
blah
something
pressure_run=one_tab
pressure_run=two_tabs
the result is
blah
test
pressure_run=1
blah
something
pressure_run=1
pressure_run=1
Recommended reading
How to read a file line-by-line (explains the IFS= and -r business, which is quite essential to preserve whitespace)
BashGuide
Good day. I was reading another post regarding resolving hostnames to IPs and only using the first IP in the list.
I want to do the opposite and used the following script:
#!/bin/bash
IPLIST="/Users/mymac/Desktop/list2.txt"
for IP in 'cat $IPLIST'; do
domain=$(dig -x $IP +short | head -1)
echo -e "$domain" >> results.csv
done < domainlist.txt
I would like to give the script a list of 1000+ IP addresses collected from a firewall log, and resolve the list of destination IP's to domains. I only want one entry in the response file since I will be adding this to the CSV I exported from the firewall as another "column" in Excel. I could even use multiple responses as semi-colon separated on one line (or /,|,\,* etc). The list2.txt is a standard ascii file. I have tried EOF in Mac, Linux, Windows.
216.58.219.78
206.190.36.45
173.252.120.6
What I am getting now:
The domainlist.txt is getting an exact duplicate of list2.txt while the results has nothing. No error come up on the screen when I run the script either.
I am running Mac OS X with Macports.
Your script has a number of syntax and stylistic errors. The minimal fix is to change the quotes around the cat:
for IP in `cat $IPLIST`; do
Single quotes produce a literal string; backticks (or the much preferred syntax $(cat $IPLIST)) performs a command substitution, i.e. runs the command and inserts its output. But you should fix your quoting, and preferably read the file line by line instead. We can also get rid of the useless echo.
#!/bin/bash
IPLIST="/Users/mymac/Desktop/list2.txt"
while read IP; do
dig -x "$IP" +short | head -1
done < "$IPLIST" >results.csv
Seems that in your /etc/resolv.conf you configured a nameserver which does not support reverse lookups and that's why the responses are empty.
You can pass the DNS server which you want to use to the dig command. Lets say 8.8.8.8 (Google) for example:
dig #8.8.8.8 -x "$IP" +short | head -1
The commands returns the domain with a . appended. If you want to replace that you can additionally pipe to sed:
... | sed 's/.$//'
I'm learning about sed but it is very difficult to me understand it.
I have adsl with dynamic ip so and i want to put current ip on hosts file.
This following script just tells me the current wan ip address and no more:
IP=$(dig +short myip.opendns.com #resolver1.opendns.com)
echo $IP
The result:
192.42.7.73
So, i have a line on hosts file with the old ip address:
190.42.44.22 peep.strudel.com
and i want to update host file like this:
192.42.7.73 peep.strudel.com
How can i do it? I think i can use the hostname as pattern...
The reason of doing this is because my server is a client of my router, so it access the internet thru its gateway and not directly. And postfix always is logging me that "connect from unknown [x.x.x.x]" (where x.x.x.x is my wan ip!) and it can't resolve that ip. I think that maybe if i specify this relating with my fqdn host/domain, on hosts file it will works better.
Thanks
Sergio.
You can use a simple shell script:
#! /bin/bash
IP=$(dig +short myip.opendns.com #resolver1.opendns.com)
HOST="peep.strudel.com"
sed -i "/$HOST/ s/.*/$IP\t$HOST/g" /etc/hosts
Explanation:
sed -i "/$HOST/ s/.*/$IP\t$HOST/g" /etc/hosts means in the line which contains $HOST replace everything .* by $IP tab $HOST.
using sed
sed -r "s/^ *[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+( +peep.strudel.com)/$IP\1/"
.
[0-9]+\. find all lines that matches 1 or more digits with this pattern 4 consecutive times then pattern peep.strudel.com .The parenthesis around the pattern peep.strudel.com save it as \1 then replace the whole patten with your variable and your new ip.
another approach:instead of saving pattern to a variable named IP, you can execute your command line inside sed command line to get the new IP .
sed -r "s/^ *[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+( +peep.strudel.com)/$(dig +short myip.opendns.com #resolver1.opendns.com)\1/"
using gawk
gawk -v IP=$IP '/ *[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+( +peep.strudel.com).*/{print gensub(/ *[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+( +peep.strudel.com)/,IP"\\1","g")}'
You need to include the sed code inside double quotes so that the used variable got expanded.
sed "s/\b\([0-9]\{1,3\}\.\)\{1,3\}[0-9]\{1,3\}\b/$IP/g" file
Add -i parameter to save the changes made. In basic sed \(..\) called capturing group. \{min,max\} called range quantifier.
Example:
$ IP='192.42.7.73'
$ echo '190.42.44.22 peep.strudel.com' | sed "s/\b\([0-9]\{1,3\}\.\)\{1,3\}[0-9]\{1,3\}\b/$IP/g"
192.42.7.73 peep.strudel.com
I have a bash script which reads lines from a text file with 4 columns(no headers). The number of lines can be a maximum of 4 lines or less. The words in each line are separated by SPACE character.
ab#from.com xyz#to.com;abc#to.com Sub1 MailBody1
xv#from.com abc#to.com;poy#to.com Sub2 MailBody2
mb#from.com gmc#to.com;abc#to.com Sub3 MailBody3
yt#from.com gqw#to.com;xyz#to.com Sub4 MailBody4
Currently, I am parsing the file and after getting each line, I am storing each word in every line into a variable and calling mailx four times. Wondering if is there is an elegant awk/sed solution to the below mentioned logic.
find total number of lines
while read $line, store each line in a variable
parse each line as i=( $line1 ), j=( $line2 ) etc
get values from each line as ${i[0]}, ${i[1]}, ${i[2]} and ${i[3]} etc
call mailx -s ${i[2]} -t ${i[1]} -r ${i[0]} < ${i[3]}
parse next line and call mailx
do this until no more lines or max 4 lines have been reached
Do awk or sed provide an elegant solution to the above iterating/looping logic?
Give this a shot:
head -n 4 mail.txt | while read from to subject body; do
mailx -s "$subject" -t "$to" -r "$from" <<< "$body"
done
head -n 4 reads up to four lines from your text file.
read can read multiple variables from one line, so we can use named variables for readability.
<<< is probably what you want for the redirection, rather than <. Probably.
The above while loop works well as a simple alternative to sed and awk if you have a lot of control over how to display the lines of text in a file. the read command can use a specified delimiter as well, using the -d flag.
Another simple example:
I had used mysql to grab a list of users and hosts, putting it into a file /tmp/userlist with text as shown:
user1 host1
user2 host2
user3 host3
I passed these variables into a mysql command to get grant info for these users and hosts and append to /tmp/grantlist:
cat /tmp/userlist | while read user hostname;
do
echo -e "\n\nGrabbing user $user for host $hostname..."
mysql -u root -h "localhost" -e "SHOW GRANTS FOR '$user'#$hostname" >> /tmp/grantlist
done