Remove everything after a pattern (.com) - bash

Driving myself nuts. I am trying to get just the domain name (http://www.example.com) out of access.log. What the log looks like:
tail access.log
Fri, 13 Jul 2012 20:32:03 -0700,INFO,6fgmd8fk,params,http://www.example.com/images/CIV-260.jpg|
I have tried many variations of this one-liner (with sed and awk):
tail -4 access.log |grep http |awk {'print $6'} |cut -c28- |awk '$1>".com"' |sort |uniq
http://www.example.com/2713-7807.jpg|
http://www.example.com/2713-7808.jpg|
http://barfoo.com/img/14616_20120711182527.jpg|
http://foobar.com/css/14616_20120713142151.css|
I am stuck.

Maybe just
awk -F/ '{print $3}'
if you don't have more '/' than you example shows.
Notice this is just the domain name, as your question asks.

Using grep:
grep -Po '(?<=http://)[^/]+' access.log | sort -u
If you want to have http:// as a part of domain name,
grep -Po 'http://[^/]+' access.log | sort -u

Using sed:
sed -n 's|.*\(http://[^/]*\)/.*|\1|p' access.log | sort -u

Related

Print all the instances of a matching pattern in a file

I've been trying to print all the instances of a matching pattern from file.
Input file:
{"id":"prod123","a":1.3,"c":"xyz","q":2},
{"id":"prod456","a":1.3,"c":"xyz","q":1}]}
{"id":"prod789","a":1.3,"currency":"xyz","q":2},
{"id":"prod101112","a":1.3,"c":"xyz","q":1}]}
I'd want to print everything between "id":" and ",.
Expected output:
prod123
prod456
prod789
prod101112
I'm using the command
grep -Eo 'id\"\:\"[^"]+"\"\,*' | grep -Eo '^[^"]+'
Am I missing anything here?
What went wrong is the place of the comma in the first grep:
grep -Eo 'id.\:.[^"]+"\,"' inputfile
You need to do something extra for getting the desired substring.
grep -Eo 'id.\:.[^"]+"\,"' inputfile | cut -d: -f2 | grep -Eo '[^",]+'
I used cut, that would be easy for your example input.
cut -d'"' -f4 < inputfile
You have alternatives, like using jq, or
sed -r 's/\{"id":"([^"]*).*/\1/' inputfile
or using awk (solution now like cut but can be changed easy)
awk -F'"' '{print $4}' inputfile

Result of awk as search pattern for grep

here is a piece of log
21:36 b05808aa-c6ad-4d30-a334-198ff5726f7c new
22:21 59996d37-9008-4b3b-ab22-340955cb6019 new
21:12 2b41f358-ff6d-418c-a0d3-ac7151c03b78 new
12:36 7ac4995c-ff2c-4717-a2ac-e6870a5670f0 new
i print it by awk '{print $2}' st.log
so i got
b05808aa-c6ad-4d30-a334-198ff5726f7c
59996d37-9008-4b3b-ab22-340955cb6019
2b41f358-ff6d-418c-a0d3-ac7151c03b78
7ac4995c-ff2c-4717-a2ac-e6870a5670f0
now i need to pass it to grep, in this manner
awk '{print $2}' |xargs -i grep -w "pattern from awk" st.log
I need exactly how to pass each founded record from awk to grep. I do not need other solutions, because my task is more complicated, than this piece. Thank you.
With bash and grep:
grep -f <(awk '{print $2}' piece_of_log) st.log
No need for awk:
grep -Ff <(cut -d' ' -f2 log)
It seems you're looking for the replace string option:
-I replace-str
Replace occurrences of replace-str in the initial-arguments with
names read from standard input. Also, unquoted blanks do not
terminate input items; instead the separator is the newline
character. Implies -x and -L 1.
Like this:
awk '{print $2}' | xargs -I{} grep -w {} st.log

Get Ports grep using netstat -v

I want to get a list of ports of the established connections using netstat -v grep.
I am trying this:
sudo netstat -v | grep "ESTABLISHED" | cut -d: -f5
Any help?
Try this with $5 and $4:
netstat -v | awk '/ESTABLISHED/ {split($5, array, ":"); print array[2]}'
Please try
netstat -v| grep "ESTABLISHED"| awk '{print $5}' | cut -d ":" -f2

How to grep and cut at the same time

Having trouble with grepping and cutting at the same time
I have a file test.txt.
Inside the file is this syntax
File: blah.txt Location: /home/john/Documents/play/blah.txt
File: testing.txt Location /home/john
My command is ./delete -r (filename), say filename is blah.txt.
How would i search test.txt for blah.txt and cut the /home/john/Documents/play/blah.txt out and put it in a variable
grep -P "^File: blah\.txt Location: .+" test.txt | cut -d: -f3
Prefer always to involse as less as possible external command for your task.
You can achive what you want using single awk command:
awk '/^File: blah.txt/ { print $4 }' test.txt
Try this one ;)
filename=$(grep 'blah.txt' test.txt | grep -oP 'Location:.*' | grep -oP '[^ ]+$')
./delete $filename

Efficient way to get your IP address in shell scripts

Context:
On *nix systems, one may get the IP address of the machine in a shell script this way:
ifconfig | grep 'inet' | grep -v '127.0.0.1' | cut -d: -f2 | awk '{print $1}'
Or this way too:
ifconfig | grep 'inet' | grep -v '127.0.0.1' | awk '{print $2}' | sed 's/addr://'
Question:
Would there be a more straightforward, still portable, way to get the IP address for use in a shell script?
(my apologies to *BSD and Solaris users as the above command may not work; I could not test)
you can do it with just one awk command. No need to use too many pipes.
$ ifconfig | awk -F':' '/inet addr/&&!/127.0.0.1/{split($2,_," ");print _[1]}'
you give direct interface thereby reducing one grep.
ifconfig eth0 | grep 'inet addr:' | cut -d: -f2 | awk '{print $1}'
Based on this you can use the following command
ip route get 8.8.8.8 | awk 'NR==1 {print $NF}'
Look here at the Beej's guide to networking to obtain the list of sockets using a simple C program to print out the IP addresses using getaddrinfo(...) call. This simple C Program can be used in part of the shell script to just print out the IP addresses available to stdout which would be easier to do then rely on the ifconfig if you want to remain portable as the output of ifconfig can vary.
Hope this helps,
Best regards,
Tom.
ifconfig | grep 'broadcast\|Bcast' | awk -F ' ' {'print $2'} | head -n 1 | sed -e 's/addr://g'
May be this could help.
more /etc/hosts | grep `hostname` | awk '{print $1}'
# for bash/linux
ipaddr(){
if="${1:-eth0}"
result=$(/sbin/ip -o -4 addr show dev "${if}" | sed 's/^.*inet // ; s/\/...*$//')
printf %s "${result}"
tty -s && printf "\n"
}

Resources