Use SED in order to filter a file [closed] - bash

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 5 years ago.
Improve this question
I would like to use SED in order to filter a file and only get the id which is constituted of 3 numbers and the Domain (e.g.: google.com).
Original File:
451 [04/Jan/1997:03:35:55 +0100] http://www.netvibes.com
448 [04/Jan/1997:03:36:30 +0100] www.google.com:443
450 [04/Jan/1997:03:36:48 +0100] http://84.55.151.142:8080
452 [04/Jan/1997:03:36:51 +0100] http://127.0.0.1:9010
451 [04/Jan/1997:03:36:55 +0100] http://www.netvibes.com
453 [04/Jan/1997:03:37:10 +0100] api.del.icio.us:443
453 [04/Jan/1997:03:37:33 +0100] api.del.icio.us:443
448 [04/Jan/1997:03:37:34 +0100] www.google.com:443
Used SED commands : sed -e 's/\[[^]]*\]//g' -e 's/http:\/\///g' -e 's/www.//g' -e 's/^.com//g' -e 's/:[0-9]*//g'
Current Output:
451 netvibes.com
448 google.com
450 84.55.151.142
452 127.0.0.1
451 netvibes.com
453 api.del.icio.us
453 api.del.icio.us
448 google.com
Wished Output:
451 netvibes.com
448 google.com
451 netvibes.com
448 google.com

using grep
sed ... | grep -F '.com'
or
sed ... | grep '\.com$'
or with sed -n, using p to print match
sed -ne 's/\[[^]]*\]//g;s/http:\/\///g;s/www.//g;s/:[0-9]*//g;/.com$/p'

Expected you've lost api.del.icio.us in your wish output so:
cat testfile | awk '{print $1" "$NF}' | sed -r 's/http\:\/\/*//g;s/www\.//g' | awk -F: '{print $1}' | sed -r 's/([0-9]{1,3}) [0-9].*/\1 /g' | sed -r 's/[0-9]{3} $//g' | grep -v '^$' | uniq
If you needs only *.com domains, get it:
cat testfile | awk '{print $1" "$NF}' | sed -r 's/http://*//g;s/www.//g' | awk -F: '{print $1}' | sed -r 's/([0-9]{1,3}) [0-9].*/\1 /g' | sed -r 's/[0-9]{3} $//g' | grep -v '^$' | grep com | uniq

Here's one in awk:
$ awk 'match($NF,/[^\.]+\.[a-z]+($|:)/) {
print $1,substr($NF,RSTART,RLENGTH-($NF~/:[0-9]+/?1:0))
}' file
451 netvibes.com
448 google.com
451 netvibes.com
453 icio.us
453 icio.us
448 google.com
If you want just the .coms, replace [a-z]+ in the match regex with com.

Related

Why is the RX and TX values ​the same when executing the network packet statistics script on centos8?

##test1 on rhel8 or centos8
$for i in 1 2;do cat /proc/net/dev | grep ens192 | awk '{print "RX:"$2"\n""TX:"$10}';done | awk '{print $0}' | tee 111
##result:
RX:2541598118
TX:1829843233
RX:2541598118
TX:1829843233

Bash Shell: How do I sort by values on last column, but ignoring the header of a file?

file
ID First_Name Last_Name(s) Average_Winter_Grade
323 Popa Arianna 10
317 Tabarcea Andreea 5.24
326 Balan Ionut 9.935
327 Balan Tudor-Emanuel 8.4
329 Lungu Iulian-Gabriel 7.78
365 Brailean Mircea 7.615
365 Popescu Anca-Maria 7.38
398 Acatrinei Andrei 8
How do I sort it by last column, except for the header ?
This is what file should look like after the changes:
ID First_Name Last_Name(s) Average_Winter_Grade
323 Popa Arianna 10
326 Balan Ionut 9.935
327 Balan Tudor-Emanuel 8.4
398 Acatrinei Andrei 8
329 Lungu Iulian-Gabriel 7.78
365 Brailean Mircea 7.615
365 Popescu Anca-Maria 7.38
317 Tabarcea Andreea 5.24
If it's always 4th column:
head -n 1 file; tail -n +2 file | sort -n -r -k 4,4
If all you know is that it's the last column:
head -n 1 file; tail -n +2 file | awk '{print $NF,$0}' | sort -n -r | cut -f2- -d' '
You'd like to just sort by the last column, but sort doesn't allow you to do that easily. So rewrite the data with the column to be sorted at the beginning of each line:
Ignoring the header for the moment (although this will often work by itself):
awk '{print $NF, $0 | "sort -nr" }' input | cut -d ' ' -f 2-
If you do need to trim the order (eg, it's getting mixed in the sort), you can do things like:
< input awk 'NR==1; NR>1 {print $NF, $0 | "sh -c \"sort -nr | cut -d \\\ -f 2-\"" }'
or
awk 'NR==1{ print " ", $0} NR>1 {print $NF, $0 | "sort -nr" }' OFS=\; input | cut -d \; -f 2-

Converting value to MB [bash] [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 4 years ago.
Improve this question
I have this code:
VAL1=`ps auxf | grep httpd | grep ^apache | grep -v grep | wc -l`
VAL2=`ps auxf | grep httpd | grep ^apache | grep -v grep | awk '{s+=$6} END {print s}'`
VAL3=`expr $VAL2 / $VAL1`
echo "servers.value $VAL3"
and then I have values like servers.value 63908. Tell me please, how can i get in in MB?
Divide the KiB by 1024 and you get MB:
VAL3=`expr $VAL2 / $VAL1 / 1024`
echo "servers.value $VAL3 MB"
To get RSS from VAL2 in MB just use:
VAL2=`ps auxf | grep httpd | grep ^apache | grep -v grep | awk '{s+=$6} END {print s/1014}'`

Formatting grep output. Bash

trying to format output from grep to make it look better, code is
grep "$1" "$2" | grep -E -o "(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)" | sort | uniq -c
$ bash myScript.sh "Failed password for root" /home/user/auth.log
5 108.166.98.9
1426 108.53.208.61
1 113.108.211.131
1 117.79.91.195
370 122.224.49.124
3480 144.0.0.32
11 162.144.94.250
6 162.253.66.74
3 186.67.83.58
1 222.190.114.98
205 59.90.242.69
705 60.172.228.226
3 64.251.21.104
and want it to look more like
ip: xxx.xxx.xxx.xxx attempts: X
Add the following command to the end of your pipe in your script, after uniq:
... | awk '{print "ip: " $2 " attempts: " $1}'
The output will be
ip: 108.166.98.9 attempts: 5
ip: 108.53.208.61 attempts: 1426
...

Bash: Limit output of ls and grep

Let me present an example and than try to explain my problem:
noob#noob:~/Downloads$ ls | grep srt$
Elementary - 01x01 - Pilot.LOL.English.HI.C.orig.Addic7ed.com.srt
Haven - 01x01 - Welcome to Haven.DVDRip.SAiNTS.English.updated.Addic7ed.com.srt
Haven - 01x01 - Welcome to Haven.FQM.English.HI.C.updated.Addic7ed.com.srt
Supernatural - 08x01 - We Need to Talk About Kevin.LOL.English.HI.C.updated.Addic7ed.com.srt
The Big Bang Theory - 06x02 - The Decoupling Fluctuation.LOL.English.HI.C.orig.Addic7ed.com.srt
Torchwood - 1x01 - Everything changes.0TV.English.orig.Addic7ed.com.srt
Torchwood - 1x01 - Everything changes.divx.English.updated.Addic7ed.com.srt
Now I only want to delete the first four results of the above command. Normally if I have to delete all the files I would do ls | grep srt$ | xargs -I {} rm {} but in this case I only want to delete the top four.
So, how can limit the output of ls and grep or suggest me an alternate way to achieve this.
You can pipe your commands to head -n to limit to n lines:
ls | grep srt | head -4
$ for i in `seq 1 345`; do echo $i ;done | sed -n '1,4p'
1
2
3
4
geee: ~
$ for i in `seq 1 345`; do echo $i ;done | sed -n '335,360p'
335
336
337
338
339
340
341
342
343
344
345
If you don't have too many files, you can use a bash array:
matching_files=( *.srt )
rm "${matching_files[#]:0:4}"

Resources