Bash compare previous variable - bash
I am trying to parse csv file and rewrite with extra field, you can see csv file below
in the file you will see 192.168 ip address and 10.0 ip address.
192.168 is end point address and 10.0 is voip addresses.
If Switch ID,Switch Port ID,Description are equal it means 10.0 ip is voip phone for 192.168 ip address
for example;
192.168.205.76,189,FC3F.DB02.ED78,basement-k001a-asw1,GigabitEthernet3/1,basement-access,K022E-NB1-C1
10.0.40.46,1640,F025.7279.6DAA,basement-k001a-asw1,GigabitEthernet3/1,basement-access,K022E-NB1-C1
they are on both basement-k001a-asw1 and GigabitEthernet3/4,K013-EB3-C1.
So 10.0.40.29 is the voip phone for 192.168.189.26 endpoint.
What i am trying to do if ip start with 10.0 and its switchid,portid and description match previous one then i would like to write voip ip end of line of 192.168 line.
I can use global variable and use previous value, current value and change them as "for loop" continues
Example:
#!/bin/bash
previuos_ip=
current_ip=
previous_location=
current_location=
for systems in $(cat list.csv)
do
previous_ip=$current_ip
current_ip=$(echo "$systems" | cut -d, -f1)
previous_location=$current_location
current_location=$(echo "$systems" | cut -d, -f4,5,7)
printf "$previous_ip,$current_ip,$previous_location,$current_location\n"
done
previous_ip=$current_ip
current_ip=
previous_location=$current_location
current_location=
I hope i explained well. Any help appreciated
Thanks
Here is sample csv I have.
192.168.205.76,189,FC3F.DB02.ED78,basement-k001a-asw1,GigabitEthernet3/1,basement-access,K022E-NB1-C1
10.0.40.46,1640,F025.7279.6DAA,basement-k001a-asw1,GigabitEthernet3/1,basement-access,K022E-NB1-C1
10.68.194.185,1189,9C93.4E2D.EE1A,basement-k001a-asw1,GigabitEthernet3/3,basement-access,K022D-NB2-C2
192.168.189.26,189,9C8E.99DD.A49F,basement-k001a-asw1,GigabitEthernet3/4,basement-access,K013-EB3-C1
10.0.40.29,1640,1CDE.A783.EA7B,basement-k001a-asw1,GigabitEthernet3/4,basement-access,K013-EB3-C1
192.168.189.230,189,EC9A.7435.2177,basement-k001a-asw1,GigabitEthernet3/6,basement-access,K024-SB1-C1
192.168.189.34,189,70F3.95C1.11F8,basement-k001a-asw1,GigabitEthernet3/8,basement-access,K020-CF7-C1
10.0.40.45,1640,0008.2FB7.6F84,basement-k001a-asw1,GigabitEthernet3/11,basement-access,K002A-NB1-C1
192.168.189.22,189,8851.FB82.5DE3,basement-k001a-asw1,GigabitEthernet3/12,basement-access,K022D-NB1-C2
10.0.40.28,1640,3CCE.73AC.ED44,basement-k001a-asw1,GigabitEthernet3/12,basement-access,K022D-NB1-C2
192.168.189.225,189,9C93.4E4D.1DDA,basement-k001a-asw1,GigabitEthernet3/13,basement-access,K022D-NB2-C1
10.68.189.182,1189,001C.9B09.0504,basement-k001a-asw1,GigabitEthernet3/15,basement-access,K006-NW1-C1
10.0.40.42,1640,1CDE.A783.B19B,basement-k001a-asw1,GigabitEthernet3/16,basement-access,K005-NB1-C1
10.68.189.181,1189,9C93.4E16.D940,basement-k001a-asw1,GigabitEthernet3/17,basement-access,K004-WB1-C2
192.168.189.233,1189,9C93.4E67.2017,basement-k001a-asw1,GigabitEthernet3/27,basement-access,K013-SB1-C1
10.68.189.52,1189,0040.580D.157E,basement-k001a-asw1,GigabitEthernet3/28,basement-access,K009HALL-EW5-C1(KRONOS)
192.168.189.31,189,984B.E17D.5BE1,basement-k001a-asw1,GigabitEthernet3/34,basement-access,K013-WB1-C1
192.168.189.222,189,68B5.9941.32CE,basement-k001a-asw1,GigabitEthernet3/35,basement-access,K004-NB3-C1
10.0.40.56,1640,0CD9.9691.B9C3,basement-k001a-asw1,GigabitEthernet3/36,basement-access,K024HALL-WW1-C1
192.168.189.223,189,3CD9.2B0F.E714,basement-k001a-asw1,GigabitEthernet3/39,basement-access,K006-EB1-C2
10.0.40.44,1640,1CDE.A782.1A7E,basement-k001a-asw1,GigabitEthernet3/41,basement-access,K011-NB1-C2
192.168.189.224,189,1458.D039.9735,basement-k001a-asw1,GigabitEthernet3/42,basement-access,K013-WB2-C2
192.168.189.23,189,D4C9.EFD8.1490,basement-k001a-asw1,GigabitEthernet3/43,basement-access,K013-WB2-C1
10.0.40.30,1640,1CDE.A783.A7CD,basement-k001a-asw1,GigabitEthernet3/43,basement-access,K013-WB2-C1
192.168.189.25,189,8851.FB81.72E4,basement-k001a-asw1,GigabitEthernet3/44,basement-access,K002A-WB1-C2
192.168.189.29,189,D4C9.EFD3.E39B,basement-k001a-asw1,GigabitEthernet3/45,basement-access,K002A-WB1-C1
10.0.40.22,1640,3820.5618.1630,basement-k001a-asw1,GigabitEthernet3/45,basement-access,K002A-WB1-C1
10.0.40.39,1640,3820.5618.169B,basement-k001a-asw1,GigabitEthernet3/46,basement-access,K002A-SB1-C2
192.168.189.221,189,001A.4B1C.F810,basement-k001a-asw1,GigabitEthernet3/46,basement-access,K002A-SB1-C2
192.168.189.27,189,F4CE.4613.FF62,basement-k001a-asw1,GigabitEthernet3/47,basement-access,K002A-SB1-C1
10.0.40.25,1640,1CDE.A783.A92C,basement-k001a-asw1,GigabitEthernet3/47,basement-access,K002A-SB1-C1
172.16.45.183,45,0040.1135.7FC6,zph-04721-asw1,GigabitEthernet1/0/15,zph-access,04740-WB1-C1(SECURITY)
10.50.10.183,1045,0040.1935.7AC2,zph-04721-asw1,GigabitEthernet1/0/15,zph-access,04740-WB1-C1(SECURITY)
172.16.45.241,45,00C0.B792.8CD1,zph-04721-asw1,GigabitEthernet1/0/25,zph-access,04721-NETBOTZ
10.50.10.241,1045,1AD1.B792.8AD1,zph-04721-asw1,GigabitEthernet1/0/25,zph-access,04721-NETBO
192.168.189.2,189,00C0.B7B6.3A1A,basement-k001a-asw1,GigabitEthernet3/48,basement-access,Connectiontobasement-k001a-ups1
192.168.x.x and 172.16.x.x are endpoint
10.0.x.x and 10.50.x.x are phones.
Location match $4,$5,&7
expected result.
192.168.205.76,189,FC3F.DB02.ED78,basement-k001a-asw1,GigabitEthernet3/1,basement-access,K022E-NB1-C1,10.0.40.46
10.0.40.46,1640,F025.7279.6DAA,basement-k001a-asw1,GigabitEthernet3/1,basement-access,K022E-NB1-C1,N/A
10.68.194.185,1189,9C93.4E2D.EE1A,basement-k001a-asw1,GigabitEthernet3/3,basement-access,K022D-NB2-C2,N/A
192.168.189.26,189,9C8E.99DD.A49F,basement-k001a-asw1,GigabitEthernet3/4,basement-access,K013-EB3-C1,10.0.40.29
10.0.40.29,1640,1CDE.A783.EA7B,basement-k001a-asw1,GigabitEthernet3/4,basement-access,K013-EB3-C1,N/A
192.168.189.230,189,EC9A.7435.2177,basement-k001a-asw1,GigabitEthernet3/6,basement-access,K024-SB1-C1,N/A
192.168.189.34,189,70F3.95C1.11F8,basement-k001a-asw1,GigabitEthernet3/8,basement-access,K020-CF7-C1,N/A
10.0.40.45,1640,0008.2FB7.6F84,basement-k001a-asw1,GigabitEthernet3/11,basement-access,K002A-NB1-C1,N/A
192.168.189.22,189,8851.FB82.5DE3,basement-k001a-asw1,GigabitEthernet3/12,basement-access,K022D-NB1-C2,10.0.40.28
10.0.40.28,1640,3CCE.73AC.ED44,basement-k001a-asw1,GigabitEthernet3/12,basement-access,K022D-NB1-C2,N/A
192.168.189.225,189,9C93.4E4D.1DDA,basement-k001a-asw1,GigabitEthernet3/13,basement-access,K022D-NB2-C1,N/A
10.68.189.182,1189,001C.9B09.0504,basement-k001a-asw1,GigabitEthernet3/15,basement-access,K006-NW1-C1,N/A
10.0.40.42,1640,1CDE.A783.B19B,basement-k001a-asw1,GigabitEthernet3/16,basement-access,K005-NB1-C1,N/A
10.68.189.181,1189,9C93.4E16.D940,basement-k001a-asw1,GigabitEthernet3/17,basement-access,K004-WB1-C2,N/A
192.168.189.233,1189,9C93.4E67.2017,basement-k001a-asw1,GigabitEthernet3/27,basement-access,K013-SB1-C1,N/A
10.68.189.52,1189,0040.580D.157E,basement-k001a-asw1,GigabitEthernet3/28,basement-access,K009HALL-EW5-C1(KRONOS),N/A
192.168.189.31,189,984B.E17D.5BE1,basement-k001a-asw1,GigabitEthernet3/34,basement-access,K013-WB1-C1,N/A
192.168.189.222,189,68B5.9941.32CE,basement-k001a-asw1,GigabitEthernet3/35,basement-access,K004-NB3-C1,N/A
10.0.40.56,1640,0CD9.9691.B9C3,basement-k001a-asw1,GigabitEthernet3/36,basement-access,K024HALL-WW1-C1,N/A
192.168.189.223,189,3CD9.2B0F.E714,basement-k001a-asw1,GigabitEthernet3/39,basement-access,K006-EB1-C2,N/A
10.0.40.44,1640,1CDE.A782.1A7E,basement-k001a-asw1,GigabitEthernet3/41,basement-access,K011-NB1-C2,N/A
192.168.189.224,189,1458.D039.9735,basement-k001a-asw1,GigabitEthernet3/42,basement-access,K013-WB2-C2,N/A
192.168.189.23,189,D4C9.EFD8.1490,basement-k001a-asw1,GigabitEthernet3/43,basement-access,K013-WB2-C1,10.0.40.30
10.0.40.30,1640,1CDE.A783.A7CD,basement-k001a-asw1,GigabitEthernet3/43,basement-access,K013-WB2-C1,N/A
192.168.189.25,189,8851.FB81.72E4,basement-k001a-asw1,GigabitEthernet3/44,basement-access,K002A-WB1-C2,N/A
192.168.189.29,189,D4C9.EFD3.E39B,basement-k001a-asw1,GigabitEthernet3/45,basement-access,K002A-WB1-C1,10.0.40.22
10.0.40.22,1640,3820.5618.1630,basement-k001a-asw1,GigabitEthernet3/45,basement-access,K002A-WB1-C1,N/A
10.0.40.39,1640,3820.5618.169B,basement-k001a-asw1,GigabitEthernet3/46,basement-access,K002A-SB1-C2,N/A
192.168.189.221,189,001A.4B1C.F810,basement-k001a-asw1,GigabitEthernet3/46,basement-access,K002A-SB1-C2,10.0.40.39
192.168.189.27,189,F4CE.4613.FF62,basement-k001a-asw1,GigabitEthernet3/47,basement-access,K002A-SB1-C1,10.0.40.25
10.0.40.25,1640,1CDE.A783.A92C,basement-k001a-asw1,GigabitEthernet3/47,basement-access,K002A-SB1-C1,N/A
172.16.45.183,45,0040.1135.7FC6,zph-04721-asw1,GigabitEthernet1/0/15,zph-access,04740-WB1-C1(SECURITY),10.50.10.183
10.50.10.183,1045,0040.1935.7AC2,zph-04721-asw1,GigabitEthernet1/0/15,zph-access,04740-WB1-C1(SECURITY),N/A
172.16.45.241,45,00C0.B792.8CD1,zph-04721-asw1,GigabitEthernet1/0/25,zph-access,04721-NETBOTZ,10.50.10.241
10.50.10.241,1045,1AD1.B792.8AD1,zph-04721-asw1,GigabitEthernet1/0/25,zph-access,04721-NETBO,N/A
192.168.189.2,189,00C0.B7B6.3A1A,basement-k001a-asw1,GigabitEthernet3/48,basement-access,Connectiontobasement-k001a-ups1,N/A
You can use awk to simplify this:
awk 'BEGIN{FS=OFS=","} {k=$4 FS $5 FS $7}
$1~/^1[79]2\./{if (pr) print pr, "N/A"; pr=$0; pk=k}
$1~/^10\./{if (k == pk) { print pr, $1; pr=""} print $0, "N/A"}
END{if (pr) print pr, "N/A"}' file
Output:
192.168.205.76,189,FC3F.DB02.ED78,basement-k001a-asw1,GigabitEthernet3/1,basement-access,K022E-NB1-C1,10.0.40.46
10.0.40.46,1640,F025.7279.6DAA,basement-k001a-asw1,GigabitEthernet3/1,basement-access,K022E-NB1-C1,N/A
10.68.194.185,1189,9C93.4E2D.EE1A,basement-k001a-asw1,GigabitEthernet3/3,basement-access,K022D-NB2-C2,N/A
192.168.189.26,189,9C8E.99DD.A49F,basement-k001a-asw1,GigabitEthernet3/4,basement-access,K013-EB3-C1,10.0.40.29
10.0.40.29,1640,1CDE.A783.EA7B,basement-k001a-asw1,GigabitEthernet3/4,basement-access,K013-EB3-C1,N/A
192.168.189.230,189,EC9A.7435.2177,basement-k001a-asw1,GigabitEthernet3/6,basement-access,K024-SB1-C1,N/A
10.0.40.45,1640,0008.2FB7.6F84,basement-k001a-asw1,GigabitEthernet3/11,basement-access,K002A-NB1-C1,N/A
192.168.189.34,189,70F3.95C1.11F8,basement-k001a-asw1,GigabitEthernet3/8,basement-access,K020-CF7-C1,N/A
192.168.189.22,189,8851.FB82.5DE3,basement-k001a-asw1,GigabitEthernet3/12,basement-access,K022D-NB1-C2,10.0.40.28
10.0.40.28,1640,3CCE.73AC.ED44,basement-k001a-asw1,GigabitEthernet3/12,basement-access,K022D-NB1-C2,N/A
10.68.189.182,1189,001C.9B09.0504,basement-k001a-asw1,GigabitEthernet3/15,basement-access,K006-NW1-C1,N/A
10.0.40.42,1640,1CDE.A783.B19B,basement-k001a-asw1,GigabitEthernet3/16,basement-access,K005-NB1-C1,N/A
10.68.189.181,1189,9C93.4E16.D940,basement-k001a-asw1,GigabitEthernet3/17,basement-access,K004-WB1-C2,N/A
192.168.189.225,189,9C93.4E4D.1DDA,basement-k001a-asw1,GigabitEthernet3/13,basement-access,K022D-NB2-C1,N/A
10.68.189.52,1189,0040.580D.157E,basement-k001a-asw1,GigabitEthernet3/28,basement-access,K009HALL-EW5-C1(KRONOS),N/A
192.168.189.233,1189,9C93.4E67.2017,basement-k001a-asw1,GigabitEthernet3/27,basement-access,K013-SB1-C1,N/A
192.168.189.31,189,984B.E17D.5BE1,basement-k001a-asw1,GigabitEthernet3/34,basement-access,K013-WB1-C1,N/A
10.0.40.56,1640,0CD9.9691.B9C3,basement-k001a-asw1,GigabitEthernet3/36,basement-access,K024HALL-WW1-C1,N/A
192.168.189.222,189,68B5.9941.32CE,basement-k001a-asw1,GigabitEthernet3/35,basement-access,K004-NB3-C1,N/A
10.0.40.44,1640,1CDE.A782.1A7E,basement-k001a-asw1,GigabitEthernet3/41,basement-access,K011-NB1-C2,N/A
192.168.189.223,189,3CD9.2B0F.E714,basement-k001a-asw1,GigabitEthernet3/39,basement-access,K006-EB1-C2,N/A
192.168.189.224,189,1458.D039.9735,basement-k001a-asw1,GigabitEthernet3/42,basement-access,K013-WB2-C2,N/A
192.168.189.23,189,D4C9.EFD8.1490,basement-k001a-asw1,GigabitEthernet3/43,basement-access,K013-WB2-C1,10.0.40.30
10.0.40.30,1640,1CDE.A783.A7CD,basement-k001a-asw1,GigabitEthernet3/43,basement-access,K013-WB2-C1,N/A
192.168.189.25,189,8851.FB81.72E4,basement-k001a-asw1,GigabitEthernet3/44,basement-access,K002A-WB1-C2,N/A
192.168.189.29,189,D4C9.EFD3.E39B,basement-k001a-asw1,GigabitEthernet3/45,basement-access,K002A-WB1-C1,10.0.40.22
10.0.40.22,1640,3820.5618.1630,basement-k001a-asw1,GigabitEthernet3/45,basement-access,K002A-WB1-C1,N/A
10.0.40.39,1640,3820.5618.169B,basement-k001a-asw1,GigabitEthernet3/46,basement-access,K002A-SB1-C2,N/A
192.168.189.221,189,001A.4B1C.F810,basement-k001a-asw1,GigabitEthernet3/46,basement-access,K002A-SB1-C2,N/A
192.168.189.27,189,F4CE.4613.FF62,basement-k001a-asw1,GigabitEthernet3/47,basement-access,K002A-SB1-C1,10.0.40.25
10.0.40.25,1640,1CDE.A783.A92C,basement-k001a-asw1,GigabitEthernet3/47,basement-access,K002A-SB1-C1,N/A
172.16.45.183,45,0040.1135.7FC6,zph-04721-asw1,GigabitEthernet1/0/15,zph-access,04740-WB1-C1(SECURITY),10.50.10.183
10.50.10.183,1045,0040.1935.7AC2,zph-04721-asw1,GigabitEthernet1/0/15,zph-access,04740-WB1-C1(SECURITY),N/A
172.16.45.241,45,00C0.B792.8CD1,zph-04721-asw1,GigabitEthernet1/0/25,zph-access,04721-NETBOTZ,10.50.10.241
10.50.10.241,1045,1AD1.B792.8AD1,zph-04721-asw1,GigabitEthernet1/0/25,zph-access,04721-NETBOTZ,N/A
192.168.189.2,189,00C0.B7B6.3A1A,basement-k001a-asw1,GigabitEthernet3/48,basement-access,Connectiontobasement-k001a-ups1,N/A
Related
Block IPs that requested more than N times per minute from a log file
I want to block IPs that requested more than N times/min using iptables . I've sorted the log file using this script: cat $log_path | awk '{print $1, $4}' | sort -n -k 1,4 | sed "s/\[//g" 10.200.3.120 20/May/2021:21:05:04 10.200.3.120 20/May/2021:21:05:17 10.200.3.120 20/May/2021:21:05:18 10.200.3.120 20/May/2021:21:05:19 10.200.3.120 20/May/2021:21:05:20 10.200.3.120 20/May/2021:22:05:39 104.131.19.181 20/May/2021:19:05:31 107.23.7.76 20/May/2021:20:05:16 119.252.76.162 20/May/2021:22:05:00 119.252.76.162 20/May/2021:22:05:01 119.252.76.162 20/May/2021:22:05:01 119.252.76.162 20/May/2021:22:05:04 119.252.76.162 20/May/2021:22:05:04 119.252.76.162 20/May/2021:21:05:10 119.252.76.162 20/May/2021:21:05:44 ⋮ In the example log above, two IPs requested more than 4 times in a minute (10.200.3.120, 119.252.76.162) and they should be blocked. How can I get the number of requests in a time interval for each IP and block those IPs?
You can try this solution: awk ' { gsub(/\[|:[0-9]+$/, "", $4) ++fq[$4,$1] } END { for (i in fq) if (fq[i] >= 4) { sub(".*" SUBSEP, "", i) print "iptables -A INPUT -s", i, "-j DROP" } }' "$log_path" | sh Here: gsub function strips starting [ and seconds value from timestamp ++fq[$4,$1] increments array element fq by 1 where each element is composite key $4,$1 i.e. $4 SUBSEP $1 string In the END block we loop through fq array. When fq[i] >= 4 we remove starting text followed by SUBSEP from array index to leave only IP. Finally we print full iptables command line using the ip we just extracted Finally we pipe awk output to sh to run all commands
You can block the ip like this : iptables -A INPUT -s <ip-address-to-block> -j DROP Adapt your bash script to use this command whenever you see an ip requesting more than you want. The idea is to read your log file with a given frequency and parse the number each ip appears. If it appears more than you want, you drop it. To unblock the ip, you can use this command: iptables -D INPUT -s <ip-address-to-unblock> -j DROP
read from the nth column on using awk
The nmcli -c no device displays: DEVICE TYPE STATE CONNECTION wlp3s0 wifi connected My Test Connection p2p-dev-wlp3s0 wifi-p2p disconnected -- enp4s0f1 ethernet unavailable -- lo loopback unmanaged -- In order to separate the info on wifi, I have this command: wf_info="$(nmcli -c no device | grep "wifi[^-]" | awk '{print "wf_devc="$1, "wf_state="$3, "wf_conn="$4}')" eval "$wf_info" echo "$wf_devc" # returns wlp3s0 echo "$wf_state" # returns connected echo "$wf_conn" # returns My (while should be My Test Connection) The problem with the above command is that for wf_conn it gives me My while I should be the full name My Test Connection. How can I tell the command to read from the 4th column on and not just the 4th column for the wf_conn?
You can "collect" the rest of the fields into a single variable and then print it: read wf_devc wf_state wf_conn < <(nmcli -c no device | awk '/wifi[^-]/{r=""; for(i=4;i<=NF;i++){r=r (i==4 ? "":" ") $i}; print $1" "$3" "r}') Note that grep part is incorporated into awk, /wifi[^-]/ will make sure only those lines will be printed that contains wifi followed by a char other than a - char. The r=""; for(i=4;i<=NF;i++){r=r (i==4 ? "":" ") $i} part inits an r empty string and then all fields starting with Field 4 are concatenated using a space. See the online demo: #!/bin/bash s='DEVICE TYPE STATE CONNECTION wlp3s0 wifi connected My Test Connection p2p-dev-wlp3s0 wifi-p2p disconnected -- enp4s0f1 ethernet unavailable -- lo loopback unmanaged --' read wf_devc wf_state wf_conn < <(awk ' /wifi[^-]/{ r=""; for(i=4;i<=NF;i++){ r=r (i==4 ? "":" ") $i }; print $1" "$3" "r }' <<< "$s") echo "wf_devc=$wf_devc wf_state=$wf_state wf_conn=$wf_conn" Output: wf_devc=wlp3s0 wf_state=connected wf_conn=My Test Connection
Store variables from lines in a text file using awk and cut in a for loop
I have a tab separated text file, call it input.txt cat input.txt Begin Annotation Diff End Begin,End 6436687 >ENST00000422706.5|ENSG00000100342.21|OTTHUMG00000030427.9|-|APOL1-205|APOL1|2901|protein_coding| 50 6436736 6436687,6436736 6436737 >ENST00000426053.5|ENSG00000100342.21|OTTHUMG00000030427.9|-|APOL1-206|APOL1|2808|protein_coding| 48 6436784 6436737,6436784 6436785 >ENST00000319136.8|ENSG00000100342.21|OTTHUMG00000030427.9|OTTHUMT00000075315.5|APOL1-201|APOL1|3000|protein_coding| 51 6436835 6436785,6436835 6436836 >ENST00000422471.5|ENSG00000100342.21|OTTHUMG00000030427.9|OTTHUMT00000319151.1|APOL1-204|APOL1|561|nonsense_mediated_decay| 11 6436846 6436836,6436846 6436847 >ENST00000475519.5|ENSG00000100342.21|OTTHUMG00000030427.9|OTTHUMT00000319153.1|APOL1-212|APOL1|600|retained_intron| 11 6436857 6436847,6436857 6436858 >ENST00000438034.5|ENSG00000100342.21|OTTHUMG00000030427.9|OTTHUMT00000319152.2|APOL1-210|APOL1|566|protein_coding| 11 6436868 6436858,6436868 6436869 >ENST00000439680.5|ENSG00000100342.21|OTTHUMG00000030427.9|OTTHUMT00000319252.1|APOL1-211|APOL1|531|nonsense_mediated_decay| 10 6436878 6436869,6436878 6436879 >ENST00000427990.5|ENSG00000100342.21|OTTHUMG00000030427.9|OTTHUMT00000319154.2|APOL1-207|APOL1|624|protein_coding| 12 6436890 6436879,6436890 6436891 >ENST00000397278.8|ENSG00000100342.21|OTTHUMG00000030427.9|OTTHUMT00000319100.4|APOL1-202|APOL1|2795|protein_coding| 48 6436938 6436891,6436938 6436939 >ENST00000397279.8|ENSG00000100342.21|OTTHUMG00000030427.9|-|APOL1-203|APOL1|1564|protein_coding| 28 6436966 6436939,6436966 6436967 >ENST00000433768.5|ENSG00000100342.21|OTTHUMG00000030427.9|OTTHUMT00000319253.2|APOL1-209|APOL1|541|protein_coding| 11 6436977 6436967,6436977 6436978 >ENST00000431184.1|ENSG00000100342.21|OTTHUMG00000030427.9|OTTHUMT00000319254.1|APOL1-208|APOL1|550|nonsense_mediated_decay| 11 6436988 6436978,6436988 Using the information in input.txt I want to obtain information from a file called Other_File.fa. This file is an annotation file filled with ENST#'s (transcript IDs) and sequences of A's,T's,C's,and G's. I want to store the sequence in a file called Output.log (see example below) and I want to store the command used to retrieve the text in a file called Input.log (see example below). I have tried to do this using awk and cut so far using a for loop. This is the code I have tried. for line in `awk -F "\\t" 'NR != 1 {print substr($2,2,17)"#"$5}' input.txt` do transcript=`cut -d "#" -f 1 $line` range=`cut -d "#" -f 2 $line` #Range is the string location in Other_File.fa echo "Our transcript is ${transcript} and our range is ${range}" >> Input.log sed -n '${range}' Other_File.fa >> Output.log done Here is an example of the 11 lines between ENST00000433768.5 and ENST00000431184.1 in Other_File.fa. grep -A 11 ENST00000433768.5 Other_File.fa >ENST00000433768.5|ENSG00000100342.21|OTTHUMG00000030427.9|OTTHUMT00000319253.2|APOL1-209|APOL1|541|protein_coding| ATCCACACAGCTCAGAACAGCTGGATCTTGCTCAGTCTCTGCCAGGGGAAGATTCCTTGG AGGAGCACACTGTCTCAACCCCTCTTTTCCTGCTCAAGGAGGAGGCCCTGCAGCGACATG GAGGGAGCTGCTTTGCTGAGAGTCTCTGTCCTCTGCATCTGGATGAGTGCACTTTTCCTT GGTGTGGGAGTGAGGGCAGAGGAAGCTGGAGCGAGGGTGCAACAAAACGTTCCAAGTGGG ACAGATACTGGAGATCCTCAAAGTAAGCCCCTCGGTGACTGGGCTGCTGGCACCATGGAC CCAGGCCCAGCTGGGTCCAGAGGTGACAGTGGAGAGCCGTGTACCCTGAGACCAGCCTGC AGAGGACAGAGGCAACATGGAGGTGCCTCAAGGATCAGTGCTGAGGGTCCCGCCCCCATG CCCCGTCGAAGAACCCCCTCCACTGCCCATCTGAGAGTGCCCAAGACCAGCAGGAGGAAT CTCCTTTGCATGAGAGCAGTATCTTTATTGAGGATGCCATTAAGTATTTCAAGGAAAAAG T >ENST00000431184.1|ENSG00000100342.21|OTTHUMG00000030427.9|OTTHUMT00000319254.1|APOL1-208|APOL1|550|nonsense_mediated_decay| The range value in input.txt for this transcript is 6436967,6436977. In my file Input.log for this transcript I hope to get Our transcript is ENST00000433768.5 and our range is 6436967,6436977 And in Output.log for this transcript I hope to get >ENST00000433768.5|ENSG00000100342.21|OTTHUMG00000030427.9|OTTHUMT00000319253.2|APOL1-209|APOL1|541|protein_coding| ATCCACACAGCTCAGAACAGCTGGATCTTGCTCAGTCTCTGCCAGGGGAAGATTCCTTGG AGGAGCACACTGTCTCAACCCCTCTTTTCCTGCTCAAGGAGGAGGCCCTGCAGCGACATG GAGGGAGCTGCTTTGCTGAGAGTCTCTGTCCTCTGCATCTGGATGAGTGCACTTTTCCTT GGTGTGGGAGTGAGGGCAGAGGAAGCTGGAGCGAGGGTGCAACAAAACGTTCCAAGTGGG ACAGATACTGGAGATCCTCAAAGTAAGCCCCTCGGTGACTGGGCTGCTGGCACCATGGAC CCAGGCCCAGCTGGGTCCAGAGGTGACAGTGGAGAGCCGTGTACCCTGAGACCAGCCTGC AGAGGACAGAGGCAACATGGAGGTGCCTCAAGGATCAGTGCTGAGGGTCCCGCCCCCATG CCCCGTCGAAGAACCCCCTCCACTGCCCATCTGAGAGTGCCCAAGACCAGCAGGAGGAAT CTCCTTTGCATGAGAGCAGTATCTTTATTGAGGATGCCATTAAGTATTTCAAGGAAAAAG T But I am getting the following error, and I am unsure as to why or how to fix it. cut: ENST00000433768.5#6436967,6436977: No such file or directory cut: ENST00000433768.5#6436967,6436977: No such file or directory Our transcript is and our range is My thought was each line from the awk would be read as a string then cut could split the string along the "#" symbol I have added, but it is reading each line as a file and throwing an error when it can't locate the file in my directory. Thanks.
EDIT2: This is a generic solution which will compare 2 files(input and other_file.fa) and on whichever line whichever range is found it will print them. Eg--> Range numbers are found on 300 line number but range shows you should print from 1 to 20 it will work in that case also. Also note this calls system command which further calls sed command(like you were using range within sed), there are other ways too, like to load whole Input_file into an array or so and then print, but I am going with this one here, fair warning this is not tested with huge size files. awk -F'[>| ]' ' FNR==NR{ arr[$2]=$NF next } ($2 in arr){ split(arr[$2],lineNum,",") print arr[$2] start=lineNum[1] end=lineNum[2] print "sed -n \047" start","end"p \047 " FILENAME system("sed -n \047" start","end"p\047 " FILENAME) start=end=0 } ' file1 FS="[>|]" other_file.fa EDIT: With OP's edited samples, please try following to print lines based on other file. assumes that the line you find range values, those values will be always after the line on which they found(eg--> 3rd line range values found and range is 4 to 10). awk -F'[>| ]' ' FNR==NR{ arr[$2]=$NF next } ($2 in arr){ split(arr[$2],lineNum," ") start=lineNum[1] end=lineNum[2] } FNR>=start && FNR<=end{ print if(FNR==end){ start=end=0 } } ' file1 FS="[>|]" other_file.fa You need not to do this with a for loop and then call awk program each time for each line. This could be done in single awk, considering that you have to only print them. Written and tested with your shown samples. awk -F'[>| ]' 'FNR>1{print "Our transcript is:"$3" and our range is:"$NF}' Input_file NOTE: This will print for each line of your Input_file values of transcript and range, in case you want to further perform some operation with their values then please do mention.
Get package name and corr. data from file
I've been banging my head lately,trying to parse dumpsys output. Here is the output: NotificationRecord(0x4297d448: pkg=com.android.systemui user=UserHandle{0} id=273 tag=null score=0: Notification(pri=0 icon=7f020148 contentView=com.android.systemui/0x1090069 vibrate=null sound=null defaults=0x0 flags=0x2 when=0 ledARGB=0x0 contentIntent=N deleteIntent=N contentTitle=6 contentText=15 tickerText=6 kind=[null])) uid=10012 userId=0 icon=0x7f020148 / com.android.systemui:drawable/stat_sys_no_sim pri=0 score=0 contentIntent=null deleteIntent=null tickerText=No SIM contentView=android.widget.RemoteViews#429c1f58 defaults=0x00000000 flags=0x00000002 sound=null vibrate=null led=0x00000000 onMs=0 offMs=0 extras={ android.title=No SIM android.subText=null android.showChronometer=false android.icon=2130837832 android.text=Insert SIM card android.progress=0 android.progressMax=0 android.showWhen=true android.infoText=null android.progressIndeterminate=false android.scoreModified=false } NotificationRecord(0x427e1878: pkg=jackpal.androidterm user=UserHandle{0} id=1 tag=null score=0: Notification(pri=0 icon=7f02000d contentView=jackpal.androidterm/0x1090069 vibrate=null sound=null defaults=0x0 flags=0x62 when=1456782124817 ledARGB=0x0 contentIntent=Y deleteIntent=N contentTitle=17 contentText=27 tickerText=27 kind=[null])) uid=10094 userId=0 icon=0x7f02000d / jackpal.androidterm:drawable/ic_stat_service_notification_icon pri=0 score=0 contentIntent=PendingIntent{42754f78: PendingIntentRecord{42802aa0 jackpal.androidterm startActivity}} deleteIntent=null tickerText=Terminal session is running contentView=android.widget.RemoteViews#4279b510 defaults=0x00000000 flags=0x00000062 sound=null vibrate=null led=0x00000000 onMs=0 offMs=0 extras={ android.title=Terminal Emulator android.subText=null android.showChronometer=false android.icon=2130837517 android.text=Terminal session is running android.progress=0 android.progressMax=0 android.showWhen=true android.infoText=null android.progressIndeterminate=false android.scoreModified=false } NotificationRecord(0x429381f8: pkg=com.droidsail.dsapp2sd user=UserHandle{0} id=128 tag=null score=0: Notification(pri=0 icon=7f020000 contentView=com.droidsail.dsapp2sd/0x1090069 vibrate=null sound=null defaults=0x0 flags=0x10 when=1456786729004 ledARGB=0x0 contentIntent=Y deleteIntent=N contentTitle=13 contentText=35 tickerText=35 kind=[null])) uid=10107 userId=0 icon=0x7f020000 / com.droidsail.dsapp2sd:drawable/appicon pri=0 score=0 contentIntent=PendingIntent{42955a60: PendingIntentRecord{4286db18 com.droidsail.dsapp2sd startActivity}} deleteIntent=null tickerText=Detected new app can be moved to SD contentView=android.widget.RemoteViews#42a891a8 defaults=0x00000000 flags=0x00000010 sound=null vibrate=null led=0x00000000 onMs=0 offMs=0 extras={ android.title=New app to SD android.subText=null android.showChronometer=false android.icon=2130837504 android.text=Detected new app can be moved to SD android.progress=0 android.progressMax=0 android.showWhen=true android.infoText=null android.progressIndeterminate=false android.scoreModified=false } NotificationRecord(0x423708b0: pkg=android user=UserHandle{-1} id=17041135 tag=null score=0: Notification(pri=0 icon=1080399 contentView=android/0x1090069 vibrate=null sound=null defaults=0x0 flags=0x1002 when=0 ledARGB=0x0 contentIntent=Y deleteIntent=N contentTitle=19 contentText=17 tickerText=N kind=[android.system.imeswitcher])) uid=1000 userId=-1 icon=0x1080399 / android:drawable/ic_notification_ime_default pri=0 score=0 contentIntent=PendingIntent{425a8960: PendingIntentRecord{426f84b0 android broadcastIntent}} deleteIntent=null tickerText=null contentView=android.widget.RemoteViews#428846b8 defaults=0x00000000 flags=0x00001002 sound=null vibrate=null led=0x00000000 onMs=0 offMs=0 extras={ android.title=Choose input method android.subText=null android.showChronometer=false android.icon=17302425 android.text=Hacker's Keyboard android.progress=0 android.progressMax=0 android.showWhen=true android.infoText=null android.progressIndeterminate=false android.scoreModified=false } I want to get the package name and the corresponding extras={} for each of them. For example: pkg:com.android.systemui extras={ ..... } So far I've tried: dumpsys notification | awk '/pkg=/,/\n}/' But without any success. I'm a newbie to awk,and if possible I want to do it with awk or perl.Of course,any other tool like sed or grep is fine by me too,I just wanna parse it somehow. Can anyone help me?
If you have GNU awk, try the following: awk -v RS='(^|\n)NotificationRecord\\([^=]+=' \ 'NF { print "pkg:" $1; print gensub(/^.*\n\s*(extras=\{[^}]+\}).*$/, "\\1", 1) }' file -v RS='(^|\n)NotificationRecord\\([^=]+=' breaks the input into records by lines starting with NotificationRecord( up to and including the following = char. In effect, that means you get records starting with the package names (com.android.systemui, ...`) NF is a condition that only executes the following block if it evaluates to nonzero; NF is the count of fields in the record, so as long as at least 1 field is present, the block is evaluated - in effect, this skips the implied empty record before the very first line. print "pkg:" $1 prints the package name, prefixed with literal pkg:. gensub(/^.*\n\s*(extras=\{[^}]+\}).*$/, "\\1", 1) matches the entire record and replaces it with the extras property captured via a capture group, effectively returning the extras property only.
I would suggest perl over awk, because you'll be storing whether you're inside the extras=... block in a variable: dumpsys notification | perl -lne ' print $1 if /^Notif.*?: pkg=(\S+)/; $in_extras = 0 if /^ \}/; print if $in_extras; $in_extras = 1 if /^ extras=\{/' Oh, if you want the extra pkg: and extras= text, slight modification: dumpsys notification | perl -lne ' print "pkg: $1" if /^Notif.*?: pkg=(\S+)/; $in_extras = 1 if /^ extras=\{/; print if $in_extras; $in_extras = 0 if /^ \}/;'
Sed version: dumpsys notification |\ sed -n 's/.*pkg=\([^ ]*\).*/pkg:\1/p;/^ extras={$/,/^ }$/s/^ //p' I'm assuming you always have two spaces in front of extras={ and } and you also want to remove these spaces.
Conditional Sort using Awk or sort
Alright, so I asked a question a week or so ago about how I could use sed or awk to extract a block of text between two blank lines, as well as omit part of the extracted text. The answers I got pretty much satisfied my needs, but now I'm doing something extra for fun (and for OCD's sake). I want to sort the output from awk in this round. I found this question & answer but it doesn't quite help me to solve the problem. I've also tried wrapping my head around a lot of awk documentation as well to try and figure out how I could do this, to no avail. So here's the block of code in my script that does all the dirty work: # This block of stuff fetches the nameservers as reported by the registrar and DNS zone # Then it gets piped into awk to work some more formatting magic... # The following is a step-for-step description since I can't put comments inside the awk block: # BEGIN: # Set the record separator to a blank line # Set the input/output field separators to newlines # FNR == 3: # The third block of dig's output is the nameservers reported by the registrar # Also blanks the last field & strips it since it's just a useless dig comment dig +trace +additional $host | \ awk -v host="$host" ' BEGIN { RS = ""; FS = "\n" } FNR == 3 { print "Nameservers of",host,"reported by the registrar:"; OFS = "\n"; $NF = ""; sub( /[[:space:]]+$/, "" ); print } ' And here's the output if I pass google.com in as the value of $host (other hostnames may produce output of differing line counts): Nameservers of google.com reported by the registrar: google.com. 172800 IN NS ns2.google.com. google.com. 172800 IN NS ns1.google.com. google.com. 172800 IN NS ns3.google.com. google.com. 172800 IN NS ns4.google.com. ns2.google.com. 172800 IN A 216.239.34.10 ns1.google.com. 172800 IN A 216.239.32.10 ns3.google.com. 172800 IN A 216.239.36.10 ns4.google.com. 172800 IN A 216.239.38.10 The idea is, using either the existing block of awk, or piping awk's output into a combination of more awk, sort, or whatever else, sort that block of text using a conditional algorithm: if ( column 4 == 'NS' ) sort by column 5 else // This will ensure that the col 1 sort includes A and AAAA records sort by column 1 I've pretty much got the same preferences for answers as the previous question: Most important of all, it must be portable since I've encountered different behaviour between OS X (my home system) and Fedora (what I use at work) when using sed (had to replace it with gsed on OS X) and grep's -m flag (used in another script) An explanation of how the solution works would be very much appreciated, as a learning opportunity moreso than anything else. I already learned quite a bit from the awk solution already provided in the previous question. If the solution can be implemented within the same block of awk, that would also be awesome If not, then something simple and eloquent that I can pipe awk's output through would suffice
Here's a solution based on #shellter's idea. Pipe the output of your nameserver records to this: awk '$4 == "NS" {print $1, $5, $0} $4 == "A" {print $1, $1, $0}' | sort | cut -f3- -d' ' Explanation: With awk, we take only the NS and A records, and re-print the same line with prefix: primary search column + secondary search column sort will sort the lines, thanks to the way we set the first and second column, the order should be as you wanted With cut we get rid of the prefix that we used for sorting
I know you asked about awk solution, but since you tagged it with bash too, I thought I'd provide such a version. It should also be more portable than awk ;) # the whole line declare -a lines # the key to use for sorting declare -a keys # insert into the arrays at the appropriate position function insert { local key="$1" local line="$2" local count=${#lines[*]} local i # go from the end backwards for((i=count; i>0; i-=1)) do # if we have the insertion point, break [[ "${keys[i-1]}" > "$key" ]] || break # shift the current item to make room for the new one lines[i]=${lines[i-1]} keys[i]=${keys[i-1]} done # insert the new item lines[i]=$line keys[i]=$key } # This block of stuff fetches the nameservers as reported by the registrar and DNS zone # The third block of dig's output is the nameservers reported by the registrar # Also blanks the last field & strips it since it's just a useless dig comment block=0 dig +trace +additional $host | while read f1 f2 f3 f4 f5 do # empty line begins new block if [ -z "$f1" ] then # increment block counter block=$((block+1)) # and read next line continue fi # if we are not in block #3, read next line [[ $block == 3 ]] || continue # ;; ends the block if [[ "$f1" == ";;" ]] then echo "Nameservers of $host reported by the registrar:" # print the lines collected so far for((i=0; i<${#lines[*]}; i+=1)) do echo ${lines[i]} done # don't bother reading the rest break fi # figure out what key to use for sorting if [[ "$f4" == "NS" ]] then key=$f5 else key=$f1 fi # add the line to the arrays insert "$key" "$f1 $f2 $f3 $f4 $f5" done