Delete text in file after a match - bash

I have a file with the following:
/home/adversion/web/wp-content/plugins/akismet/index1.php: PHP.Mailer-7 FOUND
/home/beckydodman/web/oldshop/images/google68274020601e.php: Trojan.PHP-1 FOUND
/home/resurgence/web/Issue 272/Batch 2 for Helen/keynote_Philip Baldwin (author revise).doc: W97M.Thus.A FOUND
/home/resurgence/web/Issue 272/from Helen/M keynote_Philip Baldwin.doc: W97M.Thus.A FOUND
/home/skda/web/clients/sandbox/wp-content/themes/editorial/cache/external_dc8e1cb5bf0392f054e59734fa15469b.php: Trojan.PHP-58 FOUND
I need to clean this file up by removing everything after the colon (:).
so that it looks like this:
/home/adversion/web/wp-content/plugins/akismet/index1.php
/home/beckydodman/web/oldshop/images/google68274020601e.php
/home/resurgence/web/Issue 272/Batch 2 for Helen/keynote_Philip Baldwin (author revise).doc
/home/resurgence/web/Issue 272/from Helen/M keynote_Philip Baldwin.doc
/home/skda/web/clients/sandbox/wp-content/themes/editorial/cache/external_dc8e1cb5bf0392f054e59734fa15469b.php

Use awk:
$ awk -F: '{print $1}' input
/home/adversion/web/wp-content/plugins/akismet/index1.php
/home/beckydodman/web/oldshop/images/google68274020601e.php
/home/resurgence/web/Issue 272/Batch 2 for Helen/keynote_Philip Baldwin (author revise).doc
/home/resurgence/web/Issue 272/from Helen/M keynote_Philip Baldwin.doc
/home/skda/web/clients/sandbox/wp-content/themes/editorial/cache/external_dc8e1cb5bf0392f054e59734fa15469b.php
or cut
$ cut -d: -f1 input
or sed
$ sed 's/:.*$//' input
or perl in awk-mode
$ perl -F: -lane 'print $F[0]' input
finally, pure bash
#!/bin/bash
while read line
do
echo ${line%%:*}
done < input

This should be enough
awk -F: '{print $1}' file-name

Here a none sed/awk solution
cut -d : -f 1 [filename]

pipe that through sed:
$ echo "/home/adversion/web/wp-content/plugins/akismet/index1.php: PHP.Mailer-7 FOUND" | sed 's/: .*$//'
/home/adversion/web/wp-content/plugins/akismet/index1.php
Will work as long as ': ' doesn't appear twice. Note that the awk / cut examples above are more likely to fail as they match ':' not ': '

Related

Extract a property value from a text file

I have a log file which contains lines like the following one:
Internal (reserved=1728469KB, committed=1728469KB)
I'd need to extract the value contained in "committed", so 1728469
I'm trying to use awk for that
cat file.txt | awk '{print $4}'
However that produces:
committed=1728469KB)
This is still incomplete and would need still some work. Is there a simpler solution to do that instead?
Thanks
Could you please try following, using match function of awk.
awk 'match($0,/committed=[0-9]+/){print substr($0,RSTART+10,RLENGTH-10)}' Input_file
With GNU grep using \K option of it:
grep -oP '.*committed=\K[0-9]*' Input_file
Output will be 1728469 in both above solutions.
1st solution explanation:
awk ' ##Starting awk program from here.
match($0,/committed=[0-9]+/){ ##Using match function to match from committed= till digits in current line.
print substr($0,RSTART+10,RLENGTH-10) ##Printing sub string from RSTART+10 to RLENGTH-10 in current line.
}
' Input_file ##Mentioning Input_file name here.
Sed is better at simple matching tasks:
sed -n 's/.*committed=\([0-9]*\).*/\1/p' input_file
$ awk -F'[=)]' '{print $3}' file
1728469KB
You can try this:
str="Internal (reserved=1728469KB, committed=1728469KB)"
echo $str | awk '{print $3}' | cut -d "=" -f2 | rev | cut -c4- | rev

Linux get data from each line of file

I have a file with many (~2k) lines similar to:
117 VALID|AUTHEN tcp:10.92.163.5:64127 uniqueID=nwCelerra
....
991 VALID|AUTHEN tcp:10.19.16.21:58332 uniqueID=smUNIX
I want only the IP address (10.19.16.21 shown above) and the value of the uniqueID (smUNIX shown above)
I am able to get close with:
cat t.txt|cut -f2- -d':'
10.22.36.69:46474 uniqueID=smwUNIX
...
I am on Linux using bash.
Using awk:
awk '{split($3,a,":"); split($4,b,"="); print a[2] " " b[2]}'
By default if splits on the whitespaces, with some extra code you can split the subfields
Update:
even easier overriding the default delimiter:
awk -F '[:=]' '{print $2 " "$4}'
using grep and sed :
grep -oP "^\d+ [A-Z]+\|[A-Z]+ \w+:\K(.*)" | sed "s/ uniqueID=/ /g"
outputs:
10.92.163.5:64127 nwCelerra
10.19.16.21:58332 smUNIX

grep 2 elements in a line and print them

Here is my issue, i have a file with the entries, i would like to get just the date + the last command after the last "]: "
Aug 17 14:25:17 snaper[22134]: [ip:10.1.15.245 37985 10.1.15.18 22 uid:10000 sid:21680 tty: cwd:/data/www/hybris/hybris/bin/platform filename:/bin/ps]: /bin/ps -p 6763
How can i get it when i cat the file ?
I can get the date with:
awk '{print $1,$2,$3}'
and the last command with :
awk -F': ' '{print $NF}'
But how to combine them to get it in a single line ?
I'm not awk limited, any sed grep or other command is ok for me :)
Thanks in advance
Just remove everything between the date and the last command:
sed 's/^\(... .. ..:..:..\).*: /\1 /'
Simple solution using AWK
$awk '{print $1,$2,$3, $(NF-2), $(NF-1), $NF }' file
Aug 17 14:25:17 /bin/ps -p 6763
Using GNU grep
grep -oP '^.{15}|.*\]: \K.*' file | paste - -
Possible use tee for use 1 output to 2 commands:
echo 'Aug 17 14:25:17 snaper[22134]: [ip:10.1.15.245 37985 10.1.15.18 22 uid:10000 sid:21680 tty: cwd:/data/www/hybris/hybris/bin/platform filename:/bin/ps]: /bin/ps -p 6763' | tee >(awk -F': ' '{print $NF}') | awk '{print $1,$2,$3}' | tr '\n' ' '
and we have output as:
Aug 17 14:25:17 /bin/ps -p 6763
$ s="Aug 17 14:25:17 snaper[22134]: [ip:10.1.15.245 37985 10.1.15.18 22 uid:10000 sid:21680 tty: cwd:/data/www/hybris/hybris/bin/platform filename:/bin/ps]: /bin/ps -p 6763"
Use sed to achieve your goal,
$ sed -r 's/(^.*:[0-9]{2}) .*]:/\1/' <<< "$s"
Try following solutions too. Considering your Input_file will be having same data as shown sample.
Solution 1st: Using simple cut command.
cut -d" " -f1,2,3,14,15,16 Input_file
Solution 2nd: using awk command where I am making string snaper and ]: as field separators.
awk -F' snaper|]:' '{print $1,$4}' Input_file
Solution 3rd: making record separator as space and then printing only those lines which we need as per OP's request.
awk -v RS=" " 'NR<4||NR>13{printf("%s%s",$0,NR<3||NR<16?" ":"")}' Input_file
Solution 4th: Substituting everything from snap to till : and get whatever is OP's request.
awk '{sub(/snaper\[.*\]: /,"");print}' Input_file
Solution 5th: Using --re-interval here(as I have old version of awk) you could remove it if you latest awk version in your system too.
awk --re-interval '{match($0,/.*[0-9]{2}:[0-9]{2}:[0-9]{2}/);print substr($0,RSTART,RLENGTH),$(NF-2),$(NF-1),$NF}' Input_file
Solution 6th: using sed and substituting everything till snaper and then everything till colon and printing the match only.
sed 's/\(.[^s]*\)\(.*:\)\(.*\)/\1\3/' Input_file

bash scripting removing optional <Integer><colon> prefix

I have a list with all of the content is like:
1:NetworkManager-0.9.9.0-28.git20131003.fc20.x86_64
avahi-0.6.31-21.fc20.x86_64
2:irqbalance-1.0.7-1.fc20.x86_64
abrt-addon-kerneloops-2.1.12-2.fc20.x86_64
mdadm-3.3-4.fc20.x86_64
I need to remove the N: but leave the rest of strings as is.
Have tried:
cat service-rpmu.list | sed -ne "s/#[#:]\+://p" > end.list
cat service-rpmu.list | egrep -o '#[#:]+' > end.list
both result in an empty end.list
//* the N:, just denotes an epoch version */
With sed:
sed 's/^[0-9]\+://' your.file
Output:
NetworkManager-0.9.9.0-28.git20131003.fc20.x86_64
avahi-0.6.31-21.fc20.x86_64
irqbalance-1.0.7-1.fc20.x86_64
abrt-addon-kerneloops-2.1.12-2.fc20.x86_64
mdadm-3.3-4.fc20.x86_64
Btw, your list looks like the output of a grep command with the option -n. If this is true, then omit the -n option there. Also it is likely that your whole task can be done with a single sed command.
awk -F: '{ sub(/^.*:/,""); print}' sample
Here is another way with awk:
awk -F: '{print $NF}’ service-rpmu.list

How can I switch around the content of a line of text

I have a large file (around 39,000 lines of text) that consists of the following:
1:iowemiowe093j4384d
2:98j238d92dd2d
3:98h2d078h78dbe0c
(continues in the same manner)
and I need to reverse the order of the two sections of the lines, so the output would be:
iowemiowe093j4384d:1
98j238d92dd2d:2
98h2d078h78dbe0c:3
Instead, I've tried using cut to do this but have not been able to get it to behave properly (this is in a bash environment), what would be the best way to do this?
awk -F: '{print $2":"$1}' input-file
Or
awk -F: '{print $2,$1}' OFS=: input-file
If you may have more than 2 fields:
awk -F: '{print $NF; for(i=NF-1; i; i-- ) print ":"$i }' input-file
Or
perl -F: -anE '$\=:; say reverse #F' input-file
or
perl -F: -anE 'say join( ':', reverse #F)' input-file
( Both perl solutions are untested, and I believe flawed, each requiring a chop $F[-1] or similar to remove the newline in the input.)
One way using GNU sed:
sed -ri 's/([^:]+):(.*)/\2:\1/' file.txt
Results:
iowemiowe093j4384d:1
98j238d92dd2d:2
98h2d078h78dbe0c:3
Pure Bash and almost as fast as the awk solution from William Pursell, just not as elegant:
paste -d: <(cut -d: -f2 input-file) <(cut -d: -f1 input-file)

Resources