How to match this with sed? - sorting

I have a list of email addresses. I want to remove the ones that start with numbers and capital letters only. For example if the file contains:
0035EA7C#xxxx.com
A7C0035E#zzzz.com
email#yyy.com
I need to delete the first 2 lines in SSH.
Thanks!

You can use grep to get the desired result:
grep -v '^[0-9[:upper:]]\+#'
^ matches the beginning of a line. [...] is a character class, it contains digits and uppercase letters. it must be present once or more \+. # stands for itself.

Whit a awk solution :
awk '/^[^[:upper:]0-9]+#/' file.txt

This might work for you:
sed '/^[A-Z0-9]/d' file

Related

Parse a string and extract a word delimited by comma and assign value from inside [] brackets

I need help to parse a string and extract a word delimited by comma and assign value from inside [] brackets.
The input string is like this:
KEEP_DFB,?(y/n),[y];
DFB_VERSION,?(1.4.2/1.7.6),[1.4.2]:
and expected output is
KEEP_DFB=y
DFB_VERSION=1.4.2
The closest I could achieve using sed is this:
echo 'KEEP_DFB,?(y/n),[y]:' | sed 's/\([^,]*,\).*,\([^,]*\):.*/\1=\2/'
but it does not give result as expected.
I also tried 'cut' but the same result as above.
Using IFS is not allowed for changing delimiter.
Can you please help?
I suggest:
sed 's/,.*\[/=/;s/].//' file
Output:
KEEP_DFB=y
DFB_VERSION=1.4.2
Your were fairly close:
$ printf "%s\n" 'DFB_VERSION,?(1.4.2/1.7.6),[1.4.2]:' 'KEEP_DFB,?(y/n),[y]:' |
> sed 's/\([^,]*\),.*,\[\([^],]*\)][;:].*/\1=\2/'
DFB_VERSION=1.4.2
KEEP_DFB=y
$
The first comma is moved outside the capture. The second capture is preceded by \[ (a literal [ in the data) and followed by a ] (doesn't need a backslash escape because ] is only special when it is part of a character class, though I'd be sorely tempted to add one and it works fine with or without the backslash).
Sundeep noted that there's a semicolon instead of a colon in one of the data lines, but the example data in the echo has a colon rather than a semicolon (which is why I didn't spot the problem on the first pass; I copied the prototype command). That's trivially handled by using [;:] as a character class instead of a direct :.
The negated character class excludes ] and commas — though it isn't clear why commas need to be excluded. It means you wouldn't recognize this as valid:
VERSION_LIST,?(1.2/1.3/1.4/1.7),[1.4,1.7]:
POSIX shell method, given input file 'foo':
while IFS=',[]' read a b c d e ; do echo "$a${a:+=}$d" ; done < foo
Output:
KEEP_DFB=y
DFB_VERSION=1.4.2
You didn't say what shell you are going to use, but with most shells, the following approach would work:
# Drop the last two characters
x=${original:0:-2}
# Store the name part
name=${x%%,*}
# Store the value part
value=${x##*\[}
For example, if original contains DFB_VERSION,?(1.4.2/1.7.6),[1.4.2]:, name will contain DFB_VERSION and value will contain 1.4.2.
BTW, why don't you want to modify IFS? Of course you don't want to change it permanently, but modifying it just for one statement, does not affect the rest of the program.
awk -F'[][,]' '{print $1"="$4}' file
KEEP_DFB=y
DFB_VERSION=1.4.2
#Suresh K: Could you please try following and let me know if this helps you.
awk -F, '{match($0,/\[.*\]/);print $1"="substr($0,RSTART+1,RLENGTH-2)}' Input_file
I hope this helps.
You should try this code. It should work fine.
awk -F"," '{print $1,$3}' OFS="=" file_name | sed -e 's/\[\(.*\)\]./\1/'
This will output the line contained in a file using awk and replacing the delimiter by = and then replace the part starting from [ and ending by ] or any other character by the values inside [].
You could also try this shorter one:
sed -e 's/,.*\[\(.*\)\]./=\1/' file
The output for both is:
KEEP_DFB=y
DFB_VERSION=1.4.2

Use sed to replace everything after match that is between two characters

I am grepping logs for the word "line1" and need to replace the text following that word that is between the characters : and ,
possible results would be:
xxx,\"line1\":\"C/O FRED FLINSTONE, MD\",xxx
xxx,\n line1: 'C/O FRED FLINSTONE, MD',xxx
xxx,\\\"line1\\\":\\\"C\\\\/O FRED FLINSTONE\\\,MD",xxx
I want to replace "C/O FRED FLINSTONE, MD" with "Redacted-Address1" so the end result would look something like:
xxx,\"line1\":Redacted-Address1,xxx
xxx,\n line1:Redacted-Address1,xxx
xxx,\\\"line1\\\":Redacted-Address1,xxx
I don't necessarily need to use SED but thought that was a good place to start. The xxx represents the reset of the line (not actual xxx) so we cant search by that and I want to leave that untouched.
A more complete example of the data would be:
,\"object\":{\"address\":[{\"city\":\"Bedrock\",\"line1\":\"C/O FRED FLINSTONE\, MD\",\"line2\":\"55101 Main St\",\"state\":\"TX\",\"use\":\"H\",\"zip\":69162}],
And the desired result would be:
,\"object\":{\"address\":[{\"city\":\"Bedrock\",\"line1\":Redacted-Address1,\"line2\":\"55101 Main St\",\"state\":\"TX\",\"use\":\"H\",\"zip\":69162}],
Using sed
sed -r '/line1/{s/([\]"line1[\]":)[\]"[^"]+",/\1Redacted-Address1,/}'
example
echo ',\"object\":{\"address\":[{\"city\":\"Bedrock\",\"line1\":\"C/O FRED FLINSTONE\, MD\",\"line2\":\"55101 Main St\",\"state\":\"TX\",\"use\":\"H\",\"zip\":69162}], '|sed -r '/line1/{s/([\]"line1[\]":)[\]"[^"]+",/\1Redacted-Address1,/}'
output will be
,\"object\":{\"address\":[{\"city\":\"Bedrock\",\"line1\":Redacted-Address1,\"line2\":\"55101 Main St\",\"state\":\"TX\",\"use\":\"H\",\"zip\":69162}],
It sounds like you want to get everything between the colon following line1 and the coma that immediately precedes the next colon. The following regex should accomplish that by replacing everything but the capture groups:
sed 's/\(line1[^:]*:\)[^:]*\(,[^,:]*:\)/\1 Redacted-Address1\2/'
You can use this sed with greedy regex :.*, which will match from : to last ,:
sed 's/:.*,/:Redacted Name,/' file
xxx,\"line1\":Redacted Name,xxx
xxx,\n line1:Redacted Name,xxx
xxx,\\"line1\\":Redacted Name,xxx
As per comments below:
sed "s/:..*['\"],/:Redacted Name,/" file
xxx,\"line1\":Redacted Name,xxx
xxx,\n line1:Redacted Name,xxx
xxx,\\"line1\\":Redacted Name,xxx
This might work for you (GNU sed):
sed 's/\(line1[^:]*:\).*,/\1Redacted Name,/' file
This uses the pattern line1 and any characters following which are not a : followed by a : ; and then greed (all characters to the end of the file then backtracking till a , is found). The match is then replaced by the back reference of the pattern upto the first : and the required string followed by a ,.

terminal extrac words ending with ".abc" from file

I want to do the following through the terminal. I have a file with many lines, each line containing a whole sentence. Some lines are empty. I want to read the file and extract all words that end with .abc. I want to do this through the terminal. How might I do that?
grep can be very usefull
$ cat input
.abc
.abdadf
assadf.abc
adsfas.abcadf
asdf.abc
$ grep -o '\b[^\.]*\.abc\b' input
assadf.abc
asdf.abc
What it does
-o prints the string in the line which match the regex given
\b[^\.]*\.abc\b regex matches any word wich ends with .abc
\b word boundary
[^\.] anything other than a .
* matches zero or more
\.abc\b matches .abc followed by word boundary \b
Note
If the word can contain more than one . then modify the regex as
\b.*\.abc\b
where .* would match anything including .
To find all the words that ends with .abc.
grep -oP '\S*\.abc(?=\s|$)' file
\S* Zero or more non-space charcaters.
(?=\s|$) Positive lookahead asserts that the character following the match must be a space or end of the line anchor.
Try awk among various other possibities.
awk '/\.abc$/' file
You can use sed command also.
sed -n '/\.abc$/ p' file

Print all characters upto a matching pattern from a file

Maybe a silly question but I have a text file that needs to display everything upto the first pattern match which is a '/'. (all lines contain no blank spaces)
Example.txt:
somename/for/example/
something/as/another/example
thisfile/dir/dir/example
Preferred output:
somename
something
thisfile
I know this grep code will display everything after a matching pattern:
grep -o '/[^\n]*' '/my/file.txt'
So is there any way to do the complete opposite, maybe rm everything after matching pattern or invert to display my preferred output?
Thanks.
If you're calling an external command like grep, you can get the same results your require with the sed command, i.e.
echo "something/as/another/example" | sed 's:/.*::'
something
Instead of focusing on what you want to keep, think about what you want to remove, in this case everything after the first '/' char. This is what this sed command does.
The leading s means substitute, the :/.*: is the pattern to match, with /.* meaning match the first /' char and all characters after that. The 2nd half of thesedcommand is the replacement. With::`, this means replace with nothing.
The traditional idom for sed is to use s/str/rep/, using / chars to delimit the search from the replacement, but you can use any character you want after the initial s (substitute) command.
Some seds expect the / char, and want a special indication that the following character is the sub/replace delimiter. So if s:/.*:: doesn't work, then s\:/.*:: should work.
IHTH.
Yu can use a much simpler reg exp:
/[^/]*/
The forward slash after the carat is what you're matching to.
jsFiddle
Assuming filename as "file.txt"
cat file.txt | cut -d "/" -f 1
Here, we are cutting the input line with "/" as the delimiter (-d "/"). Then we select the first field (-f 1).
You just need to include starting anchor ^ and also the / in a negated character class.
grep -o '^[^/]*' file

Finding 4 numbers in a string bash script

I need to check for the user ID in a weird looking string. I only want the lines that have it. How do I check for 4 integers in a row in the following sample strings?
"111/S/H0110//Jake, Greenfield ServiceRequest/bin/ksh"
"740/S/H5155//Jake, Greenfield/bin/ksh"
"90/S/Customer /usr/bin/ksh"
"740/S///Jake, Greenfield/bin/ksh"
In these examples I would want these lines to pass:
111/S/H0110//Jake, Greenfield ServiceRequest/bin/ksh
740/S/H5155//Jake, Greenfield/bin/ksh
and NOT these to pass:
90/S/Customer /usr/bin/ksh
740/S///Jake, Greenfield/bin/ksh
BONUS QUESTION
The ID can be anything from,
[A-Z][A-Z][0-9][0-9][0-9][0-9]
[0-9][0-9][0-9][0-9][0-9][0-9]
[A-Z]-[0-9][0-9][0-9][0-9]
meaning, for example:
7A7777
AA7777
A77777
A-7777
(though I would settle for "just" finding "7777" in the string)
The solutions below assume each line is an entry, and each entry is made up of fields delimited by a forward slash (/) character.
awk -F/ '$3~/[[:digit:]]{4}$/' filename
Awk is pretty efficient at it.
As indicated in comments, this can make it:
grep -E '[A-Z]{2}[0-9]{4}|[A-Z]{2}[0-9]{4}|[A-Z]-[0-9]{4}'
^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^
(1) (2) (3)
This matches the requirements:
[A-Z][A-Z][0-9][0-9][0-9][0-9] --> [A-Z]{2}[0-9]{4} (1)
[0-9][0-9][0-9][0-9][0-9][0-9] --> [0-9]{6} (2)
[A-Z]-[0-9][0-9][0-9][0-9] --> [A-Z]-[0-9]{4} (3)
grep is the tool you are looking for:
grep '[0-9]\{4\}'
This awk command checks for the ID contains letter number combination. If it's there, then it prints then corresponding line.
$ awk -F/ '$3~/[A-Z-]*[0-9][A-Z0-9]*/ {print}' file
"111/S/H0110//Jake, Greenfield ServiceRequest/bin/ksh"
"740/S/H5155//Jake, Greenfield/bin/ksh"
If you want only the numbers in the ID field then try this command,
$ awk -F/ '$3~/[A-Z-]*[0-9][A-Z0-9]*/ { gsub (/[A-Z-]/,"",$3); print $3}' file
0110
5155

Resources