How do I reformat MAC addresses using sed? - mac-address

I have a MAC address which is in the standard :-separated format, such as 08:f1:ea:6d:03:3c, that I would like to print out as three sets of four hexadecimal digits separated by a ., such as 08f1.ea6d.033c
I tried echo 08:f1:ea:6d:03:3c | sed 's/://g', but that produces: 08f1ea6d033c.
Any help would be appreciated.

Take what you've got, and add 's/.\{4\}/&./g;s/.$//' to the sed expression. All together, that's:
echo 08:f1:ea:6d:03:3c | sed 's/://g;s/.\{4\}/&./g;s/.$//'
As you've noted, your sed expression removes all :s; the ;s/.\{4\}/&./g adds a . after every four digits, and the ;s/.$//' removes the trailing . added by the previous expression.

Related

unix command to extract digits after last alphabetical string

String:"gamma021AH00999NAK41"
last two digit may vary.It may be 3 digit 4 digit ,etc...
"NAK" in the given string can be any other string but it contains only characters.
So my intention is to extract last numbers(example 41 in the given string) until first character.
Thanks in advance
Using only shell builtins (no external commands like sed or awk, thus much faster if you're going to be repeating this over and over, f/e, once per line):
s=gamma021AH00999NAK41
result=${s##*[[:alpha:]]}
echo "$result"
${var##pattern} is a parameter expansion which removes the longest possible match for pattern from the front of the value of var before returning it. *[[:alpha:]], as a wildcard followed by an alpha character, will thus remove everything before the K in your string.
You can replace all the alphabetic characters by for example "#" and then take the last field based on the "#" separator:
echo "gamma021AH00999NAK41" | sed "s/[aA-zZ]/#/g" | awk -F'#' '{print $NF}'
NOTE: This won't work if you have other than alphanumeric symbols in your string.
EDIT: Only without awk (Thanks #CharlesDuffy):
echo "gamma021AH00999NAK41" | awk -F'[[:alpha:]]' '{print $NF}'
I see no mention of varying length, so this command will work:
echo "gamma021AH00999NAK41" | cut -b '19-'
Answer : 41

Extract text between two special characters

Trying to extract the text between the special characters "\ and \" through sed
Ex: "\hell##$\"},
expected output : hell##$
You can do it quite easily with using a capture-group and backreference with basic regular-expressions:
sed 's/^["][\]\([^\]*\).*$/\1/'
Explanation
Normal substitution sed 's/find/replace/, where
find is ^["][\] a double-quote and \ before beginning the capture \(...\) which contains [^\]* (zero or more characters not a \), the closing of the capture \) and then .*$ the remainder of the string;
replace is \1 (the first backreference) containing the text captured between \(...\).
(note: if your "\ doesn't begin the string, remove the first '^' anchor)
Example
$ echo '"\hell##$\"},' | sed 's/^["][\]\([^\]*\).*$/\1/'
hell##$
Look things over and let me know if you have questions.
This might work for you (GNU sed):
sed -nE '/"\\[^\\]*\\+([^\\"][^\\]*\\+)*"/{s/"\\/\n/;s/.*\n//;s/\\"/\n/;P;D}' file
The solution comes in two parts:
Firstly, a regexp to determine whether a pair of two characters exists. This can be tricky as a negated class is insufficient because edge cases can easily defeat a simplistic approach.
Secondly, once a pair of characters does exist the text between them must be extracted piece meal.

How to do command "grep -oP" in a line that contains special Characters?

How can I grep a line that contains special characters.
as for example I have a file containing this text
ISA^G00^G ^G00^G ^G12^G14147844480 ^GZZ^G001165208 ^G160601^G1903^GU^G00401^G600038486^G0^GP^G>~GS^GTX^G14147844480^G001165208^G20160601^G1903^G600038486^GX^G004010VICS~ST^G864^G384860001~BMG^G00^G^G04~MIT^G000000591^GKohl's AS2 Certificate Change June 21, 2016~N1^GFR^GKOHL'S DEPARTMENT STORES~PER^GIC^GEDIMIO#kohls.com^GTE^G262-703-7334~MSG^GAttention Kohl's AS2 trading partners, Kohl's will be changing.
I would like to grep the line under MSG segment
using this command:
grep -oP 'MSG.\K[\w\s\d]*' < filename
Expected Result :
Attention Kohl's AS2 trading partners, Kohl's will be changing.
Actual Result:
Attention Kohl
How will I do it?
Your pattern:
grep -Po 'MSG.\K[\w\s\d]*'
is matching just Attention Kohl because you have a single quote after that which will not be matched by any of the \w, \s, \d tokens.
You also have , and . within your desired portion, so you need to match those too. Also, \d is actually a subset of \w so no need for explicit \d.
So you can do:
grep -Po 'MSG.\K[\w\s,.'\'']*'
Or if you you just want to match till the end:
grep -Po 'MSG.\K.*'
Do you just want grep everything after MSG? That would be simpler with other methods.
Also I see there are multiple ^G characters in your file, adjacent to MSG word as well. Not sure if you want to exclude those while grep'ing.
Back to your given regex - you can add \W which will match a non-word character and give you desired result.
grep -oP 'MSG.\K[\w\s\d\W]*' filename
Also no need to use < operator to grep here.

Dynamic delimiter in Unix

Input:-
echo "1234ABC89,234" # A
echo "0520001DEF78,66" # B
echo "46545455KRJ21,00"
From the above strings, I need to split the characters to get the alphabetic field and the number after that.
From "1234ABC89,234", the output should be:
ABC
89,234
From "0520001DEF78,66", the output should be:
DEF
78,66
I have many strings that I need to split like this.
Here is my script so far:
echo "1234ABC89,234" | cut -d',' -f1
but it gives me 1234ABC89 which isn't what I want.
Assuming that you want to discard leading digits only, and that the letters will be all upper case, the following should work:
echo "1234ABC89,234" | sed 's/^[0-9]*\([A-Z]*\)\([0-9].*\)/\1\n\2/'
This works fine with GNU sed (I have 4.2.2), but other sed implementations might not like the \n, in which case you'll need to substitute something else.
Depending on the version of sed you can try:
echo "0520001DEF78,66" | sed -E -e 's/[0-9]*([A-Z]*)([,0-9]*)/\1\n\2/'
or:
echo "0520001DEF78,66" | sed -E -e 's/[0-9]*([A-Z]*)([,0-9]*)/\1$\2/' | tr '$' '\n'
DEF
78,66
Explanation: the regular expression replaces the input with the expected output, except instead of the new-line it puts a "$" sign, that we replace to a new-line with the tr command
Where do the strings come from? Are they read from a file (or other source external to the script), or are they stored in the script? If they're in the script, you should simply reformat the data so it is easier to manage. Therefore, it is sensible to assume they come from an external data source such as a file or being piped to the script.
You could simply feed the data through sed:
sed 's/^[0-9]*\([A-Z]*\)/\1 /' |
while read alpha number
do
…process the two fields…
done
The only trick to watch there is that if you set variables in the loop, they won't necessarily be visible to the script after the done. There are ways around that problem — some of which depend on which shell you use. This much is the same in any derivative of the Bourne shell.
You said you have many strings like this, so I recommend if possible save them to a file such as input.txt:
1234ABC89,234
0520001DEF78,66
46545455KRJ21,00
On your command line, try this sed command reading input.txt as file argument:
$ sed -E 's/([0-9]+)([[:alpha:]]{3})(.+)/\2\t\3/g' input.txt
ABC 89,234
DEF 78,66
KRJ 21,00
How it works
uses -E for extended regular expressions to save on typing, otherwise for example for grouping we would have to escape \(
uses grouping ( and ), searches three groups:
firstly digits, + specifies one-or-more of digits. Oddly using [0-9] results in an extra blank space above results, so use POSIX class [[:digit:]]
the next is to search for POSIX alphabetical characters, regardless if lowercase or uppercase, and {3} specifies to search for 3 of them
the last group searches for . meaning any character, + for one or more times
\2\t\3 then returns group 2 and group 3, with a tab separator
Thus you are able to extract two separate fields per line, just separated by tab, for easier manipulation later.

grep for a specific pattern in a file?

I have a file textFile.txt
abc_efg#qwe.asd
abc_aer#
#avret
afd_wer_asd#qweasd.zxcasd
wqe_a#qwea.cae
qwe.caer
I want to grep to get specific lines :
abc_efg#qwe.asd
afd_wer_asd#qweasd.zxcasd
wqe_a#qwea.cae
That is the ones that have
[a-z]_[a-z]#[a-z].[a-z]
but the part before the # can have any number of "_"
So far this is what I have :
grep "[a-z]_[a-z]#[a-z].[a-z]" textFile.txt
But I got only one line as the output.
wqe_a#qwea.cae
Could I know a better way to do this ? :)
you can add the _ simply inside [a-z_] so the new command is:
grep "[a-z_]#[a-z].[a-z]" textFile.txt
or if you want it to start with a non _ you can have
grep "[a-z][a-z_]#[a-z].[a-z]" textFile.txt
I would suggest keeping it simple by checking only one # is present in each line:
grep -E '^[^#]+#[^#]+$' file
abc_efg#qwe.asd
afd_wer_asd#qweasd.zxcasd
wqe_a#qwea.cae
The following selects lines that have at least one underline character followed by letters before the at-sign and one or more letters followed by at least one literal period after the at-sign:
$ grep '_[a-z]\+#[a-z]\+\.' textFile.txt
abc_efg#qwe.asd
afd_wer_asd#qweasd.zxcasd
wqe_a#qwea.cae
Notes
An unescaped period matches any character. If you want to match a literal period, it must be escaped like '.`.
Thus, #[a-z].[a-z] matches an at-sign, followed by a letter, followed by anything at all, followed by a letter.
[a-z] matches a single letter. Thus _[a-z]# would match only if there was only one character between the underline and the at-sign. To match one or more letters, use [a-z]\+.
#[a-z]\+\. will match an at-sign, followed by one or more letters, followed by a literal period character.
When you do [a-z] it only matches one character of that set. That's why you are only getting wqe_a#qwea.cae back from your grep call because there is only one character between the _ and the #.
To match more than one character, you can use + or *. + means one or more of the set and * any number of that set. As well, an unescaped . means any character.
So something like:
grep "[a-z]\+_[a-z]\+#[a-z]\+\.[a-z]\+" textFile.txt would work for this. There are shorter, less specific ways of doing this as well (that other answers have shown).
Note the escapes before the + signs and the . .
This regex should get all valid email from a text file:
grep -E -o "\b[A-Za-z0-9._%+-]+#[A-Za-z0-9.-]+\.[A-Za-z]{2,6}\b" file
abc_efg#qwe.asd
afd_wer_asd#qweasd.zxcasd
wqe_a#qwea.cae
This greps for pattern like this text#text.some_more_text

Resources