"sed" doesn't match pattern - bash

I'm trying to format cut, paste output but sed not working.
file.txt
Apple
Banana
Apple
Banana
Orange
Apple
Orange
code.sh
cut -f2 file.txt | sort | uniq | sed 's/^\|$/#/g'| paste -sd,\& -
expected output / output on ubuntu
#Apple#,#Banana#&#Orange#
getting output / output on macos
Apple,Banana&Orange
Note: The code works on Ubuntu, but on MacOS it doesn't.

This can be done in a single gnu-awk:
awk '!seen[$1]++{} END {
PROCINFO["sorted_in"]="#ind_str_asc"
for (i in seen)
s = s (s == "" ? "" : (++j==1?",":"&")) "#" i "#"
print s
}' file
#Apple#,#Banana#&#Orange#
On OSX I have gnu awk installed via home brew.

As mentioned elsewhere, BSD sed doesn't support \|. Instead of replacing ^ and $, you can substitute # around the whole line.
sort -u file.txt | sed 's/.*/#&#/' | paste -sd,'&' -

As far as I know, BSD/Mac sed doesn't support \|. See sed not giving me correct substitute operation for newline with Mac - differences between GNU sed and BSD / OSX sed for details.
As an alternate, you can use ERE instead of BRE. I checked it on Linux, apparently this still doesn't seem to work on Mac (See also: MacOS sed: match either beginning or end).
$ echo 'Apple' | sed -E 's/^|$/#/g'
#Apple#
# workaround for Mac
$ echo 'Apple' | sed -e 's/^/#/' -e 's/$/#/'
#Apple#
Instead of sort+uniq+sed, you can also use awk (but note that awk solution shown here removes duplicates while preserving original order, doesn't sort the input):
$ awk '!seen[$0]++{print "#" $0 "#"}' ip.txt
#Apple#
#Banana#
#Orange#
Change $0 to $2 if you want only the second field, based on your use of cut

A simple way to do it using the sed command:
sed -E 's/[[:alnum:]]+/#&#/'
the -E option for enabling the POSIX ERE (extended regular
expression)
[[:alnum:]]+ The alphanumeric characters; in ASCII, equivalent to [A-Za-z0-9] with the plus (+) to refer to one or more.
the & symbol, does bring or refer to the content of the pattern we found. (on which we surrounded it with #)

Related

sed: remove all characters except for last n characters

I am trying to remove every character in a text string except for the remaining 11 characters. The string is Sample Text_that-would$normally~be,here--pe_-l4_mBY and what I want to end up with is just -pe_-l4_mBY.
Here's what I've tried:
$ cat food
Sample Text_that-would$normally~be,here--pe_-l4_mBY
$ cat food | sed 's/^.*(.{3})$/\1/'
sed: 1: "s/^.*(.{3})$/\1/": \1 not defined in the RE
Please note that the text string isn't really stored in a file, I just used cat food as an example.
OS is macOS High Sierra 10.13.6 and bash version is 3.2.57(1)-release
You can use this sed with a capture group:
sed -E 's/.*(.{11})$/\1/' file
-pe_-l4_mBY
Basic regular expressions (used by default by sed) require both the parentheses in the capture group and the braces in the brace expression to be escaped. ( and { are otherwise treated as literal characters to be matched.
$ cat food | sed 's/^.*\(.\{3\}\)$/\1/'
mBY
By contrast, explicitly requesting sed to use extended regular expressions with the -E option reverses the meaning, with \( and \{ being the literal characters.
$ cat food | sed -E 's/^.*(.{3})$/\1/'
mBY
Try this also:
grep -o -E '.{11}$' food
grep, like sed, accepts an arbitrary number of file name arguments, so there is no need for a separate cat. (See also useless use of cat.)
You can use tail or Parameter Expansion :
string='Sample Text_that-would$normally~be,here--pe_-l4_mBY'
echo "$string" | tail -c 11
echo "${string#${string%??????????}}"
pe_-l4_mBY
pe_-l4_mBY
also with rev/cut/rev
$ echo abcdefghijklmnopqrstuvwxyz | rev | cut -c1-11 | rev
pqrstuvwxyz
man rev => rev - reverse lines characterwise

Sed regex, extracting part of a string in Mac terminal

I have sample data like "(stuff/thing)" and I'm trying to extract "thing".
I'm doing this in the terminal on OSX and I can't quite seem to get this right.
Here's the last broken attempt
echo '(stuff/thing)' | sed -n 's/\((.*)\)/\1/p'
I would say:
$ echo '(stuff/thing)' | sed -n 's#.*/\([^)]*\))#\1#p'
thing
I start saying:
$ echo '(stuff/thing)' | sed -n 's#.*/##p'
thing)
Note I use # as sed delimiter for better readability.
Then, I want to get rid of what comes from the ). For this, we have to capture the block with \([^)]*\)) and print it back with \1.
So all together this is doing:
# print the captured group
# ^^
# |
.*/\([^)]*\))#\1
# ^^^| ^^^^^ |
# | | ------|---- all but )
# | | |
# | ^^ ^^
# | capture group
# |
# everything up to a /
To provide an awk alternative to fedorqui's helpful answer:
awk makes it easy to parse lines into fields based on separators:
$ echo '(stuff/thing)' | awk -F'[()/]' '{print $3}'
thing
-F[()/] specifies that any of the characters ( ) / should serve as a field separator when breaking each input line into fields.
$3 refers to the 3rd field (thing is the 3rd field, because the line starts with a field separator, which implies that field 1 ($1) is the empty string before it).
As for why your sed command didn't work:
Since you're not using -E, you must use basic regexes (BREs), where, counter-intuitively, parentheses must be escaped to be special - you have it the other way around.
The main problem, however, is that in order to output only part of the line, you must match ALL of it, and replace it with the part of interest.
With a BRE, that would be:
echo '(stuff/thing)' | sed -n 's/^.*\/\(.*\))$/\1/p'
With an ERE (extended regex), it would be:
echo '(stuff/thing)' | sed -En 's/^.*\/(.*)\)$/\1/p'`
Also note that both commands work as-is with GNU sed, so the problem is not Mac-specific (but note that the -E option to activate EREs is an alias there for the better-known -r).
That said, regex dialects do differ across implementations; GNU sed generally supports extensions to the POSIX-mandated BREs and EREs.
I would do it in 2 easy parts - remove everything up to and including the slash and then everything from the closing parenthesis onwards:
echo '(stuff/thing)' | sed -e 's/.*\///' -e 's/).*//'

Reverse text file in Bash

How can I reverse a text file in Bash?
For example, if some.txt contains this:
book
pencil
ruler
then how can I get this? :
relur
licnep
koob
Try the combined form of tac and rev commands,
$ tac file | rev
relur
licnep
koob
From man tac
tac - concatenate and print files in reverse
From man rev
The rev utility copies the specified files to standard output, reversing
the order of characters in every line. If no files are specified, stan‐
dard input is read.
Mac OS X uses FreeBSD sed that allows escaped newlines in its replacement string.
The following version of the solution given by F. Hauri works for GNU sed 4.2.1, FreeBSD sed and minised 1.15.
escnl='\
'
sed -ne '
/../!b;
s/^.*$/'"${escnl}"'&'"${escnl}"'/;
tx;
:x;
s/\(\n.\)\(.*\)\(.\n\)/\3\2\1/;
tx;
s/\n//g;
:;
1!G;
$p;
h
' <<<$'book\npencil\nruler'

Bash - remove all lines beginning with 'P'

I have a text file that's about 300KB in size. I want to remove all lines from this file that begin with the letter "P". This is what I've been using:
> cat file.txt | egrep -v P*
That isn't outputting to console. I can use cat on the file without another other commands and it prints out fine. My final intention being to:
> cat file.txt | egrep -v P* > new.txt
No error appears, it just doesn't print anything out and if I run the 2nd command, new.txt is empty.
I should say I'm running Windows 7 with Cygwin installed.
Explanation
use ^ to anchor your pattern to the beginning of the line ;
delete lines matching the pattern using sed and the d flag.
Solution #1
cat file.txt | sed '/^P/d'
Better solution
Use sed-only:
sed '/^P/d' file.txt > new.txt
With awk:
awk '!/^P/' file.txt
Explanation
The condition starts with an ! (negation), that negates the following pattern ;
/^P/ means "match all lines starting with a capital P",
So, the pattern is negated to "ignore lines starting with a capital P".
Finally, it leverage awk's behavior when { … } (action block) is missing, that is to print the record validating the condition.
So, to rephrase, it ignores lines starting with a capital P and print everything else.
Note
sed is line oriented and awk column oriented. For your case you should use the first one, see Edouard Lopez's reponse.
Use sed with inplace substitution (for GNU sed, will also for your cygwin)
sed -i '/^P/d' file.txt
BSD (Mac) sed
sed -i '' '/^P/d' file.txt
Use start of line mark and quotes:
cat file.txt | egrep -v '^P.*'
P* means P zero or more times so together with -v gives you no lines
^P.* means start of line, then P, and any char zero or more times
Quoting is needed to prevent shell expansion.
This can be shortened to
egrep -v ^P file.txt
because .* is not needed, therefore quoting is not needed and egrep can read data from file.
As we don't use extended regular expressions grep will also work fine
grep -v ^P file.txt
Finally
grep -v ^P file.txt > new.txt
This works:
cat file.txt | egrep -v -e '^P'
-e indicates expression.

SED command error on MACOS X

I am trying to run this command on MacOSX terminal , which was initially intended to run on Linux
sed '1 i VISPATH=/mnt/local/gdrive/public/3DVis' init.txt >> ~/.bash_profile
but it gives me the error:
command i expects \ followed by text.
is there any way I could modify the above command to work on MacOSX terminal
Shelter is right but there's another way to do it. You can use the bash $'...' quoting to interpret the escapes before passing the string to sed.
So:
sed -iold '1i\'$'\n''text to prepend'$'\n' file.txt
^^^^^^^^ ^
/ |\|||/ \ |__ No need to reopen
| | \|/ | string to sed
Tells sed to | | | |
escape the next _/ | | +-----------------------------+
char | +-------------+ |
| | |
Close string The special bash Reopen string to
to sed newline char to send to sed
send to sed
This answer on unix.stackexchange.com led me to this solution.
Had the same problem and solved it with brew:
brew install gnu-sed
gsed YOUR_USUAL_SED_COMMAND
If you want to use the sed command, then you can set an alias:
alias sed=gsed
The OSX seds are based on older versions, you need to be much more literal in your directions to sed, AND you're lucky, in this case, sed is telling you exactly what to do. Untested as I don't have OSX, but try
sed '1 i\
VISPATH=/mnt/local/gdrive/public/3DVis
' init.txt >> ~/.bash_profile
Input via the i cmd is terminated by a blank line. Other sed instructions can follow after that. Note, NO chars after the \ char!
Also, #StephenNiedzielski is right. Use the single quote chars to wrap your sed statements. (if you need variable expansion inside your sed and can escape other uses of $, then you can also use dbl-quotes, but it's not recommended as a normal practices.
edit
As I understand now that you're doing this from the command-line, and not in a script or other editor, I have tested the above, and.... all I can say is that famous line from tech support ... "It works for me". If you're getting an error message
sed: -e expression #1, char 8: extra characters after command
then you almost certainly have added some character after the \. I just tested that, and I got the above error message. (I'm using a linux version of sed, so the error messages are exactly the same). You should edit your question to include an exact cut-paste of your command line and the new error message. Using curly-single-quotes will not work.
IHTH
Here's how I worked it out on OS X. In my case, I needed to prepend text to a file. Apparently, modern sed works like this:
sed -i '1i text to prepend' file.txt
But on OS X I had to do the following:
sed -i '' '1i\
text to prepend
' file.txt
It looks like you copied rich text. The single quotes should be straight not curly:
sed '1 i VISPATH=/mnt/local/gdrive/public/3DVis'

Resources