Remove words starting with "_" in file using sed in bash

Remove words starting with "_" in file using sed in bash - macos

Given:
this is a_REMOVEME test_REMOVEME for the win
I want to get:
this is a test for the win
What I currently have doesn't seem to do the trick as it only removes the _:
sed -e 's/_\w*//g' myfile.txt

\w may not be understood by your version of sed by default. Instead of \w try, say, [A-Za-z0-9] or [^ ] to match any non-space characters. You may also want to try sed -re to turn on extended regexp support.

try this
sed -ie 's/_[A-Za-z0-9]* / /g' here.txt

It's funny, if you have done that using command-line perl... it would work:
$ echo "this is a_REMOVEME test_REMOVEME for the win" | perl -pe 's/_\w*//g'
this is a test for the win

Related

Replace all unquoted characters from a file bash

Using bash, how would one replace all unquoted characters from a file?
I have a system that I can't modify that spits out CSV files such as:
code;prop1;prop2;prop3;prop4;prop5;prop6
0,1000,89,"a1,a2,a3",33,,
1,,,"a55,a10",1,1 L,87
2,25,1001,a4,,"1,5 L",
I need this to become, for a new system being added
code;prop1;prop2;prop3;prop4;prop5;prop6
0;1000;89;a1,a2,a3;33;;
1;;;a55,a10;1;1 L;87
2;25;1001;a4;1,5 L;
If the quotes can be removed after this substitution happens in one command it would be nice :) But I prefer clarity to complicated one-liners for future maintenance.
Thank you

With sed:
sed -e 's/,/;/g' -e ':loop; s/\("\)\([^;]*\);\([^"]*"\)/\1\2,\3/; t loop'
Test:
$ sed -e 's/,/;/g' -e ':loop; s/\("\)\([^;]*\);\([^"]*"\)/\1\2,\3/; t loop' yourfile
code;prop1;prop2;prop3;prop4;prop5;prop6
0;1000;89;"a1,a2,a3";33;;
1;;;"a55,a10";1;1 L;87
2;25;1001;a4;;"1,5 L";

You want to use a csv parser. Parsing csv with shell tools is hard (you will encounter regular expressions soon, and they rarely get all cases).
There is one in almost every language. I recommend python.
You can also do this using excel/openoffice variants by opening the file and then saving with ; as the separator.

You can used sed:
echo '0,1000,89,"a1,a2,a3",33,,' | sed -e "s|\"||g"
This will replace " with the empty string (deletes it), and you can pipe another sed to replace the , with ;:
sed -e "s|,|;|g"
$ echo '0,1000,89,"a1,a2,a3",33,,' | sed -e "s|\"||g" | sed -e "s|,|;|g"
>> 0;1000;89;a1;a2;a3;33;;
Note that you can use any separator you want instead of | inside the sed command. For example, you can rewrite the first sed as:
sed -e "s-\"--g"

How to toggle cases of characters in a string with a one-liner?

"I am Groot" should be changed to "i AM gROOT" using sed one-liner.
I've tried...
sed -e 's/(.*)/\L\1/' -e 's/(.*)/\U\1/'
..., but both expressions don't seem to run in parallel.
Any suggestions?

Using sed:
$ echo "I am Groot" | sed 'y/ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz/abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ/'
i AM gROOT
tr is a little more compact (but not unicode-safe):
$ echo "I am Groot" | tr '[:upper:][:lower:]' '[:lower:][:upper:]'
i AM gROOT
Another sed solution (requires GNU sed)
The following toggles the case with help from a character that we think will never be in a sed input line. One possiblity would be to chose \x00 for that character because it can never be part of a bash variable. Another is to chose \n because it is never part of a sed input line. For the following, \n was chosen.
All lower case characters in the input are tagged by putting a \n in front of them. Then, any upper-case character is converted to lower case. Finally, any character with a \n in front of it is converted to upper case:
$ echo "I am Groot" | sed -r 's/[[:lower:]]/\n&/g; s/[[:upper:]]/\L&/g; s/\n(.)/\U\1/g'
i AM gROOT

This might work for you (GNU sed):
sesed "s/.*/echo '&'|tr '[:upper:][:lower:]' '[:lower:][:upper:]'/e" file
Used GNU sed's evaluation command but really why not just use tr?
N.B. Can be used in conjunction with the -i option to reverse case a file in place.

Replace all hyphens at the end of line by '?'

I have following file:
------FGJFG----HULKJ----LKHJ-------
---JKLJLK-----UIOUOPPOIPIPIPOPIP---
GGJKHKLJK----------JKLHKLJLKJLKJLKJ
and I want this:
??????FGJFG----HULKJ----LKHJ???????
???JKLJLK-----UIOUOPPOIPIPIPOPIP???
GGJKHKLJK----------JKLHKLJLKJLKJLKJ
i.e., I want to replace all leading and tailing '-' by the same number of '?', but not the '-' between letters
I know how to do this for leading:
sed -i ':a;s/^\(-*\)-/\1?/;ta' file
but how can I modify the command to replace '-' at the end of lines?

You can use perl:
perl -pe 's/\G-|-(?=-*$)/?/g'
Output:
cat file
??????FGJFG----HULKJ----LKHJ???????
???JKLJLK-----UIOUOPPOIPIPIPOPIP???
GGJKHKLJK----------JKLHKLJLKJLKJLKJ

Here is an awk
awk -F"[^-]" '{a=b="";for (i=1;i<=length($1);i++) a="?"a;sub(/^-+/,a);for (i=1;i<=length($NF);i++) b="?"b;sub(/-+$/,b)}1' file
??????FGJFG----HULKJ----LKHJ???????
???JKLJLK-----UIOUOPPOIPIPIPOPIP???
GGJKHKLJK----------JKLHKLJLKJLKJLKJ

In Perl,
perl -pe 's/^(-+)/"?" x length($1)/e; s/(-+)$/"?" x length($1)/e' file

Through Perl.
$ perl -pe 's/(?<=[^-])-+(?=[^-\n])(*SKIP)(*F)|-/?/g' file
??????FGJFG----HULKJ----LKHJ???????
???JKLJLK-----UIOUOPPOIPIPIPOPIP???
GGJKHKLJK----------JKLHKLJLKJLKJLKJ
(?<=[^-])-+(?=[^-\n]) matches all the hyphens which are at the middle. (*SKIP)(*F) makes the match to fail and - after | will match all the hyphens from the remaining string.

The same looping mechanism that you're using for the leading dashes can be used for the trailing ones, with appropriate minor changes:
sed -i -e ':a;s/^\(-*\)-/\1?/;ta' \
-e ':b;s/-\(-*\)$/?\1/;tb' file
Note that this assumes GNU sed on several grounds; you have to do things a bit differently with the BSD (Mac OS X) version of sed.

This one-line Perl program will do as you ask
perl -pe 's/(?|^(\-+)|(\-+)$)/$1=~tr|-|?|r/eg' myfile.txt

Replace comma with newline in sed on MacOS?

I have a file of strings that are comma separated. I'm trying to replace the commas with a new line. I've tried:
sed 's/,/\n/g' file
but it is not working. What am I missing?

Use tr instead:
tr , '\n' < file

Use an ANSI-C quoted string $'string'
You need a backslash-escaped literal newline to get to sed.
In bash at least, $'' strings will replace \n with a real newline, but then you have to double the backslash that sed will see to escape the newline, e.g.
echo "a,b" | sed -e $'s/,/\\\n/g'
Note this will not work on all shells, but will work on the most common ones.

sed 's/,/\
/g'
works on Mac OS X.

If your sed usage tends to be entirely substitution expressions (as mine tends to be), you can also use perl -pe instead
$ echo 'foo,bar,baz' | perl -pe 's/,/,\n/g'
foo,
bar,
baz

MacOS is different, there is two way to solve this problem with sed in mac
first ,use \'$'\n'' replace \n, it can work in MacOS:
sed 's/,/\'$'\n''/g' file
the second, just use an empty line:
sed 's/,/\
/g' file
Ps. Pay attention the range separated by '
the third, use gnu-sed replace the mac-sed

Apparently \r is the key!
$ sed 's/, /\r/g' file3.txt > file4.txt
Transformed this:
ABFS, AIRM, AMED, BOSC, CALI, ECPG, FRGI, GERN, GTIV, HSON, IQNT, JRCC, LTRE,
MACK, MIDD, NKTR, NPSP, PME, PTIX, REFR, RSOL, UBNT, UPI, YONG, ZEUS
To this:
ABFS
AIRM
AMED
BOSC
CALI
ECPG
FRGI
GERN
GTIV
HSON
IQNT
JRCC
LTRE
MACK
MIDD
NKTR
NPSP
PME
PTIX
REFR
RSOL
UBNT
UPI
YONG
ZEUS

This works on MacOS Mountain Lion (10.8), Solaris 10 (SunOS 5.10) and RHE Linux (Red Hat Enterprise Linux Server release 5.3, Tikanga)...
$ sed 's/{pattern}/\^J/g' foo.txt > foo2.txt
... where the ^J is done by doing ctrl+v+j. Do mind the \ before the ^J.
PS, I know the sed in RHEL is GNU, the MacOS sed is FreeBSD based, and although I'm not sure about the Solaris sed, I believe this will work pretty much with any sed. YMMV tho'...

To make it complete, this also works:
echo "a,b" | sed "s/,/\\$(echo -e '\n\r')/"

Though I am late to this post, just updating my findings. This answer is only for Mac OS X.
$ sed 's/new/
> /g' m1.json > m2.json
sed: 1: "s/new/
/g": unescaped newline inside substitute pattern
In the above command I tried with Shift+Enter to add new line which didn't work. So this time I tried with "escaping" the "unescaped newline" as told by the error.
$ sed 's/new/\
> /g' m1.json > m2.json
Worked! (in Mac OS X 10.9.3)

$ echo $PATH | sed -e $'s/:/\\\n/g'
/usr/local/sbin
/Library/Oracle/instantclient_11_2/sdk
/usr/local/bin
...
Works for me on Mojave

Just to clearify: man-page of sed on OSX (10.8; Darwin Kernel Version 12.4.0) says:
[...]
Sed Regular Expressions
The regular expressions used in sed, by default, are basic regular expressions (BREs, see re_format(7) for more information), but extended
(modern) regular expressions can be used instead if the -E flag is given. In addition, sed has the following two additions to regular
expressions:
1. In a context address, any character other than a backslash (``\'') or newline character may be used to delimit the regular expression.
Also, putting a backslash character before the delimiting character causes the character to be treated literally. For example, in the
context address \xabc\xdefx, the RE delimiter is an ``x'' and the second ``x'' stands for itself, so that the regular expression is
``abcxdef''.
2. The escape sequence \n matches a newline character embedded in the pattern space. You cannot, however, use a literal newline charac-
ter in an address or in the substitute command.
[...]
so I guess one have to use tr - as mentioned above - or the nifty
sed "s/,/^M
/g"
note: you have to type <ctrl>-v,<return> to get '^M' in vi editor

The sed on macOS Mojave was released in 2005, so one solution is to install the gnu-sed,
brew install gnu-sed
then use gsed will do as you wish,
gsed 's/,/\n/g' file
If you prefer sed, just sudo sh -c 'echo /usr/local/opt/gnu-sed/libexec/gnubin > /etc/paths.d/brew', which is suggested by brew info gnu-sed. Restart your term, then your sed in command line is gsed.

FWIW, the following line works in windows and replaces semicolons in my path variables with a newline. I'm using the tools installed under my git bin directory.
echo %path% | sed -e $'s/;/\\n/g' | less

I have found another command that is working also.
find your_filename.txt -type f -exec sed -i 's/,/\n/g' {} \;

How to insert a newline in front of a pattern?

How to insert a newline before a pattern within a line?
For example, this will insert a newline behind the regex pattern.
sed 's/regex/&\n/g'
How can I do the same but in front of the pattern?
Given this sample input file, the pattern to match on is the phone number.
some text (012)345-6789
Should become
some text
(012)345-6789

This works in bash and zsh, tested on Linux and OS X:
sed 's/regexp/\'$'\n/g'
In general, for $ followed by a string literal in single quotes bash performs C-style backslash substitution, e.g. $'\t' is translated to a literal tab. Plus, sed wants your newline literal to be escaped with a backslash, hence the \ before $. And finally, the dollar sign itself shouldn't be quoted so that it's interpreted by the shell, therefore we close the quote before the $ and then open it again.
Edit: As suggested in the comments by #mklement0, this works as well:
sed $'s/regexp/\\\n/g'
What happens here is: the entire sed command is now a C-style string, which means the backslash that sed requires to be placed before the new line literal should now be escaped with another backslash. Though more readable, in this case you won't be able to do shell string substitutions (without making it ugly again.)

Some of the other answers didn't work for my version of sed.
Switching the position of & and \n did work.
sed 's/regexp/\n&/g'
Edit: This doesn't seem to work on OS X, unless you install gnu-sed.

In sed, you can't add newlines in the output stream easily. You need to use a continuation line, which is awkward, but it works:
$ sed 's/regexp/\
&/'
Example:
$ echo foo | sed 's/.*/\
&/'
foo
See here for details. If you want something slightly less awkward you could try using perl -pe with match groups instead of sed:
$ echo foo | perl -pe 's/(.*)/\n$1/'
foo
$1 refers to the first matched group in the regular expression, where groups are in parentheses.

On my mac, the following inserts a single 'n' instead of newline:
sed 's/regexp/\n&/g'
This replaces with newline:
sed "s/regexp/\\`echo -e '\n\r'`/g"

echo one,two,three | sed 's/,/\
/g'

You can use perl one-liners much like you do with sed, with the advantage of full perl regular expression support (which is much more powerful than what you get with sed). There is also very little variation across *nix platforms - perl is generally perl. So you can stop worrying about how to make your particular system's version of sed do what you want.
In this case, you can do
perl -pe 's/(regex)/\n$1/'
-pe puts perl into a "execute and print" loop, much like sed's normal mode of operation.
' quotes everything else so the shell won't interfere
() surrounding the regex is a grouping operator. $1 on the right side of the substitution prints out whatever was matched inside these parens.
Finally, \n is a newline.
Regardless of whether you are using parentheses as a grouping operator, you have to escape any parentheses you are trying to match. So a regex to match the pattern you list above would be something like
\(\d\d\d\)\d\d\d-\d\d\d\d
\( or \) matches a literal paren, and \d matches a digit.
Better:
\(\d{3}\)\d{3}-\d{4}
I imagine you can figure out what the numbers in braces are doing.
Additionally, you can use delimiters other than / for your regex. So if you need to match / you won't need to escape it. Either of the below is equivalent to the regex at the beginning of my answer. In theory you can substitute any character for the standard /'s.
perl -pe 's#(regex)#\n$1#'
perl -pe 's{(regex)}{\n$1}'
A couple final thoughts.
using -ne instead of -pe acts similarly, but doesn't automatically print at the end. It can be handy if you want to print on your own. E.g., here's a grep-alike (m/foobar/ is a regex match):
perl -ne 'if (m/foobar/) {print}'
If you are finding dealing with newlines troublesome, and you want it to be magically handled for you, add -l. Not useful for the OP, who was working with newlines, though.
Bonus tip - if you have the pcre package installed, it comes with pcregrep, which uses full perl-compatible regexes.

In this case, I do not use sed. I use tr.
cat Somefile |tr ',' '\012'
This takes the comma and replaces it with the carriage return.

To insert a newline to output stream on Linux, I used:
sed -i "s/def/abc\\\ndef/" file1
Where file1 was:
def
Before the sed in-place replacement, and:
abc
def
After the sed in-place replacement. Please note the use of \\\n. If the patterns have a " inside it, escape using \".

Hmm, just escaped newlines seem to work in more recent versions of sed (I have GNU sed 4.2.1),
dev:~/pg/services/places> echo 'foobar' | sed -r 's/(bar)/\n\1/;'
foo
bar

echo pattern | sed -E -e $'s/^(pattern)/\\\n\\1/'
worked fine on El Captitan with () support

In my case the below method works.
sed -i 's/playstation/PS4/' input.txt
Can be written as:
sed -i 's/playstation/PS4\nplaystation/' input.txt
PS4
playstation
Consider using \\n while using it in a string literal.
sed : is stream editor
-i : Allows to edit the source file
+: Is delimiter.
I hope the above information works for you 😃.

in sed you can reference groups in your pattern with "\1", "\2", ....
so if the pattern you're looking for is "PATTERN", and you want to insert "BEFORE" in front of it, you can use, sans escaping
sed 's/(PATTERN)/BEFORE\1/g'
i.e.
sed 's/\(PATTERN\)/BEFORE\1/g'

You can also do this with awk, using -v to provide the pattern:
awk -v patt="pattern" '$0 ~ patt {gsub(patt, "\n"patt)}1' file
This checks if a line contains a given pattern. If so, it appends a new line to the beginning of it.
See a basic example:
$ cat file
hello
this is some pattern and we are going ahead
bye!
$ awk -v patt="pattern" '$0 ~ patt {gsub(patt, "\n"patt)}1' file
hello
this is some
pattern and we are going ahead
bye!
Note it will affect to all patterns in a line:
$ cat file
this pattern is some pattern and we are going ahead
$ awk -v patt="pattern" '$0 ~ patt {gsub(patt, "\n"patt)}1' d
this
pattern is some
pattern and we are going ahead

sed -e 's/regexp/\0\n/g'
\0 is the null, so your expression is replaced with null (nothing) and then...
\n is the new line
On some flavors of Unix doesn't work, but I think it's the solution to your problem.
echo "Hello" | sed -e 's/Hello/\0\ntmow/g'
Hello
tmow

This works in MAC for me
sed -i.bak -e 's/regex/xregex/g' input.txt sed -i.bak -e 's/qregex/\'$'\nregex/g' input.txt
Dono whether its perfect one...

After reading all the answers to this question, it still took me many attempts to get the correct syntax to the following example script:
#!/bin/bash
# script: add_domain
# using fixed values instead of command line parameters $1, $2
# to show typical variable values in this example
ipaddr="127.0.0.1"
domain="example.com"
# no need to escape $ipaddr and $domain values if we use separate quotes.
sudo sed -i '$a \\n'"$ipaddr www.$domain $domain" /etc/hosts
The script appends a newline \n followed by another line of text to the end of a file using a single sed command.

In vi on Red Hat, I was able to insert carriage returns using just the \r character. I believe this internally executes 'ex' instead of 'sed', but it's similar, and vi can be another way to do bulk edits such as code patches. For example. I am surrounding a search term with an if statement that insists on carriage returns after the braces:
:.,$s/\(my_function(.*)\)/if(!skip_option){\r\t\1\r\t}/
Note that I also had it insert some tabs to make things align better.

Just to add to the list of many ways to do this, here is a simple python alternative. You could of course use re.sub() if a regex were needed.
python -c 'print(open("./myfile.txt", "r").read().replace("String to match", "String to match\n"))' > myfile_lines.txt

sed 's/regexp/\'$'\n/g'
works as justified and detailed by mojuba in his answer .
However, this did not work:
sed 's/regexp/\\\n/g'
It added a new line, but at the end of the original line, a \n was added.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Remove words starting with "_" in file using sed in bash - macos

Given: this is a_REMOVEME test_REMOVEME for the win I want to get: this is a test for the win What I currently have doesn't seem to do the trick as it only removes the _: sed -e 's/_\w*//g' myfile.txt

\w may not be understood by your version of sed by default. Instead of \w try, say, [A-Za-z0-9] or [^ ] to match any non-space characters. You may also want to try sed -re to turn on extended regexp support.

try this sed -ie 's/_[A-Za-z0-9]* / /g' here.txt

It's funny, if you have done that using command-line perl... it would work: $ echo "this is a_REMOVEME test_REMOVEME for the win" | perl -pe 's/_\w*//g' this is a test for the win

Related

Replace all unquoted characters from a file bash

How to toggle cases of characters in a string with a one-liner?

Replace all hyphens at the end of line by '?'

Replace comma with newline in sed on MacOS?

How to insert a newline in front of a pattern?

Categories

Resources