Replace all hyphens at the end of line by '?' - bash

I have following file:
------FGJFG----HULKJ----LKHJ-------
---JKLJLK-----UIOUOPPOIPIPIPOPIP---
GGJKHKLJK----------JKLHKLJLKJLKJLKJ
and I want this:
??????FGJFG----HULKJ----LKHJ???????
???JKLJLK-----UIOUOPPOIPIPIPOPIP???
GGJKHKLJK----------JKLHKLJLKJLKJLKJ
i.e., I want to replace all leading and tailing '-' by the same number of '?', but not the '-' between letters
I know how to do this for leading:
sed -i ':a;s/^\(-*\)-/\1?/;ta' file
but how can I modify the command to replace '-' at the end of lines?

You can use perl:
perl -pe 's/\G-|-(?=-*$)/?/g'
Output:
cat file
??????FGJFG----HULKJ----LKHJ???????
???JKLJLK-----UIOUOPPOIPIPIPOPIP???
GGJKHKLJK----------JKLHKLJLKJLKJLKJ

Here is an awk
awk -F"[^-]" '{a=b="";for (i=1;i<=length($1);i++) a="?"a;sub(/^-+/,a);for (i=1;i<=length($NF);i++) b="?"b;sub(/-+$/,b)}1' file
??????FGJFG----HULKJ----LKHJ???????
???JKLJLK-----UIOUOPPOIPIPIPOPIP???
GGJKHKLJK----------JKLHKLJLKJLKJLKJ

In Perl,
perl -pe 's/^(-+)/"?" x length($1)/e; s/(-+)$/"?" x length($1)/e' file

Through Perl.
$ perl -pe 's/(?<=[^-])-+(?=[^-\n])(*SKIP)(*F)|-/?/g' file
??????FGJFG----HULKJ----LKHJ???????
???JKLJLK-----UIOUOPPOIPIPIPOPIP???
GGJKHKLJK----------JKLHKLJLKJLKJLKJ
(?<=[^-])-+(?=[^-\n]) matches all the hyphens which are at the middle. (*SKIP)(*F) makes the match to fail and - after | will match all the hyphens from the remaining string.

The same looping mechanism that you're using for the leading dashes can be used for the trailing ones, with appropriate minor changes:
sed -i -e ':a;s/^\(-*\)-/\1?/;ta' \
-e ':b;s/-\(-*\)$/?\1/;tb' file
Note that this assumes GNU sed on several grounds; you have to do things a bit differently with the BSD (Mac OS X) version of sed.

This one-line Perl program will do as you ask
perl -pe 's/(?|^(\-+)|(\-+)$)/$1=~tr|-|?|r/eg' myfile.txt

Related

How to convert multiple parameters URLs into single parameter URLs in bash

$ cat urls.txt
http://example.com/test/test/test?apple=&bat=&cat=&dog=
https://test.com/test/test/test?aa=&bb=&cc=
http://target.com/test/test?hmm=
I want output like below 👇🏻 , how can i do that in bash ( single line command )
$ cat urls.txt
http://example.com/test/test/test?apple=
http://example.com/test/test/test?bat=
http://example.com/test/test/test?cat=
http://example.com/test/test/test?dog=
https://test.com/test/test/test?aa=
https://test.com/test/test/test?bb=
https://test.com/test/test/test?cc=
http://target.com/test/test?hmm=
With GNU awk:
$ awk -F'?|=&|=' '{for(i=2;i<NF;i++) print $1 "?" $i "="}' urls.txt
http://example.com/test/test/test?apple=
http://example.com/test/test/test?bat=
http://example.com/test/test/test?cat=
http://example.com/test/test/test?dog=
https://test.com/test/test/test?aa=
https://test.com/test/test/test?bb=
https://test.com/test/test/test?cc=
http://target.com/test/test?hmm=
I try use sed but it is complex. if use perl like this:
perl -pe 'if(/(.*\?)/){$url=$1;s#&#\n$url#g;}' url.txt
it works well.
With GNU awk using gensub():
awk '{print gensub(/^(https?:)(.*)(\?[[:alpha:]]+=)(.*)/,"\\1\\2\\3","g")}' file
http://example.com/test/test/test?apple=
https://test.com/test/test/test?aa=
http://target.com/test/test?hmm=
gensub() for specifying components of the regexp in the replacement text, using parentheses in the regexp to mark the components (four here). We print only 3 of them: "\\1\\2\\3" .
This might work for you (GNU sed):
sed -E 's/(([^?]+\?)[^=]+=)&/\1\n\2/;P;D' file
Replace each & by a newline and the substring before the first parameter, print/delete the first line and repeat.

How to convert parts of a line to uppercase in a file

I have a file file.txt and it has the lines below. I want the queuename to be converted to uppercase, like this: queuename=SP00245B
# Queue name
#
queuename=sp00245b
awk '$1 == "queuename" {$2 = toupper($2)}1' FS== OFS== input-file
Note that this will fail if there are 2 = in the line, and only the values between the first 2 = will be uppercased. If that's an issue, it's an easy fix (left as an exercise for the reader).
A simple Perl solution:
perl -i -pe 's/^\s*queuename=\K(.*)/\U$1/' file.txt
(Remove -i if you don't want to modify the file in place.)
With GNU sed:
sed -i 's/\(^[[:blank:]]*queuename=\)\(.*\)/\1\U\2/' file.txt
This uses two captures groups and the \U sequence to toggle uppercase substitution for the second group.
You can also use the sed conversion \U to convert the portions of the matched pattern with the substitution command to uppercase. To covert everything following the '=' sign you could use, e.g.
sed '/^queuename=/s/=.*$/\U&/' filename
To edit the file in-place, include the -i option, e.g.
sed -i '/^queuename=/s/=.*$/\U&/' filename
Example Use/Output
$ echo "queuename=sp00245b" | sed '/^queuename=/s/=.*$/\U&/'
queuename=SP00245B

Remove words starting with "_" in file using sed in bash

Given:
this is a_REMOVEME test_REMOVEME for the win
I want to get:
this is a test for the win
What I currently have doesn't seem to do the trick as it only removes the _:
sed -e 's/_\w*//g' myfile.txt
\w may not be understood by your version of sed by default. Instead of \w try, say, [A-Za-z0-9] or [^ ] to match any non-space characters. You may also want to try sed -re to turn on extended regexp support.
try this
sed -ie 's/_[A-Za-z0-9]* / /g' here.txt
It's funny, if you have done that using command-line perl... it would work:
$ echo "this is a_REMOVEME test_REMOVEME for the win" | perl -pe 's/_\w*//g'
this is a test for the win

to insert line breaks in a file whenever a comma is encountered-Shell script

I need to write a shell script to re-format a file by inserting line breaks. The condition is that a line break should be inserted when we encounter comma in the file.
For example, if the file delimiter.txt contains:
this, is a file, that should, be added, with a, line break, when we find, a comma.
The output should be:
this
is a file
that should
be added
with a
line break
when we find a
a comma.
Can this be done grep or awk?
Using GNU sed:
sed 's/, /\n/g' your.file
Output:
this
is a file
that should
be added
with a
line break
when we find a
a comma.
Note: the syntax above will work only on system that have the \n as line delimiter as Linux and the most UNIXes.
If you need a portal solution in a a script then use the following expression that uses a literal new line instead of \n:
sed 's/,[[:space:]]/\
/g' your.file
Thanks #EdMorten for this advice.
This is what tr is for
$ tr ',' '\n' <<< 'this, is a file, that should, be added, with a, line break, when we find, a comma.'
this
is a file
that should
be added
with a
line break
when we find
a comma.
Or if you must use awk:
awk '{gsub(", ", "\n", $0)}1' delimiter.txt
Solution using awk:
awk 1 RS=", " file
this
is a file
that should
be added
with a
line break
when we find
a comma.
Here's the solution using perl:
perl -pe 's#,#\n#g'
Here's a sample of it working properly on OpenBSD or OS X:
% echo 'a,b,c,d,e' | perl -pe 's#,#\n#g'
a
b
c
d
e
%
E.g., unlike the sed solutions earlier, this perl works everywhere, because the same search/replace snippet wouldn't work with the BSD sed on OpenBSD or OS X:
% echo 'a,b,c,d,e' | sed -E 's#,#\n#g'
anbncndne
%

How to insert a newline in front of a pattern?

How to insert a newline before a pattern within a line?
For example, this will insert a newline behind the regex pattern.
sed 's/regex/&\n/g'
How can I do the same but in front of the pattern?
Given this sample input file, the pattern to match on is the phone number.
some text (012)345-6789
Should become
some text
(012)345-6789
This works in bash and zsh, tested on Linux and OS X:
sed 's/regexp/\'$'\n/g'
In general, for $ followed by a string literal in single quotes bash performs C-style backslash substitution, e.g. $'\t' is translated to a literal tab. Plus, sed wants your newline literal to be escaped with a backslash, hence the \ before $. And finally, the dollar sign itself shouldn't be quoted so that it's interpreted by the shell, therefore we close the quote before the $ and then open it again.
Edit: As suggested in the comments by #mklement0, this works as well:
sed $'s/regexp/\\\n/g'
What happens here is: the entire sed command is now a C-style string, which means the backslash that sed requires to be placed before the new line literal should now be escaped with another backslash. Though more readable, in this case you won't be able to do shell string substitutions (without making it ugly again.)
Some of the other answers didn't work for my version of sed.
Switching the position of & and \n did work.
sed 's/regexp/\n&/g'
Edit: This doesn't seem to work on OS X, unless you install gnu-sed.
In sed, you can't add newlines in the output stream easily. You need to use a continuation line, which is awkward, but it works:
$ sed 's/regexp/\
&/'
Example:
$ echo foo | sed 's/.*/\
&/'
foo
See here for details. If you want something slightly less awkward you could try using perl -pe with match groups instead of sed:
$ echo foo | perl -pe 's/(.*)/\n$1/'
foo
$1 refers to the first matched group in the regular expression, where groups are in parentheses.
On my mac, the following inserts a single 'n' instead of newline:
sed 's/regexp/\n&/g'
This replaces with newline:
sed "s/regexp/\\`echo -e '\n\r'`/g"
echo one,two,three | sed 's/,/\
/g'
You can use perl one-liners much like you do with sed, with the advantage of full perl regular expression support (which is much more powerful than what you get with sed). There is also very little variation across *nix platforms - perl is generally perl. So you can stop worrying about how to make your particular system's version of sed do what you want.
In this case, you can do
perl -pe 's/(regex)/\n$1/'
-pe puts perl into a "execute and print" loop, much like sed's normal mode of operation.
' quotes everything else so the shell won't interfere
() surrounding the regex is a grouping operator. $1 on the right side of the substitution prints out whatever was matched inside these parens.
Finally, \n is a newline.
Regardless of whether you are using parentheses as a grouping operator, you have to escape any parentheses you are trying to match. So a regex to match the pattern you list above would be something like
\(\d\d\d\)\d\d\d-\d\d\d\d
\( or \) matches a literal paren, and \d matches a digit.
Better:
\(\d{3}\)\d{3}-\d{4}
I imagine you can figure out what the numbers in braces are doing.
Additionally, you can use delimiters other than / for your regex. So if you need to match / you won't need to escape it. Either of the below is equivalent to the regex at the beginning of my answer. In theory you can substitute any character for the standard /'s.
perl -pe 's#(regex)#\n$1#'
perl -pe 's{(regex)}{\n$1}'
A couple final thoughts.
using -ne instead of -pe acts similarly, but doesn't automatically print at the end. It can be handy if you want to print on your own. E.g., here's a grep-alike (m/foobar/ is a regex match):
perl -ne 'if (m/foobar/) {print}'
If you are finding dealing with newlines troublesome, and you want it to be magically handled for you, add -l. Not useful for the OP, who was working with newlines, though.
Bonus tip - if you have the pcre package installed, it comes with pcregrep, which uses full perl-compatible regexes.
In this case, I do not use sed. I use tr.
cat Somefile |tr ',' '\012'
This takes the comma and replaces it with the carriage return.
To insert a newline to output stream on Linux, I used:
sed -i "s/def/abc\\\ndef/" file1
Where file1 was:
def
Before the sed in-place replacement, and:
abc
def
After the sed in-place replacement. Please note the use of \\\n. If the patterns have a " inside it, escape using \".
Hmm, just escaped newlines seem to work in more recent versions of sed (I have GNU sed 4.2.1),
dev:~/pg/services/places> echo 'foobar' | sed -r 's/(bar)/\n\1/;'
foo
bar
echo pattern | sed -E -e $'s/^(pattern)/\\\n\\1/'
worked fine on El Captitan with () support
In my case the below method works.
sed -i 's/playstation/PS4/' input.txt
Can be written as:
sed -i 's/playstation/PS4\nplaystation/' input.txt
PS4
playstation
Consider using \\n while using it in a string literal.
sed : is stream editor
-i : Allows to edit the source file
+: Is delimiter.
I hope the above information works for you 😃.
in sed you can reference groups in your pattern with "\1", "\2", ....
so if the pattern you're looking for is "PATTERN", and you want to insert "BEFORE" in front of it, you can use, sans escaping
sed 's/(PATTERN)/BEFORE\1/g'
i.e.
sed 's/\(PATTERN\)/BEFORE\1/g'
You can also do this with awk, using -v to provide the pattern:
awk -v patt="pattern" '$0 ~ patt {gsub(patt, "\n"patt)}1' file
This checks if a line contains a given pattern. If so, it appends a new line to the beginning of it.
See a basic example:
$ cat file
hello
this is some pattern and we are going ahead
bye!
$ awk -v patt="pattern" '$0 ~ patt {gsub(patt, "\n"patt)}1' file
hello
this is some
pattern and we are going ahead
bye!
Note it will affect to all patterns in a line:
$ cat file
this pattern is some pattern and we are going ahead
$ awk -v patt="pattern" '$0 ~ patt {gsub(patt, "\n"patt)}1' d
this
pattern is some
pattern and we are going ahead
sed -e 's/regexp/\0\n/g'
\0 is the null, so your expression is replaced with null (nothing) and then...
\n is the new line
On some flavors of Unix doesn't work, but I think it's the solution to your problem.
echo "Hello" | sed -e 's/Hello/\0\ntmow/g'
Hello
tmow
This works in MAC for me
sed -i.bak -e 's/regex/xregex/g' input.txt sed -i.bak -e 's/qregex/\'$'\nregex/g' input.txt
Dono whether its perfect one...
After reading all the answers to this question, it still took me many attempts to get the correct syntax to the following example script:
#!/bin/bash
# script: add_domain
# using fixed values instead of command line parameters $1, $2
# to show typical variable values in this example
ipaddr="127.0.0.1"
domain="example.com"
# no need to escape $ipaddr and $domain values if we use separate quotes.
sudo sed -i '$a \\n'"$ipaddr www.$domain $domain" /etc/hosts
The script appends a newline \n followed by another line of text to the end of a file using a single sed command.
In vi on Red Hat, I was able to insert carriage returns using just the \r character. I believe this internally executes 'ex' instead of 'sed', but it's similar, and vi can be another way to do bulk edits such as code patches. For example. I am surrounding a search term with an if statement that insists on carriage returns after the braces:
:.,$s/\(my_function(.*)\)/if(!skip_option){\r\t\1\r\t}/
Note that I also had it insert some tabs to make things align better.
Just to add to the list of many ways to do this, here is a simple python alternative. You could of course use re.sub() if a regex were needed.
python -c 'print(open("./myfile.txt", "r").read().replace("String to match", "String to match\n"))' > myfile_lines.txt
sed 's/regexp/\'$'\n/g'
works as justified and detailed by mojuba in his answer .
However, this did not work:
sed 's/regexp/\\\n/g'
It added a new line, but at the end of the original line, a \n was added.

Resources