Ruby split keep the delimiter before the string - ruby

I have the following string :
a = '% abc \n %% abcd \n %% efgh\n '
I would like the ouput to be
['% abc \n', '%% abcd \n', '%% efgh \n']
If I have
b = '%% abc \n %% efg \n %% ijk \n]
I would like the output to be
['%% abc \n', '%% efg \n', '%% ijk \n']
I use b.split('%%').collect!{|v| '%%' + v } and it works fine for case 2.
but it doesn't work for case 1.
I saw some post of using 'scan' or 'split' to keep the delimiter if its after the string
For example : 'a; b; c' becomes ['a;', 'b;' ,'c']
But I want the opposite ['a', ';b', ';c']
There need not be space between \n and %% since \n depicts a new line.
A solution i made was
sel = '% asd \n %% asf sdaf \n %% adsasd asdf asd asf ';
delimiter = '%%';
indexOfPercent = test_string.index("%%")
if(indexOfPercent == 0)
result = (test_string || '').split(delimiter).reject(&:empty?).collect! {|v| delimiter + v}
else
result = (test_string.slice(test_string.index("%%")..-1) || '').split(delimiter).reject(&:empty?).collect! {|v| delimiter + v}
result.unshift(sel[0.. indexOfPercent-1])
end

(?<=\\n)\s*(?=%%)
You can split on the space using lookarounds.See demo.
https://regex101.com/r/fM9lY3/7

You could do it this way
def splitter(s)
#reject(&:empty) added to handle trailing space in a
s.lines.map{|n| n.lstrip.chomp(' ')}.reject(&:empty?)
end
#double quotes used to keep ruby from changing
# \n to \\n
a = "% abc \n %% abcd \n %% efgh\n "
b = "b = '%% abc \n %% efg \n %% ijk \n"
splitter(a)
#=> ["% abc \n", "%% abcd \n", "%% efgh\n"]
splitter(b)
#=> ["%% abc \n", "%% efg \n", "%% ijk \n"]
String#lines will partition the string right after the newline character by default. (This will return an Array. Then we call Array#map and pass in each matching string. This string then calls lstrip to remove the leading space and chomp(' ') to remove the trailing space without removing the \n. Then we reject any empty strings as would be the case in variable a because of the trailing space.

You can also use
a.split(/\\n\s?/).collect{|e| "#{e}\\n"}
a.split(/\\n\s?/)
# ["% abc ", "%% abcd ", "%% efgh"]
.collect{|e| "#{e}\\n"}
# will append \n
# ["% abc \\n", "%% abcd \\n", "%% efgh\\n"]

Related

How to add newline character with trailing spaces with printf?

I have the following printf command that works correctly adding 5 spaces after the string ABC and then prints string "DEF".
printf '%s%*s' "ABC" 5 '' "DEF"
I'd like to add a newline character at the end, after the string DEF, but I don't know how to do it. I've tried without success in these ways:
user /d
$ printf '%s%*s' "ABC" 5 '' "DEF\n"
ABC DEF\n
user /d
$ printf '%s%*s\n' "ABC" 5 '' "DEF"
ABC
DEF
How should be done? thanks in advance
Add another String placeholder for DEF and a new line character:
printf '%s%*s%s\n' "ABC" 5 '' "DEF"
You may use $'\n' to get a newline:
printf '%s%*s' "ABC" 5 '' $'DEF\n'
It didn't work with
printf '%s%*s\n' "ABC" 5 '' "DEF"
because printf format is one short of total arguments. First %s is used for ABC and then %*s is used to output 5 spaces. But there is no format for DEF hence \n is appended to both the arguments.

BASH - Shuffle characters in strings from several rows

I have a file (filename.txt) with the following structure:
>line1
ABC
DEF
GHI
>line2
JKL
MNO
PQR
>line3
STU
VWX
YZ
I would like to shuffle the characters in the strings that do not start wit >. The output would (for example) look like the following:
>line1
DGC
FEI
HBA
>line2
JRP
OKN
QML
>line3
SZV
YXT
UW
This is what I tried to shuffle the characters for each >line[number]: ruby -lpe '$_ = $_.chars.shuffle * "" if !/^>/' filename.txt. The command works (see my post BASH - Shuffle characters in strings from file) but it shuffles line by line. I was wondering how I could modify the command to shuffle characters between all strings of each >line[number]). Using ruby is not a requirement.
First, we need to solve the problem: how to shuffle all characters in multiple lines:
echo -e 'ABC\nDEF\nGHI' |grep -o . |shuf |tr -d '\n'
GDAFHEIBC
and, we also need an array to record the length of each line in origin strings.
s=GDAFHEIBC
lens=(3 3 3)
start=0
for len in "${lens[#]}"; do
echo ${s:${start}:${len}}
((start+=len))
done
GDA
FHE
IBC
So, the origin multiple lines:
ABC
DEF
GHI
have been shuffled to:
GDA
FHE
IBC
Now, we can do our jobs:
lens=()
string=""
function shuffle_lines {
local start=0
local shuffled_string=$(grep -o . <<< ${string} |shuf |tr -d '\n')
for len in "${lens[#]}"; do
echo ${shuffled_string:${start}:${len}}
((start+=len))
done
lens=()
string=""
}
while read -r line; do
if [[ "${line}" =~ ^\> ]]; then
shuffle_lines
echo "${line}"
else
string+="${line}"
lens+=(${#line})
fi
done <filename.txt
shuffle_lines
Examples:
$ cat filename.txt
>line1
ABC
DEF
GHI
>line2
JKL
MNO
PQR
>line3
STU
VWX
YZ
>line4
0123
456
78
9
$ ./solution.sh
>line1
HFG
BED
AIC
>line2
JOP
KMQ
RLN
>line3
UVW
TYZ
XS
>line4
1963
245
08
7
#!/bin/bash
# echo > output.txt # uncomment to write in a file output.txt
mix(){
{
echo "$title"
line="$( fold -w1 <<< "$line" | shuf )"
echo "${line//$'\n'}" | fold -w3
} # >> output.txt # uncomment to write in a file output.txt
unset line
}
while read -r; do
if [[ $REPLY =~ ^\> ]]; then
mix
title="$REPLY"
else
line+="$REPLY"
fi
done < filename.txt
mix # final mix after loop's exit, otherwise line3 will be not mixed
exit
edited with comment of gniourf-gniourf
First create a test file.
str =<<FINI
>line1
ABC
DEF
GHI
>line2
JKL
MNO
PQR
>line3
STU
VWX
YZ
FINI
File.write('test', str)
#=> 56
Now read the file and perform the desired operations.
result = File.read('test').split(/(>line\d+)/).map do |s|
if s.match?(/\A(?:|>line\d+)\z/)
s
else
a = s.scan(/\p{Lu}/).shuffle
s.gsub(/\p{Lu}/) { a.shift }
end
end.join
# ">line1\nECF\nHIA\nGBD\n>line2\nJNP\nKLR\nOQM\n>line3\nTXY\nUZV\nSW\n"
puts result
>line1
ECF
HIA
GBD
>line2
JNP
KLR
OQM
>line3
TXY
UZV
SW
To do this from the command convert the code to a string with statements separated by a semicolon.
ruby -e "puts (File.read('test').split(/(>line\d+)/).map do |s|; if s.match?(/\A(?:|>line\d+)\z/); s; else; a = s.scan(/\p{Lu}/).shuffle; s.gsub(/\p{Lu}/) { a.shift }; end; end).join"
The steps are as follows.
a = File.read('test')
#=> ">line1\nABC\nDEF\nGHI\n>line2\nJKL\nMNO\nPQR\n>line3\nSTU\nVWX\nYZ\n"
b = a.split(/(>line\d+)/)
#=> ["", ">line1", "\nABC\nDEF\nGHI\n", ">line2", "\nJKL\nMNO\nPQR\n",
# ">line3", "\nSTU\nVWX\nYZ\n"]
Notice that the regular expression that is split's argument places >line\d+ within a capture group. Without that, ">line1", ">line2" and ">line3" would not be included in b.
c = b.map do |s|
if s.match?(/\A(?:|>line\d+)\z/)
s
else
a = s.scan(/\p{Lu}/).shuffle
s.gsub(/\p{Lu}/) { a.shift }
end
end
#=> ["", ">line1", "\nEAC\nIHB\nDGF\n", ">line2", "\nKQJ\nROL\nMPN\n",
# ">line3", "\nSUY\nXTV\nZW\n"]
c.join
#=> ">line1\nEAC\nIHB\nDGF\n>line2\nKQJ\nROL\nMPN\n>line3\nSUY\nXTV\nZW\n"
Now consider more closely the calculation of c. The first element of b is passed to the block and the block variable s is set to its value:
s = ""
We then compute
s.match?(/\A(?:|>line\d+)\z/)
#=> true
so "" is returned from the block. The regular expression can be expressed as follows.
/
\A # match the beginning of the string
(?: # begin a non-capture group
# match an empty space
| # or
>line\d+ # match '>line' followed by one or more digits
) # end non-capture group
\z # match the end of the string
/x # free-spacing regex definition mode.
Within the non-capture group an empty space was matched.
The next element of b is then passed to the block.
s = ">line1"
Again
s.match?(/\A(?:|>line\d+)\z/)
#=> true
so s is return from the block.
Now the third element of b is passed to the block. (Finally, something interesting.)
s = "\nABC\nDEF\nGHI\n"
d = s.scan(/\p{Lu}/)
#=> ["A", "B", "C", "D", "E", "F", "G", "H", "I"]
a = d.shuffle
#=> ["D", "C", "G", "H", "B", "F", "I", "E", "A"]
s.gsub(/\p{Lu}/) { a.shift }
#=> "\nDCG\nHBF\nIEA\n"
The remaining calculations are similar.

Ruby regexp for replace some sequence

How I can convert string -
text = "test test1 \n \n \n \n \n \n \n \n \n \n \n \n \n \n test2 \n"
to
test test1 \n\n\n\n\n\n\n\n\n\n\n\n\n\n test2\n
I tried use next - text.gsub(/\s\n/, '\n'), but it added additional slash -
test test1\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n\\n test2\\n
Use double quotes, instead of single:
text.gsub(/\s\n/, "\n")
With single quotes, \n has the meaning of \ and n, one after another. With double, it is interpreted as new line.
I expect that either the space after "test1" is to be removed as well or the space after "test2" is not to be removed. #ndn assumed the former was intended. If the second interpretation applies, you could do the following:
r = /
(?<=\n) # match \n in a positive lookbehind
\s # match a whitespace character
(?=\n) # match \n in a positive lookahead
/x # extended/free-spacing regex definition mode
text.gsub(r,"")
#=> "test test1 \n\n\n\n\n\n\n\n\n\n\n\n\n\n test2 \n"
or:
text.gsub(/\n\s(?=\n)/, "\n")

Check if string1 is before string2 on the same line

I am trying to match comment lines in a c#/sql code. CREATE may come before or after /*. They can be on the same line.
line6 = " CREATE /* this is ACTIVE line 6"
line5 = " charlie /* CREATE inside this is comment 5"
In the first case, it will be an active line; in the second, it will be a comment. I probably can do some kind of charindex, but maybe there is a simpler way
regex1 = /\/\*||\-\-/
if (line1 =~ regex1) then puts "Match comment___" + line6 else puts '____' end
if (line1 =~ regex1) then puts "Match comment___" + line5 else puts '____' end
With the regex
r = /
\/ # match forward slash
\* # match asterisk
\s+ # match > 0 whitespace chars
CREATE # match chars
\b # match word break (to avoid matching CREATED)
/ # extended mode for regex def
you can return an array of the comment lines thus:
[line6, line5].select { |l| l =~ r }
#=> [" charlie /* CREATE inside this is comment 5"]

How to add string "\n" literally at the end of each line in Ruby?

Here is a string str:
str = "line1
line2
line3"
We would like to add string "\n" to the end of each line:
str = "line1 \n
line2 \n
line3 \n"
A method is defined:
def mod_line(str)
s = ""
str.each_line do |l|
s += l + '\\n'
end
end
The problem is that '\n' is a line feed and was not added to the end of the str even with escape \. What's the right way to add '\n' literally to each line?
String#gsub/String#gsub! plus a very simple regular expression can be used to achieve that:
str = "line1
line2
line3"
str.gsub!(/$/, ' \n')
puts str
Output:
line1 \n
line2 \n
line3 \n
The platform-independent solution:
str.gsub(/\R/) { " \\n#{$~}" }
It will search for line-feeds/carriage-returns and replace them with themselves, prepended by \n.
\n needs to be interpreted as a special character. You need to put it in double quotes.
"\n"
Your attempt:
'\\n'
only escapes the backslash, which is actually redundant. With or without escaping on the backslash, it gives you a backslash followed by the letter n.
Also, your method mod_line returns the result of str.each_line, which is the original string str. You need to return the modified string s:
def mod_line(str)
...
s
end
And by the way, be aware that each line of the original string already has "\n" at the end of each line, so you are adding the second "\n" to each line (making it two lines).
This is the closest I got to it.
def mod_line(str)
s = ""
str.each_line do |l|
s += l
end
p s
end
Using p instead of puts leaves the \n on the end of each line.

Resources