How to replace "\n" but not "\n\n" etc. with " \n"?
text1 = "Hello\nWorld"
text1.sub! "\n", " \n"
=> "Hello \nWorld"
text2 = "Hello\n\nWorld"
text2.sub! "\n\n", " \n"
=> "Hello \n\nWorld"
SHOULD BE: => "Hello\n\nWorld"
You can use the regular expression /(?<!\n)\n(?!\n)/, which matches a \n only when it's not prefixed with another \n and not followed by a \n.
text1 = "Hello\nWorld"
# => "Hello\nWorld"
text1.sub /(?<!\n)\n(?!\n)/, " \n"
# => "Hello \nWorld"
text2 = "Hello\n\nWorld"
# => "Hello\n\nWorld"
text2.sub /(?<!\n)\n(?!\n)/, " \n"
# => "Hello\n\nWorld"
Here's another way:
r = /\n+/
"Hello\nWorld".sub(r) { |s| (s.size==1) ? " \n" : s }
#=> "Hello \nWorld"
"Hello\n\nWorld".sub(r) { |s| (s.size==1) ? " \n" : s }
#=> "Hello\n\nWorld"
and one more:
h = Hash.new { |h,k| h[k] = k }.update("\n"=>" \n")
#=> {"\n"=>" \n"}
"Hello\nWorld".sub(r,h)
#=> "Hello \nWorld"
"Hello\n\nWorld".sub(r,h)
#=> "Hello\n\nWorld"
In the latter method, each string of one or more consecutive newline characters is passed to the hash. If it's a single newline, "\n" is replaced with h["\n"] #=> " \n". If it's two or more newlines, say s = "\n\n", and h does not have a key equal to s (initally it won't), the key-value pair s=>s will be added to h (because of the default value defined for the hash) and s will be replaced by itself.
Another solution you could use:
string = "Hello\nWorld"
string.split("\n") == string.split("\n").reject(&:empty?) ? string.sub!("\n", " \n") : string
#=> "Hello \nWorld"
string = "Hello\n\nWorld"
string.split("\n") == string.split("\n").reject(&:empty?) ? string.sub!("\n", " \n") : string
#=> "Hello\n\nWorld"
Related
Assuming I have the following input:
names = ["\"Петр Сергеевич\"", "\"Курсатов Роман\"", "\" \"", "\"Павел2 Олегович\"", "\"Илья иванович\"", "\" \""]
Each whitespace is actually a non-breaking space (U+00A0).
How do I remove \" in pure ruby, so the following is true:
p names
=> ["Петр Сергеевич", "Курсатов Роман", " ", "Павел2 Олегович", "Илья иванович", " "]
I tried:
names.map { |i| i.gsub(/[\"]/, "")}.map(&:inspect)
names.map { |i| i.delete('\\"')}.map(&:inspect)
names.map { |i| i.gsub('\\"', '')}.map(&:inspect)
Nothing seems to work.
string.delete("\"")
# => " "
or
string.tr("\"", "")
# => " "
Using Ruby 2.4. How do I apply an editing of a stirng to the string itself? I have this method
# Removes the word from teh end of the string
def remove_word_from_end_of_str(str, word)
str[0...(-1 * word.length)]
end
I want the parameter to be operated upon, but it isn't working ...
2.4.0 :001 > str = "abc def"
=> "abc def"
2.4.0 :002 > StringHelper.remove_word_from_end_of_str(str, "def")
=> "abc "
2.4.0 :003 > str
=> "abc def"
I want the string that was passed in to be equal to "abc " but that isn't happening. I don't want to set the variable to the result of the function (e.g. "str = StringHelper.remove(...)"
Ruby already has the String#delete! method that does exactly this:
>> str = 'abc def'
=> "abc def"
>> word = 'def'
=> "def"
>> str.delete!(word)
=> "abc "
>> str
=> "abc "
Note that this will remove all instances of word:
>> str = 'def abc def'
=> "def abc def"
>> str.delete!(word)
=> " abc "
To limit the effect to only the last word, you can do:
>> str = 'def abc def'
=> "def abc def"
>> str.slice!(-word.length..-1)
=> "def"
>> str
=> "def abc "
str[range] is just a shorthand for str.slice(range). You just have to use the destructive method, like that :
# Removes the word from the end of the string
def remove_word_from_end_of_str(str, word)
str.slice!((str.length - word.length)...(str.length))
end
For more information, see the documentation.
If you want your function to return the new string as well, you should use :
# Removes the word from the end of the string
def remove_word_from_end_of_str(str, word)
str.slice!((str.length - word.length)...(str.length))
str
end
Try:
def remove_word_from_end_of_str(str, word)
str.slice!((str.length - word.length)..str.length)
end
Also, your explanation is a little confusing. You are calling the remove_word method as a class method but it is an instance method.
chomp! returns a the String with the given record separator removed from the end of string (if present), and nil if nothing was removed.
def remove_word_from_end_of_str(str, word)
str.chomp!( "CO")
end
str = "Aurora CO"
remove_word_from_end_of_str(str, "CO")
p str #=> "Aurora "
I'm having some trouble trying to find an appropriate method for string substitution. I would like to replace every character in a string 'except' for a selection of words or set of string (provided in an array). I know there's a gsub method, but I guess what I'm trying to achieve is its reverse. For example...
My string: "Part of this string needs to be substituted"
Keywords: ["this string", "substituted"]
Desired output: "**** ** this string ***** ** ** substituted"
ps. It's my first question ever, so your help will be greatly appreciated!
Here's a different approach. First, do the reverse of what you ultimately want: redact what you want to keep. Then compare this redacted string to your original character by character, and if the characters are the same, redact, and if they are not, keep the original.
class String
# Returns a string with all words except those passed in as keepers
# redacted.
#
# "Part of this string needs to be substituted".gsub_except(["this string", "substituted"], '*')
# # => "**** ** this string ***** ** ** substituted"
def gsub_except keep, mark
reverse_keep = self.dup
keep.each_with_object(Hash.new(0)) { |e, a| a[e] = mark * e.length }
.each { |word, redacted| reverse_keep.gsub! word, redacted }
reverse_keep.chars.zip(self.chars).map do |redacted, original|
redacted == original && original != ' ' ? mark : original
end.join
end
end
You can use something like:
str="Part of this string needs to be substituted"
keep = ["this","string", "substituted"]
str.split(" ").map{|word| keep.include?(word) ? word : word.split("").map{|w| "*"}.join}.join(" ")
but this will work only to keep words, not phrases.
This might be a little more understandable than my last answer:
s = "Part of this string needs to be substituted"
k = ["this string", "substituted"]
tmp = s
for(key in k) {
tmp = tmp.replace(k[key], function(x){ return "*".repeat(x.length)})
}
res = s.split("")
for(charIdx in s) {
if(tmp[charIdx] != "*" && tmp[charIdx] != " ") {
res[charIdx] = "*"
} else {
res[charIdx] = s.charAt(charIdx)
}
}
var finalResult = res.join("")
Explanation:
This goes off of my previous idea about using where the keywords are in order to replace portions of the string with stars. First off:
For each of the keywords we replace it with stars, of the same length as it. So:
s.replace("this string", function(x){
return "*".repeat(x.length)
}
replaces the portion of s that matches "this string" with x.length *'s
We do this for each key, for completeness, you should make sure that the replace is global and not just the first match found. /this string/g, I didn't do it in the answer, but I think you should be able to figure out how to use new RegExp by yourself.
Next up, we split a copy of the original string into an array. If you're a visual person, it should make sense to think of this as a weird sort of character addition:
"Part of this string needs to be substituted"
"Part of *********** needs to be substituted" +
---------------------------------------------
**** ** this string ***** ** ** ***********
is what we're going for. So if our tmp variable has stars, then we want to bring over the original string, and otherwise we want to replace the character with a *
This is easily done with an if statement. And to make it like your example in the question, we also bring over the original character if it's a space. Lastly, we join the array back into a string via .join("") so that you can work with a string again.
Makes sense?
You can use the following approach: collect the substrings that you need to turn into asterisks, and then perform this replacement:
str="Part of this string needs to be substituted"
arr = ["this string", "substituted"]
arr_to_remove = str.split(Regexp.new("\\b(?:" + arr.map { |x| Regexp.escape(x) }.join('|') + ")\\b|\\s+")).reject { |s| s.empty? }
arr_to_remove.each do |s|
str = str.gsub(s, "*" * s.length)
end
puts str
Output of the demo program:
**** ** this string ***** ** ** substituted
str = "Part of this string needs to be substituted"
keywords = ["this string", "substituted"]
pattern = /(#{keywords.join('|')})/
str.split(pattern).map {|i| keywords.include?(i) ? i : i.gsub(/\S/,"*")}.join
#=> "**** ** this string ***** ** ** substituted"
A more readable version of the same code
str = "Part of this string needs to be substituted"
keywords = ["this string", "substituted"]
#Use regexp pattern to split string around keywords.
pattern = /(#{keywords.join('|')})/ #pattern => /(this string|substituted)/
str = str.split(pattern) #=> ["Part of ", "this string", " needs to be ", "substituted"]
redacted = str.map do |i|
if keywords.include?(i)
i
else
i.gsub(/\S/,"*") # replace all non-whitespace characters with "*"
end
end
# redacted => ["**** **", "this string", "***** ** **", "substituted"]
redacted.join
You can do that using the form of String#split that uses a regex with a capture group.
Code
def sub_some(str, keywords)
str.split(/(#{keywords.join('|')})/)
.map {|s| keywords.include?(s) ? s : s.gsub(/./) {|c| (c==' ') ? c : '*'}}
.join
end
Example
str = "Part of this string needs to be substituted"
keywords = ["this string", "substituted"]
sub_some(str, keywords)
#=> "**** ** this string ***** ** ** substituted"
Explanation
r = /(#{keywords.join('|')})/
#=> /(this string|substituted)/
a = str.split(r)
#=> ["Part of ", "this string", " needs to be ", "substituted"]
e = a.map
#=> #<Enumerator: ["Part of ", "this string", " needs to be ",
# "substituted"]:map>
s = e.next
#=> "Part of "
keywords.include?(s) ? s : s.gsub(/./) { |c| (c==' ') ? c : '*' }
#=> s.gsub(/./) { |c| (c==' ') ? c : '*' }
#=> "Part of "gsub(/./) { |c| (c==' ') ? c : '*' }
#=> "**** ** "
s = e.next
keywords.include?(s) ? s : s.gsub(/./) { |c| (c==' ') ? c : '*' }
#=> "this string"
keywords.include?(s) ? s : s.gsub(/./) { |c| (c==' ') ? c : '*' }
#=> s
#=> "this string"
and so on... Lastly,
["**** ** ", "this string", " ***** ** ** ", "substituted"].join('|')
#=> "**** ** this string ***** ** ** substituted"
Note that, prior to v.1.9.3, Enumerable#map did not return an enumerator when no block is given. The calculations are the same, however.
For microarray data processing, I need to make a list of gene names from 1 to 654, like Gene_1 ... Gene_654.
My simple Ruby code produces the following:
1.upto(654).each { |i| print "Gene" }
The result is:
GeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGene
GeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGene
GeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGene
GeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGene
GeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGene
GeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGene
GeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGene
GeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGene
GeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGene
..................................
GeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGeneGene=> 1
irb(main):008:0>
How do I add a "postfix _#" in sequential incremental order to a printed string and put them in a column, like:
Gene_1
Gene_2
::::::
Gene_654
1.upto(654).each { |i| printf "%8s\t", "Gene_#{i}" }
Source: http://www.ruby-doc.org/core-2.0.0/Kernel.html#format-method
Edited to conform to the new requirements:
1.upto(654).each { |i| puts "Gene_#{i}" }
--output:--
Gene_1
Gene_2
...
Geen_654
I'd use:
str = 'Gene_0'
654.times { puts str.next! }
Which outputs:
Gene_1
...
Gene_654
If you need the text output to the same width, perhaps because you're going to append information to each line, use some formatting:
str = 'Gene_0'
654.times { puts '%8s ' % str.next! }
# >> Gene_1
...
# >> Gene_9
# >> Gene_10
...
# >> Gene_99
# >> Gene_100
...
# >> Gene_654
If you need columns across a page:
str = 'Gene_0'
654.times { print '%8s ' % str.next! }
puts
Which spaces them out in 8-space-wide columns.
By default %8s uses right alignment, which isn't always what you want. Instead you can use %-8s for left-alignment.
You can build an array containing the column headings:
str = 'Gene_0'
columns = []
654.times { columns << '%-8s' % str.next! }
puts columns.join(' ')
You could even use something like inject:
str = 'Gene_0'
columns = []
(1..654).inject(columns) { |a, i| a.push('%-8s' % str.next!) }
puts columns.join(' ')
But that starts to add code that doesn't really help.
The OP asked:
...how to add " " to the result...
The output above doesn't make it easy to see the whitespace automatically appended to the output by '%8s ', so I tweaked the format-string to make it more obvious by wrapping the output in double-quotes:
str = 'Gene_0'
654.times { puts '"%8s "' % str.next! }
And here's the corresponding output, trimmed down to show how the format string maintains the column width as the string value increments:
# >> " Gene_1 "
...
# >> " Gene_9 "
# >> " Gene_10 "
...
# >> " Gene_99 "
# >> "Gene_100 "
...
# >> "Gene_654 "
If you want all the white-space to occur at the end of the column, use a left-alignment:
str = 'Gene_0'
654.times { puts '"%-8s "' % str.next! }
Which outputs:
# >> "Gene_1 "
...
# >> "Gene_9 "
# >> "Gene_10 "
...
# >> "Gene_99 "
# >> "Gene_100 "
...
# >> "Gene_654 "
def name
#name || "#{self.first_name} #{self.last_name}"
end
If first name and last name are both empty name is a space " ". How do I rewrite the right-hand side so it's an empty string "" instead of a space " "?
You can just add .strip at the end:
>> ln = 'last' #=> "last"
>> fn = 'first' #=> "first"
>> "#{fn} #{ln}".strip #=> "first last"
>> fn = nil #=> nil
>> ln = nil #=> nil
>> "#{fn} #{ln}".strip #=> ""
def name
#name ||= [first_name, last_name].compact * " "
end
This solution has the advantage of not including a trailing or leading space when either name is nil, and it works in the general case (i.e. for any number of strings).