gsub a backslash to display a unicode character [closed] - ruby

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I would like to gsub one of the backslahes in front of u00E9 so that it will print the unicode character, which in this case would be e with an accent on top. Below is the code I am using, which doesn't work.
array1 = [
["V\\u00E9tiver (1978) ", "by L'Artisan Parfumeur", "12"],
["Time for Peace for Her (1999) ", "by Kenzo", "4"],
["Time for Peace for Him (1999) ", "by Kenzo", "7"],
[" Untitled (2009) ", "by Kenzo", "1"],
[" Havana Vanille (2009) ", "by L'Artisan Parfumeur", "10"]
]
array3 = array1.each do |s,a,r|
puts s.gsub(/\\/,"")
end
so what I would like to know is the correct regex to get rid of one of the backslashes in the array.I was thinking that the one I have above would be enough.However it is not.

You seem to not understand how escape sequences work. Take this string, for example:
s = "V\u00E9tiver (1978)"
The \u00e9 here is a representation of one character é, not a six-char string of \ u 0 0 e 9. So, if you try to replace any part of it (say, the "u"), you'll fail because there's no such character in the string.
s.gsub('u', 'U') # => "Vétiver (1978)"
Whereas in your string
s2 = "V\\u00E9tiver (1978) "
you have totally different situation. Here the backslash does not start a unicode escape sequence, but is instead escaped itself. Which means that the following characters u00E9 are just regular characters in the string, not part of unicode codepoint definition.
Off the top of my head, I don't know of a way to turn "\\u00E9" into "\u00E9" (short of eval, of course). What you should do instead, is fix the source of that data, so that it does not double-escape sequences.

Related

Ruby - Split a String to retrieve a number and a measurement/weight and then convert numberFo [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I need to split a string, for food products, such as "Chocolate Biscuits 200g"
I need to extract the "200g" from the String and then split this by number and then by the measurement/weight.
So I need the "200" and "g" separately.
I have written a Ruby regex to find the "200g" in the String (sometimes there may be space between the number and measurement so I have included an optional whitespace between them):
([0-9]*[?:\s ]?[a-zA-Z]+)
And I think it works. But now that I have the result ("200g") that it matched from the entire String, I need to split this by number and measurement.
I wrote two regexes to split these:
([0-9]+)
to split by number and
([a-zA-Z]+)
to split by letters.
But the .split method is not working with these.
I get the following error:
undefined method 'split' for #MatchData "200"
Of course I will need to convert the 200 to a number instead of a String.
Any help is greatly appreciated,
Thank you!
UPDATE:
I have tested the 3 regexes on http://www.rubular.com/.
My issue seems to be around splitting up the result from the first regex into number and measurement.
One way among many is to use String#scan with a regex. See the last sentence of the doc concerning the treatment of capture groups.
str = "Chocolate Biscuits 200g"
r = /
(\d+) # match one or more digits in capture group 1
([[:alpha:]]+) # match one or more alphabetic characters in capture group 2
/x # free-spacing regex definition mode
number, weight = str.scan(r).flatten
#=> ["200", "g"]
number = number.to_i
#=> 200
I'm not an expert in ruby, but I guess that the following code does the deal
myString = String("Chocolate Biscuits 200g");
weight = 0;
unit = String('');
stringArray = myString.split(/(?:([a-zA-Z]+)|([0-9]+))/);
stringArray.each{
|val|
if val =~ /\A[0-9]+\Z/
weight = val.to_i;
elsif weight > 0 and val.length > 0
unit = val;
end
}
p weight;
p unit;

Why does my ruby regexp match never stop? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I have a trouble with a regexp I wrote in ruby:
reg = /\([^\(|\)]{5,}*\)/i #almost 5 caracters inside two parenthesis.
one_string = "( foobarbaz, foobarbaz "
one_string.match(reg)#works fine and return nil
one_string = "( foobarbaz, foobarbaz, foobarbaz, foobarbaz foobarbaz, foobarbaz, foobarbaz, foobarbaz foobarbaz "
one_string.match(reg) # never stop if one_string is to long.
The parenthesis is not closed in one_string. And if the string I want to match is long, the match function does not seem to stop. Should I write my regexp differently, or is there a trouble with ruby (the expression is simple)?
Your regular expression syntax is incorrect here.
\( # match '('
[^\(|\)]{5,} # match any character except: '\(', '|', '\)' (at least 5 times)
Then it fails on the * quantifier because the preceding token is not quantifiable. Also you can drop the i flag since you are not matching any word characters in your regular expression.
I am not clear on what you are exactly trying to do here, but you may be looking for something like this.
reg = /\([^()]{5,}\)?/
Which I still don't understand the concept, if you are just trying to match everything between:
reg = /\([^()]*\)?/
Explanation:
\( # match '('
[^()]* # any character except: '(', ')' (0 or more times)
\)? # ')' (optional)

Check if a string contains a sequence of "udlr" in Ruby [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
If I have a string how can I check if the string contains any sequence of "rldu"? I am really new to ruby, sorry if this is a stupid question to ask.
r- right, l-left, d-down, u-up.
For example:
str = "udlv" #should return false
str = "lrd" #should return true
Assuming the string should entirely be composed of the given four characters in any order
str =~ /^[rldu]+$/
will return an integer or nil that you can use in a conditional. If you want a boolean, use the trick with !!:
!!str.match(/^[rldu]+$/)
If you wanted to check whether the string contains anything other than udlr, then
!("udlv" =~ /[^udlr]/) # => false
!("lrd" =~ /[^udlr]/) # => true
This one does not use a regular expression:
p "udlv".count("^rlrd").zero? #=> false
p "lrd".count("^rldu").zero? #=> true
"^rldu" means "everything else than rldu"
Assuming that by 'any sequence of "rldu"' you mean you want to verify that the string is composed of only the r, l, d, u (any number of times, in any order) and nothing else, a good old regular expression should work just fine:
str =~ /^[udlr]*$/
If you strictly need that to be a boolean value (true/false), then you can prefix it with two exclamation points (double not), like so:
!!(str =~ /^[udlr]*$/)
In most cases, you shouldn't need to do that because Ruby can interpret any value as either true or false anyway.
You can view the documentation for all of String's core methods here. And here is a guide on regular expressions.

How can I transform array of arrays into one string in Ruby [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I have an array of arrays looking like this:
arr = [[0,0,0,0,0], [0,0,1,0,0], [0,1,0,1,0], [0,0,1,0,0], [0,0,0,0,0]]
and I want to make it look like this:
00000
00100
01010
00100
00000
I tried like this:
arr.each {|a| p a.join(',').gsub(',','')}
but it outputs it like this:
00000
00100
01010
00100
00000
whith quotes ' " " ' in the begining and the end of each row. I want it to be one single piece that starts with a quote then the rows and in the end - quote. But not quoting every single row.
try
arr.map {|a| a.join}.join("\n")
join without an argument:
arr.each{|el| puts el.join}
puts arr.map(&:join)
Calling map goes through the array (the outer one) and for each entry calls the join method; it returns a new array where each entry has been replaced by the result.
The join method of an array calls to_s on each part of an array and concatenates them with an optional separator. Calling join with no arguments uses an empty string (no separator).
Calling puts on an array prints each entry on its own line.
Or, if you want the final results as a single string with embedded newlines:
str = arr.map(&:join).join("\n")

Using regex backreference value as numeric value in regex [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I've got a string that has variable length sections. The length of the section precedes the content of that section. So for example, in the string:
13JOHNSON,STEVE
The first 2 characters define the content length (13), followed by the actual content. I'd like to be able to parse this using named capture groups with a backreference, but I'm not sure it is possible. I was hoping this would work:
(?<length>\d{2})(?<name>.{\k<length>})
But it doesn't. Seems like the backreference isn't interpreted as a number. This works fine though:
(?<length>\d{2})(?<name>.{13})
No, that will not work of course. You need to recompile your regular expression after extracting the first number.
I would recommend you to use two different expressions:
the first one that extracts number, and the second one that extracts texts basing on the number extracted by the first one.
You can't do that.
>> s = '13JOHNSON,STEVE'
=> "13JOHNSON,STEVE"
>> length = s[/^\d{2}/].to_i # s[0,2].to_i
=> 13
>> s[2,length]
=> "JOHNSON,STEVE"
This really seems like you're going after this the hard way. I suspect the sample string is not as simple as you said, based on:
I've got a string that has variable length sections. The length of the section precedes the content of that section.
Instead I'd use something like:
str = "13JOHNSON,STEVE 08Blow,Joe 10Smith,John"
str.scan(/\d{2}(\S+)/).flatten # => ["JOHNSON,STEVE", "Blow,Joe", "Smith,John"]
If the string can be split accurately, then there's this:
str.split.map{ |s| s[2..-1] } # => ["JOHNSON,STEVE", "Blow,Joe", "Smith,John"]
If you only have length bytes followed by strings, with nothing between them something like this works:
offset = 0
str.delete!(' ') # => "13JOHNSON,STEVE08Blow,Joe10Smith,John"
str.scan(/\d+/).map{ |l| s = str[offset + 2, l.to_i]; offset += 2 + l.to_i ; s }
# => ["JOHNSON,STEVE", "Blow,Joe", "Smith,John"]
won't work if the names have digits in them – tihom
str = "13JOHNSON,STEVE 08Blow,Joe 10Smith,John 1012345,7890"
str.scan(/\d{2}(\S+)/).flatten # => ["JOHNSON,STEVE", "Blow,Joe", "Smith,John", "12345,7890"]
str.split.map{ |s| s[2..-1] } # => ["JOHNSON,STEVE", "Blow,Joe", "Smith,John", "12345,7890"]
With a a minor change, and minor addition it'll continue to work correctly with strings not containing delimiters:
str.delete!(' ') # => "13JOHNSON,STEVE08Blow,Joe10Smith,John1012345,7890"
offset = 0
str.scan(/\d{2}/).map{ |l| s = str[offset + 2, l.to_i]; offset += 2 + l.to_i ; s }.compact
# => ["JOHNSON,STEVE", "Blow,Joe", "Smith,John", "12345,7890"]
\d{2} grabs the numerics in groups of two. For the names where the numeric is a leading length value of two characters, which is according to the OPs sample, the correct thing happens. For a solid numeric "name" several false-positives are returned, which would return nil values. compact cleans those out.
What about this?
a = '13JOHNSON,STEVE'
puts a.match /(?<length>\d{2})(?<name>(.*),(.*))/

Resources