Check if a string contains a sequence of "udlr" in Ruby [closed] - ruby

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
If I have a string how can I check if the string contains any sequence of "rldu"? I am really new to ruby, sorry if this is a stupid question to ask.
r- right, l-left, d-down, u-up.
For example:
str = "udlv" #should return false
str = "lrd" #should return true

Assuming the string should entirely be composed of the given four characters in any order
str =~ /^[rldu]+$/
will return an integer or nil that you can use in a conditional. If you want a boolean, use the trick with !!:
!!str.match(/^[rldu]+$/)

If you wanted to check whether the string contains anything other than udlr, then
!("udlv" =~ /[^udlr]/) # => false
!("lrd" =~ /[^udlr]/) # => true

This one does not use a regular expression:
p "udlv".count("^rlrd").zero? #=> false
p "lrd".count("^rldu").zero? #=> true
"^rldu" means "everything else than rldu"

Assuming that by 'any sequence of "rldu"' you mean you want to verify that the string is composed of only the r, l, d, u (any number of times, in any order) and nothing else, a good old regular expression should work just fine:
str =~ /^[udlr]*$/
If you strictly need that to be a boolean value (true/false), then you can prefix it with two exclamation points (double not), like so:
!!(str =~ /^[udlr]*$/)
In most cases, you shouldn't need to do that because Ruby can interpret any value as either true or false anyway.
You can view the documentation for all of String's core methods here. And here is a guide on regular expressions.

Related

Ruby - Split a String to retrieve a number and a measurement/weight and then convert numberFo [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
I need to split a string, for food products, such as "Chocolate Biscuits 200g"
I need to extract the "200g" from the String and then split this by number and then by the measurement/weight.
So I need the "200" and "g" separately.
I have written a Ruby regex to find the "200g" in the String (sometimes there may be space between the number and measurement so I have included an optional whitespace between them):
([0-9]*[?:\s ]?[a-zA-Z]+)
And I think it works. But now that I have the result ("200g") that it matched from the entire String, I need to split this by number and measurement.
I wrote two regexes to split these:
([0-9]+)
to split by number and
([a-zA-Z]+)
to split by letters.
But the .split method is not working with these.
I get the following error:
undefined method 'split' for #MatchData "200"
Of course I will need to convert the 200 to a number instead of a String.
Any help is greatly appreciated,
Thank you!
UPDATE:
I have tested the 3 regexes on http://www.rubular.com/.
My issue seems to be around splitting up the result from the first regex into number and measurement.
One way among many is to use String#scan with a regex. See the last sentence of the doc concerning the treatment of capture groups.
str = "Chocolate Biscuits 200g"
r = /
(\d+) # match one or more digits in capture group 1
([[:alpha:]]+) # match one or more alphabetic characters in capture group 2
/x # free-spacing regex definition mode
number, weight = str.scan(r).flatten
#=> ["200", "g"]
number = number.to_i
#=> 200
I'm not an expert in ruby, but I guess that the following code does the deal
myString = String("Chocolate Biscuits 200g");
weight = 0;
unit = String('');
stringArray = myString.split(/(?:([a-zA-Z]+)|([0-9]+))/);
stringArray.each{
|val|
if val =~ /\A[0-9]+\Z/
weight = val.to_i;
elsif weight > 0 and val.length > 0
unit = val;
end
}
p weight;
p unit;

Retrieving multiple matched tokens from a regexp [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
Here's what I want to happen
> /x(y\d)*/.somefunction('xy1y2y3').each { |x| puts x }
y1
y2
y3
This seems like a pretty natural use of the asterisk in a regexp. I've matched a bunch of tokens and I want them printed out.
The closest I've been able to find is:
/x((y\d)*)/.match('xy1y2y3')[1].scan(/y\d/).each { |x| puts x }
Which is just abysmal.
The issue you are running into has to do with the regex rather than Ruby. You are repeating a capture group rather than capturing a repeated group. You could use
str.scan(/x((?:y\d)*)/)
However, this will capture all of the groups combined as one string. In order to do what you actually want to do (check that the string follows the pattern x followed by these groups) you unfortunately need to do two steps as you are doing in your question. Either that, or you can remove the additional requirement and search only for the pattern as other answers are suggesting.
I assume this is what you want:
'xy1y2y3'.gsub(/y\d/) { |s| puts s }
The gsub method accepts a block.
Based on your input and output, this looks about right:
'xy1y2y3'.scan(/y\d/)
# => ["y1", "y2", "y3"]
Use this if you want to print them:
puts 'xy1y2y3'.scan(/y\d/)
# >> y1
# >> y2
# >> y3
String's scan is your friend if you want to look through a string and capture repeating patterns.

Using regex backreference value as numeric value in regex [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I've got a string that has variable length sections. The length of the section precedes the content of that section. So for example, in the string:
13JOHNSON,STEVE
The first 2 characters define the content length (13), followed by the actual content. I'd like to be able to parse this using named capture groups with a backreference, but I'm not sure it is possible. I was hoping this would work:
(?<length>\d{2})(?<name>.{\k<length>})
But it doesn't. Seems like the backreference isn't interpreted as a number. This works fine though:
(?<length>\d{2})(?<name>.{13})
No, that will not work of course. You need to recompile your regular expression after extracting the first number.
I would recommend you to use two different expressions:
the first one that extracts number, and the second one that extracts texts basing on the number extracted by the first one.
You can't do that.
>> s = '13JOHNSON,STEVE'
=> "13JOHNSON,STEVE"
>> length = s[/^\d{2}/].to_i # s[0,2].to_i
=> 13
>> s[2,length]
=> "JOHNSON,STEVE"
This really seems like you're going after this the hard way. I suspect the sample string is not as simple as you said, based on:
I've got a string that has variable length sections. The length of the section precedes the content of that section.
Instead I'd use something like:
str = "13JOHNSON,STEVE 08Blow,Joe 10Smith,John"
str.scan(/\d{2}(\S+)/).flatten # => ["JOHNSON,STEVE", "Blow,Joe", "Smith,John"]
If the string can be split accurately, then there's this:
str.split.map{ |s| s[2..-1] } # => ["JOHNSON,STEVE", "Blow,Joe", "Smith,John"]
If you only have length bytes followed by strings, with nothing between them something like this works:
offset = 0
str.delete!(' ') # => "13JOHNSON,STEVE08Blow,Joe10Smith,John"
str.scan(/\d+/).map{ |l| s = str[offset + 2, l.to_i]; offset += 2 + l.to_i ; s }
# => ["JOHNSON,STEVE", "Blow,Joe", "Smith,John"]
won't work if the names have digits in them – tihom
str = "13JOHNSON,STEVE 08Blow,Joe 10Smith,John 1012345,7890"
str.scan(/\d{2}(\S+)/).flatten # => ["JOHNSON,STEVE", "Blow,Joe", "Smith,John", "12345,7890"]
str.split.map{ |s| s[2..-1] } # => ["JOHNSON,STEVE", "Blow,Joe", "Smith,John", "12345,7890"]
With a a minor change, and minor addition it'll continue to work correctly with strings not containing delimiters:
str.delete!(' ') # => "13JOHNSON,STEVE08Blow,Joe10Smith,John1012345,7890"
offset = 0
str.scan(/\d{2}/).map{ |l| s = str[offset + 2, l.to_i]; offset += 2 + l.to_i ; s }.compact
# => ["JOHNSON,STEVE", "Blow,Joe", "Smith,John", "12345,7890"]
\d{2} grabs the numerics in groups of two. For the names where the numeric is a leading length value of two characters, which is according to the OPs sample, the correct thing happens. For a solid numeric "name" several false-positives are returned, which would return nil values. compact cleans those out.
What about this?
a = '13JOHNSON,STEVE'
puts a.match /(?<length>\d{2})(?<name>(.*),(.*))/

ruby regular expression match string between last two delimiters [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I need to match everything between last two '/' in a regex
for example: for string tom/jack/sam/jill/ ---> I need to match jill
and in that case also need to match tom/jack/sam (without the last '/')
Thoughts appreciated!
1)
str = "tom/jack/sam/jill/"
*the_rest, last = str.split("/")
the_rest = the_rest.join("/")
puts last, the_rest
--output:--
jill
tom/jack/sam
2)
str = "tom/jack/sam/jill/"
md = str.match %r{
(.*) #Any character 0 or more times(greedy), captured in group 1
/ #followed by a forward slash
([^/]+) #followed by not a forward slash, one or more times, captured in group 2
}x #Ignore whitespace and comments in regex
puts md[2], md[1] if md
--output:--
jill
tom/jack/sam
If what you want is given a string tom/jack/sam/jill/ extract two groups: jill and tom/jack/sam/.
The regexp you need is: ^((?:[^\/]+\/)+)([^\/]+)\/$.
Note that regexp does not accept / in the begin of string and request a / in the end of string.
Take a look: http://rubular.com/r/mxBYtC31N2

Ruby gsub / regex with several arguments [duplicate]

This question already has answers here:
Match a string against multiple patterns
(2 answers)
Closed 8 years ago.
I'm new to ruby and I'm trying to solve a problem.
I'm parsing through several text field where I want to remove the header which has different values. It works fine when the header always is the same:
variable = variable.gsub(/(^Header_1:$)/, '')
But when I put in several arguments it doesn't work:
variable = variable.gsub(/(^Header_1$)/ || /(^Header_2$)/ || /(^Header_3$)/ || /(^Header_4$)/ || /^:$/, '')
You can use Regexp.union:
regex = Regexp.union(
/^Header_1/,
/^Header_2/,
/^Header_3/,
/^Header_4/,
/^:$/
)
variable.gsub(regex, '')
Please note that ^something$ will not work on strings containing something more than something :)
Cause ^ is for matching beginning of string and $ is for end of string.
So i intentionally removed $.
Also you do not need brackets when you only need to remove the matched string.
You can also use it like this:
headers = %w[Header_1 Header_2 Header_3]
regex = Regexp.union(*headers.map{|s| /^#{s}/}, /^\:$/, /etc/)
variable.gsub(regex, '')
And of course you can remove headers without explicitly define them.
Most likely there are a white space after headers?
If so, you can do it as simple as:
variable = "Header_1 something else"
puts variable.gsub(/(^Header[^\s]*)?(.*)/, '\2')
#=> something else
variable = "Header_BLAH something else"
puts variable.gsub(/(^Header[^\s]*)?(.*)/, '\2')
#=> something else
Just use a proper regexp:
variable.gsub(/^(Header_1|Header_2|Header_3|Header_4|:)$/, '')
If the header is always the same format of Header_n, where n is some integer value, then you can simplify your regex greatly:
/Header_\d+/
will find every one of these:
%w[Header_1 Header_2 Header_3].grep(/Header_\d+/)
[
[0] "Header_1",
[1] "Header_2",
[2] "Header_3"
]
Tweaking it to handle finding words, not substrings:
/^Header_\d+$/
or:
/\bHeader_\d+\b/
As mentioned, using Regexp.union is a good start, but, used blindly, can result in very slow or inefficient patterns, so think ahead and help out the engine by giving it useful sub-patterns to work with:
values = %w[foo bar]
/Header_(?:\d+|#{ values.join('|') })/
=> /Header_(?:\d+|foo|bar)/
Unfortunately, Ruby doesn't have the equivalent to Perl's Regexp::Assemble module, which can build highly optimized patterns from big lists of words. Search here on Stack Overflow for examples of what it can do. For instance:
use Regexp::Assemble;
my #values = ('Header_1', 'Header_2', 'foo', 'bar', 'Header_3');
my $ra = Regexp::Assemble->new;
foreach (#values) {
$ra->add($_);
}
print $ra->re, "\n";
=> (?-xism:(?:Header_[123]|bar|foo))

Resources