String comparison by splitting in jruby - ruby

I am trying to get the numeric value out of a string which looks like "ABCDEFG [1]". Here I am trying to get 1. I tried to use split method on the following string but dint get success. Any solution for this is appreciated. Thanks!

Here you go:
def get_numeric_value(str)
# Match a number surrounded by square brackets
matchdata = str.match(/\[\d+\]/)
if matchdata
# Get the first matched substring, and remove the first and last characters
# (which will be the square brackets)
number = matchdata[0][1..-2]
number.to_i
else
nil
end
end
get_numeric_value "ABCDEFG [1]" #=> 1

Related

Match returns another outpu

after inserting input value "https://youtu.be/KMBBjzp5hdc" the code returns output value "https://youtu.be/"
str = gets.chomp.to_s
puts /.*[=\/]/.match(str)
I do not understand why as i would expect https:/
Thanks for advise!
[...] the code returns output value "https://youtu.be/" [...] I do not understand why as i would expect https:/
Your regexp /.*[=\/]/ matches:
.* zero or more characters
[=\/] followed by a = or / character
In your example string, there are 3 candidates that end with a / character: (and none ending with =)
https:/
https://
https://youtu.be/
Repetition like * is greedy by default, i.e. it matches as many characters as it can. From the 3 options above, it matches the longest one which is https://youtu.be/.
You can append a ? to make the repetition lazy, which results in the shortest match:
"https://youtu.be/KMBBjzp5hdc".match(/.*?[=\/]/)
#=> #<MatchData "https:/">
str = "https://youtu.be/KMBBjzp5hdc"
matches = str.match(/(https:\/\/youtu.be\/)(.+)/)
matches[1]
Outputs:
"https://youtu.be/"

Ruby : get value between parenthesis (without those parenthesis)

I have strings like "untitled", "untitled(1)" or "untitled(2)".
I want to get the last integer value between parenthesis when if there is. So far I tried a lot of regex, the ones making sense to me (I am new to regex) look like this:
number= string[/\([0-9]+\)/]
number= string[/\(([0-9]+)\)/]
but it still returns me the value with the parenthesis.
If there are no left-then-right parenthesis, getting an empty string (or nil) would be nice. Case such as "untitled($)", getting the '$' char or nil or an empty string would do the trick. For "untitled(3) animal(4)", I want to get 4.
I have been looking a lot of topics about how to do that and but it never seems to work ... what am I missing here ?
/(?<=\()\w+(?=\)$)/ matches one or more word characters (letter, number, underscore) within parenthesis, right before the end of line:
words = %w[
untitled
untitled(1)
untitled(2)
untitled(foo)
unti(tle)d
]
p words.map { |word| word[/(?<=\()\w+(?=\)$)/] }
# => [nil, "1", "2", "foo", nil]
When you use Regex as the parameter to String#[], you can optionally pass in the index of captured group to extract that group.
If a Regexp is supplied, the matching portion of the string is returned. If a capture follows the regular expression, which may be a capture group index or name, follows the regular expression that component of the MatchData is returned instead.
string = "untitled(1)"
number = string[/\(([0-9]+)\)/, 1]
puts number
#=> 1

Finding the first duplicate character in the string Ruby

I am trying to call the first duplicate character in my string in Ruby.
I have defined an input string using gets.
How do I call the first duplicate character in the string?
This is my code so far.
string = "#{gets}"
print string
How do I call a character from this string?
Edit 1:
This is the code I have now where my output is coming out to me No duplicates 26 times. I think my if statement is wrongly written.
string "abcade"
puts string
for i in ('a'..'z')
if string =~ /(.)\1/
puts string.chars.group_by{|c| c}.find{|el| el[1].size >1}[0]
else
puts "no duplicates"
end
end
My second puts statement works but with the for and if loops, it returns no duplicates 26 times whatever the string is.
The following returns the index of the first duplicate character:
the_string =~ /(.)\1/
Example:
'1234556' =~ /(.)\1/
=> 4
To get the duplicate character itself, use $1:
$1
=> "5"
Example usage in an if statement:
if my_string =~ /(.)\1/
# found duplicate; potentially do something with $1
else
# there is no match
end
s.chars.map { |c| [c, s.count(c)] }.drop_while{|i| i[1] <= 1}.first[0]
With the refined form from Cary Swoveland :
s.each_char.find { |c| s.count(c) > 1 }
Below method might be useful to find the first word in a string
def firstRepeatedWord(string)
h_data = Hash.new(0)
string.split(" ").each{|x| h_data[x] +=1}
h_data.key(h_data.values.max)
end
I believe the question can be interpreted in either of two ways (neither involving the first pair of adjacent characters that are the same) and offer solutions to each.
Find the first character in the string that is preceded by the same character
I don't believe we can use a regex for this (but would love to be proved wrong). I would use the method suggested in a comment by #DaveNewton:
require 'set'
def first_repeat_char(str)
str.each_char.with_object(Set.new) { |c,s| return c unless s.add?(c) }
nil
end
first_repeat_char("abcdebf") #=> b
first_repeat_char("abcdcbe") #=> c
first_repeat_char("abcdefg") #=> nil
Find the first character in the string that appears more than once
r = /
(.) # match any character in capture group #1
.* # match any character zero of more times
? # do the preceding lazily
\K # forget everything matched so far
\1 # match the contents of capture group 1
/x
"abcdebf"[r] #=> b
"abccdeb"[r] #=> b
"abcdefg"[r] #=> nil
This regex is fine, but produces the warning, "regular expression has redundant nested repeat operator '*'". You can disregard the warning or suppress it by doing something clunky, like:
r = /([^#{0.chr}]).*?\K\1/
where ([^#{0.chr}]) means "match any character other than 0.chr in capture group 1".
Note that a positive lookbehind cannot be used here, as they cannot contain variable-length matches (i.e., .*).
You could probably make your string an array and use detect. This should return the first char where the count is > 1.
string.split("").detect {|x| string.count(x) > 1}
I'll use positive lookahead with String#[] method :
"abcccddde"[/(.)(?=\1)/] #=> c
As a variant:
str = "abcdeff"
p str.chars.group_by{|c| c}.find{|el| el[1].size > 1}[0]
prints "f"

Use regular expression to fetch 3 groups from string

This is my expected result.
Input a string and get three returned string.
I have no idea how to finish it with Regex in Ruby.
this is my roughly idea.
match(/(.*?)(_)(.*?)(\d+)/)
Input and expected output
# "R224_OO2003" => R224, OO, 2003
# "R2241_OOP2003" => R2244, OOP, 2003
If the example description I gave in my comment on the question is correct, you need a very straightforward regex:
r = /(.+)_(.+)(\d{4})/
Then:
"R224_OO2003".scan(r).flatten #=> ["R224", "OO", "2003"]
"R2241_OOP2003".scan(r).flatten #=> ["R2241", "OOP", "2003"]
Assuming that your three parts consist of (R and one or more digits), then an underbar, then (one or more non-whitespace characters), before finally (a 4-digit numeric date), then your regex could be something like this:
^(R\d+)_(\S+)(\d{4})$
The ^ indicates start of string, and the $ indicates end of string. \d+ indicates one or more digits, while \S+ says one or more non-whitespace characters. The \d{4} says exactly four digits.
To recover data from the matches, you could either use the pre-defined globals that line up with your groups, or you could could use named captures.
To use the match globals just use $1, $2, and $3. In general, you can figure out the number to use by counting the left parentheses of the specific group.
To use the named captures, include ? right after the left paren of a particular group. For example:
x = "R2241_OOP2003"
match_data = /^(?<first>R\d+)_(?<second>\S+)(?<third>\d{4})$/.match(x)
puts match_data['first'], match_data['second'], match_data['third']
yields
R2241
OOP
2003
as expected.
As long as your pattern covers all possibilities, then you just need to use the match object to return the 3 strings:
my_match = "R224_OO2003".match(/(.*?)(_)(.*?)(\d+)/)
#=> #<MatchData "R224_OO2003" 1:"R224" 2:"_" 3:"OO" 4:"2003">
puts my_match[0] #=> "R224_OO2003"
puts my_match[1] #=> "R224"
puts my_match[2] #=> "_"
puts my_match[3] #=> "00"
puts my_match[4] #=> "2003"
A MatchData object contains an array of each match group starting at index [1]. As you can see, index [0] returns the entire string. If you don't want the capture the "_" you can leave it's parentheses out.
Also, I'm not sure you are getting what you want with the part:
(.*?)
this basically says one or more of any single character followed by zero or one of any single character.

regex for a pattern at end of string

I have a string which looks like:
hello/world/1.9.2-some-text
hello/world/2.0.2-some-text
hello/world/2.11.0
Through regex I want to get the string after last '/' and until end of line i.e. in above examples output should be 1.9.2-some-text, 2.0.2-some-text, 2.11.0
I tried this - ^(.+)\/(.+)$ which returns me an array of which first object is "hello/world" and 2nd object is "1.9.2-some-text"
Is there a way to just get "1.9.2-some-text" as the output?
Try using a negative character class ([^…]) like this:
[^\/]+$
This will match one or more of any character other than / followed by the end of the string.
You can use a negated match here.
'hello/world/1.9.2-some-text'.match(Regexp.new('[^/]+$'))
# => "1.9.2-some-text"
Meaning any character except: / (1 or more times) followed by the end of the string.
Although, the simplest way would be to split the string.
'hello/world/1.9.2-some-text'.split('/').last
# => "1.9.2-some-text"
OR
'hello/world/1.9.2-some-text'.split('/')[-1]
# => "1.9.2-some-text"
If you do not need to use a regex, the ordinary way of doing such thing is:
File.basename("hello/world/1.9.2-some-text")
#=> "1.9.2-some-text"
This is one way:
s = 'hello/world/1.9.2-some-text
hello/world/2.0.2-some-text
hello/world/2.11.0'
s.lines.map { |l| l[/.*\/(.*)/,1] }
#=> ["1.9.2-some-text", "2.0.2-some-text", "2.11.0"]
You said, "in above examples output should be 1.9.2-some-text, 2.0.2-some-text, 2.11.0". That's neither a string nor an array, so I assumed you wanted an array. If you want a string, tack .join(', ') onto the end.
Regex's are naturally "greedy", so .*\/ will match all characters up to and including the last / in each line. 1 returns the contents of the capture group (.*) (capture group 1).

Resources