Regex to match exact word in string - ruby

I've looked around but haven't been able to find a working solution to my problem.
I have an array of two strings input and want to test which element of the array contains an exact substring Test.
One thing I have tried (among numerous other attempts):
input = ["Test's string", "Test string"]
# Alternative input array that it needs to work on:
# ["Testing string", "some Test string"]
substring = "Test"
if (input[0].match(/\b#{substring}\b/))
puts "Test 0 "
# Do something...
elsif (input[1].match(/\b#{substring}\b/))
puts "Test 1"
# Do something different...
end
The desired result is a print of "Test 1". The input can be more complex but overall I am looking for a way to find an exact match of a substring in a longer string.
I feel like this should be a rather trivial regex but I haven't been able to come up with the correct pattern. Any help would be greatly appreciated!

Following code may be what you are looking for.
input = ["Testing string", "Test string"]
substring = "Test"
if (input[0].match(/[^|\s]#{substring}[\s|$]/)
puts "Test 0 "
elsif (input[1].match(/[^|\s]#{substring}[\s|$]/)
puts "Test 1"
end
The meaning of the pattern /[^|\s]#{substring}[\s|$]/ is
[^|\s] : left side of the substring is begining of string(^) or white space,
{substring} : subsring is matched exactly,
[\s|$] : right side of the substring is white space or end of string($).

One way to that is as follows:
input = ["Testing string", "Test"]
"Test #{ input.index { |s| s[/\bTest\b/] } }"
#=> "Test 1"
input = ["Test", "Testing string"]
"Test #{ input.index { |s| s[/\bTest\b/] } }"
#=> "Test 0"
\b is the regex denotes a word boundary.
Maybe you want a method to return the index of the first element of input that contains the word? That could be:
def matching_index(input, word)
input.index { |s| s[/\b#{word}\b/i] }
end
input = ["Testing string", "Test"]
matching_index(input, "Test") #=> 1
matching_index(input, "test") #=> 1
matching_index(input, "Testing") #=> 0
matching_index(input, "Testy") #=> nil
Then you could use it like this, for example:
word = 'Test'
puts "The matching element for '#{word}' is at index #{ matching_index(input, word) }"
#=> The matching element for 'Test' is at index 1
word = "Testing"
puts "The matching element for '#{word}' is '#{ input[matching_index(input, word)] }'"
#The matching element for 'Testing' is 'Testing string'

The problem is with your bounding. In your original question, the word Test will match the first string because the ' is will match the \b word boundary. It's a perfect match and is responding with "Test 0" correctly. You need to determine how you'll terminate your search. If your input contains special characters, I don't think the regex will work properly. /\bTest my $money.*/ will never match because the of the $ in your substring.
What happens if you have multiple matches in your input array? Do you want to do something to all of them or just the first one?

Related

Searching for a string using user input

My goal is to have the user enter a string to find a string in an array. Im using strings include? function to search but its returning the wrong data.
puts "Enter Artist(all or partial name):"
search_artist = gets.chomp
list.each do |x|
if x.artist.include? (search_artist)
num += 1
x.to_s
else
puts "none found"
end end
search_artist = 'a' (because im looking for AARON...)
returns:
AARON KDL NOT VALID 2
ZAC CHICKEN ROCK 1289
2 records found
should be:
AARON KDL NOT VALID 2
1 record found`
The problem is that both strings include 'a' somewhere in the string.
How do I search from the beginning of the string?
There's a really easy way of doing this with grep:
matches = list.grep(search_artist)
if (matches.empty?)
puts "none found"
end
To count the number of matches you can just matches.length.
If you want a case insensitive match, then you want this:
matches = list.grep(Regexp.new(search_artist, Regexp::IGNORECASE))
Where that flag creates a case-insensitive regular expression to match more broadly.
Edit: To anchor this search to the beginning of the string:
matches = list.grep(Regexp.new('\A' + Regexp.escape(search_artist), Regexp::IGNORECASE))
Where \A anchors to the beginning of the string.
Other option, just if the search is limited to the first letter, case insensitive:
found = list.select { |x| [search_artist.downcase, search_artist.upcase].include? x[0] }
found.each { |e| puts e }
puts "Found #{found.size} records"
Without Regular expressions:
puts "Enter Artist(all or partial name):"
search_artist = gets.chomp
puts list.select do |x|
x.artist.start_with?(search_artist)
end

Regular expression to match exact substring at the beginning of string

I have one substring to be matched exactly at the beginning of source string.
source_string = "This is mat. This is cat."
substring1 = "This is"
substring2 = "That is"
source_string.match(/^(#{substring1}|#{substring2})$/)
This is what I tried it should work like this, if exact 'This is' or 'That is' is there at the beginning of string it should match, doesn't matter what is there after those substrings in source_string. My code is giving nil even if 'This is' is present.
I would not use a regex:
[substring1, substring2].any? { |sub| source_string.start_with?(sub) }
#=> true
Remove $ at the end of the regular expression pattern.
source_string.match(/^(#{substring1}|#{substring2})$/)
↑
By appending $, it requires the pattern ends with This is or That is. You only need ^ at the beginning.
source_string = "This is mat. This is cat."
substring1 = "This is"
substring2 = "That is"
source_string.match(/^(#{substring1}|#{substring2})/)
# => #<MatchData "This is" 1:"This is">
While #falsetru is right about the core problem, the regexp is actually still wrong. Whilst the goal is to match a pattern at a beginning of source string, not at the beginning of each line, the \A modifier should be used (see Regexp::Anchors for details):
source_string = <<-STR
Not to be matched.
This is cat.
STR
source_string.match(/^This is/) # this should not be matched!
#⇒ #<MatchData "This is">
source_string.match(/\AThis is/)
#⇒ nil

My instance variable isn't holding its value

Okay, so I'm building something that takes a text file and breaks it up into multiple sections that are further divided into entries, and then puts <a> tags around part of each entry. I have an instance variable, #section_name, that I need to use in making the link. The problem is, #section_name seems to lose its value if I look at it wrong. Some code:
def find_entries
#sections.each do |section|
#entries = section.to_s.shatter(/(some) RegEx/)
#section_name = $1.to_s
puts #section_name
add_links
end
end
def add_links
puts "looking for #{#section_name} in #{#section_hash}"
section_link = #section_hash.fetch(#section_name)
end
If I comment out the call to add_links, it spits out the names of all the sections, but if I include it, I just get:
looking for in {"contents" => "of", "the" => "hash"}
Any help is much appreciated!
$1 is a global variable which can be used in later code.$n contains the n-th (...) capture of the last match
"foobar".sub(/foo(.*)/, '\1\1')
puts "The matching word was #{$1}" #=> The matching word was bar
"123 456 789" =~ /(\d\d)(\d)/
p [$1, $2] #=> ["12", "3"]
So I think #entries = section.to_s.shatter(/(some) RegEx/) line is not doing match properly. thus your first matched group contains nothing. so $1 prints nil.

Taking a block of strings and storing them in an array

Is it possible to write a function that takes a block of strings, does something with the strings, and then returns an array with the strings?
def collect_string(&block)
# just toss them into an array and return it
return ...
end
a = collect_string {
"string 1"
"string 2"
"string 3"
}
When I print out what a is, I should get
["string 1", "string2", "string3"]
Now suppose I decided to change my mind and wanted to do something more with the strings first. Maybe I want to remove all of the vowels first, or just grab the first 3 characters.
This is really not what the blocks are for. You're going to make an array of strings anyway, so why not use an array to begin with?
def collect_string &block
v = block.call
# do something with v
end
# block returning array
a = collect_string {[
"string 1",
"string 2",
"string 3"
]}
If you use a block as in your example then it will return only "string 3", the last expression evaluated. Previous strings are lost.

How to split a string and skip whitespace?

I have a string like " This is a test ". I want to split the string by the space character. I do it like this:
puts " This is a test ".strip.each(' ') {|s| puts s.strip}
The result is:
This
is
a
test
This is a test
Why is there the last line "This is a test"?
And I need, that if there are two or more space characters between two words, that this should not return a "row".
I only want to get the words splitted in a given string.
Does anyone have an idea?
irb(main):002:0> " This is a test ".split
=> ["This", "is", "a", "test"]
irb(main):016:0* puts " This is a test ".split
This
is
a
test
str.split(pattern=$;, [limit]) => anArray
If pattern is omitted, the value of $;
is used. If $; is nil (which is the
default), str is split on whitespace
as if ` ’ were specified.
You should do
" This is a test ".strip.each(' ') {|s| puts s.strip}
If you don't want the last "this is a test"
Because
irb>>> puts " This is a test ".strip.each(' ') {}
This is a test
The first command "puts" will be put after the each-block is excecuted.
omit the first "puts" and you are done

Resources