Is it possible to write a function that takes a block of strings, does something with the strings, and then returns an array with the strings?
def collect_string(&block)
# just toss them into an array and return it
return ...
end
a = collect_string {
"string 1"
"string 2"
"string 3"
}
When I print out what a is, I should get
["string 1", "string2", "string3"]
Now suppose I decided to change my mind and wanted to do something more with the strings first. Maybe I want to remove all of the vowels first, or just grab the first 3 characters.
This is really not what the blocks are for. You're going to make an array of strings anyway, so why not use an array to begin with?
def collect_string &block
v = block.call
# do something with v
end
# block returning array
a = collect_string {[
"string 1",
"string 2",
"string 3"
]}
If you use a block as in your example then it will return only "string 3", the last expression evaluated. Previous strings are lost.
Related
I have a code that asks for the user to either type Cats or Dogs then it'll search an array for everything that contains the word Cats or Dogs then puts them all out.
print "Cats or Dogs? "
userinput = gets.chomp
lines = [["Cats are smarter than dogs"],["Dogs also like meat"], ["Cats are nice"]]
lines.each do |line|
if line =~ /(.*?)#{userinput}(.*)/
puts line
end
end
So If I were to input Cats. I should get two sentences:
Cats are smarter than dogs
Cats are nice
You could even input smarter and I'll get
Cats are smarter than dogs
I'm strictly looking for ways to use an regular expression to search through an array or string and take out the lines/sentences that match the expression.
If anyone is wondering, the lines array was originally from an file and I turned each line into an array part.
EDIT:
Wow, how far I came in the coding world.
print "Cats or Dogs? "
userinput = gets.chomp
lines = [["Cats are smarter than dogs"],["Dogs also like meat"], ["Cats are nice"]]
lines.each do |linesInside|
linesInside.each do |line|
if line =~ /(.*?)#{userinput}(.*)/
puts line
end
end
end
Took literally 5 seconds to solve what took me ages to give up on at the time.
Try this
...
lines = ["Cats are smarter than dogs", "Dogs also like meat", "Cats are nice"]
regexp = Regexp.new(userinput)
selected_lines = lines.grep(regexp)
puts selected_lines
How does this work?
grep filters an array using pattern matching
Notice that I am using an array of strings. Your example code uses an array of single-element arrays, I assume you mean to just use an array of strings.
You can, of course, do that without a regex.
lines = ["Dogs are smarter than cats", "Cats also like meat", "Dogs are nice"]
print "Cats or Dogs? "
input = gets.chomp.downcase
If input #=> "dogs",
lines.select { |line| line.downcase.split.include?(input) }
#=> ["Dogs are smarter than cats", "Dogs are nice"]
If input #=> "cats",
lines.select { |line| line.downcase.split.include?(input) }
#=> ["Dogs are smarter than cats", "Cats also like meat"]
Since your array is an array of arrays, you could call flatten first :
lines.flatten.grep(/#{userinput}/i)
i is for case insensitive search, so that 'Dogs' matches 'Dogs' and 'dogs'.
If you want whole-word search :
lines.flatten.grep(/\b#{userinput}\b/i)
Finally, if you don't really need an array of arrays, just read an array from your file directly, either with File.readlines(f) or File.foreach(f).
I'd like to create ruby one liner that prints some information to stdout and gets data from stdin. I've got some code:
["This should be shown first", "This second: #{gets.chomp}"].each{|i| puts "#{i}"}
...but apparently, get.chomp is evaluated in the same time when whole array is evaluated, before iteration of each element.
In result, I'm first prompted for input, and then each element is printed.
Can I somehow evaluate it lazily, print array in order and still have whole thing in one line?
One way to achieve lazy evaluation is to use procs. Something like this (multiple lines for readability):
[
-> { puts "This should be shown first" },
-> { print "This second: "; puts gets.chomp },
].each(&:call)
I don't really see the advantage of making this a one-liner since it becomes pretty unreadable, but nevertheless:
[ ->{ "This should be shown first" },
->{ "This second: #{gets.chomp}" }
].each {|line| puts line.call }
P.S. Never do "#{foo}". Use string interpolation (#{...}) when you want to, well, interpolate strings, as on the second line above. If you want to turn a non-string into a string, do foo.to_s. If you know it's already a string (or don't care if it is) just use it directly: foo. But puts automatically calls to_s on its arguments, so just do puts foo.
If you dont mind the repetiton of puts:
['puts "This should be shown first"', 'puts "This second: #{gets.chomp}"'].each{|i| eval i}
This is just to show you could use a method rather than a proc.
def line2
"#{["cat","dog"].sample}"
end
["Line 1", :line2, "line 3"].each { |l| puts (l.is_a? Symbol) ? method(l).call : l }
#=> dog
I know I can easily remove a substring from a string.
Now I need to remove every substring from a string, if the substring is in an array.
arr = ["1. foo", "2. bar"]
string = "Only delete the 1. foo and the 2. bar"
# some awesome function
string = string.replace_if_in?(arr, '')
# desired output => "Only delete the and the"
All of the functions to remove adjust a string, such as sub, gsub, tr, ... only take one word as an argument, not an array. But my array has over 20 elements, so I need a better way than using sub 20 times.
Sadly it's not only about removing words, rather about removing the whole substring as 1. foo
How would I attempt this?
You can use gsub which accepts a regex, and combine it with Regexp.union:
string.gsub(Regexp.union(arr), '')
# => "Only delete the and the "
Like follows:
arr = ["1. foo", "2. bar"]
string = "Only delete the 1. foo and the 2. bar"
arr.each {|x| string.slice!(x) }
string # => "Only delete the and the "
One extended thing, this also allows you to crop text with regexp service chars like \, or . (Uri's answer also allows):
string = "Only delete the 1. foo and the 2. bar and \\...."
arr = ["1. foo", "2. bar", "\..."]
arr.each {|x| string.slice!(x) }
string # => "Only delete the and the and ."
Use #gsub with #join on the array elements
You can use #gsub by calling #join on the elements of the array, joining them with the regex alternation operator. For example:
arr = ["foo", "bar"]
string = "Only delete the foo and the bar"
string.gsub /#{arr.join ?|}/, ''
#=> "Only delete the and the "
You can then deal with the extra spaces left behind in any way you see fit. This is a better method when you want to censor words. For example:
string.gsub /#{arr.join ?|}/, '<bleep>'
#=> "Only delete the <bleep> and the <bleep>"
On the other hand, split/reject/join might be a better method chain if you need to care about whitespace. There's always more than one way to do something, and your mileage may vary.
I've looked around but haven't been able to find a working solution to my problem.
I have an array of two strings input and want to test which element of the array contains an exact substring Test.
One thing I have tried (among numerous other attempts):
input = ["Test's string", "Test string"]
# Alternative input array that it needs to work on:
# ["Testing string", "some Test string"]
substring = "Test"
if (input[0].match(/\b#{substring}\b/))
puts "Test 0 "
# Do something...
elsif (input[1].match(/\b#{substring}\b/))
puts "Test 1"
# Do something different...
end
The desired result is a print of "Test 1". The input can be more complex but overall I am looking for a way to find an exact match of a substring in a longer string.
I feel like this should be a rather trivial regex but I haven't been able to come up with the correct pattern. Any help would be greatly appreciated!
Following code may be what you are looking for.
input = ["Testing string", "Test string"]
substring = "Test"
if (input[0].match(/[^|\s]#{substring}[\s|$]/)
puts "Test 0 "
elsif (input[1].match(/[^|\s]#{substring}[\s|$]/)
puts "Test 1"
end
The meaning of the pattern /[^|\s]#{substring}[\s|$]/ is
[^|\s] : left side of the substring is begining of string(^) or white space,
{substring} : subsring is matched exactly,
[\s|$] : right side of the substring is white space or end of string($).
One way to that is as follows:
input = ["Testing string", "Test"]
"Test #{ input.index { |s| s[/\bTest\b/] } }"
#=> "Test 1"
input = ["Test", "Testing string"]
"Test #{ input.index { |s| s[/\bTest\b/] } }"
#=> "Test 0"
\b is the regex denotes a word boundary.
Maybe you want a method to return the index of the first element of input that contains the word? That could be:
def matching_index(input, word)
input.index { |s| s[/\b#{word}\b/i] }
end
input = ["Testing string", "Test"]
matching_index(input, "Test") #=> 1
matching_index(input, "test") #=> 1
matching_index(input, "Testing") #=> 0
matching_index(input, "Testy") #=> nil
Then you could use it like this, for example:
word = 'Test'
puts "The matching element for '#{word}' is at index #{ matching_index(input, word) }"
#=> The matching element for 'Test' is at index 1
word = "Testing"
puts "The matching element for '#{word}' is '#{ input[matching_index(input, word)] }'"
#The matching element for 'Testing' is 'Testing string'
The problem is with your bounding. In your original question, the word Test will match the first string because the ' is will match the \b word boundary. It's a perfect match and is responding with "Test 0" correctly. You need to determine how you'll terminate your search. If your input contains special characters, I don't think the regex will work properly. /\bTest my $money.*/ will never match because the of the $ in your substring.
What happens if you have multiple matches in your input array? Do you want to do something to all of them or just the first one?
Currently i am splitting a string by pattern, like this:
outcome_array=the_text.split(pattern_to_split_by)
The problem is that the pattern itself that i split by, always gets omitted.
How do i get it to include the split pattern itself?
Thanks to Mark Wilkins for inpsiration, but here's a shorter bit of code for doing it:
irb(main):015:0> s = "split on the word on okay?"
=> "split on the word on okay?"
irb(main):016:0> b=[]; s.split(/(on)/).each_slice(2) { |s| b << s.join }; b
=> ["split on", " the word on", " okay?"]
or:
s.split(/(on)/).each_slice(2).map(&:join)
See below the fold for an explanation.
Here's how this works. First, we split on "on", but wrap it in parentheses to make it into a match group. When there's a match group in the regular expression passed to split, Ruby will include that group in the output:
s.split(/(on)/)
# => ["split", "on", "the word", "on", "okay?"
Now we want to join each instance of "on" with the preceding string. each_slice(2) helps by passing two elements at a time to its block. Let's just invoke each_slice(2) to see what results. Since each_slice, when invoked without a block, will return an enumerator, we'll apply to_a to the Enumerator so we can see what the Enumerator will enumerator over:
s.split(/(on)/).each_slice(2).to_a
# => [["split", "on"], ["the word", "on"], ["okay?"]]
We're getting close. Now all we have to do is join the words together. And that gets us to the full solution above. I'll unwrap it into individual lines to make it easier to follow:
b = []
s.split(/(on)/).each_slice(2) do |s|
b << s.join
end
b
# => ["split on", "the word on" "okay?"]
But there's a nifty way to eliminate the temporary b and shorten the code considerably:
s.split(/(on)/).each_slice(2).map do |a|
a.join
end
map passes each element of its input array to the block; the result of the block becomes the new element at that position in the output array. In MRI >= 1.8.7, you can shorten it even more, to the equivalent:
s.split(/(on)/).each_slice(2).map(&:join)
You could use a regular expression assertion to locate the split point without consuming any of the input. Below uses a positive look-behind assertion to split just after 'on':
s = "split on the word on okay?"
s.split(/(?<=on)/)
=> ["split on", " the word on", " okay?"]
Or a positive look-ahead to split just before 'on':
s = "split on the word on okay?"
s.split(/(?=on)/)
=> ["split ", "on the word ", "on okay?"]
With something like this, you might want to make sure 'on' was not part of a larger word (like 'assertion'), and also remove whitespace at the split:
"don't split on assertion".split(/(?<=\bon\b)\s*/)
=> ["don't split on", "assertion"]
If you use a pattern with groups, it will return the pattern in the results as well:
irb(main):007:0> "split it here and here okay".split(/ (here) /)
=> ["split it", "here", "and", "here", "okay"]
Edit The additional information indicated that the goal is to include the item on which it was split with one of the halves of the split items. I would think there is a simple way to do that, but I don't know it and haven't had time today to play with it. So in the absence of the clever solution, the following is one way to brute force it. Use the split method as described above to include the split items in the array. Then iterate through the array and combine every second entry (which by definition is the split value) with the previous entry.
s = "split on the word on and include on with previous"
a = s.split(/(on)/)
# iterate through and combine adjacent items together and store
# results in a second array
b = []
a.each_index{ |i|
b << a[i] if i.even?
b[b.length - 1] += a[i] if i.odd?
}
print b
Results in this:
["split on", " the word on", " and include on", " with previous"]