Combine more than one \flags in Ruby regex (\A, \w) - ruby

I try to catch only cases b and d from sample below (ie. END should be the only word on a line (or at least be a word not part of longer word, and END should be at beginning of line (not necessarily ^, could start from column #2, case \i.)
I cannot combine this all togethernin one regex, can I have more then 1 flag in regex? I also need this OR in this regex too.
Thanks all.
M
regexDrop = /String01|String2|\AEND/i #END\n/i
a = "the long END not begin of line"
b = "ENd" # <#><< need this one
c = "END MORE WORDs"
d =" EnD" # <#><< need this one
if a =~ regexDrop then puts "a__Match: " + a else puts 'a_' end
if b =~ regexDrop then puts "b__Match: " + b else puts 'b_' end
if c =~ regexDrop then puts "c__Match: " + c else puts 'c_' end
if d =~ regexDrop then puts "d__Match: " + d else puts 'd_' end
## \w Matches word characters.
## \A Matches beginning of string. (could be not column 1)

Note that \A is an anchor (a kind of a built-in lookehind, or "zero width assertion", that matches the beginning of a whole string. The \w is a shorthand class matching letters, digits and an underscore (word characters).
Judging by your description and sample input and expected output, I think you are just looking for END anywhere in a string as a whole word and case-insensitive.
You can match the instances with
regexDrop = /String01|String2|\bEND\b/i
Here is a demo
Output:
a__Match: the long END not begin of line
b__Match: ENd
c__Match: END MORE WORDs
d__Match: EnD

Related

Replace specified phrase with * within text

My purpose is to accept a paragraph of text and find the specified phrase I want to REDACT, or replace.
I made a method that accepts an argument as a string of text. I break down that string into individual characters. Those characters are compared, and if they match, I replace those characters with *.
def search_redact(text)
str = ""
print "What is the word you would like to redact?"
redacted_name = gets.chomp
puts "Desired word to be REDACTED #{redacted_name}! "
#splits name to be redacted, and the text argument into char arrays
redact = redacted_name.split("")
words = text.split("")
#takes char arrays, two loops, compares each character, if they match it
#subs that character out for an asterisks
redact.each do |x|
if words.each do |y|
x == y
y.gsub!(x, '*') # sub redact char with astericks if matches words text
end # end loop for words y
end # end if statment
end # end loop for redact x
# this adds char array to a string so more readable
words.each do |z|
str += z
end
# prints it out so we can see, and returns it to method
print str
return str
end
# calling method with test case
search_redact("thisisapassword")
#current issues stands, needs to erase only if those STRING of characters are
# together and not just anywehre in the document
If I put in a phrase that shares characters with others parts of the text, for example, if I call:
search_redact("thisisapassword")
then it will replace that text too. When it accepts input from the user, I want to get rid of only the text password. But it then looks like this:
thi*i**********
Please help.
This is a classic windowing problem used to find a substring in a string. There are many ways to solve this, some that are much more efficient than others but I'm going to give you a simple one to look at that uses as much of your original code as possible:
def search_redact(text)
str = ""
print "What is the word you would like to redact?"
redacted_name = gets.chomp
puts "Desired word to be REDACTED #{redacted_name}! "
redacted_name = "password"
#splits name to be redacted, and the text argument into char arrays
redact = redacted_name.split("")
words = text.split("")
words.each.with_index do |letter, i|
# use windowing to look for exact matches
if words[i..redact.length + i] == redact
words[i..redact.length + i].each.with_index do |_, j|
# change the letter to an astrisk
words[i + j] = "*"
end
end
end
words.join
end
# calling method with test case
search_redact("thisisapassword")
The idea here is we're taking advantage of array == which allows us to say ["a", "b", "c"] == ["a", "b", "c"]. So now we just walk the input and ask does this sub array equal this other sub array. If they do match, we know we need to change the value so we loop through each element and replace it with a *.

Not sure why this simple regex matching code won't work

# #!/usr/local/bin/ruby
puts "why doesn't this work??"
pi = ''
special = "[;\`'<>-]"
regex = /[#{special.gsub(/./){|char| "\\#{char}"}}]/
pi = ARGV[0].to_s #takes in console argument to test
if pi == '3.1415926535897932385'
puts "got it"
end
if pi =~ regex
puts "stop word"
else
puts "incorrect"
end
All I'm trying to do is test whether or not the pi variable contains any of the stop characters, if true, print "stop word" otherwise got it or incorrect respectively. I've tried doing this about ten ways. with scans, include? lines and I feel like this is the best route.
I think you may be over-thinking this. Here are a couple of ways (among many), where true means that the string contains at least one of the special characters):
#1
baddies = "[;`'<>-]"
pi = '3.14'
pi.delete(baddies).size < pi.size #=> false
pi = '3.1;4'
pi.delete(baddies).size < pi.size #=> true
#2
special = %w| [ ; ` ' < > - ] |
# => ["[", ";", "`", "'", "<", ">", "-", "]"]
pi = '3.14'
(pi.chars & special).any? #=> false
pi = '3.1cat4'
(pi.chars & special).any? #=> false
pi = '3.1;4'
(pi.chars & special).any? #=> true
You don't need to escape any of the characters in your character class:
special = "[;\`'<>-]"
regex = /#{special}/
p regex
#pi = ARGV[0] #takes in console argument to test
pi = 'hello;world'
if pi == '3.1415926535897932385'
puts "got it"
end
if pi =~ regex
puts "stop word"
else
puts "incorrect"
end
--output:--
/[;`'<>-]/
stop word
And ARGV[0] is a string already. But, a shell/console also recognizes special characters when you enter them on the command line:
special = "[;\`'<>-]"
#regex = /[#{special.gsub(/./){|char| "\\#{char}"}}]/
regex = /#{special}/
p regex
pi = ARGV[0] #takes in console argument to test
if pi == '3.1415926535897932385'
puts "got it"
end
if pi =~ regex
puts "stop word"
else
puts "incorrect"
end
--output:--
~/ruby_programs$ ruby 1.rb ;
/[;`'<>-]/
incorrect
~/ruby_programs$ ruby 1.rb <
-bash: syntax error near unexpected token `newline'
If you want the shell/console to treat the special characters that it recognizes--as literals, then you have to quote them. There are various ways to quote things in a shell/console:
~/ruby_programs$ ruby 1.rb \;
/[;`'<>-]/
stop word
~/ruby_programs$ ruby 1.rb \<
/[;`'<>-]/
stop word
Note you can use String#[] too:
special = "[;\`'<>-]"
regex = /#{special}/
...
...
if pi[regex]
puts "stop word"
else
puts "incorrect"
end

Finding the first duplicate character in the string Ruby

I am trying to call the first duplicate character in my string in Ruby.
I have defined an input string using gets.
How do I call the first duplicate character in the string?
This is my code so far.
string = "#{gets}"
print string
How do I call a character from this string?
Edit 1:
This is the code I have now where my output is coming out to me No duplicates 26 times. I think my if statement is wrongly written.
string "abcade"
puts string
for i in ('a'..'z')
if string =~ /(.)\1/
puts string.chars.group_by{|c| c}.find{|el| el[1].size >1}[0]
else
puts "no duplicates"
end
end
My second puts statement works but with the for and if loops, it returns no duplicates 26 times whatever the string is.
The following returns the index of the first duplicate character:
the_string =~ /(.)\1/
Example:
'1234556' =~ /(.)\1/
=> 4
To get the duplicate character itself, use $1:
$1
=> "5"
Example usage in an if statement:
if my_string =~ /(.)\1/
# found duplicate; potentially do something with $1
else
# there is no match
end
s.chars.map { |c| [c, s.count(c)] }.drop_while{|i| i[1] <= 1}.first[0]
With the refined form from Cary Swoveland :
s.each_char.find { |c| s.count(c) > 1 }
Below method might be useful to find the first word in a string
def firstRepeatedWord(string)
h_data = Hash.new(0)
string.split(" ").each{|x| h_data[x] +=1}
h_data.key(h_data.values.max)
end
I believe the question can be interpreted in either of two ways (neither involving the first pair of adjacent characters that are the same) and offer solutions to each.
Find the first character in the string that is preceded by the same character
I don't believe we can use a regex for this (but would love to be proved wrong). I would use the method suggested in a comment by #DaveNewton:
require 'set'
def first_repeat_char(str)
str.each_char.with_object(Set.new) { |c,s| return c unless s.add?(c) }
nil
end
first_repeat_char("abcdebf") #=> b
first_repeat_char("abcdcbe") #=> c
first_repeat_char("abcdefg") #=> nil
Find the first character in the string that appears more than once
r = /
(.) # match any character in capture group #1
.* # match any character zero of more times
? # do the preceding lazily
\K # forget everything matched so far
\1 # match the contents of capture group 1
/x
"abcdebf"[r] #=> b
"abccdeb"[r] #=> b
"abcdefg"[r] #=> nil
This regex is fine, but produces the warning, "regular expression has redundant nested repeat operator '*'". You can disregard the warning or suppress it by doing something clunky, like:
r = /([^#{0.chr}]).*?\K\1/
where ([^#{0.chr}]) means "match any character other than 0.chr in capture group 1".
Note that a positive lookbehind cannot be used here, as they cannot contain variable-length matches (i.e., .*).
You could probably make your string an array and use detect. This should return the first char where the count is > 1.
string.split("").detect {|x| string.count(x) > 1}
I'll use positive lookahead with String#[] method :
"abcccddde"[/(.)(?=\1)/] #=> c
As a variant:
str = "abcdeff"
p str.chars.group_by{|c| c}.find{|el| el[1].size > 1}[0]
prints "f"

Pig Latin exercise works, but only for one user inputed word. Not all words

I'm new to programming and I'm working with Ruby as my starter language. The below code works, but if someone inputs more than one word, the pigatize method only works on the first word and adds the additional ay or way to the last word. How do i get it to apply to each word a user inputs?
# If the first letter is a vowel, add "way" to the end
# If the first letter is a consonant, move it to the end and add "ay"
class PigLatin
VOWELS = %w(a e i o u)
def self.pigatize(text)
if PigLatin.vowel(text[0])
pigalatin = text + 'way'
else
piglatin = text[1..-1] + text[0] + 'ay'
end
end
def self.vowel(first_letter)
VOWELS.include?(first_letter)
end
end
puts 'Please enter a word and I will translate it into Pig Latin. Ippyyay!.'
text = gets.chomp
puts "Pigatized: #{PigLatin.pigatize(text)}"
Chiefly, you need to split the input string into words with String#split, using an expression like:
text.split(' ')
That produces an array of words, which you can loop over with an .each block and run the algorithm on each word, then reassemble them with += and a space at the end + ' '
Incorporating these things into your existing code looks like the following (with comments):
class PigLatin
VOWELS = %w(a e i o u)
def self.pigatize(text)
# Declare the output string
piglatin = ''
# Split the input text into words
# and loop with .each, and 'word' as the iterator
# variable
text.split(' ').each do |word|
if PigLatin.vowel(word[0])
# This was misspelled...
# Add onto the output string with +=
# and finish with an extra space
piglatin += word + 'way' + ' '
else
# Same changes down here...
piglatin += word[1..-1] + word[0] + 'ay' + ' '
end
end
# Adds a .chomp here to get rid of a trailing space
piglatin.chomp
end
def self.vowel(first_letter)
VOWELS.include?(first_letter)
end
end
puts 'Please enter a word and I will translate it into Pig Latin. Ippyyay!.'
text = gets.chomp
puts "Pigatized: #{PigLatin.pigatize(text)}"
There are other ways to handle this than adding to the string with +=. You could, for example add words onto an array with an expression like:
# piglatin declared as an array []
# .push() adds words to the array
piglatin.push(word + 'way')
Then when it's time to output it, use Array#join to connect them back with spaces:
# Reassemble the array of pigatized words into a
# string, joining the array elements by spaces
piglatin.join(' ')
There are alternatives to .each..do for the loop. You could use a for loop like
for word in text.split(' ')
# stuff...
end
...but using the .each do is a bit more idiomatic and more representative of what you'll usually find in Ruby code, though the for loop is more like you'd find in most other languages besides Ruby.

Check if string contains one word or more

When looping through lines of text, what is the neatest way (most 'Ruby') to do an if else statement (or similar) to check if the string is a single word or not?
def check_if_single_word(string)
# code here
end
s1 = "two words"
s2 = "hello"
check_if_single_word(s1) -> false
check_if_single_word(s2) -> true
Since you're asking about the 'most Ruby' way, I'd rename the method to single_word?
One way is to check for the presence of a space character.
def single_word?(string)
!string.strip.include? " "
end
But if you want to allow a particular set of characters that meet your definition of word, perhaps including apostrophes and hyphens, use a regex:
def single_word?(string)
string.scan(/[\w'-]+/).length == 1
end
Following your definition of a word given in the comment:
[A] stripped string that doesn't [include] whitespace
the code would be
def check_if_single_word(string)
string.strip == string and string.include?(" ").!
end
check_if_single_word("two words") # => false
check_if_single_word("New York") # => false
check_if_single_word("hello") # => true
check_if_single_word(" hello") # => false
Here some code may help you out :
def check_if_single_word(string)
ar = string.scan(/\w+/)
ar.size == 1 ? "only one word" : "more than one word"
end
s1 = "two words"
s2 = "hello"
check_if_single_word s1 # => "more than one word"
check_if_single_word s2 # => "only one word"
def check_if_single_word(string)
string.scan(/\w+/).size == 1
end
s1 = "two words"
s2 = "hello"
check_if_single_word s1 # => false
check_if_single_word s2 # => true
I would check if a space exists in the string.
def check_if_single_word(string)
return !(string.strip =~ / /)
end
.strip will remove excess white space that may exist at the start and the end of the string.
!(myString =~ / /) means that the string does not match the regular expression of a single space.
Likewise you could also use !string.strip[/ /].
a Ruby Way. Extend the calss String
class String
def one?
!self.strip.include? " "
end
end
Then use "Hello world".one? to Check if string contains one word or more.

Resources