I want to find the binary gap using Ruby regex
Say 1000001001010011100000000000, From left I want to use regex to match
A. 1000001 should return 00000
B. 1001 should return 00
C. 101 should return 0
D 1001 should return 00
My first attempt look like this but its missing the B and D
Update
A binary gap within a positive integer N is any maximal sequence of consecutive zeros that is surrounded by ones at both ends in the binary representation of N.
I think what you are looking for is:
/1(0+)(?=1)/
The problem with your pattern is that you consume the "closing 1". Consequence, the next research starts after this "closing 1".
But if you use a lookahead (that is a zero width assertion that doesn't consume characters and only tests what happens after), the "closing 1" isn't consumed and you get the desired result, because the next research starts after the last zero.
Note that if you don't need the zeros to be enclosed between ones, you can also simply use: /0+/
Other way: if you are sure that the string only contains 1s and 0s, you can also use the (non-)word-boundary assertion \B with this pattern: 1\K0++\B
R = /
(?= # start a positive lookahead
1 # match a one
(0+) # match one or more zeros in capture group 1
1 # match a one
) # end positive lookahead
/x # free-spacting regex definition mode
str = "1000001001010011100000000000"
arr = []
str.scan(R) { |m| arr << [m.first, Regexp.last_match.begin(0)+1] }
arr
#=> [["00000", 1], ["00", 7], ["0", 10], ["00", 12]]
The elements of arr correspond to all all substrings of one or more "0"'s of str that are preceded and followed by 1. The first element of each pair is the substring, the second is the offset into str where the substring begins.
Here's a second example.
str = "10011001010101001110001000100101"
arr = []
str.scan(R) { |m| arr << [m.first, Regexp.last_match.begin(0)+1] }
arr
#=> [["00", 1], ["00", 5], ["0", 8], ["0", 10], ["0", 12], ["00", 14],
# ["000", 19], ["000", 23], ["00", 27], ["0", 30]]
Note that one must use a positive lookahead, rather than a positive lookbehind, as (in Ruby) the latter does not permit variable-length strings (i.e., 0+).
#Stefan, in a comment, suggested an improvement:
R = /
(?<=1) # match a one in a positive lookbehind
0+ # match one or more zeros
(?=1) # match a one in a positive lookahead
/x # free-spacting regex definition mode
str = "1000001001010011100000000000"
arr = []
str.scan(R) { |m| arr << [m, Regexp.last_match.begin(0)] }
arr
#=> [["00000", 1], ["00", 7], ["0", 10], ["00", 12]]
This is similar to what #Casimir suggests (/1(0+)(?=1)/), except that by putting the first 1 in a positive lookbehind there's no need for the capture group.
Here is another way that does not use a regex.
str = "1000001001010011100000000000"
(0..str.size-3).each_with_object([]) do |i,a|
next if str[i] == '0' || str[i+1] == '1'
ndx = str[i+2..-1].index('1')
a << [str[i+1, 1+ndx], i+1] if ndx
end
#=> [["00000", 1], ["00", 7], ["0", 10], ["00", 12]]
In order to get only the zeroes in between ones, you need to use regex lookbehind and lookahead:
(?:<=1)0+(?:=1)
After that you only need to get the max lenght element.
Related
I'm still coming to terms with Regex and want to formulate an expression that will let me count the number of successive consonants at the beginning of a string. E.g. 'Cherry' will return 2, 'hello' 1, 'schlepp' 4 and so on. Since the number isn't predetermined (although English probably has some upper limit on initial consonants!) I'd need some flexible expression, but I'm a bit stuck about how to write it. Any pointers would be welcome!
This would work:
'Cherry'[/\A[bcdfghjklmnpqrstvwxyz]*/i].length #=> 2
The regex matches zero or more consonants at the beginning of the string. String#[] returns the matching part and length determines its length.
You can also express the consonants character class more succinct by intersecting [a-z] and [^aeiou] via &&:
'Cherry'[/\A[a-z&&[^aeiou]]*/i].length #=> 2
Something along this line would work:
>> 'Cherry'.downcase.split(/([aeiou].*)/).first.length
# => 2
>> 'hello'.downcase.split(/([aeiou].*)/).first.length
# => 1
>> 'schlepp'.downcase.split(/([aeiou].*)/).first.length
# => 4
Another way is to replace from the first vowel until end of string by nothing then take the length:
'Cherry'.gsub(/[aeiou].*$/,"").length
It is not necessary to use a regular expression.
CONSONANTS = (('a'..'z').to_a - 'aeiou'.chars).join
#=> "bcdfghjklmnpqrstvwxyz"
def consecutive_constants(str)
e, a = str.each_char.chunk { |c| CONSONANTS.include?(c.downcase) }.first
e ? a.size : 0
end
consecutive_constants("THIS is nuts") #=> 2
consecutive_constants("Is this ever cool?") #=> 0
consecutive_constants("1. this is wrong") #=> 0
Note
enum = "THIS is nuts".each_char.chunk { |c| CONSONANTS.include?(c.downcase) }
#=> #<Enumerator: #<Enumerator::Generator:0x000000010e1a40>:each>
We can see the elements that will be generated by this enumerator by applying Enumerable#entries (or Enumerable#to_a):
enum.entries
#=> [[true, ["T", "H"]], [false, ["I"]], [true, ["S"]], [false, [" ", "i"]],
# [true, ["s"]], [false, [" "]], [true, ["n"]], [false, ["u"]], [true, ["t", "s"]]]
Continuing,
e, a = enum.first
#=> [true, ["T", "H"]]
e #=> true
a #=> ["T", "H"]
a.size
#=> 2
Iām using Rails 4.2.7 (Ruby 2.3). I use the following to split a string based on a regular expression
my_string.split(/\W+/)
My question is, how can I get an equivalent array of numbers in which each number represents the index of where the splits occurred? If no splits occurred, I would expect the array to only contain an element with ā-1ā.
So, you don't want to split, you just want the indices?
a = []
"Hello stack overflow".scan(/\W+/){a<<Regexp.last_match.begin(0)}
a
#=> [5, 11]
An empty array would mean that no split occured.
Edit : Shorter version could be
"Hello stack overflow".enum_for(:scan, /\W+/).map{Regexp.last_match.begin(0)}
#=> [5, 11]
In some cases String#split will split a string on substrings having multiple characters. The offset into the string of the beginning of each substring may not be sufficient. Both the start and end indices may be needed.
The following method returns an array of tuples, the first element of each tuple being the portion of the string returned by strip, the second being the index where that substring begins. Note only does this provide a way to compute the desired indices, but it also returns the substrings returned by split.
def split_with_offsets(str, r)
indices = []
str.scan(r) { indices << Regexp.last_match.begin(0) }
str.split(r).zip(indices << str.size).map { |s,ndx| [s, ndx-s.size] }
end
With this information it is straightforward to obtain an array of ranges, each range being the offsets of a substring on which the strip is split.
def gaps(str, r)
split_with_offsets(str, r).each_cons(2).map { |(s,f), (_,lp1)| f+s.size..lp1-1 }
end
Here are two examples.
#1
str = "Hello stack overflow"
r = /\W+/
split_with_offsets(str, r)
#=> [["Hello", 0], ["stack", 6], ["overflow", 12]]
gaps(str, r)
#=> [5..5, 11..11]
#2
str = "I | cannot | wait | for | the election |to |be | over"
r = /\s*\|\s*/
split_with_offsets(str, r)
#=> [["I", 0], ["cannot", 4], ["wait", 13], ["for", 20], ["the election", 29],
# ["to", 43], ["be", 47], ["over", 53]]
gaps(str, r)
#=> [1..3, 10..12, 17..19, 23..28, 41..42, 45..46, 49..52]
I've read % notation but I could not find the explanation about the followings.
Example 1: The following code with % outputs i. Obviously % changes i to a string. But I am not sure what actually % is doing.
irb(main):200:0> [[1,2,3],[4,5,6]].each{ |row| p row.map{ |i| % i } }
["i", "i", "i"]
["i", "i", "i"]
=> [[1, 2, 3], [4, 5, 6]]
irb(main):201:0> [[1,2,3],[4,5,6]].each{ |row| p row.map{ |i| i } }
[1, 2, 3]
[4, 5, 6]
=> [[1, 2, 3], [4, 5, 6]]
Example 2: It seems %2d adding 2 spaces in front of a number. Again, I am not sure what %2d is doing.
irb(main):194:0> [[1,2,3],[4,5,6],[7,8,9]].each{ |row| p row.map{|i| "%2d" % i } }
[" 1", " 2", " 3"]
[" 4", " 5", " 6"]
[" 7", " 8", " 9"]
=> [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
Where can I find the documentation about these?
Here is the doc - You may also create strings using %:.
There are two different types of % strings %q(...) behaves like a single-quote string (no interpolation or character escaping) while %Q behaves as a double-quote string.....
In your first example p row.map{|i| % i } as per the above doc % i creates a string "i".
Examples :-
[1, 2, 3].map { |i| % i } # => ["i", "i", "i"]
% i # => "i"
Just remember as doc is saying -
Any combination of adjacent single-quote, double-quote, percent strings will be concatenated as long as a percent-string is not last.
From the wikipedia link
Any single non-alpha-numeric character can be used as the delimiter, %[including these], %?or these?,...
Now in your case it is %<space>i<space>. Which in the link I mentioned just above are %[..], %?..? etc.. That is why %<space>i<space> gives "i". (I used <space> to show there is a space)
Read Kernel#format
Returns the string resulting from applying format_string to any additional arguments. Within the format string, any characters other than format sequences are copied to the result.
The syntax of a format sequence is follows.
%[flags][width][.precision]type
A format sequence consists of a percent sign, followed by optional flags, width, and precision indicators, then terminated with a field type character. The field type controls how the corresponding sprintf argument is to be interpreted, while the flags modify that interpretation.
Your last question actually points to a method str % arg ā new_str.
If IRB made you fool, like made me while trying to understand % i, don't worry, have a look - why in IRB modulo string literal(%) is behaving differently ?. Good answer Matthew Kerwin is given there.
DESCRIPTION:
The purpose of my code is to take in input of a sequence of R's and C's and to simply store each number that comes after the character in its proper array.
For Example: "The input format is as follows: R1C4R2C5
Column Array: [ 4, 5 ] Row Array: [1,2]
My problem is I am getting the output like this:
[" ", 1]
[" ", 4]
[" ", 2]
[" ", 5]
**How do i get all the Row integers following R in one array, and all the Column integers following C in another seperate array. I do not want to create multiple arrays, Rather just two.
Help!
CODE:
puts 'Please input: '
input = gets.chomp
word2 = input.scan(/.{1,2}/)
col = []
row = []
word2.each {|a| col.push(a.split(/C/)) if a.include? 'C' }
word2.each {|a| row.push(a.split(/R/)) if a.include? 'R' }
col.each do |num|
puts num.inspect
end
row.each do |num|
puts num.inspect
end
x = "R1C4R2C5"
col = []
row = []
x.chars.each_slice(2) { |u| u[0] == "R" ? row << u[1] : col << u[1] }
p col
p row
The main problem with your code is that you replicate operations for rows and columns. You want to write "DRY" code, which stands for "don't repeat yourself".
Starting with your code as the model, you can DRY it out by writing a method like this to extract the information you want from the input string, and invoke it once for rows and once for columns:
def doit(s, c)
...
end
Here s is the input string and c is the string "R" or "C". Within the method you want
to extract substrings that begin with the value of c and are followed by digits. Your decision to use String#scan was a good one, but you need a different regex:
def doit(s, c)
s.scan(/#{c}\d+/)
end
I'll explain the regex, but let's first try the method. Suppose the string is:
s = "R1C4R2C5"
Then
rows = doit(s, "R") #=> ["R1", "R2"]
cols = doit(s, "C") #=> ["C4", "C5"]
This is not quite what you want, but easily fixed. First, though, the regex. The regex first looks for a character #{c}. #{c} transforms the value of the variable c to a literal character, which in this case will be "R" or "C". \d+ means the character #{c} must be followed by one or more digits 0-9, as many as are present before the next non-digit (here a "R" or "C") or the end of the string.
Now let's fix the method:
def doit(s, c)
a = s.scan(/#{c}\d+/)
b = a.map {|str| str[1..-1]}
b.map(&:to_i)
end
rows = doit(s, "R") #=> [1, 2]
cols = doit(s, "C") #=> [4, 5]
Success! As before, a => ["R1", "R2"] if c => "R" and a =>["C4", "C5"] if c => "C". a.map {|str| str[1..-1]} maps each element of a into a string comprised of all characters but the first (e.g., "R12"[1..-1] => "12"), so we have b => ["1", "2"] or b =>["4", "5"]. We then apply map once again to convert those strings to their Fixnum equivalents. The expression b.map(&:to_i) is shorthand for
b.map {|str| str.to_i}
The last computed quantity is returned by the method, so if it is what you want, as it is here, there is no need for a return statement at the end.
This can be simplified, however, in a couple of ways. Firstly, we can combine the last two statements by dropping the last one and changing the one above to:
a.map {|str| str[1..-1].to_i}
which also gets rid of the local variable b. The second improvement is to "chain" the two remaining statements, which also rids us of the other temporary variable:
def doit(s, c)
s.scan(/#{c}\d+/).map { |str| str[1..-1].to_i }
end
This is typical Ruby code.
Notice that by doing it this way, there is no requirement for row and column references in the string to alternate, and the numeric values can have arbitrary numbers of digits.
Here's another way to do the same thing, that some may see as being more Ruby-like:
s.scan(/[RC]\d+/).each_with_object([[],[]]) {|n,(r,c)|
(n[0]=='R' ? r : c) << n[1..-1].to_i}
Here's what's happening. Suppose:
s = "R1C4R2C5R32R4C7R18C6C12"
Then
a = s.scan(/[RC]\d+/)
#=> ["R1", "C4", "R2", "C5", "R32", "R4", "C7", "R18", "C6", "C12"]
scan uses the regex /([RC]\d+)/ to extract substrings that begin with 'R' or 'C' followed by one or more digits up to the next letter or end of the string.
b = a.each_with_object([[],[]]) {|n,(r,c)|(n[0]=='R' ? r : c) << n[1..-1].to_i}
#=> [[1, 2, 32, 4, 18], [4, 5, 7, 6, 12]]
The row values are given by [1, 2, 32, 4, 18]; the column values by [4, 5, 7, 6, 12].
Enumerable#each_with_object (v1.9+) creates an array comprised of two empty arrays, [[],[]]. The first subarray will contain the row values, the second, the column values. These two subarrays are represented by the block variables r and c, respectively.
The first element of a is "R1". This is represented in the block by the variable n. Since
"R1"[0] #=> "R"
"R1"[1..-1] #=> "1"
we execute
r << "1".to_i #=> [1]
so now
[r,c] #=> [[1],[]]
The next element of a is "C4", so we will execute:
c << "4".to_i #=> [4]
so now
[r,c] #=> [[1],[4]]
and so on.
rows, cols = "R1C4R2C5".scan(/R(\d+)C(\d+)/).flatten.partition.with_index {|_, index| index.even? }
> rows
=> ["1", "2"]
> cols
=> ["4", "5"]
Or
rows = "R1C4R2C5".scan(/R(\d+)/).flatten
=> ["1", "2"]
cols = "R1C4R2C5".scan(/C(\d+)/).flatten
=> ["4", "5"]
And to fix your code use:
word2.each {|a| col.push(a.delete('C')) if a.include? 'C' }
word2.each {|a| row.push(a.delete('R')) if a.include? 'R' }
I'm attempting to create a word generator based on a 4x4 grid of letters (below).
Here are the rules:
Letters cannot be repeated
Words must be formed by adjacent letters
Words can be formed horizontally, vertically or diagonally to the left, right or up-and-down
Currently, I take a 16-character input and loop through every word in the dictionary, determining whether that word can be spelled with the letters on the grid.
#!/usr/bin/ruby
require './scores' # alphabet and associated Scrabble scoring value (ie the wordValue() method)
require './words.rb' # dictionary of English words (ie the WORDS array)
# grab users letters
puts "Provide us the 16 letters of your grid (no spaces please)"
word = gets.chomp.downcase
arr = word.split('')
# store words that can be spelled with user's letters
success = []
# iterate through dictionary of words
WORDS.each do |w|
# create temp arrays
dict_arr = w.split('')
user_arr = arr.dup
test = true
# test whether users letters spell current word in dict
while test
dict_arr.each do |letter|
if (user_arr.include?(letter))
i = user_arr.index(letter)
user_arr.delete_at(i)
else
test = false
break
end
end
# store word in array
if test
success << w
test = false
end
end
end
# create hash for successful words and their corresponding values
SUCCESS = {}
success.each do |w|
score = wordValue(w)
SUCCESS[w] = score
end
# sort hash from lowest to smallest value
SUCCESS = SUCCESS.sort_by {|word, value| value}
# print results to screen
SUCCESS.each {|k,v| puts "#{k}: #{v}"}
However, this approach doesn't take into account the positions of the tiles on the board. How would you suggest I go about finding words that can be created based on their location in the 4x4 grid?
For the board game in the image above, it takes about 1.21 seconds for my VM running Ubuntu to compute the 1185 possible words. I'm using the dictionary of words provided with Ubunut in /usr/share/dict/words
Instead of iterating over words and searching for their presence, walk through each tile on the grid and find all words stemming from that tile.
First, compile your dictionary into a trie. Tries are efficient at performing prefix-matching string comparisons, which will be of use to us shortly.
To find the words within the board, perform the following steps for each of the 16 tiles, starting with an empty string for prefix.
Add the current tile's value to prefix.
Check if our trie contains any words starting with prefix.
If it does, branch the search: for each legal (unvisited) tile that is adjacent to this tile, go back to step 1 (recurse).
If it doesn't match, stop this branch of the search since there are no matching words.
I would create a simple graph representing the whole board. Letters would be vertices. If two letters are near one another on the board I would create an edge between their vertices. It would be very easy to find out whether the input is valid. You simply would have to check whether there is a matching path in the graph.
My original answer was not what you wanted. I was creating a list of all of the
"words" in the grid, instead of searching for the words you had already identified
from the dictionary. Now I have written a function which searches the grid for a
particular word. It works recursively.
So, now the algorithm is:
1) Get the 16 letters from the user
2) Search dictionary for all words with those letters
3) Call is_word_on_board with each of those words to see if you have a match
#!/usr/bin/ruby
# This script searches a board for a word
#
# A board is represented by a string of letters, for instance, the string
# "abcdefghijklmnop" represents the board:
#
# a b c d
# e f g h
# i j k l
# m n o p
#
# The array ADJACENT lists the cell numbers that are adjacent to another
# cell. For instance ADJACENT[3] is [2, 6, 7]. If the cells are numbered
#
# 0 1 2 3
# 4 5 6 7
# 8 9 10 11
# 12 13 14 15
ADJACENT = [
[1, 4, 5],
[0, 2, 4, 5, 6],
[1, 3, 5, 6, 7],
[2, 6, 7],
[0, 1, 5, 8, 9],
[0, 1, 2, 4, 6, 8, 9, 10],
[1, 2, 3, 5, 7, 9, 10, 11],
[2, 3, 6, 10, 11],
[4, 5, 9, 12, 13],
[4, 5, 6, 8, 10, 12, 13, 14],
[5, 6, 7, 9, 11, 13, 14, 15],
[6, 7, 10, 14, 15],
[8, 9, 13],
[8, 9, 10, 12, 14],
[9, 10, 11, 13, 15],
[10, 11, 14]
]
# function: is_word_on_board
#
# parameters:
# word - word you're searching for
# board - string of letters representing the board, left to right, top to bottom
# prefix - partial word found so far
# cell - position of last letter chosen on the board
#
# returns true if word was found, false otherwise
#
# Note: You only need to provide the word and the board. The other two parameters
# have default values, and are used by the recursive calls.
# set this to true to log the recursive calls
DEBUG = false
def is_word_on_board(word, board, prefix = "", cell = -1)
if DEBUG
puts "word = #{word}"
puts "board = #{board}"
puts "prefix = #{prefix}"
puts "cell = #{cell}"
puts
end
# If we're just beginning, start word at any cell containing
# the starting letter of the word
if prefix.length == 0
0.upto(15) do |i|
if word[0] == board[i]
board_copy = board.dup
newprefix = board[i,1]
# put "*" in place of letter so we don't reuse it
board_copy[i] = ?*
# recurse, and return true if the word is found
if is_word_on_board(word, board_copy, newprefix, i)
return true
end
end
end
# we got here without finding a match, so return false
return false
elsif prefix.length == word.length
# we have the whole word!
return true
else
# search adjacent cells for the next letter in the word
ADJACENT[cell].each do |c|
# if the letter in this adjacent cell matches the next
# letter of the word, add it to the prefix and recurse
if board[c] == word[prefix.length]
newprefix = prefix + board[c, 1]
board_copy = board.dup
# put "*" in place of letter so we don't reuse it
board_copy[c] = ?*
# recurse, and return true if the word is found
if is_word_on_board(word, board_copy, newprefix, c)
return true
end
end
end
# bummer, no word found, so return false
return false
end
end
puts "Test board:"
puts
puts " r u t t"
puts " y b s i"
puts " e a r o"
puts " g h o l"
puts
board = "ruttybsiearoghol"
for word in ["ruby", "bears", "honey", "beast", "rusty", "burb", "bras", "ruttisbyearolohg", "i"]
if is_word_on_board(word, board)
puts word + " is on the board"
else
puts word + " is NOT on the board"
end
end
Running this script give the following results:
Test board:
r u t t
y b s i
e a r o
g h o l
ruby is on the board
bears is on the board
honey is NOT on the board
beast is on the board
rusty is NOT on the board
burb is NOT on the board
bras is on the board
ruttisbyearolohg is on the board
i is on the board
I happen to have a Boggle solver I wrote a while ago. It follows Cheeken's outline. It's invoked a bit differently (you supply the word list file and a text file with a 4x4 grid as arguments), but I figured it was worth sharing. Also note that it treats "Q" as "QU", so there's some extra logic in there for that.
require 'set'
def build_dict(dict, key, value)
if key.length == 0
dict[:a] = value
else
if key[0] == "q"
first = key[0..1]
rest = key[2, key.length - 1]
else
first = key[0]
rest = key[1, key.length - 1]
end
dict[first] = {} unless dict.has_key? first
build_dict(dict[first], rest, value)
end
end
dict = {}
#parse the file into a dictionary
File.open(ARGV[0]).each_line do |line|
real_line = line.strip
build_dict(dict, real_line, real_line)
end
#parse the board
board = {}
j = 0
File.open(ARGV[1]).each_line do |line|
line.chars.each_with_index do |l, i|
board[[j, i]] = l
end
j += 1
end
#(0..3).each{|r| puts (0..3).map{|c| board[[r, c]]}.join}
#how can i get from one place to another?
def get_neighbors(slot, sofar)
r, c = slot
directions =
[
[r+1, c],
[r+1, c+1],
[r+1, c-1],
[r, c+1],
[r, c-1],
[r-1, c],
[r-1, c+1],
[r-1, c-1]
]
directions.select{|a| a.all?{|d| d >= 0 && d <= 3} && !sofar.include?(a)}
end
#actual work
def solve(board, slot, word_dict, sofar)
results = Set.new
letter = board[slot]
letter = "qu" if letter == "q"
stuff = word_dict[letter]
return results if stuff.nil?
if stuff.has_key? :a
results << stuff[:a] if stuff[:a].length > 2
end
unless stuff.keys.select{|key| key != :a}.empty?
get_neighbors(slot, sofar).each do |dir|
results += solve(board, dir, stuff, sofar.clone << slot)
end
end
results
end
#do it!
results = Set.new
all_slots = (0..3).to_a.product((0..3).to_a)
all_slots.each do |slot|
results += solve(board, slot, dict, slot)
end
puts results.sort