i am working on the problem of Number shuffle https://rubymonk.com/learning/books/1-ruby-primer/problems/154-permutations#solution4802. exercises asks to:
return a sorted array of all the unique numbers that can be formed
with 3 or 4 digits.
there is a solution (See the Solution) below the exercise, that looks like this:
def number_shuffle(number)
no_of_combinations = number.to_s.size == 3 ? 6 : 24
digits = number.to_s.split(//)
combinations = []
combinations << digits.shuffle.join.to_i while
combinations.uniq.size!=no_of_combinations
combinations.uniq.sort
end
I have a few questions, can anyone explain me:
1) in 'no_of_combinations' variable what does it mean '3 ? 6 : 24'? i think 3 is amount of digits in the number. question mark ( ? ) is symbol of 'if'- if number digits are 3, the amount of numbers will be 6 in the array of combinations. colon (punctuation) is symbol sign, but i do not know why 24, there are 23 symbols considering white space in the array of combinations.
2) what does it mean << symbol after combinations? i know that it is addition sign, but what does it do here? and also, what it means exclamation mark after 'size' in the following string?
1) in 'no_of_combinations' variable what does it mean '3 ? 6 : 24'?
The expression needs to include the comparison to make sense . . .
number.to_s.size == 3 ? 6 : 24
This is the ternary if. If the comparison before the ? is true, it evaluates to the first value (6 here), if it is false it evaluates to the second value (24 here). It has nothing to do with literal Symbol values . . . in fact Ruby parser will always treat the colon as a value separator here, however you space it.
The syntax of this operator is originally from C, and you will find it copied to many other languages.
2) what does it mean << symbol after combinations? i know that it is addition sign, but what does it do here?
It is not an addition sign. This operator does different things depending on the class of the object (on the left). In this example, that is an Array, and << pushes the object on the right onto the end of the array. It is almost identical to push
and also, what it means exclamation mark after 'size' in the following string?
In this case it is part of !=, or "not equals" comparison operator. The original author could have made this clearer with a bit of whitespace.
Related
That's what I am doing:
c.scan(/[1-9]|1[0-2]/)
For some reason, it returns only numbers from 1 to 9, ignoring the second part. I tried experimenting a little bit, it seems that the method will search for 10-12 only if 1 is excluded from [1-9] part, e.g., c.scan(/[2-9]|1[0-2]/) will do. What is the reason?
P.S. I know that this method lacks lookbehinds and will search for numbers and "part of numbers" as well
Change the order of your patterns and add word boundaries if necessary.
c.scan(/\b(?:1[0-2]|[1-9])\b/)
The pattern before | is used first. So in our case, it matches all the numbers from 10 to 12. After that the next pattern, that is the one after | is used and now it matches all the remaining numbers ranges from 1 to 9. Note that this would match 9 in 59 also. So i suggest you to put your pattern inside a capturing or non-capturing group and add word boundary \b (matches between a word character and a non-word character) before and after to that group .
DEMO
| matches left to right, and the first part of the right side (1) is always matched by the left side. Reverse them:
c.scan(/1[0-2]|[1-9]/)
Here's another way you might consider extracting numbers between 1 and 12 (assuming that's what you want to do):
c = '14 0 11x 15 003 y12'
c.scan(/\d+/).map(&:to_i).select { |n| (1..12).cover?(n) }
#=> [11, 3, 12]
I've returned an array of integers, rather than strings, thinking that probably would be more useful, but if you want strings:
c.scan(/\d+/).map { |s| s.to_i.to_s }
.select { |s| ['10', '11', '12', *'1'..'9'].include?(s) }
#=> ["11", "3", "12"]
I see several advantages to this approach, versus using a single regex:
it's easy to understand;
the regex is simple;
it's easy to modify if the permissible values change; and
it can be broken into three pieces to facilitate testing.
Right now I am working on a project that issues IDs consisting of both letters and numbers, for example 345A22. I need this program to be able to tell that for example, 345B22 is greater than 345A22. I can't assume that the letters will be in the same position all the time (ie we do have some id's with 22335Q) but when I compare two numbers the letters will be in the same position.
How do I accomplish this in Ruby?
You can use the String#<=> method to compare strings. See documentation here.
>> "345B22" <=> "345A22"
=> 1
Where the 1 return value means that 345B22 is greater.
If a simple string comparison won't do the trick (e.g. different lengths, etc.), try converting the IDs (assuming they all match ^[0-9A-Z]*$) into integers by treating them as base36-encoded data.
In Ruby strings have the same comparison methods as numbers have.
2 > 1 #=> true
"2" > "1" #=> true
"B" > "A" #=> true
Not sure I understand your question, but I'm guessing that you mentally parse the ids into components (so 345B22 is 345, B, 22) and then are wishing for a numeric sort for things that are numbers (i.e., 12 > 2) and a string sort for things that are strings (AB < B).
If this is what you intend, something like the following would do the trick:
ids.sort_by do |id|
id.scan(/\d+|[a-zA-Z]+/).map {|c| c =~ /\d/ ? c.rjust(20) : c.ljust(20) }.join
end
What this does is extract out all consecutive numbers or letters and then justify them right or left based on their type, concatenates the result and then sorts based on this (expanded and canonicalized) id.
I'm working with OPE IDs. One file has them with two trailing zeros, eg, [998700, 1001900]. The other file has them with one or two leading zeros for a total length of six, eg, [009987, 010019]. I want to convert every OPE ID (in both files) to an eight-digit string with exactly two leading zeros and however many zeros at the end to get it to be eight digits long.
Try this:
a = [ "00123123", "077934", "93422", "1231234", "12333" ]
a.map { |n| n.gsub(/^0*/, '00').ljust(8, '0') }
=> ["00123123", "00779340", "00934220", "001231234", "00123330"]
If you have your data parsed and stored as strings, it could be done like this, for example.
n = ["998700", "1001900", "009987", "0010019"]
puts n.map { |i|
i =~ /^0*([0-9]+?)0*$/
"00" + $1 + "0" * [0, 6 - $1.length].max
}
Output:
00998700
00100190
00998700
00100190
This example on codepad.
I'm note very sure though, that I got the description exactly right. Please check the comments and I correct in case it's not exactly what you were looking for.
With the help of the answers given by #detunized & #nimblegorilla, I came up with:
"998700"[0..-3].rjust(6, '0').to_sym
to make the first format I described (always with two trailing zeros) equal to the second.
can you help me with this:
I want a regular expression for my Ruby program to match a word with the below pattern
Pattern has
List of letters ( For example. ABCC => 1 A, 1 B, 2 C )
N Wild Card Charaters ( N can be 0 or 1 or 2)
A fixed word (for example “XY”).
Rules:
Regarding the List of letters, it should match words with
a. 0 or 1 A
b. 0 or 1 B
c. 0 or 1 or 2 C
Based on the value of N, there can be 0 or 1 or 2 wild chars
Fixed word is always in the order it is given.
The combination of all these can be in any order and should match words like below
ABWXY ( if wild char = 1)
BAXY
CXYCB
But not words with 2 A’s or 2 B’s
I am using the pattern like ^[ABCC]*.XY$
But it looks for words with more than 1 A, or 1 B or 2 C's and also looks for words which end with XY, I want all words which have XY in any place and letters and wild chars in any postion.
If it HAS to be a regex, the following could be used:
if subject =~
/^ # start of string
(?!(?:[^A]*A){2}) # assert that there are less than two As
(?!(?:[^B]*B){2}) # and less than two Bs
(?!(?:[^C]*C){3}) # and less than three Cs
(?!(?:[ABCXY]*[^ABCXY]){3}) # and less than three non-ABCXY characters
(?=.*XY) # and that XY is contained in the string.
/x
# Successful match
else
# Match attempt failed
end
This assumes that none of the characters A, B, C, X, or Y are allowed as wildcards.
I consider myself to be fairly good with regular expressions and I can't think of a way to do what you're asking. Regular expressions look for patterns and what you seem to want is quite a few different patterns. It might be more appropriate to in your case to write a function which splits the string into characters and count what you have so you can satisfy your criteria.
Just to give an example of your problem, a regex like /[abc]/ will match every single occurrence of a, b and c regardless of how many times those letters appear in the string. You can try /c{1,2}/ and it will match "c", "cc", and "ccc". It matches the last case because you have a pattern of 1 c and 2 c's in "ccc".
One thing I have found invaluable when developing and debugging regular expressions is rubular.com. Try some examples and I think you'll see what you're up against.
I don't know if this is really any help but it might help you choose a direction.
You need to break out your pattern properly. In regexp terms, [ABCC] means "any one of A, B or C" where the duplicate C is ignored. It's a set operator, not a grouping operator like () is.
What you seem to be describing is creating a regexp based on parameters. You can do this by passing a string to Regexp.new and using the result.
An example is roughly:
def match_for_options(options)
pattern = '^'
pattern << 'A' * options[:a] if (options[:a])
pattern << 'B' * options[:b] if (options[:b])
pattern << 'C' * options[:c] if (options[:c])
Regexp.new(pattern)
end
You'd use it something like this:
if (match_for_options(:a => 1, :c => 2).match('ACC'))
# ...
end
Since you want to allow these "elements" to appear in any order, you might be better off writing a bit of Ruby code that goes through the string from beginning to end and counts the number of As, Bs, and Cs, finds whether it contains your desired substring. If the number of As, Bs, and Cs, is in your desired limits, and it contains the desired substring, and its length (i.e. the number of characters) is equal to the length of the desired substring, plus # of As, plus # of Bs, plus # of Cs, plus at most N characters more than that, then the string is good, otherwise it is bad. Actually, to be careful, you should first search for your desired substring and then remove it from the original string, then count # of As, Bs, and Cs, because otherwise you may unintentionally count the As, Bs, and Cs that appear in your desired string, if there are any there.
You can do what you want with a regular expression, but it would be a long ugly regular expression. Why? Because you would need a separate "case" in the regular expression for each of the possible orders of the elements. For example, the regular expression "^ABC..XY$" will match any string beginning with "ABC" and ending with "XY" and having two wild card characters in the middle. But only in that order. If you want a regular expression for all possible orders, you'd need to list all of those orders in the regular expression, e.g. it would begin something like "^(ABC..XY|ACB..XY|BAC..XY|BCA..XY|" and go on from there, with about 5! = 120 different orders for that list of 5 elements, then you'd need more for the cases where there was no A, then more for cases where there was no B, etc. I think a regular expression is the wrong tool for the job here.
I'm trying to create a system where I can convert RegEx values to integers and vice versa. where zero would be the most basic regex ( probably "/./" ), and any subsequent numbers would be more complex regex's
My best approach so far was to stick all the possible values that could be contained within a regex into an array:
values = [ "!", ".", "\/", "[", "]", "(", ")", "a", "b", "-", "0", "9", .... ]
and then to take from that array as follows:
def get( integer )
if( integer.zero? )
return '';
end
integer = integer - 1;
if( integer < values.length )
return values[integer]
end
get(( integer / values.length ).floor) + get( integer % values.length);
end
sample_regex = /#{get( 100 )}/;
The biggest problem with this approach is that a invalid RegExp can easily be generated.
Is there an already established algorithm to achieve what I'm trying? if not, any suggestions?
Thanx
Steve
Since regular expressions can be formally defined by recursively applying a finite number of elements, this can be done: instead of simply concatenating elements, combine them according to the rules of regular expressions. Because the regular language is also recursively enumerable, this is guaranteed to work.
However, it's quite probably overkill to implement this. What do you need this for? Would a simple dictionary of Number -> RegExp key-value pairs not be better suited to associate regular expressions with unique numbers?
I would say that // is the simplest regex (it matches anything). /./ is fairly complex since it is just shorthand for /[^\n]/, which itself is just shorthand for a much longer expression (what that expression is depends on your character set). The next simplest expression would be /a/ where a is the first character in your character set. That last statement brings up an interesting problem for your enumeration: what character set will you use? Any enumeration will be tied to a given character set. Assuming you start with // as 0, /\x{00}/ (match the nul character) as 1, /\x{01}/ as 2, etc. Then you would start to get into interesting regexes (ones that match more than one string) around 129 if you used the ASCII set, but it would take up to 1114112 for UNICODE 5.0.
All in all, I would say a better solution is treat the number as a sequence of bytes, map those bytes into whatever character set you are using, use a regex compiler to determine if that number is a valid regex, and discard numbers that are not valid.