How to parse on a hash of symbols - ruby

I tried to downcase a parsed option to match only symbols in lower-case format because, when I match an uppercase OR mixed-case word my parser returns a nil value.
I don't want to have a hash like [:ens, :ENS, :eNS, :enS ...]:
opts.on("-i", "--instance [INSTANCE]", [:ens, :etu], "Selectionnez l'instance de Gitlab (etu, ens)") do |instance|
# puts instance.inspect
Options[:instance] = instance
end
Example:
./gitlabCollect -t my_token -k my_keyword -i ENS
will not work because the hash returned is:
{:token=>"my_token", :keyword=>"my_keyword", :instance=>nil}

It seems you don't understand some things about symbols, they map to hashes based on their characters. :sss != :Sss. Try :sss.hash and :Sss.hash in irb/pry. You will see they map to two different numbers. If you need to be case indiscriminate, you can always convert to a string and upcase it.
symbol1 = :sss
symbol2 = :Sss
symbol1.to_s.upcase == symbol2.to_s.upcase #true

Related

Take an array and a letter as arguments and return a new array with words that contain that letter

I can run a search and find the element I want and can return those words with that letter. But when I start to put arguments in, it doesn't work. I tried select with include? and it throws an error saying, private method. This is my code, which returns what I am expecting:
my_array = ["wants", "need", 3, "the", "wait", "only", "share", 2]
def finding_method(source)
words_found = source.grep(/t/) #I just pick random letter
print words_found
end
puts finding_method(my_array)
# => ["wants", "the", "wait"]
I need to add the second argument, but it breaks:
def finding_method(source, x)
words_found = source.grep(/x/)
print words_found
end
puts finding_method(my_array, "t")
This doesn't work, (it returns an empty array because there isn't an 'x' in the array) so I don't know how to pass an argument. Maybe I'm using the wrong method to do what I'm after. I have to define 'x', but I'm not sure how to do that. Any help would be great.
Regular expressions support string interpolation just like strings.
/x/
looks for the character x.
/#{x}/
will first interpolate the value of the variable and produce /t/, which does what you want. Mostly.
Note that if you are trying to search for any text that might have any meaning in regular expression syntax (like . or *), you should escape it:
/#{Regexp.quote(x)}/
That's the correct answer for any situation where you are including literal strings in regular expression that you haven't built yourself specifically for the purpose of being a regular expression, i.e. 99% of cases where you're interpolating variables into regexps.

Regex string with grouping?

I see in the documentation I'm able to do:
/\$(?<dollars>\d+)\.(?<cents>\d+)/ =~ "$3.67" #=> 0
puts dollars #=> prints 3
I was wondering if this would be possible:
string = "\$(\?<dlr>\d+)\.(\?<cts>\d+)"
/#{Regexp.escape(string)}/ =~ "$3.67"
I get:
`<main>': undefined local variable or method `dlr' for main:Object (NameError)
There are a few mistakes in your approach. First of all, let's look at your string:
string = "\$(\?<dlr>\d+)\.(\?<cts>\d+)"
You escape the dollar sign with "\$", but that is the same as just writing "$", consider:
"\$" == "$"
#=> true
To actually end up with the string "backslash followed by dollar" you would need to write "\\$". The same thing applies to the decimal character classes, you would have to write "\\d" to end up with the correct string.
The question marks on the other hand are actually part of the regex syntax, so you do not want to escape these at all. I recommend using single quotes for your original string, because that makes the input much easier:
string = '\$(?<dlr>\d+)\.(?<cts>\d+)'
#=> "\\$(?<dlr>\\d+)\\.(?<cts>\\d+)"
The next issue is with Regexp.escape. Take a look at what regular expression it produces with the above string:
string = '\$(?<dlr>\d+)\.(?<cts>\d+)'
Regexp.escape(string)
#=> "\\\\\\$\\(\\?<dlr>\\\\d\\+\\)\\\\\\.\\(\\?<cts>\\\\d\\+\\)"
That's one level too much escaping. Regexp.escape can be used when you want to match the literal characters that are contained in the string. For example, the escaped regex above will match the source string itself:
/#{Regexp.escape(string)}/ =~ string
#=> 0 # matches at offset 0
Instead, you can use Regexp.new to treat the source as an actual regular expression.
The last issue is then how you access the match result. Obviously, you are getting a NoMethodError. You might think that the match result is stored in local variables called dlr and cts, but that is not the case. You have two options to access the match data:
Use Regexp.match, it will return a MatchData object as result
Use regexp =~ string and then access the last match data with the global variable $~
I prefer the former, because it is easier to read. The full code would then look like this:
string = '\$(?<dlr>\d+)\.(?<cts>\d+)'
regexp = Regexp.new(string)
result = regexp.match("$3.67")
#=> #<MatchData "$3.67" dlr:"3" cts:"67">
result[:dlr]
#=> "3"
result[:cts]
#=> "67"

How to validate that a string is a proper hexadecimal value in Ruby?

I am writing a 6502 assembler in Ruby. I am looking for a way to validate hexadecimal operands in string form. I understand that the String object provides a "hex" method to return a number, but here's a problem I run into:
"0A".hex #=> 10 - a valid hexadecimal value
"0Z".hex #=> 0 - invalid, produces a zero
"asfd".hex #=> 10 - Why 10? I guess it reads 'a' first and stops at 's'?
You will get some odd results by typing in a bunch of gibberish. What I need is a way to first verify that the value is a legit hex string.
I was playing around with regular expressions, and realized I can do this:
true if "0A" =~ /[A-Fa-f0-9]/
#=> true
true if "0Z" =~ /[A-Fa-f0-9]/
#=> true <-- PROBLEM
I'm not sure how to address this issue. I need to be able to verify that letters are only A-F and that if it is just numbers that is ok too.
I'm hoping to avoid spaghetti code, riddled with "if" statements. I am hoping that someone could provide a "one-liner" or some form of elegent code.
Thanks!
!str[/\H/] will look for invalid hex values.
String#hex does not interpret the whole string as hex, it extracts from the beginning of the string up to as far as it can be interpreted as hex. With "0Z", the "0" is valid hex, so it interpreted that part. With "asfd", the "a" is valid hex, so it interpreted that part.
One method:
str.to_i(16).to_s(16) == str.downcase
Another:
str =~ /\A[a-f0-9]+\Z/i # or simply /\A\h+\Z/ (see hirolau's answer)
About your regex, you have to use anchors (\A for begin of string and \Z for end of string) to say that you want the full string to match. Also, the + repeats the match for one or more characters.
Note that you could use ^ (begin of line) and $ (end of line), but this would allow strings like "something\n0A" to pass.
This is an old question, but I just had the issue myself. I opted for this in my code:
str =~ /^\h+$/
It has the added benefit of returning nil if str is nil.
Since Ruby has literal hex built-in, you can eval the string and rescue the SyntaxError
eval "0xA" => 10
eval "0xZ" => SyntaxError
You can use this on a method like
def is_hex?(str)
begin
eval("0x#{str}")
true
rescue SyntaxError
false
end
end
is_hex?('0A') => true
is_hex?('0Z') => false
Of course since you are using eval, make sure you are sending only safe values to the methods

Convert matched string of UTF-8 values to UTF-8 characters in Ruby

Trying to convert output from a rest_client GET to the characters that are represented with escape sequences.
Input: ..."sub_id":"\u0d9c\u8138\u8134\u3f30\u8139\u2b71"...
(which I put in 'all_subs')
Match: m = /sub_id\"\:\"([^\"]+)\"/.match(all_subs.to_str) [1]
Print: puts m.force_encoding("UTF-8").unpack('U*').pack('U*')
But it just comes out the same way I put it in. ie, "\u0d9c\u8138\u8134\u3f30\u8139\u2b71"
However, if I convert a raw string of it:
puts "\u0d9c\u8138\u8134\u3f30\u8139\u2b71".unpack('U*').pack('U*')
The output is perfect as "ග脸脴㼰脹⭱"
What you're getting when you parse the input string is actually this:
m = "\\u0d9c\\u8138\\u8134\\u3f30\\u8139\\u2b71"
Which is not the same as:
"\u0d9c\u8138\u8134\u3f30\u8139\u2b71"
Therefore one option is to eval the string so that ruby applies the codepoints:
puts eval("\"#{m}\"")
=> ග脸脴㼰脹
However note that there are security implications when running eval.
If the string is always like in your example. You could also do something like this, which is safe:
puts m.split("\\u")[1..-1].map { |c| c.to_i(16) }.pack("U*")
=> ග脸脴㼰脹

can I do hash.has_key?('video' or 'video2') (ruby)

or even better can I do hash.has_key?('videox') where x is
'' nothing or
a digit?
so 'video', 'video1', 'video2' would pass the condition?
of course I can have two conditions but in case I need to use video3 in the future things would get more complicated...
If you want the general case of video followed by a digit without explicitly listing all the combinations there are a couple of methods from Enumerable that you could use in combination with a regular expression.
hash.keys is an array of the keys from hash and ^video\d$ matches video followed by a digit.
# true if the block returns true for any element
hash.keys.any? { |k| k.match(/^video\d$/ }
or
# grep returns an array of the matching elements
hash.keys.grep(/^video\d$/).size > 0
grep would also allow you to capture the matching key(s) if you needed that information for the next bit of your code e.g.
if (matched_keys = hash.keys.grep(/^video\d$/)).size > 0
puts "Matching keys #{matched_keys.inspect}"
Furthermore, if the prefix of key we're looking for is in a variable rather than a hard coded string we can do something along the lines of:
prefix = 'video'
# use an interpolated string, note the need to escape the
# \ in \d
hash.keys.any? { |k| k.match("^#{prefix}\\d$") }
One possibility:
hash.values_at(:a, :b, :c, :x).compact.length > 1

Resources