Convert Unicode number to Natural number in ruby - ruby

Currently I am facing problem with UNICODE character on my Rails 3 project.
In Khmer character number(unicode character) letter "៤" is equal to 4.
I want to compare ៤ >= 3 but can't.
Can anyone suggest me some idea about how to compare that? May be there are some method could convert ៤ to 4 so that I can do compare.
Note
I can type ៤ by switching keyboard from Eng to Khm and type 4 as normal.
Thanks

Do the numbers behave in the same way like arabic numerals? Then, you can use this little helper method to convert a Khmer-number string to an integer:
# encoding: utf-8
class String
def to_khmer
num_string = chars.map{ |c| %w[០ ១ ២ ៣ ៤ ៥ ៦ ៧ ៨ ៩].index(c) || c }.join
if num_string =~ /\./
num_string.to_f
else
num_string.to_i
end
end
end

Yes, you can do that
"s".ord == 115 #=> true
115.chr == "s" #=> true
4.chr.ord == 4 #=> true

Related

How can I create a method that checks if a string starts with a capitalized letter?

So far I have:
def capitalized?(str)
str[0] == str[0].upcase
end
THe problem wit this is that it returns true for strings like "12345", "£$%^&" and"9ball" etc. I would like it to only return true if the first character is a capital letter.
You can use match? to return true if the first character is a letter in the range of A to Z both uppercase or not:
def capitalized?(str)
str.match?(/\A[A-Z]/)
end
p capitalized?("12345") # false
p capitalized?("fooo") # false
p capitalized?("Fooo") # true
Also you can pass a regular expression to start_with?:
p 'Foo'.start_with?(/[A-Z]/) # true
p 'foo'.start_with?(/[A-Z]/) # false
There's probably a nicer way to do it with regex, but keeping this ruby based, you can make an array of capital letters:
capital_letters = ("A".."Z")
Then you can check if your first letter is in that array:
def capitalized?(str)
capital_letters = ("A".."Z")
capital_letters.include?(str[0])
end
Or a bit shorter:
def capitalized?(str)
("A".."Z").include?(str[0])
end
I would avoid character ranges if possible, because without knowing the encoding, you can never be sure what is in a range. In your case, it is unnecessary. A simple
/^[[:upper:]]/ =~ str
would do. See here for the definition of POSIX character classes.
def capitalized?(str)
str[0] != str[0].downcase
end
capitalized? "Hello" #=> true
capitalized? "hello" #=> false
capitalized? "007, I presume" #=> false
capitalized? "$100 for that?" #=> false
Simple solution
def capitalized?(str)
str == str.capitalize
end

Simple regex - ignoring certain characters

I'm trying to use the match method with an argument of a regex to select a valid phone number, by definition, any string with nine digits.
For example:
9347584987 is valid,
(456)322-3456 is valid,
(324)5688890 is valid.
But
(340)HelloWorld is NOT valid and
456748 is NOT valid.
So far, I'm able to use \d{9} to select the example string of 9 digit characters in a row, but I'm not sure how to specifically ignore any character, such as '-' or '(' or ')' in the middle of the sequence.
What kind of Regex could I use here?
Given:
nums=['9347584987','(456)322-3456','(324)5688890','(340)HelloWorld', '456748 is NOT valid']
You can split on a NON digit and rejoin to remove non digits:
> nums.map {|s| s.split(/\D/).join}
["9347584987", "4563223456", "3245688890", "340", "456748"]
Then filter on the length:
> nums.map {|s| s.split(/\D/).join}.select {|s| s.length==10}
["9347584987", "4563223456", "3245688890"]
Or, you can grab a group of numbers that look 'phony numbery' by using a regex to grab digits and common delimiters:
> nums.map {|s| s[/[\d\-()]+/]}
["9347584987", "(456)322-3456", "(324)5688890", "(340)", "456748"]
And then process that list as above.
That would delineate:
> '123 is NOT a valid area code for 456-7890'[/[\d\-()]+/]
=> "123" # no match
vs
> '123 is NOT a valid area code for 456-7890'.split(/\D/).join
=> "1234567890" # match
I suggest using one regular expression for each valid pattern rather than constructing a single regex. It would be easier to test and debug, and easier to maintain the code. If, for example, "123-456-7890" or 123-456-7890 x231" were in future deemed valid numbers, one need only add a single, simple regex for each to the array VALID_PATTERS below.
VALID_PATTERS = [/\A\d{10}\z/, /\A\(\d{3}\)\d{3}-\d{4}\z/, /\A\(\d{3}\)\d{7}\z/]
def valid?(str)
VALID_PATTERS.any? { |r| str.match?(r) }
end
ph_nbrs = %w| 9347584987 (456)322-3456 (324)5688890 (340)HelloWorld 456748 |
ph_nbrs.each { |s| puts "#{s.ljust(15)} \#=> #{valid?(s)}" }
9347584987 #=> true
(456)322-3456 #=> true
(324)5688890 #=> true
(340)HelloWorld #=> false
456748 #=> false
String#match? made its debut in Ruby v2.4. There are many alternatives, including str.match(r) and str =~ r.
"9347584987" =~ /(?:\d.*){9}/ #=> 0
"(456)322-3456" =~ /(?:\d.*){9}/ #=> 1
"(324)5688890" =~ /(?:\d.*){9}/ #=> 1
"(340)HelloWorld" =~ /(?:\d.*){9}/ #=> nil
"456748" =~ /(?:\d.*){9}/ #=> nil
Pattern: (Rubular Demo)
^\(?\d{3}\)?\d{3}-?\d{4}$ # this makes the expected symbols optional
This pattern will ensure that an opening ( at the start of the string is followed by 3 numbers the a closing ).
^(\(\d{3}\)|\d{3})\d{3}-?\d{4}$
On principle, though, I agree with melpomene in advising that you remove all non-digital characters, test for 9 character length, then store/handle the phone numbers in a single/reliable/basic format.

Add (...) to hash entries over certain character limit in Ruby

I'm very new to Ruby and searching for a solution.
Essentially I have a hash in the form of [0 > String, 1 > String] etc.
I want to run a loop which can count the characters in the strings in the hashes, should it reach a limit, the end of the string should be cut at that point and replaced with '...'
Example:
Say I set my character count at 10:
Hello World!
Would shorten to:
Hello Worl...
May be worth noting that this hash is created from an array, as such if it is deemed wiser to do it before the hash conversion, that would also be fine. Any advice is hugely appreciated.
Using ActiveSupport's Truncate Method
If you're willing to mix in methods from the ActiveSupport gem such as String#truncate, this is trivial. For example:
require 'active_support/core_ext/string/filters'
'Hello World!'.truncate 10
#=> "Hello W..."
Note that the #truncate method counts the ellipsis as three characters (one for each period in the ellipsis) towards the total character count. Bump your count by three characters (e.g. 13 instead of 10) if you actually want 10 characters before the ellipsis rather than including it.
I think this is not beautiful, but it does what you're asking:
hash = {
0 => 'hello world',
1 => 'hello'
}
hash.each_pair do |key, value|
p key: key
p value: value.length > 10 ? value[0..9] + '...' : value
end
you could also make a method to truncate the string, like this:
def truncate(string, truncate_after)
return string if string.length < truncate_after
string[0..(truncate_after - 1)] + '...'
end

Ruby converting letters in string to letters 13 places further in the alphabet

I'm trying to solve a problem where when given a string I convert each letter 13 places further in the alphabet. For example
a => n
b => o
c => p
Basically every letter in the string is converted 13 alphabet spaces.
If given the string 'sentence' i'd like it to convert to
'feagrapr'
I have no idea how to do it. I've tried
'sentence'.each_char.select{|x| 13.times{x.next}}
and I still couldn't solve it.
This one has been puzzling me for a while now, and I've given up trying to solve it.
I need your help
IMHO, there is a better way to achieve the same in idiomatic Ruby:
def rot13(string)
string.tr("A-Za-z", "N-ZA-Mn-za-m")
end
This works because the parameter 13 is hard-coded in the OP's question, in which case the tr function seems to be just the right tool for the job!
Using String#tr as TCSGrad suggests is the ideal solution.
Some alternatives:
Using case, ord, and chr
word = 'sentence'
word.gsub(/./) do |c|
case c
when 'a'..'m', 'A'..'M' then (c.ord + 13).chr
when 'n'..'z', 'N'..'Z' then (c.ord - 13).chr
else c
end
end
Using gsub and a hash for multiple replacement
word = 'sentence'
from = [*'a'..'z', *'A'..'Z']
to = [*'n'..'z', *'a'..'m', *'N'..'Z', *'A'..'M']
cipher = from.zip(to).to_h
word.gsub(/[a-zA-Z]/, cipher)
Note, Array#to_h requires Ruby 2.1+. For older versions of Ruby, use
cipher = Hash[from.zip(to)].
From here -> How do I increment/decrement a character in Ruby for all possible values?
you should do it like:
def increment_char(char)
return 'a' if char == 'z'
char.ord.next.chr
end
def increment_by_13(str)
conc = []
tmp = ''
str.split('').each do |c|
tmp = c
13.times.map{ |i| tmp = increment_char(tmp) }
conc << tmp
end
conc
end
Or close.

Ruby: How to find out if a character is a letter or a digit?

I just started tinkering with Ruby earlier this week and I've run into something that I don't quite know how to code. I'm converting a scanner that was written in Java into Ruby for a class assignment, and I've gotten down to this section:
if (Character.isLetter(lookAhead))
{
return id();
}
if (Character.isDigit(lookAhead))
{
return number();
}
lookAhead is a single character picked out of the string (moving by one space each time it loops through) and these two methods determine if it is a character or a digit, returning the appropriate token type. I haven't been able to figure out a Ruby equivalent to Character.isLetter() and Character.isDigit().
Use a regular expression that matches letters & digits:
def letter?(lookAhead)
lookAhead.match?(/[[:alpha:]]/)
end
def numeric?(lookAhead)
lookAhead.match?(/[[:digit:]]/)
end
These are called POSIX bracket expressions, and the advantage of them is that unicode characters under the given category will match. For example:
'ñ'.match?(/[A-Za-z]/) #=> false
'ñ'.match?(/\w/) #=> false
'ñ'.match?(/[[:alpha:]]/) #=> true
You can read more in Ruby’s docs for regular expressions.
The simplest way would be to use a Regular Expression:
def numeric?(lookAhead)
lookAhead =~ /[0-9]/
end
def letter?(lookAhead)
lookAhead =~ /[A-Za-z]/
end
Regular expression is an overkill here, it's much more expensive in terms of performance. If you just need a check is character a digit or not there is a simpler way:
def is_digit?(s)
code = s.ord
# 48 is ASCII code of 0
# 57 is ASCII code of 9
48 <= code && code <= 57
end
is_digit?("2")
=> true
is_digit?("0")
=> true
is_digit?("9")
=> true
is_digit?("/")
=> false
is_digit?("d")
=> false

Resources