Check if string starts with number - ruby

How can I check if my string starts with a number?
I'm trying to make this work by using the starts_with ruby method with no luck:
<% if line.start_with?("ANY NUMBER") %>
Thanks!

In Regular Expressions, \A means start of string, and \d means any digit. So this could work for you:
if line.match?(/\A\d/)

Related

How to write a Ruby regexp for [A-Z0-9] repeat 3, or 4 times, no more?

I want to detect 3P4, 4SC3, etc., but not 492-as. I tried /[A-Z0-9]{3,4}/ but 492-as is still passing.
If anyone could help me with this regexp rule, I'd greatly appreciate it.
You need to use start and end of string anchors:
/\A[A-Z0-9]{3,4}\z/
See demo at Rubular
The \A forces the match at the beginning of a string, and \z matches the end of string.

Extracting numbers with regex in ruby from a numbers divided by a dot (thousand delimiter)

Trying to extract '4995' from the string '4.995,-' with regex in Ruby.
I tried with
/\d+/
Which seems to work from this Rubular screenshot: http://cl.ly/image/111c2x0N3s0C
but running it only outputs
4
You cannot match it in a single regex because it is not a single substring.
"4.995,-".gsub(/\D/, "") # => "4995"
I'm up-voting sawa's answer because it's a good answer.
But since you are new to regular expressions, you may want further explanation as to why his answer works for you.
When you are trying to match with the regexp /\d+/, what you are saying is "Match for me 1 or more consecutive digits." But your target string, 4.995,-, is not made up of only consecutive digits. It has a 4 and it has a 995. The first match of "1 or more consecutive digits" is 4. That's why what you're getting as a result is 4.
Try to look at your problem differently. Instead of saying, "Find me all the digits and extract those out," you could say, "Find me anything that's not a digit, and get rid of it." To do this, you can use ruby's search-and-replace function, gsub. gsub searches a target string for anything that matches a given regular expression, and then it replaces those matches with some replacement string that you also provide. Documentation on gsub can be found here
The regular expression for "non-digit" is /\D/. So, you can do a gsub that looks for any /\D/ and replaces it with a blank string.
'4.995,-'.gsub(/\D/,'')
Do as below using String#[] and String#tr:
"4.995,-"[/\d+.\d+/].tr('.','') # => "4995"
# more Rubyish way using #tr method only
"4.995,-".tr("^0-9",'') # => "4995"
p '4.995,-1'.delete('.')[/\d+/] #=> "4995"
Here's another way that, like #Arup's solution, works when a digit follows the first non-digit:
'4.995,-1'.sub('.','').to_i.to_s #=> "4995"
This works because
'4.995,-1'.sub('.','') #=> "4995,-1"
and to_i takes the first part part of a string that can be converted to a Fixnum.
Alternatively:
'4.995,-1'.to_f.to_s.sub('.','') #=> "4995"

Regular expression to find first letter in a string

Consider this example string:
mystr ="1. moody"
I want to capitalize the first letter that occurs in mystr. I am trying this regular expression in Ruby but still returns all the letters in mystr (moody) instead of the letter m only.
puts mystr.scan(/[a-zA-Z]{1}/)
Any help appreciated!
Do as below using String#sub
(arup~>~)$ pry --simple-prompt
>> s = "1. moody"
=> "1. moody"
>> s.sub(/[a-z]/i,&:upcase)
=> "1. Moody"
>>
If you want to modify the source string use s.sub!(/[a-z]/,&:upcase).
Just for completeness, although it doesn’t directly answer your question as posed but could be relevant, consider this variation:
mystr ="1. école"
The line mystr.sub(/[a-z]/i,&:upcase) (as in Arup Rakshit’s answer) will match the second letter of the word, producing
1. éCole
The line mystr.sub /\b\s?[a-zA-Z]{1}/, &:upcase (diego.greyrobot’s answer) won’t match at all and so the line will be unchanged.
There are two problems here. The first is that [a-zA-Z] doesn’t match accented characters, so é isn’t matched. The fix for this is to use the \p{Letter} character property:
mystr.sub /\p{Letter}/, &:upcase
This will match the character in question, but won’t change it. This is due to the second problem, which is that upcase (and downcase) only works on characters in the ASCII range. This is almost as easy to fix, but relies on using an external library such as unicode_utils:
require 'unicode_utils'
mystr.sub(/\p{Letter}/) { |c| UnicodeUtils.upcase(c)}
This results in:
1. École
which is probably what is wanted in this case.
This may not affect you if you are sure all your data is just ASCII, but is worth knowing for other situations.
The reason your attempt returns all the letters is because you are using the scan method which does just that, it returns all the characters which match the regex, in your case letters. For your use case you should use sub since you only want to substitute 1 letter.
I use http://rubular.com to practice my Ruby Regexes. Here's what I came up with http://rubular.com/r/fAQEDFVEVn
The regex is: /\b[a-z]/
It uses \b to find a word boundary, and finally we ask for one letter only with [a-zA-Z]
Finally we'll use sub to replace it with its upcased version:
"1. moody".sub /\b[a-z]/, &:upcase
=> "1. Moody"
Hope that helps.

How do I match a UTF-8 encoded hashtag with embedded punctuation characters?

I want to extract #hashtags from a string, also those that have special characters such as #1+1.
Currently I'm using:
#hashtags ||= string.scan(/#\w+/)
But it doesn't work with those special characters. Also, I want it to be UTF-8 compatible.
How do I do this?
EDIT:
If the last character is a special character it should be removed, such as #hashtag, #hashtag. #hashtag! #hashtag? etc...
Also, the hash sign at the beginning should be removed.
The Solution
You probably want something like:
'#hash+tag'.encode('UTF-8').scan /\b(?<=#)[^#[:punct:]]+\b/
=> ["hash+tag"]
Note that the zero-width assertion at the beginning is required to avoid capturing the pound sign as part of the match.
References
String#encode
Ruby's POSIX Character Classes
This should work:
#hashtags = str.scan(/#([[:graph:]]*[[:alnum:]])/).flatten
Or if you don't want your hashtag to start with a special character:
#hashtags = str.scan(/#((?:[[:alnum:]][[:graph:]]*)?[[:alnum:]])/).flatten
How about this:
#hashtags ||=string.match(/(#[[:alpha:]]+)|#[\d\+-]+\d+/).to_s[1..-1]
Takes cares of #alphabets or #2323+2323 #2323-2323 #2323+65656-67676
Also removes # at beginning
Or if you want it in array form:
#hashtags ||=string.scan(/#[[:alpha:]]+|#[\d\+-]+\d+/).collect{|x| x[1..-1]}
Wow, this took so long but I still don't understand why scan(/#[[:alpha:]]+|#[\d\+-]+\d+/) works but not scan(/(#[[:alpha:]]+)|#[\d\+-]+\d+/) in my computer. The difference being the () on the 2nd scan statement. This has no effect as it should be when I use with match method.

How to create a regexp that matches a pattern except for some strings in Ruby?

I work in Ruby, and have to create a single regexp for the following task, as I'm working with someone else's gem that uses this regexp to match fields to be worked on in a text file. I need to match beginning of string, any set of characters, and underscore, then any multi-digit integer that is not 1,2, 9, or 10, and end of string.
I.e., I want the following to match:
foo_4
bar_8
baz_120
BUT NOT:
foo_1
bar_9
baz_10
I tried
/^.+_(^(1|2|9|10))$/
but it did not work as apparently ^ only "negates" characters in brackets, not submatches.
Outside of a character class the ^ symbol means start of line. I think you want a negative lookahead instead:
/^.+_(?!(?:1|2|9|10)$)\d+$/
See it in action on rubular.

Resources