Regexp with letters and hexadecimal - ruby

I need regexp for "EWD-eb-AEW-97-QOW" like strings.
The general pattern is:
3 uppercase letters, hex, 3 uppercase letters, hex, 3 uppercase letters.
I use:
/[A-Z]{3}-\h-[A-Z]{3}-\h-[A-Z]{3}/
but it doesn't work. Can anyone help with it and explain why it doesn't work?

\h doesn't match 2 digit hex numbers, use this regex:
/[A-Z]{3}-[A-F0-9]{2}-[A-Z]{3}-[A-F0-9]{2}-[A-Z]{3}/i
RegEx Demo

In addition to anubhava's answer.. You can add {2} occurrences in your answer for achieving the same.
/[A-Z]{3}-\h{2}-[A-Z]{3}-\h{2}-[A-Z]{3}/i
See DEMO

I'd use something like:
/(?:[A-Z]{3}-[a-f0-9]{2}-){2}[A-Z]{3}/
https://www.regex101.com/r/aY5eF6/1
(?:[A-Z]{3}-[a-f0-9]{2}-)
groups the three-letters, '-', two hexadecimal letters, and '-', and then does it twice, followed by three-letters again.
Regarding using Ruby's \h Regexp extension:
I'd be careful using special characters like \h in a mixed-language environment, or one that is supporting old versions of Ruby. We use YAML to contain patterns shared among various languages, and something like this would open up a very hard to track bug. I'd recommend using [a-f0-9] unless you KNOW you'll never run into that problem.

Related

Use regex to find a certain instance of two characters

I am not sure how to approach this problem using regex.
Write a method that takes a string in and returns true if the letter "z" appears within three letters after an "a". You may assume that the string contains only lowercase letters.
Any insight would be greatly appreciated.
Basically, you're being asked to match any of the following patterns:
'a**z'
'a*z'
'az'
where * is a lowercase letter, a-z. In natural language (ok, English) that can be stated as "An 'a' followed by 0, 1, or 2 lowercase letters, followed by a 'z'. Regex-wise, that can be expressed as
/a[a-z]{0,2}z/
I'm not a rubyist at all, so there may or may not be some sort of Ruby specific tweaks that need to be made to that, but that should be the basic gist of it.
def foo s
!!(s =~ /a\w{,2}z/)
end
This is the site I like to use for Ruby Regex validations. rubular.com That being said, if you want to hit on a z that is 0 to 2 spaces after only an a, and the text is lowercase I would use this Regex string. [a]{0,2}z This regex string passed all three scenarios on Rubular and it will only hit on an a.

Regex for capital letters not matching accented characters

I am new to ruby and I'm trying to work with regex.
I have a text which looks something like:
HEADING
Some text which is always non capitalized. Headings are always capitalized, followed by a space or nothing more.
YOU CAN HAVE MULTIPLE WORDS IN HEADING
I'm using this regular expression to choose all headings:
^[A-Z]{2,}\s?([A-Z]{2,}\s?)*$
However, it matches all headings which does not contain chars as Č, Š, Ž(slovenian characters).
So I'm guessing [A-Z] only matches ASCII characters? How could I get utf8?
You are right in that when you define the ASCII range A-Z, the match is made literally only for those characters. This is to do with the history of characters on computers, more and more characters have been added over time, and they are not always structured in an encoding in ways that are easy to use.
You could make a larger character class that matches the slovenian characters you need, by listing them.
But there is a shortcut. Someone else has already added necessary data to the Unicode data so that you can write shorter matches for "all uppercase characters": /[[:upper:]]/. See http://ruby-doc.org//core-2.1.4/Regexp.html for more.
Altering your regular expression with just this adjustment:
^[[:upper:]]{2,}\s?([[:upper:]]{2,}\s?)*$
You may need to adjust it further, for instance it would not match the heading "I AM A HEADING" due to the match insisting each word is at least two letters long.
Without seeing all your examples, I would probably simplify the group matching and just allow spaces anywhere:
^[[:upper:]\s]+$
You can use unicode upper case letter:
\p{Lu}
Your regex:
\b\p{Lu}{2,}(?:\s*\p{Lu}{2,})\b
RegEx Demo

Ruby/Rails: How do I allow whitespace, dash and ÄÖÜ in my regex?

Currently my regular Expression looks like this: /\A[a-zA-Z]{2,50}\z/i
Now I would like to add acceptance of: -, whitespace and ÄÖÜ.
I tried rubular.com but I'm a total regex noob.
Probably you want to think about using Unicode properties and scripts. You could write your regex then as
\A[\p{L}\s-]{2,50}\z
See it here on Rubular
\p{L} is a Unicode property and matching any letter in any language.
If you want to match ÄÖÜ you maybe also want ß, using Unicode properties you don't have to think about such things.
If you want to limit the possible letters a bit you can use a Unicode script such as Latin
\A[\p{Latin}\s-]{2,50}\z
See it on Rubular
Just add it in the regex as follows:
/\A[a-zA-Z\-\sÄÖÜ]{2,50}\z/i
DEMO
regex is an invaluable cross-language tool that the sooner you learn, the better off you will be. I suggest putting in the time to learn it
In a regex, whitespace is represented by the shorthand character class \s.
A hyphen/minus is a special character, so must be escaped by a backslash \-
Ä, Ö and Ü are normal characters, so you can just add them as they are.
/\A[a-zA-Z\s\-ÄÖÜ]{2,50}\z/i

How to hide output password with sharp in ruby?

I want to puts sharp's instead of password in ruby code
puts " found password: #{pass.tr('?','#')}"
I need as many sharp '#' characters output as characters in a password.
How to do it right?
The method .tr is intended to swap specific characters, you cannot do a wild-card match. Even if you extended it to cover many characters, there is a risk that you miss or forget a special character that is allowed in passwords on your system.
A simple variant of what you have is to use .gsub instead:
pass.gsub(/./,'#')
This uses regular expressions to find groups of characters to swap. The simple Regexp /./ matches any single character. The Ruby core documentation on regular expressions includes a brief introduction, in case you have not used them much before.

Regexp Greek chars by number

I deal with strings that contain Greek and English (Latin) text. I'd like to use a regex to catch all the Greek words that contain 4 or more characters on them.
Using regexp manual I figure out that I can use \p{Greek} to grab all Greek words and \w{4,} in order to grab 4+ character words. However, these two don't work together, from various tests I made.
Is there any way to do what I want using 1 regexp expression? Strings are UTF-8 and come out of tweets.
Regards
Are you using the UTF-8 pattern modifier?
/\p{Greek}{4,}/u

Resources