My Asg names are creating with this name format
"digital-microservice-app1-20220627062026999600000001"
"digital-microservice-app2-20220627062026999600000001"
"digital-microservice-app3-20220627062026999600000001"
How can I search all the asg names starting with "digital-microservice-" using ruby?
Is that possible to search asg names based on string?
Use Anchored Regular Expressions When You Need Post-Processing of Matches
String-based answers are generally faster because regular expressions are generally slower than comparable String methods. However, they offer some capabilities that would take a lot more additional parsing and transformation steps if you ever need to match something more complex than a simple prefix, or to capture elements of the match for additional processing. For example, the following Regexp does the same thing as String#start_with?:
asg_names = [
"digital-microservice-app1-20220627062026999600000001",
"digital-microservice-app2-20220627062026999600000001",
"digital-microservice-app3-20220627062026999600000001",
"analog-microservice-app4-20220627062026999600000001"
]
# this will match all your digital apps, and exclude
# the one starting with "analog"
asg_names.grep /^digital-microservice-/
#=>
["digital-microservice-app1-20220627062026999600000001",
"digital-microservice-app2-20220627062026999600000001",
"digital-microservice-app3-20220627062026999600000001"]
Example of When Regexp Helps with Post-Processing
Unlike with String methods, you can use Regexp capture groups and other features to do something with various portions of the Regexp match, which you couldn't do without additional String processing. By using named captures, pre- and post-match variables, sub-expressions, or other things not strictly needed to solve the problem as originally posted, you can simplify some things that might otherwise take a larger number of additional parsing and transformation steps. This could simplify any next steps you may have in your processing workflow.
As a trivial example, consider the following pattern that extracts the name of the app type (e.g. digital or analog) and the app name, and then transforms the resulting matches within a block into an Array of String objects suitable for logging or user-facing output:
asg_names.grep(/^digital-microservice-\K(app\d)+/) do
"#{$`.tr(?-, ?\s)}found for: #{$1}".capitalize
end
#=>
["Digital microservice found for: app1",
"Digital microservice found for: app2",
"Digital microservice found for: app3"]
Go with String methods if you don't need the additional Regexp features to solve your problem, but consider regular expressions if you're doing something more complex with your matches.
Input
a = ["digital-microservice-app1-20220627062026999600000001",
"digital-microservice-app2-20220627062026999600000001",
"digital-microservice-app3-20220627062026999600000001",
"digita-microservice-app3-20220627062026999600000001"
]
Code
p a.filter { |x| x.start_with?('digital-microservice') }
Output
["digital-microservice-app1-20220627062026999600000001", "digital-microservice-app2-20220627062026999600000001", "digital-microservice-app3-20220627062026999600000001"]
Related
I'm trying to write a regex in Ruby that will parse various date/time formats. The entire regex looks like this:
/^(?<year>\d{4})\-(?<month>\d{2})\-(?<day>\d{2})(T(?<hour>\d{2})(:(?<minute>\d{2})(:(?<second>\d{2}(\.\d{1,3})?))?)?)?(?<offset>[+-]\d{2}:\d{2})?$/
I'm using named groups so that I can fetch the matching parts out of the match object just using the simple names like "year", "month", "day", etc. This regex is working fine, but let's focus on the "offset" at the end of this:
(?<offset>[+-]\d{2}:\d{2})?
The problem is that I'm trying to add the ability to interpret a "Z" on the end of the string to denote UTC time (aka Zulu Time). This "Z" should be mutually exclusive with the offset. Here's some of the ways I've tried it:
(?<offset>[Z([+-]\d{2}:\d{2})])?
(?<offset>[(Z)([+-]\d{2}:\d{2})])?
[(?<zulu>Z)(?<offset>[+-]\d{2}:\d{2})]?
None of these work. In the first two cases, it can interpret date strings ending in "Z", but it can no longer interpret date string ending with actual offsets like "-07:00". In the third case, the named groups "zulu" and "offset" are just totally missing from the match object.
I think this issue is because I'm trying use square brackets to denote [(ThisGroup)(OrThisGroup)]? but I don't think the regex engine appreciates having groups inside of square brackets. How do I tell the regex engine to allow and capture "group A or group B or neither, but not both"?
Square brackets are used for "exactly one of any of these characters" -- that's not what you need here. Pattern-level alternation is done via the | operator: (hello|goodbye) world will match either hello world or goodbye world.
(?<offset>Z|[+-]\d{2}:\d{2})?
Specifically to parse a datetime, though, I suggest preferring DateTime.parse (plus to_time, if you need a Time instance). And if that isn't sufficiently flexible, consider the chronic gem.
We have one quite complex regular expression which checks for string structure.
I wonder if there is an easy way to find out which character in the string that is causing reg expression not to match.
For example,
string.match(reg_exp).get_position_which_fails
Basically, the idea is how to get "position" of state machine when it gave up.
Here is an example of regular expression:
%q^[^\p{Cc}\p{Z}]([^\p{Cc}\p{Zl}\p{Zp}]{0,253}[^\p{Cc}\p{Z}])?$
The short answer is: No.
The long answer is that a regular expression is a complicated finite state machine that may be in a state trying to match several different possible paths simultaneously. There's no way of getting a partial match out of a regular expression without constructing a regular expression that allows partial matches.
If you want to allow partial matches, either re-engineer your expression to support them, or write a parser that steps through the string using a more manual method.
You could try generating one of these automatically with Ragel if you have a particularly difficult expression to solve.
I am doing some localization testing and I have to test for strings in both English and Japaneses. The English string might be 'Waiting time is {0} minutes.' while the Japanese string might be '待ち時間は{0}分です。' where {0} is a number that can change over the course of a test. Both of these strings are coming from there respective property files. How would I be able to check for the presence of the string as well as the number that can change depending on the test that's running.
I should have added the fact that I'm checking these strings on a web page which will display in the relevant language depending on the location of where they are been viewed. And I'm using watir to verify the text.
You can read elsewhere about various theories of the best way to do testing for proper language conversion.
One typical approach is to replace all hard-coded text matches in your code with constants, and then have a file that sets the constants which can be updated based on the language in use. (I've seen that done by wrapping the require of that file in a case statement based on the language being tested. Another approach is an array or hash for each value, enumerated by a variable with a name like 'language', which lets the tests change the language on the fly. So validations would look something like this
b.div(:id => "wait-time-message).text.should == WAIT_TIME_MESSAGE[language]
To match text where part is expected to change but fall within a predictable pattern, use a regular expression. I'd recommend a little reading about regular expressions in ruby, especially using unicode regular expressions in ruby, as well as some experimenting with a tool like Rubular to test regexes
In the case above a regex such as:
/Waiting time is \d+ minutes./ or /待ち時間は\d+分です。/
would match the messages above and expect one or more digits in the middle (note that it would fail if no digits appear, if you want zero or more digits, then you would need a * in place of the +
Don't check for the literal string. Check for some kind of intermediate form that can be used to render the final string.
Sometimes this is done by specifying a message and any placeholder data, like:
[ :waiting_time_in_minutes, 10 ]
Where that would render out as the appropriate localized text.
An alternative is to treat one of the languages as a template, something that's more limited in flexibility but works most of the time. In that case you could use the English version as the string that's returned and use a helper to render it to the final page.
Using Ruby I'd like to take a Regexp object (or a String representing a valid regex; your choice) and tokenize it so that I may manipulate certain parts.
Specifically, I'd like to take a regex/string like this:
regex = /var (\w+) = '([^']+)';/
parts = ["foo","bar"]
and create a replacement string that replaces each capture with a literal from the array:
"var foo = 'bar';"
A naïve regex-based approach to parsing the regex, such as:
i = -1
result = regex.source.gsub(/\([^)]+\)/){ parts[i+=1] }
…would fail for things like nested capture groups, or non-capturing groups, or a regex that had a parenthesis inside a character class. Hence my desire to properly break the regex into semantically-valid pieces.
Is there an existing Regex parser available for Ruby? Is there a (horror of horrors) known regex that cleanly matches regexes? Is there a gem I've not found?
The motivation for this question is a desire to find a clean and simple answer to this question.
I have a JavaScript project on GitHub called: Dynamic (?:Regex Highlighting)++ with Javascript! you may want to look at. It parses PCRE compatible regular expressions written in both free-spacing and non-free-spacing modes. Since the regexes are written in the less-feature-rich JavaScript syntax, these regexes could be easily converted to Ruby.
Note that regular expressions may contain arbitrarily nested parentheses structures and JavaScript has no recursive regex features, so the code must parse the tree of nested parens from the-inside-out. Its a bit tricky but works quite well. Be sure to try it out on the highlighter demo page, where you can input and dynamically highlight any regex. The JavaScript regular expressions used to parse regular expressions are documented here.
I am getting completely different reults from string.scan and several regex testers...
I am just trying to grab the domain from the string, it is the last word.
The regex in question:
/([a-zA-Z0-9\-]*\.)*\w{1,4}$/
The string (1 single line, verified in Ruby's runtime btw)
str = 'Show more results from software.informer.com'
Work fine, but in ruby....
irb(main):050:0> str.scan /([a-zA-Z0-9\-]*\.)*\w{1,4}$/
=> [["informer."]]
I would think that I would get a match on software.informer.com ,which is my goal.
Your regex is correct, the result has to do with the way String#scan behaves. From the official documentation:
"If the pattern contains groups, each individual result is itself an array containing one entry per group."
Basically, if you put parentheses around the whole regex, the first element of each array in your results will be what you expect.
It does not look as if you expect more than one result (especially as the regex is anchored). In that case there is no reason to use scan.
'Show more results from software.informer.com'[ /([a-zA-Z0-9\-]*\.)*\w{1,4}$/ ]
#=> "software.informer.com"
If you do need to use scan (in which case you obviously need to remove the anchor), you can use (?:) to create non-capturing groups.
'foo.bar.baz lala software.informer.com'.scan( /(?:[a-zA-Z0-9\-]*\.)*\w{1,4}/ )
#=> ["foo.bar.baz", "lala", "software.informer.com"]
You are getting a match on software.informer.com. Check the value of $&. The return of scan is an array of the captured groups. Add capturing parentheses around the suffix, and you'll get the .com as part of the return value from scan as well.
The regex testers and Ruby are not disagreeing about the fundamental issue (the regex itself). Rather, their interfaces are differing in what they are emphasizing. When you run scan in irb, the first thing you'll see is the return value from scan (an Array of the captured subpatterns), which is not the same thing as the matched text. Regex testers are most likely oriented toward displaying the matched text.
How about doing this :
/([a-zA-Z0-9\-]*\.*\w{1,4})$/
This returns
informer.com
On your test string.
http://rubular.com/regexes/13670