Yahoo-Pipes, Best Practices: 'Loop with a String Regex' vs. 'Regex' - yahoo-pipes

What is a useful heuristic to consider when deciding between a 'Loop with a String Regex' and a 'Regex' module?

http://discuss.pipes.yahoo.com/Message_Boards_for_Pipes/threadview?m=tm&bn=pip-DeveloperHelp&tid=9666&mid=9677&tof=2&rt=2&frt=2&off=1

Related

how do I use wildcards in a gsub?

I used some brute force code to make a roman numeral converter. I'm seeing some drying opportunities with 5s and 10s
out.gsub!('IVI','V')
out.gsub!('IXI','X')
out.gsub!('IXLI','XL')
So what I'd like to do is something like...
out.gsub!(/'I'.'I'/,/./)
Where '.' is any number of characters between two 'I's
Any ideas?
What you're looking for is /I(.*)I/, which will group the string between the Is. You can access that via \1, producing out.gsub!(/I(.*)I/, '\1').
Take a look at the documentation for regular expressions. http://ruby-doc.org/core-2.1.1/Regexp.html
You can use the captures of a regex using \1, \2, etc.:
outs = 'IVVVVI'
out.gsub!(/^I(.*)I$/, '\1')
# => "VVVV"
This might be worth trying:
out = 'IVVVVI'
out.tr('I','')
# => "VVVV"

How to turn a .txt document to all capitalized letters?

We have a coding convention with an obscure, proprietary language called PowerOn (think of a scripty PL/1 language) that requires all coding text to be capitalized. Going behind some previous developers, people thought they would carry over their camel case habits from other languages like Java. Are there any tools that would transform all text to capitalize all text?
Worst case scenario, I could make somehing in .Net that could accomplish this. I am just trying to avoid reinventing the wheel.
tr "[a-z]" "[A-Z]" <code.poweron >newcode.poweron
In case you're on linux you have a number of options. Two of these are:
1. dd if=input.txt of=output.txt conv=ucase
2. tr '[:lower:]' '[:upper:]' < input.txt > output.txt

Ruby regular expression for this string?

I'm trying to get the first word in this string: Basic (11/17/2011 - 12/17/2011)
So ultimately wanting to get Basic out of that.
Other example string: Premium (11/22/2011 - 12/22/2011)
The format is always "Single-word followed by parenthesized date range" and I just want the single word.
Use this:
str = "Premium (11/22/2011 - 12/22/2011)"
str.split.first # => "Premium"
The split uses ' ' as default parameter if you don't specify any.
After that, get the first element with first
You don't need regexp for that, you can just use
str.split(' ')[0]
I know you found the answer you are needing but in case anyone stumbles on this in the future, in order to pull the needed value out of a large String of unknown length:
word_you_need = s.slice(/(\b[a-zA-Z]*\b \(\d+\/\d+\/\d+ - \d+\/\d+\/\d+\))/).split[0]
This regular expression will match the first word with out the trailing space
"^\w+ ??"
If you really want a regex you can get the first group after using this regex:
(\w*) .*
"Single-word followed by parenthesized date range"
'word' and 'parenthesized date range' should be better defined
as, by your requirement statement, they should be anchors and/or delimeters.
These raw regex's are just a general guess.
\w+(?=\s*\([^)]*\))
or
\w+(?=\s*\(\s*\d+(?:/\d+)*\s*-\s*\d+(?:/\d+)*\s*\))
Actually, all you need is:
s.split[0]
...or...
s.split.first

How to pull the email address out of this string?

Here are two possible email string scenarios:
email = "Joe Schmoe <joe#example.com>"
email = "joe#example.com"
I always only want joe#example.com.
So what would the regex or method be that would account for both scenarios?
This passes your examples:
def find_email(string)
string[/<([^>]*)>$/, 1] || string
end
find_email "Joe Schmoe <joe#example.com>" # => "joe#example.com"
find_email "joe#example.com" # => "joe#example.com"
If you know your email is always going to be in the < > then you can do a sub string with those as the starting and ending indexes.
If those are the only two formats, don't use a regex. Just use simple string parsing. IF you find a "<>" pair, then pull the email address out from between them, and if you don't find those characters, treat the whole string as the email address.
Regexes are great when you need them, but if you have very simple patterns, then the overhead of loading in and parsing down the regex and processing with it will be much higher than simple string manipulation. Not loading in extra libraries other than what is very core in a language will almost always be faster than going a different route.
If you are willing to load an extra library, this has already been solved in the TMail gem:
http://lindsaar.net/2008/4/13/tip-5-cleaning-up-an-verifying-an-email-address-with-ruby-on-rails
TMail::Address.parse('Mikel A. <spam#lindsaar.net>').spec
=> "spam#lindsaar.net"

Ruby - remove pattern from string

I have a string pattern that, as an example, looks like this:
WBA - Skinny Joe vs. Hefty Hal
I want to truncate the pattern "WBA - " from the string and return just "Skinny Joe vs. Hefty Hal".
Assuming that the "WBA" spot will be a sequence of any letter or number, followed by a space, dash, and space:
str = "WBA - Skinny Joe vs. Hefty Hal"
str.sub /^\w+\s-\s/, ''
By the way — RegexPal is a great tool for testing regular expressions like these.
If you need a more complex string replacement, you can look into writing a more sophisticated regular expression. Otherwise:
Keep it simple! If you only need to remove "WBA - " from the beginning of the string, use String#sub.
s = "WBA - Skinny Joe vs. Hefty Hal"
puts s.sub(/^WBA - /, '')
# => Skinny Joe vs. Hefty Hal
You can also remove the first occurrence of a pattern with the following snippet:
s[/^WBA - /] = ''

Resources