Extract address from RSS Feed description - yahoo-pipes

I have an RSS feed and the items are like this:
Place: Starbucks
Address: Example Street
Date: 31/05 22:10 - 17/06 22:38
How can I extract the address from all feed items and create a new string only with the address in Yahoo Pipes?

You can do like this:
Use the Rename operator to make a copy of the description field to a work field
Use the Regex operator to extract the address part with a regular expression, for example replacing .*Address: ([^\n]+).* with $1
I put together an example pipe to demonstrate the technique:
http://pipes.yahoo.com/pipes/pipe.info?_id=a7f3d9e85f006e3f6c021c45dbb7ed38

Related

Ruby Regex to extract domain from email address

I have no real previous experience using regex, just saying.
I want to extract domain names from email addresses with the below format.
richardc#mydomain.com
so that the regex returns just: mydomain
With an explanation of how/why it works if possible!
Cheers
Here capturing (...) the domain name in group \1 and replace the whole string with that capture, which yields the domain name only at the end.
email = 'richardc#mydomain.com'
domain = email.gsub(/.+#([^.]+).+/, '\1')
# => mydomain
.+ means any character(except \n). So its basically matching the whole email string, and capturing the domain name using ([^.]+) [means anything but dot]
if you want to take the parsing route instead, the mail gem will do the job:
Mail::Address.new("richardc#mydomain.com").domain

Capture Filter with Wildcard in IP Address

I am trying to customize Wireshark capture such that is captures all IP addresses (both source and destination) with the IP address format xxx.xxx.xxx.100.
I used the following Capture Filter
ip matches /.*/.*/.*/.100
but the text box remains red'
These are not IP addresses in a particular range, just the fourth octet is 100
Your regex is a little off, as you need to use a backslash to escape the periods. Try this:
ip.host matches "\.100$"
That should match .100 at the end of the string.
Source: http://ask.wireshark.org/questions/22230/filter-for-partial-ip-address
Edit: Try using the Display Filter (Analyze->Display Filters..), not the Capture Filter

Yahoo pipes: how can I add an additional nodes/elements to RSS/feed items

I am merging two feeds using Yahoo pipes and using the output feed on a website. However, as would like to identify the "feed source" for each item in the output feed. Is it possible to manipulate the original feeds so I can add another node/element to the feed items?
Thanks
One way to do that is using the Regex operator. Let's say you want to add a new field called source. You could use Regex with parameters:
In: item.source
replace: .*
with: (the text you want)
See it in action here:
http://pipes.yahoo.com/janos/7a3b9993cfc143d414fe7b637b1bd95a
That is, I have two feeds, I added a source attribute in the first with value "Question 1" and in the second with value "Question 2".
As an added bonus interesting undocumented Yahoo Pipes hack, I used one more Regex after the Union to make the source appear in the title.
However, this only adds the attribute to the node in the pipe debugger. You can use it for further processing, like I added it here to the title, it won't create a <source> tag in the output. That's because the RSS output of Yahoo Pipes removes all other fields that are not in the RSS standard. You can still see it in the JSON output though.

Yahoo Pipes: filter items in a feed based on words in a text file

I have a pipe that filters an RSS feed and removes any item that contains "stopwords" that I've chosen. Currently I've manually created a filter for each stopword in the pipe editor, but the more logical way is to read these from a file. I've figured out how to read the stopwords out of the text file, but how do I apply the filter operator to the feed, once for every stopword?
The documentation states explicitly that operators can't be applied within the loop construct, but hopefully I'm missing something here.
You're not missing anything - the filter operator can't go in a loop.
Your best bet might be to generate a regex out of the stopwords and filter using that. e.g. generate a string like (word1|word2|word3|...|wordN).
You may have to escape any odd characters. Also I'm not sure how long a regex can be so you might have to chunk it over multiple filter rules.
In addition to Gavin Brock's answer the following Yahoo Pipes
filters the feed items (title, description, link and author) according to multiple stopwords:
Pipes Info
Pipes Edit
Pipes Demo
Inputs
_render=rss
feed=http://example.com/feed.rss
stopwords=word1-word2-word3

Extract email addresses from a block of text

How can I create an array of email addresses contained within a block of text?
I've tried
addrs = text.scan(/ .+?#.+? /).map{|e| e[1...-1]}
but (not surprisingly) it doesn't work reliably.
Howabout this for a (slightly) better regular expression
\b[A-Z0-9._%+-]+#[A-Z0-9.-]+\.[A-Z]{2,4}\b
You can find this here:
Email Regex
Just an FYI, the problem with your email is that you allow only one type of separator before or after an email address. You would match "#" alone, if separated by spaces.

Resources