Yahoo Pipes Only One Item per Hour - yahoo-pipes

Hello I'm building a Yahoo Pipe to feed my Facebook Fanpage. I have plenty of RSS Feeds which stream pictures and I want to limit the output to one picture per hour. But I'm completely new to pipes and can't find an understandable tutorial. Pipe looks like that
RSS1 RSS2 ... RSSn
| | |
+-----UNION----+
|
PIPE OUTPUT

You can do that using this algorithm:
Create a new field that contains the date truncated to hours
Use the Unique operator on this new field to get only one item per hour
You could implement this using pipes like this:
Copy pubDate to, say, datepart, using a Rename operator, with params:
item.pubDate
Copy As
datepart
Truncate datepart, using a Regex operator, with params:
In = item.datepart
replace = ^(.{13}).*
with = $1
That is, since date fields are represented as YYYY-mm-DDTHH:MM:ssZ we take the first 13 characters to get the date part up until the hour and discard the rest. For example if pubDate was 2013-11-03T13:34:37 then we get 2013-11-03T13.
Use a Unique operator based on item.datepart to filter items
As a simple demo, I put together a pipe for you that shows 1 question per month tagged yahoo-pipes on stackoverflow:
http://pipes.yahoo.com/pipes/pipe.info?_id=72fea3931e145324f308f0d5f6852d93
Note that you will get different results depending on where you put these elements. For example, you could put this logic after your union, to get one image per hour from all your source feeds combined. Or you could put this logic before your union, to get one image per hour per feed.
You might also ask, in case of multiple images per hour, which one will be picked? The first one. I think the default ordering is by pubDate. To make Yahoo Pipes pick a different item, insert an appropriate Sort operator before Unique.

Related

Extract data from webpage to Excel

I tried to automate this portal, but since I have a trouble due to new to UiPath.
This is a URL
Have to extract CompanyName,BrokerName,Address,Phone into Excel for a number of records as per user input.
Since that client data is in one element and separated by breaks (br) I would suggest to still use the Scrape Data feature, (pick the first and second data set-group) and pull in the data set as-is; so its in block format separated by new lines.
Then iterate through the results, do a split string array on the results, iterate through the string array and evaluate each line using regex. If an address match or email match or phone..etc.. Then handle it from there, You could dump the results into a temp data table and then dump the results into excel.
Granted there might need to be some fluff on your regexpressions and it might miss a few, but it would be a good start.
Hope that helps get you started

Make a group by with a where in statement Laravel

I am writing a little script to load an Attendee_logs, based that counts the total of prints for based on the hour.
First I load the id's from the attendees
$allAttendees->pluck('id')->implode(',')
So I get 389832, 321321 from this (this are the id's from the attendees, based on a group).
Now I want to group them by the hour.
But I cannot find out how I add the whereIn statement
$badgesPrintedByDate = DB::table('Attendee_logs')->select(DB::raw('hour(created_at)'), DB::raw('COUNT(id)'))->whereIn('id', [$allAttendees->pluck('id')->implode(',')])->groupBy(DB::raw('hour(created_at)'));
When I do it like this, I get an empty result.
But when I remove the whereIn I get a result.
So my question, How can I count the rows based on the Hour and where I also give the ID's with it :)?
I think this is gonna work:
$badgesPrintedByDate = DB::table('Attendee_logs')->select(DB::raw('hour(created_at)'), DB::raw('COUNT(id)'))->whereIn('id', $allAttendees->pluck('id')->all())->groupBy(DB::raw('hour(created_at)'));
Instead of saying:
$allAttendees->pluck('id')->all()
Which returns an array of ids, you can also say:'
$allAttendees->pluck('id')->values()
Or:
$allAttendees->pluck('id')->toArray();

Iterate through items on a given date within date range rails

I kind of have the feeling this has been asked before, but I have been searching, but cannot come to a clear description.
I have a rails app that holds items that occur on a specific date (like birthdays). Now I would like to make a view that creates a table (or something else, divs are all right as well) that states a specified date once and then iterates over the related items one by one.
Items have a date field and are, of course, not related to a date in a separate table or something.
I can of course query the database for ~30 times (as I want a representation for one months worth of items), but I think it looks ugly and would be massively repetitive. I would like the outcome to look like this (consider it a table with two columns for the time being):
Jan/1 | jan1.item1.desc
| jan1.item2.desc
| jan1.item3.desc
Jan/2 | jan2.item1.desc
| etc.
So I think I need to know two things: how to construct a correct query (but it could be that this is as simple as Item.where("date > ? < ?", lower_bound, upper_bound)) and how to translate that into the view.
I have also thought about a hash with a key for each individual day and an array for the values, but I'd have to construct that like above(repetition) which I expect is not very elegant.
Using GROUP BY does not seem to get me anything different (apart from the grouping, of course, of the items) to work with than other queries. Just an array of objects, but I might do this wrong.
Sorry if it is a basic question. I am relatively new to the field (and programming in general).
If you're making a calendar, you probably want to GROUP BY date:
SELECT COUNT(*) AS instances, DATE(`date`) AS on_date FROM items GROUP BY DATE(`date`)
This is presuming your column is literally called date, which seeing as how that's a SQL reserved word, is probably a bad idea. You'll need to escape that whenever it's used if that's the case, using ``` here in MySQL notation. Postgres and others use a different approach.
For instances in a range, what you want is probably the BETWEEN operator:
#items = Item.where("`date` BETWEEN ? AND ?", lower_bound, upper_bound)

Ordering photos both by timestamp and manually

Context
I'm working on a small web app to store photos. Photos are ordered according to their timestamp (the date they've been taken), and it's working great. Here's a simplified look at the database:
+--------------+-------------------+
| id | timestamp |
+--------------+-------------------+
| 1 | 1000000003 |
| 2 | 1000000000 |
+--------------+-------------------+
Now I'd like to add the possibility to re-order photos. And I can't find a way of doing that without any downsides.
What I did
I first added a column to the table to save a custom order.
+--------------+-------------------+-------------+
| id | timestamp | order |
+--------------+-------------------+-------------+
| 1 | 1000000003 | 1 |
| 2 | 1000000000 | 2 |
+--------------+-------------------+-------------+
First issue: I believe I can't order photos according to two different criteria, because it'd be hard to know which one has to be given precedence.
So I'm ordering them using the order column, and only this one. When I added the order column, I gave each photo a value so that the current order would remain. I now have photos ordered by order, in the same order as when they were ordered by timestamp.
I can now re-order some photos manually, and the other ones will stay where they belong. The first issue has been solved.
But now, I want to add a new photo.
Second issue: I know when the new photo I'm adding has been taken, but my photos aren't ordered by their timestamp anymore. This photo needs to be correctly ordered, thus it needs a correct order value.
This is the issue: a correct order value.
Here are two ways I could handle a new photo:
Give it an order value greater than others. In the previous table, a new photo would be given order = 3. This is obviously a bad idea, since it doesn't take its timestamp into account. A recent photo would still be the last one displayed.
"Insert" it where it belongs, according to its timestamp. Looking at the same table, if the timestamp of the new photo was 1000000002, the new photo would be given order = 2, and the order of every following photo would be increased by 1.
The second solution looks great, except in one case: if the order of the photo #2 had been manually changed to let's say 50, the new photo would have been given order = 50 even though it belongs among the first photos (according to its timestamp).
What I need
What I need is a way of ordering photos according to their timestamp and to their manually-set order.
Maybe you have a solution to the second issue I highlighted, or maybe you're aware of a whole other way to deal with this. Either way, thank you for your help.
At no point in your question do you mention computers or programming languages. This is OK (actually, it's a good approach, get the problem and solution worked out on paper before coding) and here's an answer which also ignores computers and programming languages.
Put all your photos into a shoebox in the order in which you get them.
Now, take three pieces of paper:
On page 1 write the numbers (one to a line) from 1 to N (the number of photos the box can hold). Whenever you put a photo in the box, write its timestamp on the line corresponding to its order in the box.
On page 2 write the timestamp of photo 1 a few lines down. Write a 1 on the same line. For the next photo, write its timestamp in the appropriate place on the paper, leaving as much space above and below as seems necessary for future photo insertions. Write a 2 on the same line. Continue until you run out of space between lines, when you need to copy all the information onto a new version of the page with more space for insertions. The information on this page is the same as the information on page 1, but with the two numbers on each line swapping positions.
On page 3 write the numbers from 1 to N again. As you collect each photo write its number from page 1 (ie its number in the sequence of all photo numbers) in the correct position for your manually-set ordering. You'll probably have to do a lot of rubbing-out and re-writing on this page as you decide that latecomers ought to be inserted high onto this page.
Now you have:
a store for your photos, the shoebox; you should already have realised that you can't store the photos in more than one order at a time;
three indexes (indices if you prefer); the first is fixed and simply assigns a unique sequence number to each photo; it also tells you the timestamp of each photo in the box;
the second index enables you to find the unique sequence number of a photo given its timestamp, and then find the photo in the shoebox;
the third index allows you to order photos as you wish; the first number on each line is the sequence number in the sorted order, the second number is the photo's unique sequence number from the first index.
All of this is an extremely long-winded way of telling you that, since you can't (either in a shoebox or a computerised data store) keep photos in multiple orders simultaneously, you will have to maintain indices for the orderings you wish to use. Those indices point (that's what an index does) from a number to a location in the shoebox, either directly or indirectly.

How do I return multiple columns of data using ImportXML in Google Spreadsheets?

I'm using ImportXML in a Google Spreadsheet to access the user_timeline method in the Twitter API. I'd like to extract the created_at and text fields from the response and create a two-column display of the results.
Currently I'm doing this by calling the API twice, with
=ImportXML("http://twitter.com/status/user_timeline/matthewsim.xml?count=200","/statuses/status/created_at")
in the cell at the top of one column, and
=ImportXML("http://twitter.com/status/user_timeline/matthewsim.xml?count=200","/statuses/status/text")
in another.
Is there a way for me to create this display with a single call?
ImportXML supports using the xpath | separator to include as many queries as you like.
=ImportXML("http://url"; "//#author | //#catalogid| //#publisherid")
However it does not expand the results into multiple columns. You get a single column of repeating triplets (or however many attributes you've selected) as shown below in column A.
The following is deprecated
2015.06.16: continue is not available in "the new Google Sheets" (see: The Google Documentation for continue).
However you don't need to use the automatically inserted CONTINUE() function to place your results.
=CONTINUE($A$2, (ROW()-ROW($A$2)+1)*$A$1-B$1, 1)
Placed in B2 that should cleanly fill down and right to give you sane column data.
ImportXML is in A2.
A3 and below are how the CONTINUE() functions are automatically filled in.
A1 is the number of attributes.
B1:D1 are the attribute index for their columns.
Another way to convert the rows of =CONTINUE() into columns is to use transpose():
=transpose(importxml("http://url","//a | //b | //c"))
Just concatenate your queries with "|"
=ImportXML("http://twitter.com/status/user_timeline/matthewsim.xml?count=200","/statuses/status/created_at | /statuses/status/text")
I posed this question to the Google Support Forum and this is was a solution that worked for me:
=ArrayFormula(QUERY(QUERY(IFERROR(IF({1,1,0},IF({1,0,0},INT((ROW(A:A)-1)/2),MOD(ROW(A:A)-1,2)),IMPORTXML("http://example.com","//td/a | //td/a/#href"))),"select min(Col3) where Col3 <> '' group by Col1 pivot Col2",0),"offset 1",0))
Replace the contents of IMPORTXML with your data and query and see if that works for you. I
Apparently, this attempts to invoke the IMPORTXML function only once. It's a solution for now, at least.
Here's the full thread.
This is the best solution (NOT MINE) posted in the comments below. To be honest, I'm not sure how it works. Perhaps #Pandora, the original poster, could provide an explanation.
=ArrayFormula(iferror(hlookup(1,{1;ARRAY},(row(A:A)+1)*2-transpose(sort(row(A1:A2)+0,1,0)))))
This is a very ugly solution and doesn't even explain how it works. At least I couldn't get it to work due to multiple errors, like i.e. to much parameters for IF (because an array is used). A shorter solution can be found here =ArrayFormula(iferror(hlookup(1,{1;ARRAY},(row(A:A)+1)*2-transpose(sort(row(A1:A2)+0,1,0))))) "ARRAY" can be replaced with IMPORTXML-Function. This function can be used for as much XPATHS one wants. – Pandora Mar 7 '19 at 15:51
In particular, it would be good to know how to modify the formula to accommodate more columns.

Resources