how to get interactions from the top klout scorers using datasift - klout

I need to retrieve the interactions from top klout scorers using datasift.
Ex:
CSDL: interaction.content contains "music" and klout.score > 50"
This is giving the interactions from users who has klout score more than 50,
But i need the top 10 or 20 interactions.

Related

Importfeed not pulling in all results (googlesheets)

I'm using =importfeed to pull in a Google Alert. It only gives me about 20 results, when I know there are more that are fed to me in my alert email (something like 50).
For example, I set an alert on Starbucks. I went to the bottom of the email I received, and copied the address of the RSS feed, stuck that in cell A1, and appended ?output=atom.
This is the formula I have:
=importfeed(A1,"items title",true)
Here's what's in A1:
https://www.google.com/alerts/feeds/10136724742523748721/15764223666146635885?output=atom
I've also added to the importfeed formula 10,20,100 after True for the desired query result. 10 gives me 10, 20 gives me 20, but 100 only gives me the 20, and no specification gives me 20.
Any suggestions?

Get all open search (point in time) in Elasticsearch

To create a new point a time we do
POST /my-index-000001/_pit?keep_alive=1m
output:
{
"id": "46ToAwMDaWR5BXV1aWQyKwZub2RlXzMAAAAAAAAAACoBYwADaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQADaWR5BXV1aWQyKgZub2RlXzIgAAAAA=="
}
It will return a PIT ID.
Now to get the total number of open search, we do
GET /_nodes/stats/indices/search
and output has
"open_contexts" : 18,
As Elastic says that we should close/delete the PIT after the work is done. Close the Point in time
Now the question:
how can we get ids of all open search or PITs?
if we can not get the Ids, is there any way to close the search after a time or it is auto-closed after keep-alive time ended.
I am not able to find any documentation regarding this.
There's no way to do retrieve the IDs of the opened search contexts, you're supposed to keep track of them in your application code.
As the link you provided states:
Point-in-time is automatically closed when its keep_alive has been elapsed.
So in your case, if you don't issue a new query after 1 minute, then the search context will be automatically closed.

How to implement queue management system in Redis

Im trying to implement queuing system using redis. So lets say we have this list:
> LPUSH listTickets 1 2 3 4 5 6 7 8 9
and as mobile app user someone was assigned number 4, and another user number 6. Now I want to display to them how many tickets are in front of them ( also to calculate the waiting estimated time). Now this may seem easy,
> LPOP listTicket
"4"
then we broadcast the result (the current ticket number to be called), then on mobile app everyone will subtract it from their own ticket number. for example my current ticket is 6 so 6-4=2 That way every user know how many tickets are ahead
However once you want to add feature to let the user delete their ticket or pushing it to the end of the queue, things get complicated. After deleting for example
> LRANGE listTicket 0 -1
1) "2"
2) "4"
3) "6"
4) "7"
5) "8"
when we LPOP listTicket we will get number 2 and the mobile app with ticket 6 will calculate 6-2 and get 4 which is the wrong calculation.
Do you have any algorithm in mind? Is getting the index of every ticket in the list every time someone delete their ticket an expensive process to calculate? Should I just send all tickets in the queue and let the users calculate their own position?
I want the system to be scale-able. So can the redis-node server handle total of 50 thousands (over different queues or lists) connected mobile apps getting their ranking in the queue they subscribed their ticket to.
I thought about using ZRANK but how will this load the server?

Unable to get results more than 100 results on google custom search api

I need to use Google Custom Search API https://developers.google.com/custom-search/v1/overview. From that page, it said:
For CSE users, the API provides 100 search queries per day for free.
If you need more, you may sign up for billing in the Developers
Console. Additional requests cost $5 per 1000 queries, up to 10k
queries per day.
I already sign up for billing inside the developer console. However, I still could not retrieve results more than 100. What things should I do more? https://www.googleapis.com/customsearch/v1?cx=CSE_INSTANCE&key=API_KEY&q=QUERY&start=100
{ error: { errors: [ { domain: "global", reason: "invalid", message:
"Invalid Value" } ], code: 400, message: "Invalid Value" } }
Query: Definition
https://support.google.com/customsearch/answer/1361951
Any actual user query from a Google Site Search engine, including but
not limited to search engines installed on your website using XML,
iFrame, or the Custom Search Element.
That means you would probably need to send eleven queries to get more than 100 results.
GET https://www.googleapis.com/customsearch/v1?&q=QUERY&...&start=1
GET https://www.googleapis.com/customsearch/v1?&q=QUERY&...&start=11
GET https://www.googleapis.com/customsearch/v1?&q=QUERY&...&start=21
GET ...
GET https://www.googleapis.com/customsearch/v1?&q=QUERY&...&start=81
GET https://www.googleapis.com/customsearch/v1?&q=QUERY&...&start=91
GET https://www.googleapis.com/customsearch/v1?&q=QUERY&...&start=101
Check every response and if error code is 400, you can stop - there is probably no need to send next (&start=previous+10) request.
Now you can merge responses and start building results page.
Google Custom Search and Google Site Search return up to 10 results
per query. If you want to display more than 10 results to the user,
you can issue multiple requests (using the start=0, start=11 ...
parameters) and display the results on a single page. In this case,
Google will consider each request as a separate query, and if you are
using Google Site Search, each query will count towards your limit.
There might be a better way to do this then I described above. (But, I'm not sure about batching API calls.)
And (finally) possible answer to your question: I made more than few tests, but I haven't had any luck with start greater than 100 (I was getting the same as you - <Response [400]>). I'm using "Browser key" from my billing-enabled project. That could mean we can't get 101st, 102nd, 103rd, etc. results with CSE API.
The API documentation says it never returns more than 100 items.
https://developers.google.com/custom-search/v1/reference/rest/v1/cse/list
start
integer (uint32 format)
The index of the first result to return. The default number of results
per page is 10, so &start=11 would start at the top of the second page
of results. Note: The JSON API will never return more than 100
results, even if more than 100 documents match the query, so setting
the sum of start + num to a number greater than 100 will produce an
error. Also note that the maximum value for num is 10.

Can Cube (js metrics framework) return more than 1000 events?

The Cube software (https://github.com/square/cube) allows you to retrieve events.
I want to retrieve a lot of events. But it appears that I am capped at 1000. There are well over 9000 in mongodb in the collection and time range I am querying
Example http GET queries I issue:
# 1000 results
http://1.2.3.4:1081/1.0/event?expression=my_event_type
# 1000 results
http://1.2.3.4:1081/1.0/event?expression=my_event_type&start=2012-02-02&stop=2013-07-03
# 7 results
http://1.2.3.4:1081/1.0/event?expression=my_event_type&limit=7
# 1000 results
http://1.2.3.4:1081/1.0/event?expression=my_event_type&limit=9999
It appears that the limit is pinned:
https://github.com/square/cube/blob/28dad4af27a6680deb46077b16952590f2c21cad/lib/cube/event.js
Line 166
based on the 'batchSize=1000'
Is it possible that you can 'page' through the data in some way? Or is this just a hard limit?
Looks like there is a hard cap on results in three places that need to be updated for large domains:
event.js - line 166
metric.js - line 11
metric.js - line 12
In addition, I was unable to find any query-string apis for the parameters. Ideally, we can leave the cap at 1000 (to avoid server bloat for people not tuning their queries correctly) and allow the consumer to define override behavior.

Resources