I use tweetstream gem to get sample tweets from Twitter Streaming API:
TweetStream.configure do |config|
config.username = 'my_username'
config.password = 'my_password'
config.auth_method = :basic
end
#client = TweetStream::Client.new
#client.sample do |status|
puts "#{status.text}"
end
However, this script will stop printing out tweets after about 100 tweets (the script continues to run). What could be the problem?
The Twitter Search API sets certain arbitrary (from the outside) limits for things, from the docs:
GET statuses/:id/retweeted_by Show user objects of up to 100 members who retweeted the status.
From the gem, the code for the method is:
# Returns a random sample of all public statuses. The default access level
# provides a small proportion of the Firehose. The "Gardenhose" access
# level provides a proportion more suitable for data mining and
# research applications that desire a larger proportion to be statistically
# significant sample.
def sample(query_parameters = {}, &block)
start('statuses/sample', query_parameters, &block)
end
I checked the API docs but don't see an entry for 'statuses/sample', but looking at the one above I'm assuming you've reached 100 of whatever statuses/xxx has been accessed.
Also, correct me if I'm wrong, but I believe Twitter no longer accepts basic auth and you must use an OAuth key. If this is so, then that means you're unauthenticated, and the search API will also limit you in other ways too, see https://dev.twitter.com/docs/rate-limiting
Hope that helps.
Ok, I made a mistake there, I was looking at the search API when I should've been looking at the streaming API (my apologies), but it's possible some of the things I was talking about could be the cause of your problems so I'll leave it up. Twitter definitely has moved away from basic auth, so I'd try resolving that first, see:
https://dev.twitter.com/docs/auth/oauth/faq
Related
how can I get ALL records from route53?
referring code snippet here, which seemed to work for someone, however not clear to me: https://github.com/aws/aws-sdk-ruby/issues/620
Trying to get all (I have about ~7000 records) via resource record sets but can't seem to get the pagination to work with list_resource_record_sets. Here's what I have:
route53 = Aws::Route53::Client.new
response = route53.list_resource_record_sets({
start_record_name: fqdn(name),
start_record_type: type,
max_items: 100, # fyi - aws api maximum is 100 so we'll need to page
})
response.last_page?
response = response.next_page until response.last_page?
I verified I'm hooked into right region, I see the record I'm trying to get (so I can delete later) in aws console, but can't seem to get it through the api. I used this: https://github.com/aws/aws-sdk-ruby/issues/620 as a starting point.
Any ideas on what I'm doing wrong? Or is there an easier way, perhaps another method in the api I'm not finding, for me to get just the record I need given the hosted_zone_id, type and name?
The issue you linked is for the Ruby AWS SDK v2, but the latest is v3. It also looks like things may have changed around a bit since 2014, as I'm not seeing the #next_page or #last_page? methods in the v2 API or the v3 API.
Consider using the #next_record_name and #next_record_type from the response when #is_truncated is true. That's more consistent with how other paginations work in the Ruby AWS SDK, such as with DynamoDB scans for example.
Something like the following should work (though I don't have an AWS account with records to test it out):
route53 = Aws::Route53::Client.new
hosted_zone = ? # Required field according to the API docs
next_name = fqdn(name)
next_type = type
loop do
response = route53.list_resource_record_sets(
hosted_zone_id: hosted_zone,
start_record_name: next_name,
start_record_type: next_type,
max_items: 100, # fyi - aws api maximum is 100 so we'll need to page
)
records = response.resource_record_sets
# Break here if you find the record you want
# Also break if we've run out of pages
break unless response.is_truncated
next_name = response.next_record_name
next_type = response.next_record_type
end
According to Twilio's documentation here regarding "paging":
The list returned to you includes paging information. If you plan on requesting more records than will fit on a single page, you may want to use the provided nextpageuri rather than incrementing through the pages by page number.
It then gives an example:
# Initialize Twilio Client
#client = Twilio::REST::Client.new(account_sid, auth_token)
#client.calls.list
.each do |call|
puts call.direction
end
However, doing this just returns an array of all calls, there isn't any paging information or limiting of results or any "pages".
For my purposes I'm actually filtering the query like this:
#calls = #client.calls.list(
start_time_after: #time
start_time_before: #another_time
)
Because my date filter range is a 1 month period and there are currently about 4.5k calls to retrieve, its taking quite a while to process (and sometimes it just never processes)
I'm using the twilio helper library ruby gem "twilio-ruby" and running ruby 2.5
I've also tried using PHP with the respective twilio helper library and have found the same result.
Using curl however does work and gives paging information, its also incredibly fast compared to using the helper libraries
Twilio developer evangelist here.
list will paginate through, loading all the resources it can.
There are other calls that will stream the API in a lazier fashion, if that is more useful for your use case. For example, you can use each and it will load the calls lazily until they have run out.
#calls = #client.calls.each(
start_time_after: #time
start_time_before: #another_time
) do |call|
puts call.direction
end
If you do want to manually paginate yourself, you can the page method to get a CallPage object and iterate from there.
page = #client.calls.page(
start_time_after: #time
start_time_before: #another_time
)
while !page.nil? do
page.each { |call| puts call.direction }
page = page.next_page
end
Let me know if that helps at all.
Im trying to make an app which would iterate through my own posts and get a list of users who favorited a post. Afterwards I would like the application to follow each of those users if I am not already following them. I am using Ruby for this.
This is my code now:
#client = Twitter::REST::Client.new(config)
OpenSSL::SSL::VERIFY_PEER = OpenSSL::SSL::VERIFY_NONE
user = #client.user()
tweets = #client.user_timeline(user).take(20)
num_of_tweets = tweets.length
puts "tweets found: #{tweets.length}"
tweets.each do |item|
puts "#{ item}" #iterating through my posts here
end
any suggestions?
That information isn't exposed in the Twitter API, either through a timeline collection or via the endpoint representing a single tweet. This'll be why the twitter gem, which provides a useable interface around the Rest API, cannot give you what you're after.
Third party sites such as Favstar do display that information, but as far as I know their own API does not expose the relevant users in any manageable way.
I've found facebook's 'Graph API Explorer' tool (https://developers.facebook.com/tools/explorer/) to be an incredibly easy way, welcoming (for beginners) & effective way to use facebook's graph API via its GUI.
I'd like to be able to use the koala gem to pass these generated URLs to facebook's api.
Right now, lets say I had a query like this
url = "me?fields=id,name,posts.fields(likes.fields(id,name),comments.fields(parent,likes.fields(id,name)),message)"
I'd like to be able to pass that directly into koala as a single string.
#graph.get_connections(url)
It doesn't like that so I separate out the uid and the ? operator like the gem seems to want
url = "fields=id,name,posts.fields(likes.fields(id,name),comments.fields(parent,likes.fields(id,name)),message)"
#graph.get_connections("me", url)
This however, returns an error as well:
Koala::Facebook::AuthenticationError:
type: OAuthException, code: 2500,
message: Unknown path components: /fields=id,name,posts.fields(likes.fields(id,name),comments.fields(parent,likes.fields(id,name)),message) [HTTP 400]
Currently this is where I am stuck. I'd like to continue using koala because I like the gem-approach to working with API's, especially when it comes to using OAuth & OAuth2.
UPDATE:
I'm starting to break down the request into pieces which the koala gem can handle, for example
posts = #graph.get_connections("me", "posts")
postids = posts.map { |p| p['id'] }
likes = postids.inject([]) {|ary, id| ary << #graph.get_connection(id, "likes") }
So that's a long way of getting two arrays, one of posts, one of like data.
But I'd quickly burn up my API requests limit in no time using this kind of approach.
I was kind of hoping I'd just be able to pass the whole string from the Graph API Explorer and just get what I wanted rather than having to manually parse all this stuff.
I don't really know about your posts.fields(likes.fields(id,name) -this does not work in the Graph API Explorer- and stuff like that but I know you can do this:
fb_api = Koala::Facebook::API.new(access_token)
fb_api.api("/me?fields=id,name,posts")
# => => {"id"=>"71170", "name"=>"My Name", "posts"=>{"paging"=>{"next"=>"https://graph.facebook.com/71170/posts?access_token=CAAEO&limit=25&until=13705022", "previous"=>"https://graph.facebook.com/711737070/posts?access_token=CAAEOTYMZD&limit=25&since=1370723&__previous=1"}, "data"=>[{"id"=>"71170_1013572471", "comments"=>{"count"=>0}, "created_time"=>"2013-06-09T08:03:43+0000", "from"=>{"id"=>"71170", "name"=>"My Name"}, "updated_time"=>"2013-06-09T08:03:43+0000", "privacy"=>{"value"=>""}, "type"=>"status", "story_tags"=>{"0"=>[{"id"=>"71170", "name"=>" ", "length"=>8, "type"=>"user", "offset"=>0}]}, "story"=>" likes a photo."}]}}
And you will receive in a hash what you asked for.
From time to time, you must pass nil as a param to koala:
result += graph_api.batch do |batch_api|
facebook_page_ids.each do |facebook_page_id|
batch_api.get_connections(facebook_page_id, nil, {"fields"=>"posts"})
end
end
I'm not sure how to solve this big performance issue of my application. I'm using open-uri to request the most popular videos from youtube and when I ran perftools https://github.com/tmm1/perftools.rb
It shows that the biggest performance issue is Timeout.timeout. Can anyone suggest me how to solve the problem?
I'm using ruby 1.8.7.
Edit:
This is the output from my profiler
https://docs.google.com/viewer?a=v&pid=explorer&chrome=true&srcid=0B4bANr--YcONZDRlMmFhZjQtYzIyOS00YjZjLWFlMGUtMTQyNzU5ZmYzZTU4&hl=en_US
Timeout is wrapping the function that is actually doing the work to ensure that if the server fails to respond within a certain time, the code will raise an error and stop execution.
I suspect that what you are seeing is that the server is taking some time to respond. You should look at caching the response in some way.
For instance, using memcached (pseudocode)
require 'dalli'
require 'open-uri'
DALLI = Dalli.client.new
class PopularVideos
def self.get
result = []
unless result = DALLI.get("videos_#{Date.today.to_s}")
doc = open("http://youtube/url")
result = parse_videos(doc) # parse the doc somehow
DALLI.set("videos_#{Date.today.to_s}", result)
end
result
end
end
PopularVideos.get # calls your expensive parsing script once
PopularVideos.get # gets the result from memcached for the rest of the day