I'm using the Legato gem to access the total number of a certain event from Google Analytics in Ruby, but I'm getting inconsistent results with the web interface.
I have a model like so:
module Analytics
class ViewedContent
extend Legato::Model
metrics :users, :new_users, :total_events
filter :my_org do
# Look u[ event_label=X AND event_action=Y
[
eql(:event_label, "My Organisation", Legato.and_join_character),
eql(:event_action, 'Viewed_Content', Legato.and_join_character)
]
end
end
end
...then I use this by doing:
query = Analytics::ViewedContent.my_org.results(profile, {
:start_date => start_date,
:end_date => end_date
})
...and looking at the totalEvents stat.
When I pass in dates in January, e.g. start_date = "2014-01-01".to_date and end_date = "2014-01-31".to_date then it works fine, and returns the identical number of totalEvents to the Google Analytics web interface.
However, when I use it in for last month, start_date = "2014-07-01".to_date and end_date = "2014-07-31".to_date then it's considerably less than in the web interface (Legato returns 555 vs 662 in the web interface).
It makes me wonder if it's something to do with British Summer Time (I'm currently in UTC+1) except that even extending the date range by a day either side doesn't bring it up to the same as what the web interface reports, which would appear to rule that out.
Any thoughts much appreciated!
With a bit of help from Query Explorer I discovered that the problem is definitely at Google's end, rather than due to Legato. The problem was with this line here:
metrics :users, :new_users, :total_events
If I changed that to just fetch :total_events then suddenly it started returning the same value as the Google Analytics web interface. I now make a separate request for :users and :new_users by looking up a segment (rather than just this filter)
Related
I am working on an app that allows Members to take a survey (Member has a one to many relationship with Response). Response holds the member_id, question_id, and their answer.
The survey is submitted all or nothing, so if there are any records in the Response table for that Member they have completed the survey.
My question is, how do I re-write the query below so that it actually works? In SQL this would be a prime candidate for the EXISTS keyword.
def surveys_completed
members.where(responses: !nil ).count
end
You can use includes and then test if the related response(s) exists like this:
def surveys_completed
members.includes(:responses).where('responses.id IS NOT NULL')
end
Here is an alternative, with joins:
def surveys_completed
members.joins(:responses)
end
The solution using Rails 4:
def surveys_completed
members.includes(:responses).where.not(responses: { id: nil })
end
Alternative solution using activerecord_where_assoc:
This gem does exactly what is asked here: use EXISTS to to do a condition.
It works with Rails 4.1 to the most recent.
members.where_assoc_exists(:responses)
It can also do much more!
Similar questions:
How to query a model based on attribute of another model which belongs to the first model?
association named not found perhaps misspelled issue in rails association
Rails 3, has_one / has_many with lambda condition
Rails 4 scope to find parents with no children
Join multiple tables with active records
You can use SQL EXISTS keyword in elegant Rails-ish manner using Where Exists gem:
members.where_exists(:responses).count
Of course you can use raw SQL as well:
members.where("EXISTS" \
"(SELECT 1 FROM responses WHERE responses.member_id = members.id)").
count
You can also use a subquery:
members.where(id: Response.select(:member_id))
In comparison to something with includes it will not load the associated models (which is a performance benefit if you do not need them).
If you are on Rails 5 and above you should use left_joins. Otherwise a manual "LEFT OUTER JOINS" will also work. This is more performant than using includes mentioned in https://stackoverflow.com/a/18234998/3788753. includes will attempt to load the related objects into memory, whereas left_joins will build a "LEFT OUTER JOINS" query.
def surveys_completed
members.left_joins(:responses).where.not(responses: { id: nil })
end
Even if there are no related records (like the query above where you are finding by nil) includes still uses more memory. In my testing I found includes uses ~33x more memory on Rails 5.2.1. On Rails 4.2.x it was ~44x more memory compared to doing the joins manually.
See this gist for the test:
https://gist.github.com/johnathanludwig/96fc33fc135ee558e0f09fb23a8cf3f1
where.missing (Rails 6.1+)
Rails 6.1 introduces a new way to check for the absence of an association - where.missing.
Please, have a look at the following code snippet:
# Before:
Post.left_joins(:author).where(authors: { id: nil })
# After:
Post.where.missing(:author)
And this is an example of SQL query that is used under the hood:
Post.where.missing(:author)
# SELECT "posts".* FROM "posts"
# LEFT OUTER JOIN "authors" ON "authors"."id" = "posts"."author_id"
# WHERE "authors"."id" IS NULL
As a result, your particular case can be rewritten as follows:
def surveys_completed
members.where.missing(:response).count
end
Thanks.
Sources:
where.missing official docs.
Pull request.
Article from the Saeloun blog.
Notes:
where.associated - a counterpart for checking for the presence of an association is also available starting from Rails 7.
See offical docs and this answer.
how can I get ALL records from route53?
referring code snippet here, which seemed to work for someone, however not clear to me: https://github.com/aws/aws-sdk-ruby/issues/620
Trying to get all (I have about ~7000 records) via resource record sets but can't seem to get the pagination to work with list_resource_record_sets. Here's what I have:
route53 = Aws::Route53::Client.new
response = route53.list_resource_record_sets({
start_record_name: fqdn(name),
start_record_type: type,
max_items: 100, # fyi - aws api maximum is 100 so we'll need to page
})
response.last_page?
response = response.next_page until response.last_page?
I verified I'm hooked into right region, I see the record I'm trying to get (so I can delete later) in aws console, but can't seem to get it through the api. I used this: https://github.com/aws/aws-sdk-ruby/issues/620 as a starting point.
Any ideas on what I'm doing wrong? Or is there an easier way, perhaps another method in the api I'm not finding, for me to get just the record I need given the hosted_zone_id, type and name?
The issue you linked is for the Ruby AWS SDK v2, but the latest is v3. It also looks like things may have changed around a bit since 2014, as I'm not seeing the #next_page or #last_page? methods in the v2 API or the v3 API.
Consider using the #next_record_name and #next_record_type from the response when #is_truncated is true. That's more consistent with how other paginations work in the Ruby AWS SDK, such as with DynamoDB scans for example.
Something like the following should work (though I don't have an AWS account with records to test it out):
route53 = Aws::Route53::Client.new
hosted_zone = ? # Required field according to the API docs
next_name = fqdn(name)
next_type = type
loop do
response = route53.list_resource_record_sets(
hosted_zone_id: hosted_zone,
start_record_name: next_name,
start_record_type: next_type,
max_items: 100, # fyi - aws api maximum is 100 so we'll need to page
)
records = response.resource_record_sets
# Break here if you find the record you want
# Also break if we've run out of pages
break unless response.is_truncated
next_name = response.next_record_name
next_type = response.next_record_type
end
According to Twilio's documentation here regarding "paging":
The list returned to you includes paging information. If you plan on requesting more records than will fit on a single page, you may want to use the provided nextpageuri rather than incrementing through the pages by page number.
It then gives an example:
# Initialize Twilio Client
#client = Twilio::REST::Client.new(account_sid, auth_token)
#client.calls.list
.each do |call|
puts call.direction
end
However, doing this just returns an array of all calls, there isn't any paging information or limiting of results or any "pages".
For my purposes I'm actually filtering the query like this:
#calls = #client.calls.list(
start_time_after: #time
start_time_before: #another_time
)
Because my date filter range is a 1 month period and there are currently about 4.5k calls to retrieve, its taking quite a while to process (and sometimes it just never processes)
I'm using the twilio helper library ruby gem "twilio-ruby" and running ruby 2.5
I've also tried using PHP with the respective twilio helper library and have found the same result.
Using curl however does work and gives paging information, its also incredibly fast compared to using the helper libraries
Twilio developer evangelist here.
list will paginate through, loading all the resources it can.
There are other calls that will stream the API in a lazier fashion, if that is more useful for your use case. For example, you can use each and it will load the calls lazily until they have run out.
#calls = #client.calls.each(
start_time_after: #time
start_time_before: #another_time
) do |call|
puts call.direction
end
If you do want to manually paginate yourself, you can the page method to get a CallPage object and iterate from there.
page = #client.calls.page(
start_time_after: #time
start_time_before: #another_time
)
while !page.nil? do
page.each { |call| puts call.direction }
page = page.next_page
end
Let me know if that helps at all.
I've found facebook's 'Graph API Explorer' tool (https://developers.facebook.com/tools/explorer/) to be an incredibly easy way, welcoming (for beginners) & effective way to use facebook's graph API via its GUI.
I'd like to be able to use the koala gem to pass these generated URLs to facebook's api.
Right now, lets say I had a query like this
url = "me?fields=id,name,posts.fields(likes.fields(id,name),comments.fields(parent,likes.fields(id,name)),message)"
I'd like to be able to pass that directly into koala as a single string.
#graph.get_connections(url)
It doesn't like that so I separate out the uid and the ? operator like the gem seems to want
url = "fields=id,name,posts.fields(likes.fields(id,name),comments.fields(parent,likes.fields(id,name)),message)"
#graph.get_connections("me", url)
This however, returns an error as well:
Koala::Facebook::AuthenticationError:
type: OAuthException, code: 2500,
message: Unknown path components: /fields=id,name,posts.fields(likes.fields(id,name),comments.fields(parent,likes.fields(id,name)),message) [HTTP 400]
Currently this is where I am stuck. I'd like to continue using koala because I like the gem-approach to working with API's, especially when it comes to using OAuth & OAuth2.
UPDATE:
I'm starting to break down the request into pieces which the koala gem can handle, for example
posts = #graph.get_connections("me", "posts")
postids = posts.map { |p| p['id'] }
likes = postids.inject([]) {|ary, id| ary << #graph.get_connection(id, "likes") }
So that's a long way of getting two arrays, one of posts, one of like data.
But I'd quickly burn up my API requests limit in no time using this kind of approach.
I was kind of hoping I'd just be able to pass the whole string from the Graph API Explorer and just get what I wanted rather than having to manually parse all this stuff.
I don't really know about your posts.fields(likes.fields(id,name) -this does not work in the Graph API Explorer- and stuff like that but I know you can do this:
fb_api = Koala::Facebook::API.new(access_token)
fb_api.api("/me?fields=id,name,posts")
# => => {"id"=>"71170", "name"=>"My Name", "posts"=>{"paging"=>{"next"=>"https://graph.facebook.com/71170/posts?access_token=CAAEO&limit=25&until=13705022", "previous"=>"https://graph.facebook.com/711737070/posts?access_token=CAAEOTYMZD&limit=25&since=1370723&__previous=1"}, "data"=>[{"id"=>"71170_1013572471", "comments"=>{"count"=>0}, "created_time"=>"2013-06-09T08:03:43+0000", "from"=>{"id"=>"71170", "name"=>"My Name"}, "updated_time"=>"2013-06-09T08:03:43+0000", "privacy"=>{"value"=>""}, "type"=>"status", "story_tags"=>{"0"=>[{"id"=>"71170", "name"=>" ", "length"=>8, "type"=>"user", "offset"=>0}]}, "story"=>" likes a photo."}]}}
And you will receive in a hash what you asked for.
From time to time, you must pass nil as a param to koala:
result += graph_api.batch do |batch_api|
facebook_page_ids.each do |facebook_page_id|
batch_api.get_connections(facebook_page_id, nil, {"fields"=>"posts"})
end
end
I use tweetstream gem to get sample tweets from Twitter Streaming API:
TweetStream.configure do |config|
config.username = 'my_username'
config.password = 'my_password'
config.auth_method = :basic
end
#client = TweetStream::Client.new
#client.sample do |status|
puts "#{status.text}"
end
However, this script will stop printing out tweets after about 100 tweets (the script continues to run). What could be the problem?
The Twitter Search API sets certain arbitrary (from the outside) limits for things, from the docs:
GET statuses/:id/retweeted_by Show user objects of up to 100 members who retweeted the status.
From the gem, the code for the method is:
# Returns a random sample of all public statuses. The default access level
# provides a small proportion of the Firehose. The "Gardenhose" access
# level provides a proportion more suitable for data mining and
# research applications that desire a larger proportion to be statistically
# significant sample.
def sample(query_parameters = {}, &block)
start('statuses/sample', query_parameters, &block)
end
I checked the API docs but don't see an entry for 'statuses/sample', but looking at the one above I'm assuming you've reached 100 of whatever statuses/xxx has been accessed.
Also, correct me if I'm wrong, but I believe Twitter no longer accepts basic auth and you must use an OAuth key. If this is so, then that means you're unauthenticated, and the search API will also limit you in other ways too, see https://dev.twitter.com/docs/rate-limiting
Hope that helps.
Ok, I made a mistake there, I was looking at the search API when I should've been looking at the streaming API (my apologies), but it's possible some of the things I was talking about could be the cause of your problems so I'll leave it up. Twitter definitely has moved away from basic auth, so I'd try resolving that first, see:
https://dev.twitter.com/docs/auth/oauth/faq