Scraping Real Time Visitors from Google Analytics - ajax

I have a lot of sites and want to build a dashboard showing the number of real time visitors on each of them on a single page. (would anyone else want this?) Right now the only way to view this information is to open a new tab for each site.
Google doesn't have a real-time API, so I'm wondering if it is possible to scrape this data. Eduardo Cereto found out that Google transfers the real-time data over the realtime/bind network request. Anyone more savvy have an idea of how I should start? Here's what I'm thinking:
Figure out how to authenticate programmatically
Inspect all of the realtime/bind requests to see how they change. Does each request have a unique key? Where does that come from? Below is my breakdown of the request:
https://www.google.com/analytics/realtime/bind?VER=8
&key= [What is this? Where does it come from? 21 character lowercase alphanumeric, stays the same each request]
&ds= [What is this? Where does it come from? 21 character lowercase alphanumeric, stays the same each request]
&pageId=rt-standard%2Frt-overview
&q=t%3A0%7C%3A1%3A0%3A%2Ct%3A11%7C%3A1%3A5%3A%2Cot%3A0%3A0%3A4%2Cot%3A0%3A0%3A3%2Ct%3A7%7C%3A1%3A10%3A6%3D%3DREFERRAL%3B%2Ct%3A10%7C%3A1%3A10%3A%2Ct%3A18%7C%3A1%3A10%3A%2Ct%3A4%7C5%7C2%7C%3A1%3A10%3A2!%3Dzz%3B%2C&f
The q variable URI decodes to this (what the?):
t:0|:1:0:,t:11|:1:5:,ot:0:0:4,ot:0:0:3,t:7|:1:10:6==REFERRAL;,t:10|:1:10:,t:18|:1:10:,t:4|5|2|:1:10:2!=zz;,&f
&RID=rpc
&SID= [What is this? Where does it come from? 16 character uppercase alphanumeric, stays the same each request]
&CI=0
&AID= [What is this? Where does it come from? integer, starts at 1, increments weirdly to 150 and then 298]
&TYPE=xmlhttp
&zx= [What is this? Where does it come from? 12 character lowercase alphanumeric, changes each request]
&t=1
Inspect all of the realtime/bind responses to see how they change. How does the data come in? It looks like some altered JSON. How many times do I need to connect to get the data? Where is the active visitors on site number in there? Here is a dump of sample data:
19
[[151,["noop"]
]
]
388
[[152,["rt",[{"ot:0:0:4":{"timeUnit":"MINUTES","overTimeData":[{"values":[49,53,52,40,42,55,49,41,51,52,47,42,62,82,76,71,81,66,81,86,71,66,65,65,55,51,53,73,71,81],"name":"Total"}]},"ot:0:0:3":{"timeUnit":"SECONDS","overTimeData":[{"values":[0,1,1,1,1,0,1,0,1,1,1,0,2,0,2,2,1,0,0,0,0,0,2,1,1,2,1,2,0,5,1,0,2,1,1,1,2,0,2,1,0,5,1,1,2,0,0,0,0,0,0,0,0,0,1,1,0,3,2,0],"name":"Total"}]}}]]]
]
388
[[153,["rt",[{"ot:0:0:4":{"timeUnit":"MINUTES","overTimeData":[{"values":[52,53,52,40,42,55,49,41,51,52,47,42,62,82,76,71,81,66,81,86,71,66,65,65,55,51,53,73,71,81],"name":"Total"}]},"ot:0:0:3":{"timeUnit":"SECONDS","overTimeData":[{"values":[2,1,1,1,1,1,0,1,0,1,1,1,0,2,0,2,2,1,0,0,0,0,0,2,1,1,2,1,2,0,5,1,0,2,1,1,1,2,0,2,1,0,5,1,1,2,0,0,0,0,0,0,0,0,0,1,1,0,3,2],"name":"Total"}]}}]]]
]
388
[[154,["rt",[{"ot:0:0:4":{"timeUnit":"MINUTES","overTimeData":[{"values":[53,53,52,40,42,55,49,41,51,52,47,42,62,82,76,71,81,66,81,86,71,66,65,65,55,51,53,73,71,81],"name":"Total"}]},"ot:0:0:3":{"timeUnit":"SECONDS","overTimeData":[{"values":[0,3,1,1,1,1,1,0,1,0,1,1,1,0,2,0,2,2,1,0,0,0,0,0,2,1,1,2,1,2,0,5,1,0,2,1,1,1,2,0,2,1,0,5,1,1,2,0,0,0,0,0,0,0,0,0,1,1,0,3],"name":"Total"}]}}]]]
]
Let me know if you can help with any of the items above!

To get the same, Google has launched new Real Time API. With this API you can easily retrieve real time online visitors as well as several Google Analytics with following dimensions and metrics. https://developers.google.com/analytics/devguides/reporting/realtime/dimsmets/
This is quite similar to Google Analytics API. To start development on this,
https://developers.google.com/analytics/devguides/reporting/realtime/v3/devguide

With Google Chrome I can see the data on the Network Panel.
The request endpoint is https://www.google.com/analytics/realtime/bind
Seems like the connection stays open for 2.5 minutes, and during this time it just keeps getting more and more data.
After about 2.5 minutes the connection is closed and a new one is open.
On the Network panel you can only see the data for the connections that are terminated. So leave it open for 5 minutes or so and you can start to see the data.
I hope that can give you a place to start.

Having google in the loop seems pretty redundant. Suggest you use a common element delivered on demand from the dashboard server and include this item by absolute URL on all pages to be monitored for a given site. The script outputting the item can read the IP of the browser asking and these can all be logged into a database and filtered for uniqueness giving a real time head count.
<?php
$user_ip = $_SERVER["REMOTE_ADDR"];
/// Some MySQL to insert $user_ip to the database table for website XXX goes here
$file = 'tracking_image.gif';
$type = 'image/gif';
header('Content-Type:'.$type);
header('Content-Length: ' . filesize($file));
readfile($file);
?>
Ammendum:
A database can also add a timestamp to every row of data it stores. This can be used to further filter results and provide the number of visitors in the last hour or minute.
Client side Javascript with AJAX for fine tuning or overkill
The onblur and onfocus javascript commands can be used to tell if the the page is visible, pass the data back to the dashboard server via Ajax. http://www.thefutureoftheweb.com/demo/2007-05-16-detect-browser-window-focus/
When a visitor closes a page this can also be detected by the javascript onunload function in the body tag and Ajax can be used to send data back to the server one last time before the browser finally closes the page.
As you may also wish to collect some information about the visitor like Google analytics does this page https://panopticlick.eff.org/ has a lot of javascript that can be examined and adapted.

I needed/wanted realtime data for personal use so I reverse-engineered their system a little bit.
Instead of binding to /bind I get data from /getData (no pun intended).
At /getData the minimum request is apparently: https://www.google.com/analytics/realtime/realtime/getData?pageId&key={{propertyID}}&q=t:0|:1
Here's a short explanation of the possible query parameters and syntax, please remember that these are all guesses and I don't know all of them:
Query Syntax: pageId&key=propertyID&q=dataType:dimensions|:page|:limit:filters
Values:
pageID: Required but seems to only be used for internal analytics.
propertyID: a{{accountID}}w{{webPropertyID}}p{{profileID}}, as specified at the Documentation link below. You can also find this in the URL of all analytics pages in the UI.
dataType:
t: Current data
ot: Overtime/Past
c: Unknown, returns only a "count" value
dimensions (| separated or alone), most values are only applicable for t:
1: Country
2: City
3: Location code?
4: Latitude
5: Longitude
6: Traffic source type (Social, Referral, etc.)
7: Source
8: ?? Returns (not set)
9: Another location code? longer.
10: Page URL
11: Visitor Type (new/returning)
12: ?? Returns (not set)
13: ?? Returns (not set)
14: Medium
15: ?? Returns "1"
page:
At first this seems to work for pagination but after further analysis it looks like it's also used to specify which of the 6 pages (Overview, Locations, Traffic Sources, Content, Events and Conversions) to return data for.
For some reason 0 returns an impossibly high metrictotal
limit: Result limit per page, maximum of 50
filters:
Syntax is as specified at the Documentation 2 link below except the OR is specified using | instead of a comma.6==CUSTOM;1==United%20States
You can also combine multiple queries in one request by comma separating them (i.e. q=t:1|2|:1|:10,t:6|:1|:10).
Following the above "documentation", if you wanted to build a query that requests the page URL and city of the top 10 active visitors with a traffic source type of CUSTOM located in the US you would use this URL: https://www.google.com/analytics/realtime/realtime/getData?key={{propertyID}}&pageId&q=t:10|2|:1|:10:6==CUSTOM;1==United%20States
Documentation
Documentation 2
I hope that my answer is readable and (although it's a little late) sufficiently answers your question and helps others in the future.

Related

Kaltura Notifications are occationally deactivated?

We are using Kaltura to notify our CMS about changes in the videos. In the KMC under Settings->Integrations Settings we have checked all the checkboxes under "Sent by Server".
Some times these checkmarks disappear? IT happens maybe once a week or once a month. How can we find the reason to these boxes being deactivated?
Those notifications are being stored on the partner object in partner table. The actual data is stored in the custom_data field, which holds large amount of PHP-serialized data.
I can suspect cases that due to updates of other fields in the custom_data object, the notifications section will be erased.
Your best shot would be first check the value of that field when the config got erased. If it was actually erased in the database, try to find the following log messages in api_v3.log (which can lead you to the actual API request that modified the field):
[2124167851][propel] */ UPDATE partner SET
`UPDATED_AT`='2017-10-04 14:11:36',
`NOTIFY`='1',
`CUSTOM_DATA`='a:79:{s:9:"firstName";s:5:"Roman";s:12:"isFirstLogin";b:0;
... tons of PHP serialized data ...
i:1;s:19:"notificationsConfig";s:42:"*=0;1=1;2=1;3=1;4=0;21=0;6=0;7=0;26=0;5=0;";
... tons of PHP serialized data ...
}' WHERE partner.ID='101' AND MD5(cast(partner.CUSTOM_DATA as char character set latin1)) = '7eb7781cc04c7f98077efc2e3c1e9426'
The key that stores the notifications config is notificationsConfig (Each number represents the notification type, then 0 / 1 for off / no).
As a side note, which CE version are you using? There might be a more reliable way to integrate with your CMS.

Unable to get results more than 100 results on google custom search api

I need to use Google Custom Search API https://developers.google.com/custom-search/v1/overview. From that page, it said:
For CSE users, the API provides 100 search queries per day for free.
If you need more, you may sign up for billing in the Developers
Console. Additional requests cost $5 per 1000 queries, up to 10k
queries per day.
I already sign up for billing inside the developer console. However, I still could not retrieve results more than 100. What things should I do more? https://www.googleapis.com/customsearch/v1?cx=CSE_INSTANCE&key=API_KEY&q=QUERY&start=100
{ error: { errors: [ { domain: "global", reason: "invalid", message:
"Invalid Value" } ], code: 400, message: "Invalid Value" } }
Query: Definition
https://support.google.com/customsearch/answer/1361951
Any actual user query from a Google Site Search engine, including but
not limited to search engines installed on your website using XML,
iFrame, or the Custom Search Element.
That means you would probably need to send eleven queries to get more than 100 results.
GET https://www.googleapis.com/customsearch/v1?&q=QUERY&...&start=1
GET https://www.googleapis.com/customsearch/v1?&q=QUERY&...&start=11
GET https://www.googleapis.com/customsearch/v1?&q=QUERY&...&start=21
GET ...
GET https://www.googleapis.com/customsearch/v1?&q=QUERY&...&start=81
GET https://www.googleapis.com/customsearch/v1?&q=QUERY&...&start=91
GET https://www.googleapis.com/customsearch/v1?&q=QUERY&...&start=101
Check every response and if error code is 400, you can stop - there is probably no need to send next (&start=previous+10) request.
Now you can merge responses and start building results page.
Google Custom Search and Google Site Search return up to 10 results
per query. If you want to display more than 10 results to the user,
you can issue multiple requests (using the start=0, start=11 ...
parameters) and display the results on a single page. In this case,
Google will consider each request as a separate query, and if you are
using Google Site Search, each query will count towards your limit.
There might be a better way to do this then I described above. (But, I'm not sure about batching API calls.)
And (finally) possible answer to your question: I made more than few tests, but I haven't had any luck with start greater than 100 (I was getting the same as you - <Response [400]>). I'm using "Browser key" from my billing-enabled project. That could mean we can't get 101st, 102nd, 103rd, etc. results with CSE API.
The API documentation says it never returns more than 100 items.
https://developers.google.com/custom-search/v1/reference/rest/v1/cse/list
start
integer (uint32 format)
The index of the first result to return. The default number of results
per page is 10, so &start=11 would start at the top of the second page
of results. Note: The JSON API will never return more than 100
results, even if more than 100 documents match the query, so setting
the sum of start + num to a number greater than 100 will produce an
error. Also note that the maximum value for num is 10.

Can I reduce my amount of requests in Google Maps JavaScript API v3?

I call 2 locations. From an xml file I get the longtitude and the langtitude of a location. First the closest cafe, then the closest school.
$.get('https://maps.googleapis.com/maps/api/place/nearbysearch/xml?
location='+home_latitude+','+home_longtitude+'&rankby=distance&types=cafe&sensor=false&key=X',function(xml)
{
verander($(xml).find("result:first").find("geometry:first").find("location:first").find("lat").text(),$(xml).find("result:first").find("geometry:first").find("location:first").find("lng").text());
}
);
$.get('https://maps.googleapis.com/maps/api/place/nearbysearch/xml?
location='+home_latitude+','+home_longtitude+'&rankby=distance&types=school&sensor=false&key=X',function(xml)
{
verander($(xml).find("result:first").find("geometry:first").find("location:first").find("lat").text(),$(xml).find("result:first").find("geometry:first").find("location:first").find("lng").text());
}
);
But as you can see, I do the function verander(latitude,longtitude) twice.
function verander(google_lat, google_lng)
{
var bryantPark = new google.maps.LatLng(google_lat, google_lng);
var panoramaOptions =
{
position:bryantPark,
pov:
{
heading: 185,
pitch:0,
zoom:1,
},
panControl : false,
streetViewControl : false,
mapTypeControl: false,
overviewMapControl: false ,
linksControl: false,
addressControl:false,
zoomControl : false,
}
map = new google.maps.StreetViewPanorama(document.getElementById("map_canvas"), panoramaOptions);
map.setVisible(true);
}
Would it be possible to push these 2 locations in only one request(perhaps via an array)? I know it sounds silly but I really want to know if their isn't a backdoor to reduce these google maps requests.
FTR: This is what a request is for Google:
What constitutes a 'map load' in the context of the usage limits that apply to the Maps API? A single map load occurs when:
a. a map is displayed using the Maps JavaScript API (V2 or V3) when loaded by a web page or application;
b. a Street View panorama is displayed using the Maps JavaScript API (V2 or V3) by a web page or application that has not also displayed a map;
c. a SWF that loads the Maps API for Flash is loaded by a web page or application;
d. a single request is made for a map image from the Static Maps API.
e. a single request is made for a panorama image from the Street View Image API.
So I'm afraid it isn't possible, but hey, suggestions are always welcome!
Your calling places api twice and loading streetview twice. So that's four calls but I think they only count those two streetviews as once if your loading it on one page. And also your places calls will be client side so they won't count towards your limits.
But to answer your question there's no loop hole to get around the double load since you want to show the users two streetviews.
What I would do is not load anything until the client asks. Instead have a couple of call to action type buttons like <button onclick="loadStreetView('cafe')">Click here to see Nearby Cafe</button> and when clicked they will call the nearby search and load the streetview. And since it is only on client request your page loads will never increment the usage counts like when your site get's crawled by search engines.
More on those usage limits
The Google Places API has different usages then the maps. https://developers.google.com/places/policies#usage_limits
Users with an API key are allowed 1 000 requests per 24 hour period
Users who have verified their identity through the APIs console are allowed 100 000 requests per 24 hour period. A credit card is required for verification, by enabling billing in the console. We ask for your credit card purely to validate your identity. Your card will not be charged for use of the Places API.
100,000 requests a day if you verify yourself. That's pretty decent.
As for Google Maps, https://developers.google.com/maps/faq#usagelimits
You get 25,000 map loads per day and it says.
In order to accommodate sites that experience short term spikes in usage, the usage limits will only take effect for a given site once that site has exceeded the limits for more than 90 consecutive days.
So if you go over a bit not and then it seems like they won't mind.
p.s. you have an extra comma after zoom:1 and zoomControl : false and they shouldn't be there. Will cause errors in some browsers like IE. You also are missing a semicolon after var panoramaOptions = { ... } and before map = new

reading EMV card using PPSE and not PSE

I'm trying to read the data off a contactless Visa Paywave card.
For the Paywave, I have to submit a SELECT using PPSE (2PAY.SYS.DDF01) instead of PSE (1PAY.SYS.DDF01).
The EMV book 1, section 11.3.4, table 43 only describes how to interpret the response for a successful SELECT command using PSE. Does anyone know or can refer me to a source that shows how to process the data returned from a successful SELECT command using PPSE?
Here's my request APDU:
00A404000e325041592e5359532e444446303100
Here's the response:
6F2F840E325041592E5359532E4444463031A51DBF0C1A61184F07A0000000031010500A564953412044454249548701019000
I understand tag 84, tag 85, tag BF0C from the response. According to the examples for reading PSE, I should be able to just send GET PROCESSION OPTIONS (to get the AIP and AFL) with PDOL = null after this successful response as follows: 80A80000830000.
But request 80A80000830000 returns error code 6985 - Command not allowed; conditions of use not satisfied.
I also tried reading all the files after successfully selecting the PPSE by traversing through every single SFI (0-30) and every single record (0-16) of each SFI. Yes, I also did the 3 bit shift and bitwise-OR the SFI with 0x4. But I got no data.
I'm stuck, any help that would point me into getting some info from my Paywave card would be appreciated!
Have you tried this tool from EMVLAB http://www.emvlab.org/emvtags/
Using that tool,
http://www.emvlab.org/tlvutils/?data=6F2F840E325041592E5359532E4444463031A51DBF0C1A61184F07A0000000031010500A564953412044454249548701019000
2PAY.SYS.DDF01 is for contactless (e.g. NFC ) cards, while 1PAY.SYS.DDF01 is for contact cards.
After successfully (SW1 SW2 = 90 00) reading a PSE, you should only search for the SFI (tag 88) which is a mandatory field in the FCI template returned.
With the SFI as your start index, your would have to read the records starting from the start index until you get a 6A83 (RECORD_NOT_FOUND). E.g. if your SFI is 1, you would do a readRecord with record_number=1. That would probably be successful. Then you increament record_number to 2 and do readRecord again. The increament to 3 .... Repeat it until you get 6A83 as your status.
The records read would be ADFs (at least 1). Then your would have to compare the read ADF Names with what your terminal support and also based on the ASI (Application Selection Indicator). At the end you would have a list of possible ADFs (Candidate list)
All the above steps (1-3) are documented in chapter 12.3.2 Book1 v4.3 of the EMV spec.
You would have to make a final selection (Chapter 12.4 Book1)
Read the spec book 1 chapter 12.3 - 12.4 for all the detailed steps.
You seem to have the flow mixed up a bit, you want to:
Send 1PAY or 2PAY, it doesn't actually matter for all of the cards I've tested. This will return a list of the AIDs available on the card. Alternately you can just select an AID straight away if you know it's there but good practice would be to check first.
Get the list of AIDs returned in response to 1PAY/2PAY, in PayWave's case this will probably be A0000000031010 if you sent 2PAY but you may get more if you send 1PAY.
Select one of the AIDs sent back (or one you already know is on there).
Then loop through the SFIs and records sending the Read Records command to get the data.
You don't have to send Get Processing Options before sending the Read Records command even though that's now a normal transaction flow goes.
I think the information you're looking for is available from this VISA website. But only if you're a registered and/or licensed partner of VISA.
EDIT: Looking at the resulting TLV struct under BF0C:
tag=0xBF0C, length=0x1A
tag=0x61, length=0x18
tag=0x4F, length=0x07, value=0xA0000000031010 // looks like an AID to me
tag=0x50, length=0x0A, value="VISA DEBIT"
tag=0x87, length=0x01, value=0x01
I would guess that you need to first select A0000000031010 before getting the processing options.
I was selecting application 2PAY.SYS.DDF01. when I should have been selecting AID = 0xA0000000031010. It looks like there's no records under application 2PAY.SYS.DDF01.
But there was 1 record under application 0xA0000000031010. After I got this application, I performed a READ RECORD, and the first record gave me the PAN and all the credit card info I wanted.
Thanks everyone for chiming in.

Google Places API does not return gym info

I could not get Google Places API to return gym info. Below is an example request api.
https://maps.googleapis.com/maps/api/place/search/json?location=33.347075,-111.96318&radius=100&types=gym&sensor=false&key=AIzaSyBg8HI6sH1Rxyhn1Mno_hhgDawuF1KAfq0
(Open URL)
I know the lat/lon in the link is valid.
If I remove the "types=gym" (see below link" it returns some places info but none of type gym.
https://maps.googleapis.com/maps/api/place/search/json?location=33.347075,-111.96318&radius=100&sensor=false&key=AIzaSyBg8HI6sH1Rxyhn1Mno_hhgDawuF1KAfq0
(Open URL)
Is there a limitation on the api?
Also, could I have the api to return an uri which takes me directly to the location?
You just need to increase your search radius a bit - you're looking for results within a 100m circle. Try this:
https://maps.googleapis.com/maps/api/place/search/json?location=33.347075,-111.96318&radius=200&types=gym&sensor=false&key=AIzaSyBg8HI6sH1Rxyhn1Mno_hhgDawuF1KAfq0
Increasing the radius to just 200m returns a result; at 1000m you get four results.
You can then pass the reference value to a Places Details search to get the url value, as follows:
https://maps.googleapis.com/maps/api/place/details/json?reference=CnRnAAAA99xxsFT0V-FNigzMi7GEnmkqWRYCOZG-lrQH0fpw9iI_JUp5WHrYOCcTGpeyzVdHrtk3rE2zrHleBxRw4i67K0sT_fhsSQufaAHN80Oi4OvxR-amG_W4plz5Mr8a-512584oHpfUpV87jMqyF2R8cRIQpqTgOCgZtZF0hYR4R_ZVRRoUEIS-oN1fcyVQcN5nj7DxaNK-e8o&sensor=false&key=AIzaSyBg8HI6sH1Rxyhn1Mno_hhgDawuF1KAfq0
The url links to the Google Maps Place page: http://maps.google.com/maps/place?cid=2681829493569576902
See the docs here: http://code.google.com/apis/maps/documentation/places/#PlaceDetailsRequests
also there is a limit of the no. requests on this particular api.
2500 IIRC -- but you can read it in the docs
Are there 20 coming back with the "types=gym" call? If so then you are hitting the limitation of the api and it just so happens the 20 returned are not a gym:
The Places API returns up to 20 establishment results. Additionally,
political results may be returned which serve to identify the area of
the request

Resources