How trustworthy are query result counts? - google-cloud-logging

I just ran a query in Logs Explorer for a 24 hour period and it returned 2409575 results
I then ran the exact query, changing only the start time and end time to define a time window that is a subset of the previous time window and it returned 2656840 results which is more than previously:
How can this be? My only conclusion from this is that the stated number of log results cannot be trusted. Can someone please inform me of what the expectations are for the log results tally? Can it be trusted?

Thank you for reporting this. The counts shown in the Log Fields panel and the Histogram are reliable. Only the total query results appear to have an issue due to the the nature of the incremental and approximate counts as logs are read. The final value should have been accurate. An internal issue has been crated and the team will fixing this soon.
Disclaimer: I work in Cloud Logging.

Related

Generate number of search requests over a given year

Does anyone know if there is a way to generate a report that details how many search requests the GSA has handled over a given timeframe?
On the admin console: Reports > Search Logs.
Select the appropriate collection or All Collections.
Select the desired date range for the Report Timeframe.
From memory this only has access to a max 90 days historical data though so if you haven't been regularly exporting this data than you'll need to extrapolate the values from what is available.
As pointed by #BigMikeW Logs only retain data for 90 days. Unless you download every 90 days, you wont get it.
Other way is integration with Google Analytics and pass all search data to GA search behavior. That way you can use GA to play around and export for a year or even more. This is what I use.

Google Webmaster data quality issues

I am running into a weird error.
We have a standard implementation of getting data from searchconsole and storing it in a database. We have crosschecked the data during the implementation and it was good.
Lately we have seen huge differences in what is reported in search console and the data retrieved from the API. In some cases it is only 10% lower than the search console data but in some cases the API data shows 50% less than what is being reported in the search console.
Is any one aware of these issues and has anyone run into this recently?
I have had this problem for about a month now and finally fixed this issue.
This was my original request
service, flags = sample_tools.init(
argv, 'webmasters', 'v3', __doc__, __file__,
scope='https://www.googleapis.com/auth/webmasters.readonly')
I have fixed it by removing the ".readonly" on the end. This was causing me to get sampled data.
My scope now looks like this and return full results.
service, flags = sample_tools.init(
argv, 'webmasters', 'v3', __doc__, __file__,
scope='https://www.googleapis.com/auth/webmasters')
I'm having the same issue of reconciling to the console. How are you storing the data, i.e. your database table structure?
Have you read about the differences in the aggregation between page and property? These can cause discrepancies.
https://support.google.com/webmasters/answer/6155685?hl=en#urlorsite
For example a search query that returns multiple pages aggregated by the property counts as 1 impression. When you group by pages this would show as however many pages you have in the search results e.g. 3 or 4. Therefore by query and by date your impressions will be lower than if you aggregate by page.

API User Usage Report: Inconsistent Reporting

I'm using a JVM to perform API calls to the Google Apps Administrator API.
I've noticed with the User Usage Reports, I'm not getting complete data when it comes to a field I'm interested in (num_docs_externally_visible) and the fields which form that fields calculation. I generally request a single day's usage report at a time, across my entire user base (~40k users).
According to the documentation on the developer's, I should be able to see that in a report 2-6 days after; however after running a report for the first 3 weeks of February, I've only gotten it for 60% of the days. The pattern appears to be random (in that I have up to 4 day in a row streaks of the item appearing and 3 days in a row streaks of it not, but there is no consistency to this).
Has anyone else experienced this issue? And if so, were you able to resolve it? Or should I expect this behavior to continue if this is an issue with what the API is returning outside of my control?
I think it's only natural that the data you get is not yet complete, it takes a certain day to receive the complete data.
This SO question is not exactly the same of your question, but i think it will help you. Especially the part that you need to use your account time zone.

Google Analytics: incorrect number of sessions when grouping by Custom Dimension

For a while I have successfully queried the number of sessions for my website, including the number of sessions per 'Lang Code' and per 'Distribution Channel'; both Custom Dimensions I have created in Analytics with their own slot and their Scope Type set to 'Session'.
Recently the number of sessions has decreased significantly when I group by a Custom Dimension, e.g. Lang Code.
The following query gives me a number of say 900:
https://ga-dev-tools.appspot.com/query-explorer/?start-date=2015-10-17&end-date=2015-10-17&metrics=ga%3Asessions
Whereas this query gives returns around a quarter of that, say ~220:
https://ga-dev-tools.appspot.com/query-explorer/?start-date=2015-10-17&end-date=2015-10-17&metrics=ga%3Asessions&dimensions=ga%3Adimension14
Now, my initial reaction was that 'Lang Code' was not set on all pages but I checked and this data is includes guaranteed on all pages of my website.
Also, no changes have been made to the Analytics View I'm querying.
The same issue occurred a couple of weeks ago and at the time I fixed this by changing the Scope Type of said Custom Dimensions to Session, but now I'm no longer sure if this was the correct fix or if this was just a temporary glitch since:
the issue didn't occur before
the issue now reoccurs
Does anyone have any idea what may have caused this data discrepancy?
P.S. to make things stranger, for daily reporting we run this query every night (around 2am), and then the numbers are actually correct, so apparently it makes a difference at what time the query is executed?

Why SQL query could take each time more time execute on subsequent executions?

I run a complex query against Oracle DB 11G based eBS R12 schema:
For first run it takes 4 seconds. If I run it again, it takes 9, next 30 etc.
If I add "and 1=1" it takes 4 seconds again, then 9, the 30 and so on.
Quick workaraound is that we added a random generated "and sometstring = somestring" and now the results are always in 4 second.
I have never encoutered a query that would behave this way (it should be opposite, or no siginificat change between executions). We tested it on 2 copies of same DB, same behaviour.
How to debug it? And what internal mechanics could be getting confused?
UPDATE 1:
EXPLAIN PLAN FOR
(my query);
SELECT * FROM table(DBMS_XPLAN.DISPLAY);
Is exactly the same before first run that it is for subsequent ones. see http://pastebin.com/dMsXmhtG
Check the DBMS_XPLAN.DISPLAY_CURSOR. The reason could be cardinality feedback or other adaptive techniques Oracle uses. You should see multiple child cursors related to SQL_ID of your query and you can compare their plans.
Has your query bound variables and columns used for filtering histograms? This could be another reason.
Sounds like you might be suffering from adaptive cursor sharing or cardinality feedback. Here is an article showing how to turn them off - perhaps you could do that and see if the issue stops happening, as well as using #OldProgrammer's suggestion of tracing what is happening.
If one of these is found to be the problem, you can then take the necessary steps to ensure that the root cause (eg. incorrect statistics, unnecessary histograms, etc.) is corrected.

Resources