Generate number of search requests over a given year - google-search-appliance

Does anyone know if there is a way to generate a report that details how many search requests the GSA has handled over a given timeframe?

On the admin console: Reports > Search Logs.
Select the appropriate collection or All Collections.
Select the desired date range for the Report Timeframe.
From memory this only has access to a max 90 days historical data though so if you haven't been regularly exporting this data than you'll need to extrapolate the values from what is available.

As pointed by #BigMikeW Logs only retain data for 90 days. Unless you download every 90 days, you wont get it.
Other way is integration with Google Analytics and pass all search data to GA search behavior. That way you can use GA to play around and export for a year or even more. This is what I use.

Related

Why I miss some data on querying of Chrome UX Report API?

On querying of Chrome UX Report API i get sometimes a 404 error, "chrome ux report data not found". Documentation says: If 404 - CrUX API doesn't have any data for given origin.
For all URLs I query, I get some metrics, there is no URL, where all metrics would be missed, and for most URLs I get all data.
But there are cases, where data of certain metric missed. For one URL is FID data missing (data for all other metrics exist), for another URLs - FID, LCP and CLS are missed (data for FCP exist).
Is it a kind of API glitch? What should I do to get data for all queried metrics?
PS: if i query the same URLs now and after 30 minutes, I get different results: for same URLs are different metrics data missed: at first query is FCP missed, at second query - LCP and CLS... Why is it so?
On the image you see how missed data looks:
FCP is the only metric guaranteed to exist. If a user visits a page but it doesn't have an FCP, CrUX throws it away. It's theoretically possible for some users to experience FCP but not LCP, for example if they navigate away in between events. Newer metrics like CLS weren't implemented in Chrome until relatively recently (2019) so users on much older versions of Chrome will not report any CLS values. There are also periodic metric updates and Chrome may require that metrics reflect the latest implementation in order to be aggregated in CrUX.
The results should be stable for roughly 1 full day. If you're seeing changes after only 30 minutes, it's possible that you happened to catch it during the daily update.

Can you have different time ranges on different panels on the same dashboard?

I'm trying to set up a monitoring dashboard that contains two graphs. One that shows current hour transaction volumes (in 1 minute intervals from current hour start until now) and one that shows current day transaction volumes (in 10 minute intervals from 00:00 until now). I can't seem to find a way to display two different x-axis timelines on the two different panels if I create them on the same dashboard. Is there a way to do what I'm looking for?
I've tried updating the queries themselves, messing with the dashboard settings, and messing with the panel settings but I haven't found what I needed. I'm using Grafana 6.0.0
Just found the answer in the docs: Relative time. With this option you can set a timerange per graph.
A bit too late for this answer, but might find someone who is in need of it though. You could have different time ranges for different panels on the same dashboard. I have a InfluxDB as data source and all I did was to update the Relative Time in the Query options.
In my case, the dashboard has a time range of 30 days but doing this way for a specific panel I was able to show data for the last 24 hours. The value in field Relative time should be positive (-24h will not work here).

API User Usage Report: Inconsistent Reporting

I'm using a JVM to perform API calls to the Google Apps Administrator API.
I've noticed with the User Usage Reports, I'm not getting complete data when it comes to a field I'm interested in (num_docs_externally_visible) and the fields which form that fields calculation. I generally request a single day's usage report at a time, across my entire user base (~40k users).
According to the documentation on the developer's, I should be able to see that in a report 2-6 days after; however after running a report for the first 3 weeks of February, I've only gotten it for 60% of the days. The pattern appears to be random (in that I have up to 4 day in a row streaks of the item appearing and 3 days in a row streaks of it not, but there is no consistency to this).
Has anyone else experienced this issue? And if so, were you able to resolve it? Or should I expect this behavior to continue if this is an issue with what the API is returning outside of my control?
I think it's only natural that the data you get is not yet complete, it takes a certain day to receive the complete data.
This SO question is not exactly the same of your question, but i think it will help you. Especially the part that you need to use your account time zone.

Google big query API returns "too many free query bytes scanned for this project"

I am using Google's big query API to retrieve results from their n-gram dataset. So I send multiple queries of "SELECT ngram from trigram_dataset where ngram == 'natural language processing'".
I'm basically using the same code posted here (https://developers.google.com/bigquery/bigquery-api-quickstart) replaced with my query statement.
On every program run, I have to get a new code of authorization and type it in the console, which gives authorization to my program to send queries to google big query under my project ID. However, after sending 5 queries, it just returns " "message" : "Exceeded quota: too many free query bytes scanned for this project".
According to Google Big Query policy, their free quota is 100G/month, and I don't think I've even nearly come close to their quota. Someone suggested in the previous thread that I should enable billing information to use their free quota, which I did, but it's still giving me the same error. Is there any way to check the leftover quota or how to resolve this problem? Thank you very much!
The query you've mentioned scans 1.12 GB of data, so you should be able to run it 89 times in a month.
The way the quota works is that you start out with 100 GB of monthly quota -- if you use it up, you don't have to wait an entire month, but you get 3.3 more quota every day.
My guess (please confirm) is that you ran a bunch of queries and used up your 100 GB monthly free quota, then waited a day, and only were able to run a few queries before hitting the quota cap. If this is not the case, please let me know, and provide your project id and I can take a look in the logs.
Also, note that this isn't the most efficient usage of bigquery; an option would be to batch together multiple requests. In this case you could do something like:
SELECT ngram
FROM trigram_dataset
WHERE ngram IN (
'natural language processing',
'some other trigram',
'three more words')

(ASP.NET) How would you go about creating a real-time counter which tracks database changes?

Here is the issue.
On a site I've recently taken over it tracks "miles" you ran in a day. So a user can log into the site, add that they ran 5 miles. This is then added to the database.
At the end of the day, around 1am, a service runs which calculates all the miles, all the users ran in the day and outputs a text file to App_Data. That text file is then displayed in flash on the home page.
I think this is kind of ridiculous. I was told they had to do this due to massive performance issues. They won't tell me exactly how they were doing it before or what the major performance issue was.
So what approach would you guys take? The first thing that popped into my mind was a web service which gets the data via an AJAX call. Perhaps every time a new "mile" entry is added, a trigger is fired and updates the "GlobalMiles" table.
I'd appreciate any info or tips on this.
Thanks so much!
Answering this question is a bit difficult since there we don't know all of your requirements and something didn't work before. So here are some different ideas.
First, revisit your assumptions. Generating a static report once a day is a perfectly valid solution if all you need is daily reports. Why hit the database multiple times throghout the day if all that's needed is a snapshot (for instance, lots of blog software used to write html files when a blog was posted rather than serving up the entry from the database each time -- many still do as an optimization). Is the "real-time" feature something you are adding?
I wouldn't jump to AJAX right away. Use the same input method, just move the report from static to dynamic. Doing too much at once is a good way to get yourself buried. When changing existing code I try to find areas that I can change in isolation wih the least amount of impact to the rest of the application. Then once you have the dynamic report then you can add AJAX (and please use progressive enhancement).
As for the dynamic report itself you have a few options.
Of course you can just SELECT SUM(), but it sounds like that would cause the performance problems if each user has a large number of entries.
If your database supports it, I would look at using an indexed view (sometimes called a materialized view). It should support allows fast updates to the real-time sum data:
CREATE VIEW vw_Miles WITH SCHEMABINDING AS
SELECT SUM([Count]) AS TotalMiles,
COUNT_BIG(*) AS [EntryCount],
UserId
FROM Miles
GROUP BY UserID
GO
CREATE UNIQUE CLUSTERED INDEX ix_Miles ON vw_Miles(UserId)
If the overhead of that is too much, #jn29098's solution is a good once. Roll it up using a scheduled task. If there are a lot of entries for each user, you could only add the delta from the last time the task was run.
UPDATE GlobalMiles SET [TotalMiles] = [TotalMiles] +
(SELECT SUM([Count])
FROM Miles
WHERE UserId = #id
AND EntryDate > #lastTaskRun
GROUP BY UserId)
WHERE UserId = #id
If you don't care about storing the individual entries but only the total you can update the count on the fly:
UPDATE Miles SET [Count] = [Count] + #newCount WHERE UserId = #id
You could use this method in conjunction with the SPROC that adds the entry and have both worlds.
Finally, your trigger method would work as well. It's an alternative to the indexed view where you do the update yourself on a table instad of SQL doing it automatically. It's also similar to the previous option where you move the global update out of the sproc and into a trigger.
The last three options make it more difficult to handle the situation when an entry is removed, although if that's not a feature of your application then you may not need to worry about that.
Now that you've got materialized, real-time data in your database now you can dynamically generate your report. Then you can add fancy with AJAX.
If they are truely having performance issues due to to many hits on the database then I suggest that you take all the input and cram it into a message queue (MSMQ). Then you can have a service on the other end that picks up the messages and does a bulk insert of the data. This way you have fewer db hits. Then you can output to the text file on the update too.
I would create a summary table that's rolled up once/hour or nightly which calculates total miles run. For individual requests you could pull from the nightly summary table plus any additional logged miles for the period between the last rollup calculation and when the user views the page to get the total for that user.
How many users are you talking about and how many log records per day?

Resources