Wrong count of users Google Reporting API v3

Wrong count of users Google Reporting API v3 - google-api

I use Google Analytics core Reporting v3. I request data from Google Analytics.
Request:
metrics = "ga:sessions,ga:users"
dimensions = "ga:landingPagePath"
filter = "ga:channelGrouping=#Organic"
precision = "higher_precision"
Results
{.... // 7 rows with landing pages
....
ga:users: "39"
....
containsSampledData: false}
BUT in the Google analytics website I see count of users = 34 for the same period.
If I delete dimensions = "ga:landingPagePath" from request, I get right count of users like in the UI (34 users).
How can I get data with dimensions = "ga:landingPagePath" and right count of users like in the UI for the one request?

The Google analytics database is a mulit dimensional database. It is NOT a relational database.
You can not compare things that are not exactly the same. Create a request in the Google analytics website that has exactly the same dimensions and metrics and date ranges and the data should return close to what you get in the API. There will always be a few errors especially with calculated columns.
You cant look at two different requests and expect to see the same numbers it doesn't work that way.
From the Google analytics website
From query explorer
Google Analytics website report
If you for example want to see this report You are going to have to include all of the dimensions and metrics you can see here in order to get the same results back in the API. You cant just take Landing pages and sessions and expect the numbers to match up. This is again not a relational database its a multi dimensional database.
Also remember to be sure that the dates you are checking are at least 48 hours old. Data under 48 hours old has not completed processing and may cause your numbers to be off.
For the one you have linked in comments you would need something like this
ga:landingPagePath
ga:users
ga:newUsers
ga:sessions
ga:bounceRate
ga:pageviewsPerSession
ga:avgSessionDuration
ga:goal1ConversionRate
ga:goal1Completions
ga:goal1Value

Related

(Google Analytics 4). Collect custom events sorted by "event_timestamp"

I have a custom event created directly on the web page, as follows:
gtag('event', <action>, {'value': <value>, 'event_timestamp': Date.now()});
Google Analytics 4 is recording them and I can see them in the reports. My problem is that the timestamp of the events has a precision of 1 second and sometimes, when two events occur in a row within the same second, GA4 records them with the same timestamp, and sometimes interleaves them in the timeline, putting the first one after the second one and breaking the sequence order.
It is very important to maintain the order of events in the Google Analytics 4 User Interface.
That's why the object 'event_timestamp': Date.now() has been added in gtag(), to get an accuracy of milliseconds.
Now, is it possible to sort the events in GA4's UI according to 'event_timestamp'?
Regards and thanks.
NOTE: I don't use Google Tag Manager or BigQuery.

How to perform a "from" in Elasticsearch scroll context?

I have a large dataset to query and display in website on an array.
I made a pagination system with a scroll but i can only display a maximum of 100 items at a time so i'm facing issue when i want to display data of page 200 and more because i have to scroll until them and it take too long.
I have check other parts of my code and i didn't find other perf issue, is just the scroll queries which make my api call too long. I tried setting the request size from 100 to 10000 but it doesn't change anything.
I don't think sliced scroll can be a solution or then I didn't understand the functionality.
I'm desperately searching a way to skip the scroll queries before datas that i'm searching even it's not a precise method.
Hoping someone has a solution or at least a clue.
Edit:
More details about what i'm trying to achieve.
I log some actions of my users like calls in Elasticsearch indexes. They do millions of actions per month so Elasticsearch seems like a good option to store them knowing that i don't have to update them after they are stored .
I'm creating a page where my users can search for actions they've performed, but they're doing the "query" themselves. I mean they can select the period and many other parameters, order them by many parameters, etc. The number of result can be 1 or 100,000 items, but I can't show 100,000 items on my page for UI reasons, so I have to manage a pagination and send only part of the result to the page.
I made a scroll query to do it for now with a size of 1000, and i scroll until i'm in the current page of my pagination. I tried to vary the size but it's not really concluent because I can't know the number of result before the query is made.
And the deeper my user go in the pagination, the longer the query take.
I could increase the index.max_result_window with an unreachable number (but I don't know what that implies) make a simple query with a from and a second scroll query for export case but I wonder if they are a way to skip some step in a scroll when i know i'm going to take 100 items after the 1 000 000th item ?
Edit: I watched how google design its pagination and i notice that if you want to go deep in search results you can't unless you go step by step. You can't go directly to the 500th page.
This is how I done mine
So I just redesign my pagination to do the same as Google and force my users to use more precise filters to get less result. Thank you #Val for getting me to ask the right questions :)

REST API - Retrieve previous query in dynamoDB

I have 100 rows of data in DynamoDB and a api with path api/get/{number}
Now when I say number=1 api should return me first 10 values. when I say number=2 it should return next 10 values. I did something like this with query, lastEvaluatedKey and sort by on createdOn . Now the use case is if the user passes number=10 after number=2 the lastEvaluatedKey is still that of page 2 and the result would be data of page 3. How can I get data directly. Also if the user goes from number=3 to number=1 still the data will not be of page 1.
I am using this to make API call based of pagination on HTML.
I am using java 1.8 and aws-java-sdk-dynamodb.

Non-sequential pagination in DynamoDB is tough - you have to design your data model around it, if it's an operation that needs to be efficient at all times. For a recommendation in your specific case I'd need more details about the data and access patterns.
In general you have the option of setting the ExclusiveStartKey attribute in the query call, which is similar to an offset in relational databases, but only similar and not identical. The ExclusiveStartKey is the key after which the query will continue, meaning data from your table and not just a number.
That means you usually can't guess it, unless it's a sequential number - which isn't ideal.
For sequential pagination, i.e. the user goes from page 1 to page 2, page 2 to page 3 etc. you can pass that along in the request as a token, but that won't work if the user moves in the other direction page 3 to page 2 or just randomly navigates to page 14.
In your case you only have a limited amount of data - 100 items, so my solution for your specific case would be to query all items and limit the amount of items in the response to n * 10, where n is the result page. Then you return the last 10 items from that result to your client.
This is a solution that would get expensive at scale (time + cost) though, fortunately not many people will use the pagination to go to page 7 or 8 though (you could bury a body on page 2 of the google search results).
Yan Cui has written an interesting post on this problem on Hackernoon, you might want to check it out.

Web Scraping returning empty data table UiPath

I’m using Data Scraping to scrape a product Information (i.e Product Name, Url, Price, Model) from a shopping website.
When I search for a product, I want whatever item comes first it scrapes that item’s data and for that purpose I have set maximum number of results to 1. But the problem is sometimes it is returning empty Data table And I cannot figure out why.
What I think is, if the current search result matches those elements that I selected in data scraping wizard, it returns the data table and if it doesn’t match it returns empty Data table.
For Example, While selecting elements in Data scraping wizard the search results were Samsung monitors. And when I ran the project I searched for Dell monitors, it returned Data table but when I searched for Samsung series or Dell Series it returned empty Data table. What is wrong with this?

You need to tell what you actually need as output.
But if your output is empty, mostly the reason is one of the following:
make sure the timeout is high enough, set it to 30000 if you are unsure
set a proper selector that has not a bad impact even when the website is being changed for some reason
For me it working properly with a proper timeout and a flexible selector with a *.

RSS functioning problem

I need to create an RSS feed for our information system, which is written in PHP.
I had no problems with the RSS 2.0 specification, nor with the creation of RSS feed generator. Items for the feed are to be fetched from a large table containing lots of records, so it will take a lot of time to get all the necessary information from this table. Therefore, it is necessary to implement the following scheme:
To show 5 latest items to new
subscribers.
For the existing subscribers – to
show only those items which have
been added since their last view of
the feed.
I have no problems with the first condition: I can simply use the LIMIT clause
to limit the number of fetched rows. Something like this:
$items = function_select(“SELECT * FROM some_table ORDER BY date DESC LIMIT 5);
But this creates the following problem: Suppose there are real feed subscribers who have already read the items from 1 up to 10. After they've been away for some period of time new items have been created; say, 10 new items.
During their next check-in we want them to see all the new 10 items, but not all at once. They will see only the last 5 ones (from 16 up to 20), but not all 10 of them. The items from 11 up to 15 will be omitted.
I suppose that in order to succeed in solving this problem there should be a kind of a flag to be sent to feed. For example: pubDate of the lasted fetched item. Twitter's feed uses something similar. However, that link is hand-made. How could it be done another way?
Please let me know if you have any ideas. If you have any example ready (no matter in what language) just share a link with me. I would appreciate it greatly.
Thank you in advance.

Standard RSS feeds don't render different content to different users. They simply always provide the most recent few items (often 10), and rely on the RSS reader to poll them often enough that they don't miss any updates. Unless you have a particularly compelling reason not to do this, this is the simplest and most effective mechanism.

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio