Creating Graphs with fire base analytics in big query - google-api

i am trying to create custom graphs with data i fetched from fire base console via event logging.
1- I found big query and data studio for generating graphs but my requirement is to generate graph auto and update it daily basis.
2- I also want to know about api which will help me to reflect these graphs(generate through big query) on front end web app in Reactjs
SELECT
*
FROM (
SELECT
(
SELECT
x.value
FROM
UNNEST(user_properties) x
WHERE
x.key='restaurantName'
AND x.value IS NOT NULL ).string_value AS restaurantName,
event_name AS event,
(
SELECT
x.value
FROM
UNNEST(user_properties) x
WHERE
x.key='restaurantId'
AND x.value IS NOT NULL).string_value AS restaurantId,
event_date AS date,
(
SELECT
x.value
FROM
UNNEST(event_params) x
WHERE
x.key="allergens"
AND x.value IS NOT NULL).string_value AS allergens,
(
SELECT
x.value
FROM
UNNEST(event_params) x
WHERE
x.key="dishes"
AND x.value IS NOT NULL).string_value AS dishes,
(
SELECT
x.value
FROM
UNNEST(event_params) x
WHERE
x.key='vegan'
AND x.value IS NOT NULL).string_value AS vegan,
(
SELECT
x.value
FROM
UNNEST(event_params) x
WHERE
x.key="vegetarian"
AND x.value IS NOT NULL).string_value AS vegetarian,
(
SELECT
x.value
FROM
UNNEST(event_params) x
WHERE
x.key="orderTotal"
AND x.value IS NOT NULL).string_value AS orderTotal,
app_info.version AS version
FROM
`reference`
WHERE
event_name="ConfirmOrderBtn"
AND app_info.id = "abc"
ORDER BY
event_date ASC )

the refresh rate at the backend depends on the connector you are using. Particularly in this case the BigQuery connector, which has the following Data refresh options:
Every 1 hour
Every 4 hours
Every 12 hours* (default)
An example of refresh time for other connectors, together with further useful information is described at the following link, where in the section "Set data freshness for a data source" you can see an example of freshness options available per connector.
At the frontend, instead, the data coming from the backend is updated in your browser in accordance with the Cache refresh rate. The Cache can be refreshed through the button "Refresh data" button, in the upper right-hand side of the UI. This process can be automated both via browser's console command or via a plugin, as specified in this question.
At the moment I am not aware of any Data Studio API. As I understand the easyness in using datastudio is indeed the exploitation of the ready-made front-end components and data integration tools. Therefore I am not sure I fully understand your question.
Please note that the minimum refresh rate for combined sources is equal to the minimum refresh rate among the sources. Therefore, in your case the data would update each 12 hours, nonetheless at the front end this would be refreshed daily. Also, refreshing data more often triggers more queries execution, needed to update the data, and therefore results in higher billing costs.

for me what i am currently following is , have created data source in data studio at this link https://datastudio.google.com/u/2/datasources/createcreate data store
1- connect project to choose your data set
2- Writing custom query for it
3- connect query
4- explore with graph , named it and save
5- Whenever you will visit above mentioned link you will have list of data sources and explorers to visit your graph.We will click refresh icon and it will update it !!

Related

SSIS Google Analytics - Reporting dimensions & metrics are incompatible

I'm trying to extract data from Google Analytics, using KingswaySoft - SSIS Integration Toolkit, in Visual Studio.
I've set the metrics and dimensions, but I get this error message:
Please remove transactions to make the request compatible. The request's dimensions & metrics are incompatible. To learn more, see https://ga-dev-tools.web.app/ga4/dimensions-metrics-explorer/
I've tried to remove transactions metric and it works, but this metric is really necessary.
Metrics: sessionConversionRate, sessions, totalUsers, transactions
Dimensions: campaignName, country, dateHour, deviceCategory, sourceMedium
Any idea on how to solve it?
I'm not sure how helpful this suggestion is but could a possible work around include having two queries.
Query 1: Existing query without transactions
Query 2: The same dimensions with transactionId included
The idea would be to use the SSIS Aggregate component to group by the original dimensions and count the transactions. You could then merge the queries together via a merge join.
Would that work?
The API supports what it supports. So if you've attempted to pair things that are incompatible, you won't get any data back. Things that seem like they should totally work go together like orange juice and milk.
While I worked on the GA stuff through Python, an approach we found helped us work through incompatible metrics and total metrics was to make multiple pulls using the same dimensions. As the data sets are at the same level of grain, as long as you match up each dimension in the set, you can have all the metrics you want.
In your case, I'd have 2 data flows, followed by an Execute SQL Task that brings the data together for the final table
DFT1: Query1 -> Derived Column -> Stage.Table1
DFT2: Query2 -> Derived Column -> Stage.Table2
Execute SQL Task
SELECT
T1.*, T2.Metric_A, T2.Metric_B, ... T2.Metric_Z
INTO
#T
FROM
Stage.T1 AS T1
INNER JOIN
Stage.T2 AS T2
ON T2.Dim1 = T1.Dim1 /* etc */ AND T2.Dim7 = T1.Dim7
-- Update you have solid data aka
-- isDataGolden exists in the "data" section of the response
-- Usually within 7? days but possibly sooner
UPDATE
X
SET
metric1 = S.metric1 /* etc */
FROM
dbo.X AS X
INNER JOIN #T AS T
ON T.Dim1 = X.Dim1
WHERE
X.isDataGolden IS NULL
AND T.isDataGolden IS NOT NULL;
-- Add new data but be aware that not all nodes might have
-- reported in.
INSERT INTO
dbo.X
SELECT
*
FROM
#T AS T
WHERE
NOT EXISTS (SELECT * FROM dbo.X AS X WHERE X.Dim1 = T.Dim1 /* etc */);

Field's Max and Min values as Default values in Control

I have 2 controls, Start Date and End Date. I would like to have the min and max of a particular field to be selected as default values of the controls. Is there anyway to do it. I tried creating a calculated field, max or min({field},[],pre_filter) but later realized that we can't add calculated field into a parameter. I'm using Standard Edition. Any help/idea is much appreciated.
I encountered a similar question recently and developed a workaround for this by connecting to my Redshift cluster which required 2 things:
A table housing all users for the dashboard in question
A table that houses the metrics I'm setting defaults on
I created a separate dataset for setting default parameters which contained a complete list of my users, along with the min/max values from querying the second table with the value. Something like:
SELECT USER_NAME
, MIN_METRIC
, MAX_METRIC
FROM USERS A
CROSS JOIN (SELECT MIN(METRIC_VALUE) MIN_METRIC
, MAX(METRIC_VALUE) MAX_METRIC
FROM METRIC_TABLE) B
Once you've built this new data set, you'd add it to your existing analysis and utilize it for setting default parameters, adding the controls, and setting the filters to key off of them.
The downside to this approach is that it does require an exhaustive user list as any null users would see whatever the non-dynamic defaults are, but with an appropriate user table, this shouldn't be an issue.

BIRT Report Designer - split actual and budget data stored in one table into columns and add a variance

I have financial data in the following format in a SQL database and I have to live with this format unfortunately (example dummy data below).
I have however been struggling to get it into the following layout in a BIRT report.
I have tried creating a data cube with Package, Flow and Account as Dimensions and Balance as a Measure, but that groups actual PER and actual YTD next to each other and budget PER and YTD next to each-other etc so is not quite what I need.
The other idea I had was to create four new calculated columns, the first would only have a value if it were a line for actual and per, the next only if it was actual and ytd etc, but could not get the IF function working in the calculated column.
What are the options? Can someone point me in the direction of how to best create the above layout from this data structure so I can take it from there?
Thanks in advance.
I am not sure what DB you are using in the back end, but this is how I did it with SQL Server.
The important bit happens in the Data Set. Here is the SQL for my Data Set:
SELECT
Account,
Package,
Flow,
Balance
FROM data
UNION
SELECT DISTINCT
Account,
'VARIANCE',
Flow,
(SELECT COALESCE(SUM(Balance),0) FROM data WHERE Account = d.Account AND Flow = d.Flow AND Package = 'ACTUAL') - (SELECT COALESCE(SUM(Balance), 0) FROM data WHERE Account = d.Account AND Flow = d.Flow AND Package = 'BUD') as Balance
FROM data d
This gives me a table like:
Then I created a DataCube that contained
Groups/Dimensions
Account
Flow
Package
Summary Fields/Measures
Balance
Then I created a CrossTab Report that was based on that DataCube
And this produces the result of:
Hopefully this helps.

OBIEE 11g Sort Pivot Prompt

I have created a query that selects user base data from two different weeks, uses a MSUM to work out the difference between the two weeks and then create a projection of base size across different verticals based on the net change.
This requires the use of a pivot table with prompts to display just the data from the most recent financial week (in format YYYY-MM), however, every time a new week rolls around, it resets the ordering in the pivot prompt to show the least recent week, which makes the calculations redundant.
I can't re-order the weeks in the base data, as the MSUM calc requires a specific order to be used across multiple dimensions.
Whilst this is very easily fixed by the end user each time by changing the drop down, or by the support team by editing the pivot table and changing the prompt before saving, (which then persists until the next week), it is either going to be a poor customer experience, or extra work for the support group.
Is there a method that I'm missing to create a sort on the pivot prompt options from within the pivot table options?
The equation follows this kind of logic...
"Metrics"."Base Size" + (
(
(
"Metrics"."Base Size" - (
MSUM ("Metrics"."Base Size", 2) - "Metrics"."Base Size"
)
) / [days in time period]
) * 365
)
OBI will order the data as defined by the sort order in the RPD, but ascending is probably the best choice for it at that level.
In your case you could put the Analysis on a dashboard and use a dashboard prompt instead. For that you have the ability, in the options, to change the "Choice List Options" to SQL Results. This should put in a default query, to which you could add an ORDER BY clause. You can also set that to default to the most recent/current period no matter the sort order of the column.
SELECT "Date"."Financial Week"
FROM "My Subject Area"
ORDER BY "Date"."Financial Week" DESC
Instead of using the MSUM() function, you may also be better to use one of the built in time-series functions that can get the value of a previous period for you, without having to rely on any ordering. Have a look into the Ago() function to get the previous period.

MySQL get rows, but for any matching get the latest version

I'm developing a CMS, and implementing versioning by adding a new entry to the database with the current timestamp.
My table is set up as follows:
id | page | section | timestamp | content
"Page" is the page being accessed, which is either the path to the page ($page_name below), or '/' (to indicate 'global' fields).
"Section" is the section of the page being edited.
I want to be able to select all rows for a given page, but each section should only be selected once, the one with the latest timestamp being selected.
I've tried using the following CodeIgniter Active Record code:
$this->db->select('DISTINCT(section), content');
$this->db->where_in('page', array('/', $page_name));
$this->db->order_by('timestamp', 'desc');
$query = $this->db->get('cms_content');
Which is producing the following SQL:
SELECT DISTINCT(section), `content`
FROM (`cms_content`)
WHERE `page` IN ('/', 'index.html')
AND `enabled` = 1
ORDER BY `timestamp` desc
Which is returning both test rows (rows have all same fields except id, timestamp and content).
Any ideas as to where I'm going wrong?
Thanks!
Your mistake is thinking that DISTINCT applies only to section - an easy mistake to make as the parentheses are misleading here. In fact the DISTINCT applies to the entire row whether or not you have parentheses. It is therefore best to omit the parentheses to avoid confusion.
Your problem is a classic 'max per group' problem. There are many, many ways to write this query and it is probably one of the most popular SQL questions on this site so you can search Stack Overflow to find ways to solve it. One way to get you started is to only select rows which hold the maximum timestamp for that section:
SELECT section, content
FROM cms_content T1
WHERE page IN ('/', 'index.html')
AND enabled = 1
AND timestamp = (
SELECT MAX(timestamp)
FROM cms_content T2
WHERE page IN ('/', 'index.html')
AND enabled = 1
AND T1.section = T2.section
)
I'm sorry but I do not know how to convert this SQL code into CodeIgniter Active Record. If another user more familiar with Active Record wishes to use this as a starting point for their own answer, they are welcome.
DISTINCT is for all columns selected, and because "content" differs you will get two different rows.
You only want to order by timestamp and limit 1 because you always want the latest.
But may I suggest that you keep a cross reference to the "active" page? That way, you are able to revert to a previous revision without dumping the new ones.
Meaning:
page
----
id
info
active_page_id
page_revisions
--------------
id
page_id
content
timestamp
...
Meaning, you have one-to-many between page <-> page_revisions, aswell as a one-to-one between page and page_revisions to keep track of the "current" revision. With this approach you are able to just join in the active revision.
This will do the job in Codeigniter, without temporary tables:
$this->db->query( "SELECT *
FROM cms_content AS c1
LEFT JOIN cms_content AS c2
ON c1.page=c2.page
AND c1.section=c2.section
AND c1.timestamp < c2.timestamp
WHERE c2.timestamp IS NULL AND page=?", $page );

Resources