AWS Quicksight aggregate data - amazon-quicksight

i have a dataset like this
Order
id
expected date
1
11-04-2022
2
10-04-2022
2
14-04-2022
Order Event
Id
Order Id
Order status
Date
1
1
created
01-04-2022
2
1
completed
12-04-2022
3
2
created
01-04-2022
4
2
in progress
07-04-2022
5
2
completed
10-04-2022
6
3
created
10-04-2022
and i need to create a graph that show, for all order with completed status the difference between expected date and actual order date.
How can i archueve that

First, you have to join both of the tables into one because QuickSight can only work with multiple data files if they are merged. You can apply an inner join on the order ID.
Then, you can calculate the difference between the expected date and the order date and add an if-statement to filter out the orders who are not completed yet. You do this by adding a calculated field to your dataset with the following code:
ifelse(
{Order_status}="completed",
dateDiff({expected_date},{Date},"DD"),
0
)
You can also modify this field. Here, I wrote "DD" for the date difference in days, you can also select hours etc.. Also, if the order is not completed, I selected 0 as a default value. To find out more about the commands used in this calculated field, visit this AWS Docs links:
If-Else Command
Date-Diff Command
Now that the calculated field is created, you can plot it together with the order ID.
BR mylosf

Related

I would like to create an efficient Bigtable row key

I would like to create an optimal row key in Bigtable. I have a table channel_data with 3 columns: channel_id,date,fan_count.
channel_id
date
fan_count
1
2022-03-01
5000
1
2022-03-02
6000
2
2022-03-01
200
2
2022-03-02
300
3
2022-03-03
1000
Users of our application can set up brands/buckets by adding multiple channels. Users can choose any random channel_id.
I want to design an efficient row key to fetch aggregated fan_count in a date range for a brand.
Let's say the user creates a brand with channel_id 1 and 3 and wish to see sum of all fans for the time period 2022-03-01 to 2022-03-03
The result should be 5000+6000+1000=12000
You have a few options here. Because you're looking to do queries based on date, you should probably make that the end part of your rowkey so you can scope down by brand first. You could also use timestamped cells to store multiple values for each channel. Perhaps a week or month of data, so it is grouped together in that way, but this isn't necessary.
Perhaps a rowkey like channel_id/yyyy-mm-dd is what you'd want. You can choose to store the date and channel info in the table, but it isn't necessary since you'd have it in your ids. You can just treat Bigtable like a key/value store in this instance which might be more optimal depending on your scenario.
If you choose to store a month of data per row, you would just make the rowkey something like channel_id/yyyy-mm and just timestamp each value for the day.
Either way for your queries, if you need multiple channels, then you could just do multiple reads or a multi-prefix scan. Let me know if this helps clarify the schema design and if you have more questions.

GroupBy doesn't show all the data

I have a list of sellers, where everyone paying service charge. I want to show the service charge grouped by year in descending order. The group's year should be in descending order. Like, I inputed data for 2023, 2020, 2021. The group data should show 2023 first then 2021, then 2020. First I tried with
$infos = Commision::all()->groupBy('country');
If I use order first it shows error. I have a previous question here. The I tried with
$infos = DB::table('service_charges')->groupBy('year')->orderBy('year','DESC')->get();
dd($infos);
But its show's only one data of each group. I have 3 data saved here . 2 is from 2021, 1 is from 2020. But the query showing me only one data from each group here.
the GROUP BY SQL statement is used to aggregate data that have the same value for a given column. That means you can only get a single aggregated result of the data for each distinct value in that column (i.e. one value per year in your case).
The collection groupBy is used to group all data in their own collection if they have the same value. However the difference is that groupBy in the collection runs after the query so you can no longer orderBy at that point. Here are two ways you might be able to solve your issue:
Order in the query then group the resulting collection
$infos = DB::table('service_charges')->orderBy('year','DESC')->get()->groupBy('year');
Sort and order the resulting collection
$infos = Commision::all()->groupBy('country')->sortKeysDesc();

Cognos 11 Crosstab - need a value that doesn't have a reference to the column values

Crosstab report works 99%.
About 20 rows, all but one are ok.
5 columns - Company Division.
The rows are things like cost, revenue, revenue 2, etc.
All the rows that work have three attributes I'm using to select them:
Fiscal Year
Period
Solution.
The problem is there is table that lists an YTD rate for each period. This table is not Division Specific; it's company wide.
All the tables are linked to the accounting period table that has fiscal year and period. So the overall query limits data to fiscal year (?pFiscalYear?) and period <= ?pPeriod?, based on prompt page results.
The source table has this:
FY_CD PD_NO ACT_CURR_RT ACT_YTD_RT
2018 1 0.36121715 0.36121715
2018 2 0.32471476 0.34255512
2018 3 0.25240906 0.31210183
2018 4 0.33154745 0.31925874
Note the YTD rate is not an average of any of the other numbers.
When I select the ACT_YTD_RT, as a row, I want the ACT_YTD_RT that matches the selected period.
What I get is the average if I set the aggregation to average or the lowest if I set it to other aggregations. So sometimes, it looks right (if I run for period 1,2,3, as the rate kept falling), and sometimes it's wrong (period 4
returns .3121 instead of .3192).
I've tried a number of different methods and can generate garbage data (totals, min, max, average) and crossjoins but can't figure out how to get the value I'm looking for.
I want YTD_RT where fiscal year =?pFiscal? and period = ?pPeriod?.
I tried a straight if then clause:
if (sourcetable.fiscalYear = ?pFiscalYear?) and (sourcetable.Period = ?pPeriod?) then (ACT_YTD_RT)
but I get an error like this:
'ACT_YTD_RT' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. (SQLSTATE=42000, SQLERRORCODE=8120)
If I create another query that generates the right response and try to include it, I get a crossjoin error that the query I'm referencing is trying to crossjoin several other items in the crosstab query.
A union doesn't work (different number of columns).
Not sure how a join would work since the division doesn't exist in the rate table.
I maybe could create a view in the database that did a crossjoin of the division table and the rate table, add that to the framework and then I wouldn't have a crossjoin since the solution would be in the rate "table" (really view), but that seems wrong somehow.
If I could just write a freaking parameterized query direct to the database I'd be done. But in Cognos 11 crosstabs I can't find a place for a SQL query object. And that shouldn't be necessary.
I've spent hours and hours chasing this in circles.
Anybody have any ideas?
Thanks
Paul
So the earlier problem was that this:
if (sourcetable.fiscalYear = ?pFiscalYear?) and (sourcetable.Period = ?pPeriod?) then (ACT_YTD_RT)
Generated an error like this:
'ACT_YTD_RT' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause. (SQLSTATE=42000, SQLERRORCODE=8120)
To fix the above, I had to add a cross join of the division table and the rate table as a view in the database. Then add that to the framework. Then build the data item this way:
total (
if (sourcetable.fiscalYear = ?pFiscalYear?) and (sourcetable.Period = ?pPeriod?) then (ACT_YTD_RT)
)
And now the "total" provides the missing group by. And the crossjoin in the database provides the division information so the crosstab is happy.
I still think there should have been an easier way to do this, but I have a functioning hammer at the moment.

MAX() SQL Equivalent on Redis

I'm new on Redis, and now I have problem to improve my stat application. The current SQL to generate the statistic is here:
SELECT MIN(created_at), MAX(created_at) FROM table ORDER BY id DESC limit 10000
It will return MIN and MAX value from created_at field.
I have read about RANGE and SCORING on Redis, seem them can be used to solve this problem. But I still confused about SCORING for last 10000 records. Are they can be used to solve this problem, or is there another way to solve this problem using Redis?
Regards
Your target appears to be somewhat unclear - are you looking to store all the records in Redis? If so, what other columns does the table table have and what other queries do you run against it?
I'll take your question at face value, but note that in most NoSQL databases (Redis included) you need to store your data according to how you plan on fetching it. Assuming that you want to get the min/max creation dates of the last 10K records, I suggest that you keep them in a Sorted Set. The Sorted Set's members will be the unique id and their scores will be the creation date (use the epoch value), for example, rows with ids 1, 2 & 3 were created at dates 10, 100 & 1000 respectively:
ZADD table 10 1 100 2 1000 3 ...
Getting the minimal creation date is easy now - just do ZRANGE table 0 0 WITHSCORES - and the max is just a ZRANGE table -1 -1 WITHSCORES away. The only "tricky" part is making sure that the Sorted Set is kept updated, so for every new record you'll need to remove the lowest id from the set and add the new one. In pseudo Python code this would look something like the following:
def updateMinMaxSortedSet(id, date):
count = redis.zcount('table')
if count > 10000:
redis.zrem('table', id-10000)
redis.zadd('table', id, date)

Multiple group by for dataset query

I am currently working on report generation using BIRT Tool. consider the table below,
TaskId Status SLAMiss
----------------------------------------------------------
1 Completed Yes
2 In Progress No
3 Completed No
I need to create a table which shows the count of Completed ,In progress tasks along with the count SLA missed tasks like below,
Tasks Completed Tasks InProgress SLA Adherence SLA Miss
---------------------------------------------------------------------------
2 1 2 1
Now i need to create the dataset using sql query. For the first two columns i have to group by 'Status'. And for the last two columns i have to group by 'SLA Miss'. So,
1.Is it possible to achieve this using a single dataset?
2.If yes what will be the sql query for the dataset?
3.If not, I can create 4 dataset's for each column and apply that to the table.
Will that be a good idea?
Thanks in advance.
The easiset way to do this is use computed columns. You would use some JavaScript like the following in a new colum named "CompletedCount" as Interger. Then when you build your report you sum the values with an "Aggregation" item from the palette.
if (row["Status"] == "Completed" )
{
"1"
}else{
"0"}

Resources