change table details group when text box is clicked - sorting

Is there a way to dynamically change the table details grouping when a text box is clicked?
For example, I have a "Date" labelled textbox. When I clicked the Date textbox, the table will group by Date
Original Layout
| Column1 | Column2 | Column3 | Column4
-Primary Colors
| Blue | Data | Data | Data
| Red | Data | Data | Data
| Yellow | Data | Data | Data
-Secondary Colors
| Green | Data | Data | Data
| Orange | Data | Data | Data
| Violet | Data | Data | Data
After I click the interactive sort of Column1, the groupings will change to this:
| Column1 | Column2 | Column3 | Column4
-Colors
| Blue | Data | Data | Data
| Green | Data | Data | Data
| Orange | Data | Data | Data
| Red | Data | Data | Data
| Violet | Data | Data | Data
| Yellow | Data | Data | Data

Related

query hive json serde table having nested ARRAY & STRUCT combination

Trying to query a json hive table built on top of json data. Using json2Hive was able to generate DDL and was able to create table after removing unnecessary fields.
create external table user_tables.sample_json_table (
`apps` struct<
`app`: array<struct<
`id`: string,
`queue`: string,
`finalstatus`: string,
`trackingurl`: string,
`applicationtype`: string,
`applicationtags`: string,
`startedtime`: string,
`launchtime`: string,
`finishedtime`: string,
`memoryseconds`: string,
`vcoreseconds`: string,
`resourcesecondsmap`: struct<
`entry`: struct<
`key`: string,
`value`: string
>
>
>
>
>
)
row format serde 'org.apache.hadoop.hive.serde2.JsonSerDe'
location '/xyz/location/;
Now, stuck trying to figure out how to query each field from the below schema ?
checked several articles but all of them are case specific, and need a generic explanation or example how to query each field under array/struct :)
I only care about the multiple 'app' subsection entries and would like them to be imported onto another table with separate fields for each fields.
Sample json data:
{"apps":{"app":[{"id":"application_282828282828_12717","user":"xyz","name":"xyz-4b6bdae2-1a0c-4772-bd8e-0d7454268b82","queue":"root.users.dummy","state":"finished","finalstatus":"succeeded","progress":100.0,"trackingui":"history","trackingurl":"http://dang:8088/proxy/application_282828282828_12717/","diagnostics":"session stats:submitteddags=1, successfuldags=1, faileddags=0, killeddags=0\n","clusterid":282828282828,"applicationtype":"aquaman","applicationtags":"ABC,xyz_20221107070124_2beb5d90-24c7-4b1b-b977-3c9af1397195,userid=dummy","priority":0,"startedtime":1667822485626,"launchtime":1667822485767,"finishedtime":1667822553365,"elapsedtime":67739,"amcontainerlogs":"http://dingdong:8042/node/containerlogs/container_e65_282828282828_12717_01_000001/xyz","amhosthttpaddress":"dingdong:8042","amrpcaddress":"dingdong:46457","masternodeid":"dingdong:8041","allocatedmb":-1,"allocatedvcores":-1,"reservedmb":-1,"reservedvcores":-1,"runningcontainers":-1,"memoryseconds":1264304,"vcoreseconds":79,"queueusagepercentage":0.0,"clusterusagepercentage":0.0,"resourcesecondsmap":{"entry":{"key":"memory-mb","value":"1264304"},"entry":{"key":"vcores","value":"79"}},"preemptedresourcemb":0,"preemptedresourcevcores":0,"numnonamcontainerpreempted":0,"numamcontainerpreempted":0,"preemptedmemoryseconds":0,"preemptedvcoreseconds":0,"preemptedresourcesecondsmap":{},"logaggregationstatus":"succeeded","unmanagedapplication":false,"amnodelabelexpression":"","timeouts":{"timeout":[{"type":"lifetime","expirytime":"unlimited","remainingtimeinseconds":-1}]}},{"id":"application_282828282828_12724","user":"xyz","name":"xyz-94962a3e-d230-4fd0-b68b-01b59dd3299d","queue":"root.users.dummy","state":"finished","finalstatus":"succeeded","progress":100.0,"trackingui":"history","trackingurl":"http://dang:8088/proxy/application_282828282828_12724/","diagnostics":"session stats:submitteddags=1, successfuldags=1, faileddags=0, killeddags=0\n","clusterid":282828282828,"applicationtype":"aquaman","applicationtags":"ZZZ_,xyz_20221107070301_e6f788db-e39c-49b6-97d5-6a02ff994c00,userid=dummy","priority":0,"startedtime":1667822585231,"launchtime":1667822585437,"finishedtime":1667822631435,"elapsedtime":46204,"amcontainerlogs":"http://ding:8042/node/containerlogs/container_e65_282828282828_12724_01_000002/xyz","amhosthttpaddress":"ding:8042","amrpcaddress":"ding:46648","masternodeid":"ding:8041","allocatedmb":-1,"allocatedvcores":-1,"reservedmb":-1,"reservedvcores":-1,"runningcontainers":-1,"memoryseconds":5603339,"vcoreseconds":430,"queueusagepercentage":0.0,"clusterusagepercentage":0.0,"resourcesecondsmap":{"entry":{"key":"memory-mb","value":"5603339"},"entry":{"key":"vcores","value":"430"}},"preemptedresourcemb":0,"preemptedresourcevcores":0,"numnonamcontainerpreempted":0,"numamcontainerpreempted":0,"preemptedmemoryseconds":0,"preemptedvcoreseconds":0,"preemptedresourcesecondsmap":{},"logaggregationstatus":"time_out","unmanagedapplication":false,"amnodelabelexpression":"","timeouts":{"timeout":[{"type":"lifetime","expirytime":"unlimited","remainingtimeinseconds":-1}]}},{"id":"application_282828282828_12736","user":"xyz","name":"xyz-1a9c73ef-2992-40a5-aaad-9f0688bb04f4","queue":"root.users.dummy","state":"finished","finalstatus":"succeeded","progress":100.0,"trackingui":"history","trackingurl":"http://dang:8088/proxy/application_282828282828_12736/","diagnostics":"session stats:submitteddags=1, successfuldags=1, faileddags=0, killeddags=0\n","clusterid":282828282828,"applicationtype":"aquaman","applicationtags":"BLAHBLAH,xyz_20221107070609_8d261352-3efa-46c5-a5a0-8a3cd745d180,userid=dummy","priority":0,"startedtime":1667822771170,"launchtime":1667822773663,"finishedtime":1667822820351,"elapsedtime":49181,"amcontainerlogs":"http://dong:8042/node/containerlogs/container_e65_282828282828_12736_01_000001/xyz","amhosthttpaddress":"dong:8042","amrpcaddress":"dong:34266","masternodeid":"dong:8041","allocatedmb":-1,"allocatedvcores":-1,"reservedmb":-1,"reservedvcores":-1,"runningcontainers":-1,"memoryseconds":1300011,"vcoreseconds":89,"queueusagepercentage":0.0,"clusterusagepercentage":0.0,"resourcesecondsmap":{"entry":{"key":"memory-mb","value":"1300011"},"entry":{"key":"vcores","value":"89"}},"preemptedresourcemb":0,"preemptedresourcevcores":0,"numnonamcontainerpreempted":0,"numamcontainerpreempted":0,"preemptedmemoryseconds":0,"preemptedvcoreseconds":0,"preemptedresourcesecondsmap":{},"logaggregationstatus":"succeeded","unmanagedapplication":false,"amnodelabelexpression":"","timeouts":{"timeout":[{"type":"lifetime","expirytime":"unlimited","remainingtimeinseconds":-1}]}},{"id":"application_282828282828_12735","user":"xyz","name":"xyz-d5f56a0a-9c6b-4651-8f88-6eaff5953777","queue":"root.users.dummy","state":"finished","finalstatus":"succeeded","progress":100.0,"trackingui":"history","trackingurl":"http://dang:8088/proxy/application_282828282828_12735/","diagnostics":"session stats:submitteddags=1, successfuldags=1, faileddags=0, killeddags=0\n","clusterid":282828282828,"applicationtype":"aquaman","applicationtags":"HAHAHA_,xyz_20221107070605_a082d9d8-912f-4278-a2ef-5dfe66089fd7,userid=dummy","priority":0,"startedtime":1667822766897,"launchtime":1667822766999,"finishedtime":1667822796759,"elapsedtime":29862,"amcontainerlogs":"http://dung:8042/node/containerlogs/container_e65_282828282828_12735_01_000001/xyz","amhosthttpaddress":"dung:8042","amrpcaddress":"dung:42765","masternodeid":"dung:8041","allocatedmb":-1,"allocatedvcores":-1,"reservedmb":-1,"reservedvcores":-1,"runningcontainers":-1,"memoryseconds":669695,"vcoreseconds":44,"queueusagepercentage":0.0,"clusterusagepercentage":0.0,"resourcesecondsmap":{"entry":{"key":"memory-mb","value":"669695"},"entry":{"key":"vcores","value":"44"}},"preemptedresourcemb":0,"preemptedresourcevcores":0,"numnonamcontainerpreempted":0,"numamcontainerpreempted":0,"preemptedmemoryseconds":0,"preemptedvcoreseconds":0,"preemptedresourcesecondsmap":{},"logaggregationstatus":"succeeded","unmanagedapplication":false,"amnodelabelexpression":"","timeouts":{"timeout":[{"type":"lifetime","expirytime":"unlimited","remainingtimeinseconds":-1}]}}]}}
sample query output :
id | queue | finalStatus | trackingurl |....
-----------------------------------------------------------
application_282828282828_12717 | root.users.dummy | succeeded | ...
application_282828282828_12724 | root.users.dummy2 | failed | ....
For anyone looking to perform something similar ,I found this article very helpful with clear explanation: https://community.cloudera.com/t5/Support-Questions/Complex-Json-transformation-using-Hive-functions/m-p/236476
Below is the query to parse using LATERAL VIEW EXPLODE in case people on the same boat:
select ex1.* from user_tables.sample_json_table cym LATERAL VIEW OUTER inline(cym.apps.app) ex1;
| id | queue | finalstatus | trackingurl | applicationtype | applicationtags | startedtime | launchtime | finishedtime | memoryseconds | vcoreseconds | resourcesecondsmap |
| ------------------------------- | ----------------- | ----------- | ------------------------------------------------------- | --------------- | --------------------------------------------------------------------------------------- | ------------- | ------------- | ------------- | ------------- | ------------ | ---------------------------------------- |
| application_1667627410794_12717 | root.users.dummy2 | succeeded | http://dang:8088/proxy/application_1667627410794_12717/ | tez | \_xyz,test-app-24c7-4b1b-b977-3c9af1397195,userid=dummy1 | 1667822485626 | 1667822485767 | 1667822553365 | 1264304 | 79 | {"entry":{"key":"vcores","value":"79"}} |
| application_1667627410794_12724 | root.users.dummy3 | succeeded | http://dang:8088/proxy/application_1667627410794_12724/ | tez | \_generate_stuff,hive_20221107070301_e6f788db-e39c-49b6-97d5-6a02ff994c00,userid=dummy3 | 1667822585231 | 1667822585437 | 1667822631435 | 5603339 | 430 | {"entry":{"key":"vcores","value":"430"}} |
| application_1667627410794_12736 | root.users.dummy1 | succeeded | http://dang:8088/proxy/application_1667627410794_12736/ | tez | \_sample_job,test-zzz-3efa-46c5-a5a0-8a3cd745d180,userid=dummy1 | 1667822771170 | 1667822773663 | 1667822820351 | 1300011 | 89 | {"entry":{"key":"vcores","value":"89"}} |
| application_1667627410794_12735 | root.users.dummy2 | succeeded | http://dang:8088/proxy/application_1667627410794_12735/ | tez | \_mixed_article,placebo_2-912f-4278-a2ef-5dfe66089fd7,userid=dummy2 | 1667822766897 | 1667822766999 | 1667822796759 | 669695 | 44 | {"entry":{"key":"vcores","value":"44"}} |
Add. Note: Although my requirement no longer needs it, but If anyone can suggest how to further parse the last field resourcesecondsmap to populate map key value would be great to know! basically use key value as field and value as actual value in field:
Desired Output:
| id | queue | finalstatus | trackingurl | applicationtype | applicationtags | startedtime | launchtime | finishedtime | memoryseconds | vcoreseconds | vcores-value |
| ------------------------------- | ----------------- | ----------- | ------------------------------------------------------- | --------------- | --------------------------------------------------------------------------------------- | ------------- | ------------- | ------------- | ------------- | ------------ | ------------ |
| application_1667627410794_12717 | root.users.dummy2 | succeeded | http://dang:8088/proxy/application_1667627410794_12717/ | tez | \_xyz,test-app-24c7-4b1b-b977-3c9af1397195,userid=dummy1 | 1667822485626 | 1667822485767 | 1667822553365 | 1264304 | 79 | 79 |
| application_1667627410794_12724 | root.users.dummy3 | succeeded | http://dang:8088/proxy/application_1667627410794_12724/ | tez | \_generate_stuff,hive_20221107070301_e6f788db-e39c-49b6-97d5-6a02ff994c00,userid=dummy3 | 1667822585231 | 1667822585437 | 1667822631435 | 5603339 | 430 | 430 |

Power Query Transform/Lookup Data From Rows To Columns

I have a set of data ("My Data") shown below, how to shift the data from rows to columns in Power Query?
("My Preferred Answer") would be the final output.
My Data:
| FruitName | Price | Quantity |
| --------- | ----- | -------- |
| Apple | 1 | 1 |
| Banana | 2 | 1 |
| Orange | 3 | 1 |
| Colour | *null* | *null* |
| Apple | Red | *null* |
| Banana | Yellow | *null* |
| Orange | Orange | *null* |
My Preferred Answer:
| FruitName | Price | Quantity | Colour |
| --------- | ----- | -------- | ------ |
| Apple | 1 | 1 | Red |
| Banana | 2 | 1 | Yellow |
| Orange | 3 | 1 | Orange |
Read the code comments to better understand the algorithm.
Split the table at the "colour" line
Join the colours with the inventory table
M Code
let
//Change Table name in next line to the actual table name in your workbook
Source = Excel.CurrentWorkbook(){[Name="Table28"]}[Content],
//Separate table for Color and Inventory
splitTable = Table.SplitAt(Source, List.PositionOf(Source[FruitName],"Colour")),
//Remove the row with the table split word "colour"
//Remove the Quantity column
//Rename the Price column
//Set the data types
colourTable = Table.TransformColumnTypes(
Table.RenameColumns(
Table.RemoveColumns(
Table.RemoveFirstN(splitTable{1},1),"Quantity"),{"Price","Colour"}),
{{"FruitName",type text},{"Colour",type text}}),
//Set the data types
inventoryTable = Table.TransformColumnTypes(splitTable{0},{
{"FruitName", type text},
{"Price",Currency.Type},
{"Quantity", Int64.Type}
}),
//Join the colour column
joined = Table.NestedJoin(inventoryTable,"FruitName",colourTable,"FruitName","Joined",JoinKind.LeftOuter),
#"Expanded Joined" = Table.ExpandTableColumn(joined, "Joined", {"Colour"}, {"Colour"})
in
#"Expanded Joined"

sqlite: wide v. long performance

I'm considering whether I should format a table in my sqlite database in "wide or "long" format. Examples of these formats are included at the end of the question.
I anticipate that the majority of my requests will be of the form:
SELECT * FROM table
WHERE
series in (series1, series100);
or the analog for selecting by columns in wide format.
I also anticipate that there will be a large number of columns, even enough to need to increase the column limit.
Are there any general guidelines for selecting a table layout that will optimize query performance for this sort of case?
(Examples of each)
"Wide" format:
| date | series1 | series2 | ... | seriesN |
| ---------- | ------- | ------- | ---- | ------- |
| "1/1/1900" | 15 | 24 | 43 | 23 |
| "1/2/1900" | 15 | null | null | 23 |
| ... | 15 | null | null | 23 |
| "1/2/2019" | 12 | 12 | 4 | null |
"Long" format:
| date | series | value |
| ---------- | ------- | ----- |
| "1/1/1900" | series1 | 15 |
| "1/2/1900" | series1 | 15 |
| ... | series1 | 43 |
| "1/2/2019" | series1 | 12 |
| "1/1/1900" | series2 | 15 |
| "1/2/1900" | series2 | 15 |
| ... | series2 | 43 |
| "1/2/2019" | series2 | 12 |
| ... | ... | ... |
| "1/1/1900" | seriesN | 15 |
| "1/2/1900" | seriesN | 15 |
| ... | seriesN | 43 |
| "1/2/2019" | seriesN | 12 |
The "long" format is the preferred way to go here, for so many reasons. First, if you use the "wide" format and there is ever a need to add more series, then you would have to add new columns to the database table. While this is not too much of a hassle, in general once you put a schema into production, you want to avoid further schema changes.
Second, the "long" format makes reporting and querying much easier. For example, suppose you wanted to get a count of rows/data points for each series. Then you would only need something like:
SELECT series, COUNT(*) AS cnt
FROM yourTable
GROUP BY series;
To get this report with the "wide" format, you would need a lot more code, and it would be as verbose as your sample data above.
The thing to keep in mind here is that SQL databases are built to operate on sets of records (read: across rows). They can also process things column wise, but they are not generally setup to do this.

Get duplicate rows based on one column using BIRT

I have one table in BIRT Report :
| Name | Amount |
| A | 200 |
| B | 100 |
| A | 150 |
| C | 80 |
| C | 100 |
I need to summarize this table in to another table as : I name is same and add corresponding values.
Summarized table would be :
| A | 350 |
| B | 100 |
| C | 180 |
Here A = 200 + 150 , B = 100 , C = 80 + 100
How I can summarize table from another table present in BIRT Report ?
That is quite easy. Just add another table to your report, select the same datasource as the first table (on the tab binding)
Go to the tab groups and add a group on the your 'Name' column.
You'll see the table change. It added group header row and group footer row. The header will also have an element on which you grouped (in this case name)
Now right click next to name in the amount column. Select Insert->Aggregation.
Select function SUM, expression should be amount, Aggregate On should be your newly created group.
Now you can see the results but it will be something like:
| A | 350 |
| A | 200 |
| A | 150 |
| B | 100 |
| B | 100 |
| C | 180 |
| C | 100 |
| C | 80 |
If you delete the detail row from the table, you'll have the result your after.
For you information:
Have a play with this, its good excersise. Move the new aggregation to the group footer, add a top border to that cell, put a label total in front if it and you'll have something like this:
| A | |
| A | 200 |
| A | 150 |
----------
| total | 350 |
| B | |
| B | 100 |
----------
| total | 100 |
| C | |
| C | 100 |
| C | 80 |
----------
| total | 180 |
Also, you don't have to select the datasource as the binding, you can also select your first table for the bindings:
select the table, open the tab biding, select report item and pick your first table from the dropdown.
This can create very complex situations, therefor I usually try to work from the original dataset.

Birt-Crosstab with empty columns

so I'm a BIRT beginner, and I just tried to get a real simple report from one of my tables of a postgres DB.
So I defined a flat table as datasource which looks like:
+----------------+--------+----------+-------+--------+
| date | store | product | value | color |
+----------------+--------+----------+-------+--------+
| 20160101000000 | store1 | productA | 5231 | red |
| 20160101000000 | store1 | productB | 3213 | green |
| 20160101000000 | store2 | productX | 4231 | red |
| 20160101000000 | store3 | productY | 3213 | green |
| 20160101000000 | store4 | productZ | 1223 | green |
| 20160101000000 | store4 | productK | 3113 | yellow |
| 20160101000000 | store4 | productE | 213 | green |
| .... | | | | |
| 20160109000000 | store1 | productA | 512 | green |
+----------------+--------+----------+-------+--------+
So I would like to add a table / crosstab to my birt report which creates a table (and after that a page break) for EVERY store which looks like:
**Store 1**
+----------------+----------+----------+----------+-----+
| | productA | productB | productC | ... |
+----------------+----------+----------+----------+-----+
| 20160101000000 | 3120 | 1231 | 6433 | ... |
| 20160102000000 | 6120 | 1341 | 2121 | ... |
| 20160103000000 | 1120 | 5331 | 1231 | ... |
+----------------+----------+----------+----------+-----+
--- PAGE BREAK ---
....
So what I tried in first was: Getting to work the standard CrossTab tutorial-template of BIRT.
I defined the DataSource, and created a datacube with dimension-group of 'store' and 'product' , and as SUM / detail -data the 'value' and for this example I just selected ONE day.
But the result looks like this:
+--------+----------+----------+----------+----------+-----+----------+
| | productA | productC | productD | productE | ... | productZ |
+--------+----------+----------+----------+----------+-----+----------+
| Store1 | 213 | | 3234 | 897 | ... | 6767 |
| Store2 | 513 | 2213 | 1233 | | ... | 845 |
| Store3 | 21 | | | 32 | ... | |
| Store4 | 123 | 222 | 142 | | ... | |
+--------+----------+----------+----------+----------+-----+----------+
It's because not every product is selled in every store, but the crosstab creates the columns by selecting ALL products available.
So, I just have no idea how to generate dynamicly different tables with different (but also dynamic) amount of columns.
The second step then would be to get the dates (days) to work.
But thanks in advance for every hint ot tutorial link to question one ;-)
You can just add a table with the complete datasource. Select the table and a group. Group by StoreID. You can set the pagebreak options for each grouping. Set the property for after to "always exluding last".
BIRT will add a group header. You can add multiple groupheader rows get the layout you're after.
For crosstabs it works in a similar way. After you added the crosstab to your page and set the info for the groups on rows and columns and added summaries. You can view the data. Select the crosstab and View the Row Area properties, select the pagegroup settings and add a new pagebreak. You can select on which group you want to break, choose your storeID group and select after: "always excluding last"

Resources