How to pivot duplicate rows to distinct columns in excel - powerquery

I have an excel data where I need to transpose the duplicate rows to columns suitable for analysis. Please let me know how to do this?
For example,
My excel data looks like
id metrics date value
1 A 20190812 100
1 A 20190813 100
1 A 20190814 100
1 B 20190812 200
1 B 20190813 130
2 A 20190812 100
2 B 20190813 106
2 C 20190814 104
The result I look forward to is
id A B C date
1 100 200 null 20190812
1 100 130 null 20190813
1 100 null null 20190814
2 100 null null 20190812
2 null 106 null 20190813
2 null null 104 20190814

Try below in Pivot:
let
Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content]
Pivot =
Table.Pivot(
Source,
List.Distinct(Source[metrics]),
"metrics",
"value",
List.Sum)
in
Pivot

Related

Group by datatable using integer range using Linq

I'm trying to group a set of data based on the range of an age(interger) using linq,
e.g. I have datatable -
id name age
1 abc 20
2 pqr 45
3 jkl 34
5 xyz 39
6 lmn 65
I want result as -
age range count
18-29 1
30-39 2
40-49 1
50-59 0
60-69 1
.
.
.
I would like to group datatable based on the age with appropriate age range and display the count.

Get Total count

I want to merge two columns(Sender and Receiver) and get the Transaction Type count then merge another table with using Sender_Receiver primary id.
Sender Receiver Type Amount Date
773787639 777611388 1 300 2/1/2019
773631898 776806843 4 450 8/20/2019
773761571 777019819 6 369 2/11/2019
774295511 777084440 34 1000 1/22/2019
774263079 776816905 45 678 6/27/2019
774386894 777202863 12 2678 2/10/2019
773671537 777545555 14 38934 9/29/2019
774288117 777035194 18 21 4/22/2019
774242382 777132939 21 1275 9/30/2019
774144715 777049859 30 6309 7/4/2019
773911674 776938987 10 3528 5/1/2019
773397863 777548054 15 35892 7/6/2019
776816905 772345091 6 1234 7/7/2019
777035194 775623065 4 453454 7/20/2019
Second Table
Mobile_number Age
773787639 34
773787632 23
774288117 65
I am try to get like this kind of table
Sender/Receiver Type_1 Type_4 Type_12...... Type_45 Age
773787639 3 2 0 0 23
773631898 1 0 1 2 56
773397863 2 2 0 0 65
772345091 1 1 0 3 32
Ok, I have seen your old question and you just need inner join in sub-query as following:
SELECT
SenderReceiver,
COUNT(CASE WHEN Type = 1 THEN 1 END) AS Type_1,
COUNT(CASE WHEN Type = 2 THEN 1 END) AS Type_2,
COUNT(CASE WHEN Type = 3 THEN 1 END) AS Type_3,
...
COUNT(CASE WHEN Type = 45 THEN 1 END) AS Type_45,
Age -- changes here
FROM
( SELECT sr.SenderReceiver, sr.Type, st.Age from -- changes here
(SELECT Sender AS SenderReceiver, Type FROM yourTable
UNION ALL
SELECT Receiver, Type FROM yourTable) sr
join <second_table> st on st.Mobile_number = sr.SenderReceiver -- changes here
) t
GROUP BY
SenderReceiver,
Age; -- changes here
Changes done in your previous query are marked with comments -- changes here.
Please replace the name of the <second_table> with the original name of the table.
Cheers!!

SQL Server - Filter only rows which have values in all columns

How do I filter rows those have values in all columns (i.e exclude the row if they have a missing value/null in any of the columns)
Say:
id name age height
------------------------
1 abc 19 NULL
2 fds 34 2.3
3 grt NULL NULL
Output should be only row2. How do I do this?

Sum multiple columns using PIG

I have multiple files with same columns and I am trying to aggregate the values in two columns using SUM.
The column structure is below
ID first_count second_count name desc
1 10 10 A A_Desc
1 25 45 A A_Desc
1 30 25 A A_Desc
2 20 20 B B_Desc
2 40 10 B B_Desc
How can I sum the first_count and second_count?
ID first_count second_count name desc
1 65 80 A A_Desc
2 60 30 B B_Desc
Below is the script I wrote but when I execute it I get an error "Could not infer matching function for SUM as multiple of none of them fit.Please use an explicit cast.
A = LOAD '/output/*/part*' AS (id:chararray,first_count:chararray,second_count:chararray,name:chararray,desc:chararray);
B = GROUP A BY id;
C = FOREACH B GENERATE group as id,
SUM(A.first_count) as first_count,
SUM(A.second_count) as second_count,
A.name as name,
A.desc as desc;
Your load statement is wrong. first_count, second_count is loaded as chararray. Sum can't add two strings. If you are sure that these columns will take numbers only then load them as int. Try this-
A = LOAD '/output/*/part*' AS (id:chararray,first_count:int,second_count:int,name:chararray,desc:chararray);
It should work.

Hive: Joining tables with different scenarios

I have a question on joining tables in a different scenario. Please find the sample tables below.
Capacity of expected table row 3-5 should be repeated as table 2 does not have those fields.
could anyone please help to get expected table?
Table 1:
No ProjectID Capacity
1 514 4
2 418 10
3 418 30
4 401 40
5 502 41
Table2:
NO ProjectID Capacity1 Capacity2
1 514 4 10
2 418 10 20
Expected Table:
NO ProjectID Capacity1 Capacity2
1 514 4 10
2 418 10 20
3 418 30 30
4 401 40 40
5 502 41 41
1.Do left outer join
2.For the values not matching take them from table 1 with if condition.
select t1.no,t1.projectid,t1.capacity1,if(t2.capacity2 is null,t1.capacity,t2.capacity)
from table1 t1 left outer join table2 t2 on t1.no=t2.no
I think above query meets your requirement let me know if need any more help.

Resources