Which column has a max in Power Query table - max

There is a table:
ID | Value1 | Value2 | Values3
A | 1 | 2 | 3
B | 2 | 4 | 3
C | 3 | 2 | 1
D | 1 | 2 | 3
Is there a function or set of functions returning which column has the max value?
ID | Value1 | Value2 | Values3 | Max_in
A | 1 | 2 | 3 | Values3
B | 2 | 4 | 3 | Values2
C | 3 | 2 | 1 | Values1
D | 1 | 3 | 3 | Values2
or
D | 1 | 3 | 3 | Values2;Values3

To do this without unpivoting, write a custom column like this:
= Table.AddColumn(
#"Changed Type",
"Max_In",
(r) =>
Text.Combine(
List.Select(
{"Value1", "Value2", "Value3"},
each Record.Field(r, _) = List.Max({r[Value1], r[Value2], r[Value3]})
),
";"
),
type text
)
Full query you can past into the advanced editor:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WclTSUTIEYiMgNlaK1YlWcoLyTOAizmAWRNQQLOKCocsVKmIMEYkFAA==", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type nullable text) meta [Serialized.Text = true]) in type table [ID = _t, Value1 = _t, Value2 = _t, Value3 = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Value1", Int64.Type}, {"Value2", Int64.Type}, {"Value3", Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Changed Type", "Max_In", (r) => Text.Combine(List.Select({"Value1", "Value2", "Value3"}, each Record.Field(r, _) = List.Max({r[Value1], r[Value2], r[Value3]})),";"), type text)
in
#"Added Custom"

You can unpivot, and group to find the max and max column name, then merge back in as below. This works for any number of columns
let Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
#"Changed Type" = Table.TransformColumnTypes(Source,{{"ID", type text}, {"Value1", Int64.Type}, {"Value2", Int64.Type}, {"Value3", Int64.Type}}),
// unpivot and get column name of max value
#"Unpivoted Other Columns" = Table.UnpivotOtherColumns(#"Changed Type", {"ID"}, "Attribute", "Value"),
#"Grouped Rows" = Table.Group(#"Unpivoted Other Columns", {"ID"}, {{"Data", each _, type table}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "Custom", each Table.FirstN(Table.Sort([Data],{{"Value", Order.Descending}}),1)),
#"Expanded Custom" = Table.ExpandTableColumn(#"Added Custom", "Custom", {"Attribute"}, {"Max_in"}),
// merge into original data set
#"Merged Queries" = Table.NestedJoin(#"Changed Type",{"ID"},#"Expanded Custom",{"ID"},"zz",JoinKind.LeftOuter),
#"Expanded Table1" = Table.ExpandTableColumn(#"Merged Queries", "zz", {"Max_in"}, {"Max_in"})
in #"Expanded Table1"

Related

Pull conditional statement results based on multiple table joins

I have 3 tables to join to get the output in the below format.
My table 1 is like:
--------------------------------------------------------
T1_ID1 | T1_ID2 | NAME
--------------------------------------------------------
123 | T11231 | TestName11
123 | T11232 | TestName12
234 | T1234 | TestName13
345 | T1345 | TestName14
--------------------------------------------------------
My table 2 is like:
--------------------------------------------------------
T2_ID1 | T2_ID2 | NAME
--------------------------------------------------------
T11231 | T21231 | TestName21
T11232 | T21232 | TestName21
T1234 | T2234 | TestName22
--------------------------------------------------------
My table 3 is like:
----------------------------------------------------------
T3_ID1 | TYPE | REF
----------------------------------------------------------
T21231 | 1 | 123456
T21232 | 2 | 1234#test.com
T2234 | 2 | 123#test.com
----------------------------------------------------------
My desired output is:
------------------------------------------------------
T1_ID1 | PHONE | EMAIL
------------------------------------------------------
123 | 123456 | 1234#test.com
234 | | 123#test.com
345 | |
------------------------------------------------------
Requirements:
T1_ID2 of table 1 left joins with T2_ID1 of table 2.
T2_ID2 of table 2 left joins with T3_ID1 of table 3.
TYPE of table 3 specifies 1 if the value is phone and specified 2 if value is email.
My output should contain T1_ID1 of table 1 and its corresponding value of REF in table 3, with the REF in the same row.
That is, in this case, T1_ID1 with value 123 has both phone and email. So, it is displayed in the same row in output.
If phone alone is available for corresponding value of T1_ID1, then phone should be populated in the result with email as null and vice versa.
If neither phone nor email is available, nothing should be populated.
I had tried the below SQLs but in vain. Where am I missing? Please extend your help.
Option 1:
SELECT DISTINCT
t1.t1_id1,
t3.ref
|| (
CASE
WHEN t3.type = 1 THEN
1
ELSE
0
END
) phone,
t3.ref
|| (
CASE
WHEN t3.type = 2 THEN
1
ELSE
0
END
) email
FROM
table1 t1
LEFT JOIN table2 t2 ON t1.t1_id2 = t2.t2_id1
LEFT JOIN table3 t3 ON t2.t2_id2 = t3.t3_id1;
Option 2:
SELECT DISTINCT
t1.t1_id1,
t3.ref,
(
CASE
WHEN t3.type = 1 THEN
1
ELSE
0
END
) phone,
t3.ref,
(
CASE
WHEN t3.type = 2 THEN
1
ELSE
0
END
) email
FROM
table1 t1
LEFT JOIN table2 t2 ON t1.t1_id2 = t2.t2_id1
LEFT JOIN table3 t3 ON t2.t2_id2 = t3.t3_id1;
Option 3:
SELECT DISTINCT
t1.t1_id1,
(
CASE
WHEN t3.type = 1 THEN
1
ELSE
0
END
) phone,
(
CASE
WHEN t3.type = 2 THEN
1
ELSE
0
END
) email
FROM
table1 t1
LEFT JOIN table2 t2 ON t1.t1_id2 = t2.t2_id1
LEFT JOIN table3 t3 ON t2.t2_id2 = t3.t3_id1;
select t1_id1, max(t3.ref )phone, max(t33.ref) email
from table1
left outer join
table2 on t1_id2=t2_id1
left outer join table3 t3 on t3.t3_id1=t2_id2 and t3.type=1
left outer join table3 t33 on t33.t3_id1=t2_id2 and t33.type=2
group by t1_id1
if you have maximum one phone and one email in table3 for each t2_id2 entry in table2.

Can I apply a WHERE clause if COUNT condition is met?

I'm having a hard time trying to add a WHERE clause that filters null values from a column ONLY when there are other rows that return data for that same column.
If all rows have null values for that column, keep them all.
If any row has data for that column, remove the rows with null values and just keep the rows with data.
I'm working on an Oracle database.
In my SELECT statement, I'm currently using a LEFT JOIN to pull data from table B even if the values for column B.info are null.
The actual query goes as follows:
SELECT A.id as A_ID, A.name as A_NAME,
B.id as B_ID, B.name as B_NAME, B.info as B_INFO
FROM A
LEFT JOIN B ON B.id = A.id_B
WHERE A.filename = 'file1.txt'
I have 2 possible scenarios in the bussiness I'm working on:
For a given "filename", the query returns some rows with the B.info column with null values and some others with the B.info column filled with data. I want the query to return only the rows with B.info != null.
Scenario 1 - Actual output:
+-------+--------+------+--------+-----------+
| A_ID | A_NAME | B_ID | B_NAME | B_INFO |
+-------+--------+------+--------+-----------+
| 1 | John | null | null | null |
+-------+--------+------+--------+-----------+
| 2 | John | 3 | Julia | Age is 35 |
+-------+--------+------+--------+-----------+
| 3 | John | null | null | null |
+-------+--------+------+--------+-----------+
Scenario 1 - Desired output:
+-------+--------+------+--------+-----------+
| A_ID | A_NAME | B_ID | B_NAME | B_INFO |
+-------+--------+------+--------+-----------+
| 2 | John | 3 | Julia | Age is 35 |
+-------+--------+------+--------+-----------+
For a given "filename", the query returns all of the rows with the B.info column with null values.
I want the query to keep returning those rows.
Scenario 2 - Actual output = desired output:
+-------+--------+------+--------+--------+
| A_ID | A_NAME | B_ID | B_NAME | B_INFO |
+-------+--------+------+--------+--------+
| 1 | Mark | null | null | null |
+-------+--------+------+--------+--------+
| 2 | Mark | null | null | null |
+-------+--------+------+--------+--------+
| 3 | Mark | null | null | null |
+-------+--------+------+--------+--------+
I tried adding the condition B.info is not null in the where clause but, although it returns the desired output for the scenario 1, the output for the scenario 2 returns no rows:
SELECT A.id as A_ID, A.name as A_NAME,
B.id as B_ID, B.name as B_NAME, B.info as B_INFO
FROM A
LEFT JOIN B ON B.id = A.id_B
WHERE A.filename = 'file1.txt'
AND B.info is not null
Scenario 1 - Output
+-------+--------+------+--------+-----------+
| A_ID | A_NAME | B_ID | B_NAME | B_INFO |
+-------+--------+------+--------+-----------+
| 2 | John | 3 | Julia | Age is 35 |
+-------+--------+------+--------+-----------+
Scenario 2 - Output
+-------+--------+------+--------+--------+
| A_ID | A_NAME | B_ID | B_NAME | B_INFO |
+-------+--------+------+--------+--------+
+-------+--------+------+--------+--------+
I also tried adding a CASE in the WHERE clause but it throws an error (ORA-00934: group function is not allowed here)
SELECT A.id as A_ID, A.name as A_NAME,
B.id as B_ID, B.name as B_NAME, B.info as B_INFO
FROM A
LEFT JOIN B ON B.id = A.id_B
WHERE A.filename = 'file1.txt'
AND B.info = CASE WHEN count(B.info) > 0 THEN null
ELSE B.info
END
I'm sorry I can't use the real example for confidentiality issues. I hope my example is clear enough. I would appreciate any help!
Count all rows and nullable rows. Use analytic count, because you need details. Then show only rows containing data or null rows if both counts are equal:
select id, a_name, b_name, info
from (
select a.id, b.id b_id, a.name a_name, b.name b_name, b.info,
count(case when b.id is null then 1 end) over (partition by a.filename) c1,
count(1) over (partition by a.filename) c2
from a left join b on a.id = b.id )
where b_id is not null or c1 = c2
demo
You can consider this problem a ranking problem: You want to show the best rows only, with non-null rows being considered "better" than null rows.
Ranking can be achieved with an appropriate ORDER BY clause. As of Oracle 12c:
select
a.id as a_id, a.name as a_name,
b.id as b_id, b.name as b_name, b.info as b_info
from a
left join b on b.id = a.id_b
where a.filename = 'file1.txt'
order by case when b.id is null then 2 else 1 end
fetch first rows with ties;
In older versions:
select a_id, a_name, b_id, b_name, b_info
from
(
select
a.id as a_id, a.name as a_name,
b.id as b_id, b.name as b_name, b.info as b_info,
rank() over (order by case when b.id is null then 2 else 1 end) as rnk
from a
left join b on b.id = a.id_b
where a.filename = 'file1.txt'
)
where rnk = 1;

Selecting only distinct record from table in oracle

I have table with following records;
ID | NN | MBL | IC | OTHER
---+-----+------+----+------
1 | 123 | | | ac
2 | | 544 | | dc
3 | | | 524| df
4 |527 | | 124| ff
5 |123 | | | tr // duplicate NN of ID 1
6 | | 544 | | op // duplicate MBL of ID 2
7 | | | 124| ii // duplicate for IC ID 4
When querying with select I need just records with single entry, skipping second occurrence,
select
ID, NN, MBL, IC, OTHER
from
TABLE1 // this should return only one entry of any NN, MBL and IC
How do I get this, I cannot use distinct for multiple columns and I also need ID and OTHER column to display in select query
Expecting result like this:
1 | 123 | | | ac
2 | | 544 | | dc
3 | | | 524| df
4 |527 | | 124| ff
You can use the analytical function ROW_NUMBER() to calculate ranks over each column you want and filter only these rows with rank = 1.
Here is an example:
WITH testdata AS (
SELECT 1 AS ID, 123 AS NN, NULL AS MBL, NULL AS IC, 'ac' AS OTHER FROM DUAL UNION ALL
SELECT 2, NULL, 544 , NULL, 'dc' FROM DUAL UNION ALL
SELECT 3, NULL, NULL, 524 , 'df' FROM DUAL UNION ALL
SELECT 4, 527, NULL, 124, 'ff' FROM DUAL UNION ALL
SELECT 5, 123, NULL, NULL, 'tr' FROM DUAL UNION ALL
SELECT 6, NULL, 544, NULL, 'op' FROM DUAL UNION ALL
SELECT 7, NULL, NULL , 124, 'ii' FROM DUAL
)
SELECT *
FROM(SELECT ID,
NN,
CASE WHEN NN IS NULL THEN 1 ELSE ROW_NUMBER() OVER (PARTITION BY NN ORDER BY ID) END AS NN_RANG,
MBL,
CASE WHEN MBL IS NULL THEN 1 ELSE ROW_NUMBER() OVER (PARTITION BY MBL ORDER BY ID) END AS MBL_RANG,
IC,
CASE WHEN IC IS NULL THEN 1 ELSE ROW_NUMBER() OVER (PARTITION BY IC ORDER BY ID) END AS IC_RANG,
OTHER
FROM testdata
)
WHERE NN_RANG = 1
AND MBL_RANG = 1
AND IC_RANG = 1
;
Hope it helps.

Complex SQL query to join two tables

Problem:
Given two tables: TableA, TableB, where TableA has a one-to-many relationship with TableB, I want to retrieve all records in TableB for where the search criteria matches a certain column in TableB and return NULL for the unique TableA records for the same attribute.
Table Structures:
Table A
ID(Primary Key) | Name | City
1 | ABX | San Francisco
2 | ASDF | Oakland
3 | FDFD | New York
4 | GFGF | Austin
5 | GFFFF | San Francisco
Table B
ATTR_ID |Attr_Type | Attr_Name | Attr_Value
1 | TableA | Attr_1 | Attr_Value_1
2 | TableD | Attr_1 | Attr_Value_2
1 | TableA | Attr_2 | Attr_Value_3
3 | TableA | Attr_4 | Attr_Value_4
9 | TableC | Attr_2 | Attr_Value_5
Table B holds attribtue names and values and is a common table used across multiple tables. Each table is identified by Attr_Type and ATTR_ID (which maps to the IDs of different tables).
For instance, the record in Table A with ID 1 has two attributes in Table B with Attr_Names: Attr_1 and Attr_2 and so on.
Expected Output
ID | Name | City | TableB.Attr_Value
1 | ABX | San Francisco | Attr_Value_1
2 | ASDF | Oakland | Attr_Value_2
3 | FDFD | New York | NULL
4 | GFGF | Austin | NULL
5 | GFFFF | San Francisco | NULL
Search Criteria:
Get rows from Table B for each record in Table A with ATTR_NAME Attr_1. If a particular TableA record doesn't have Attr_1, return null.
My Query
select id, name, city,
b.attr_value from table_A
join table_B b on
table_A.id =b.attr_id and b.attr_name='Attr_1'
This is a strange data structure. You need a left outer join with the conditions in the on clause:
select a.id, a.name, a.city, b.attr_value
from table_A a left join
table_B b
on a.id = b.attr_id and b.attr_name = 'Attr_1' and b.attr_type = 'TableA';
I added the attr_type condition, because that seems logic with this data structure.
I dont have an sql server to test the command, but what you want is an inner/outer join query. You could do something like this
select id, name, city,
b.attr_value from table_A
join table_B b on
table_A.id *= b.attr_id and b.attr_name *= 'Attr_1'
Something like this should do the trick for you

Oracle update with subquery - performanceissue

I'm having some trouble with an update statement in my oracle database.
The query takes to much time and the temp tablespace is running out of space, but it provides the correct data.
I tried to convert the subqueries to joins but i couldn't figure out how to do it correctly.
If someone knows how to improve the statement or how to convert it into a join i would be really grateful.
UPDATE table1 t1
SET t1.inxdc = (SELECT sda_x
FROM table2 t2
WHERE t1.c1 = t2.c1
AND t1.c2 = t2.c2
AND t1.c3 = t2.c3
AND t1.c4 = t2.c4
AND t1.c5 = t2.c5
AND t1.c6 = t2.c6
AND t2.ident = 'K_SDA_W'
AND rownum=1)
WHERE EXISTS
(SELECT 1
FROM table2 t2
WHERE t1.c1 = t2.c1
AND t1.c2 = t2.c2
AND t1.c3 = t2.c3
AND t1.c4 = t2.c4
AND t1.c5 = t2.c5
AND t1.c6 = t2.c6
AND t2.ident = 'K_SDA_W');
edit1:
Some information for the tables
table1 PKs = c1,c2,c3,c4,c5,c6
table2 PKs = ident,c4,c5,c6, and 3 others not mentioned in the statement (c7,c8,c9)
index: besides the PKs only on table2 c1
table1 data: 12466 rows
table2 data: 194827 rows
edit2:
Execution Plan
--------------------------------------------------------------
| Id | Operation | Name |
--------------------------------------------------------------
| 0 | UPDATE STATEMENT | |
| 1 | UPDATE | table1 |
| 2 | NESTED LOOPS SEMI | |
| 3 | TABLE ACCESS FULL | table1 |
| 4 | TABLE ACCESS BY INDEX ROWID| table2 |
| 5 | INDEX RANGE SCAN | t2.c1 |
| 6 | COUNT STOPKEY | |
| 7 | TABLE ACCESS BY INDEX ROWID| table2 |
| 8 | INDEX RANGE SCAN | t2.PK |
--------------------------------------------------------------
There are very few rows in Table1, just drop the WHERE clause in this particular situation and add NVL to the value returned from subquery:
UPDATE table1 t1
SET t1.inxdc = NVL((SELECT sda_x
FROM table2 t2
WHERE t1.c1 = t2.c1
AND t1.c2 = t2.c2
AND t1.c3 = t2.c3
AND t1.c4 = t2.c4
AND t1.c5 = t2.c5
AND t1.c6 = t2.c6
AND t2.ident = 'K_SDA_W'
AND rownum=1), t1.inxdc);
In general your update should be quick, have you checked performance of the subquery? Check if an index (and which one) is used on table2 for the subquery (best, show us the exection plan).
I think table t2 shoulkd have an index on c1,c2,c3, c4,c5,c6,ident
In this case the update of t1 should be really faster.

Resources