Sorting the dataset on the basis of more than one columns - sorting

I have a sample dataset as below.
+---------+--------+---------+---------+---------+
| Col1 | Col2 | NumCol1 | NumCol2 | NumCol3 |
+---------+--------+---------+---------+---------+
| Value 1 | Value2 | 6 | 2 | 9 |
| Value 3 | Value4 | 8 | 3 | 12 |
| Value 5 | Value6 | 1 | 11 | 8 |
| Value 7 | Value8 | 4 | 10 | 5 |
+---------+--------+---------+---------+---------+
I need to Sort this dataset based on the values of Column(NumCol1,NumCol2,NumCol3) i.e If I have to sort this dataset as ascending order I need to get below result.
+---------+--------+---------+---------+---------+
| Col1 | Col2 | NumCol1 | NumCol2 | NumCol3 |
+---------+--------+---------+---------+---------+
| Value 5 | Value6 | 1 | 11 | 8 |
| Value 1 | Value2 | 6 | 2 | 9 |
| Value 3 | Value4 | 8 | 3 | 12 |
| Value 7 | Value8 | 4 | 10 | 5 |
+---------+--------+---------+---------+---------+
row with Value 5 Value6 1 11 8 came first as it has lowest 1 similarly it folows.
If in descending order, result would be:
+---------+--------+---------+---------+---------+
| Col1 | Col2 | NumCol1 | NumCol2 | NumCol3 |
+---------+--------+---------+---------+---------+
| Value 3 | Value4 | 8 | 3 | 12 |
| Value 5 | Value6 | 1 | 11 | 8 |
| Value 7 | Value8 | 4 | 10 | 5 |
| Value 1 | Value2 | 6 | 2 | 9 |
+---------+--------+---------+---------+---------+
Is it possible to do this spark? How will be able to achieve the same?

Use least and greatest to calculate the minimum and maximum among the three columns and then order by it. In pyspark:
Ascending by the least value:
import pyspark.sql.functions as f
df.orderBy(f.least(f.col('NumCol1'), f.col('NumCol2'), f.col('NumCol3'))).show()
+-------+------+-------+-------+-------+
| Col1| Col2|NumCol1|NumCol2|NumCol3|
+-------+------+-------+-------+-------+
|Value 5|Value6| 1| 11| 8|
|Value 1|Value2| 6| 2| 9|
|Value 3|Value4| 8| 3| 12|
|Value 7|Value8| 4| 10| 5|
+-------+------+-------+-------+-------+
Descending by the greatest value:
df.orderBy(f.greatest(f.col('NumCol1'), f.col('NumCol2'), f.col('NumCol3')).desc()).show()
+-------+------+-------+-------+-------+
| Col1| Col2|NumCol1|NumCol2|NumCol3|
+-------+------+-------+-------+-------+
|Value 3|Value4| 8| 3| 12|
|Value 5|Value6| 1| 11| 8|
|Value 7|Value8| 4| 10| 5|
|Value 1|Value2| 6| 2| 9|
+-------+------+-------+-------+-------+

Related

Combinate of values in a table to get the sum of each combination

I have a table with numeric data that i need make diferent combinations itself.
For example:
| A |
|---|
| 1 |
| 2 |
| 3 |
| 4 |
I need to combine this single column to get the next result:
| A | B | C | D |
| - | - | - | - |
| 1 | | | |
| 1 | 2 | | |
| 1 | 2 | 3 | |
| 1 | 2 | 3 | 4 |
| 1 | 2 | | 4 |
| 1 | | 3 | |
| 1 | | 3 | 4 |
| 1 | | | 4 |
| | 2 | | |
| | 2 | 3 | |
| | 2 | 3 | 4 |
| | 2 | | 4 |
| | | 3 | |
| | | 3 | 4 |
| | | | 4 |
At the end of the table, i have to create a column with the Count of every column that has data and another column that contains the sums of number of each columns.
Maybe it sound very difficult or impossible, but I haven't a way to make it work.
I have try to "Cross Join" from SQL but didn't got the expected result.
Help!
In this case, you can solve this by counting in binary ending with the digits being the number of numbers in the set. etc. the starting set 2568 would end with 1111. this binary number would decide if you show that number in each row. Heres a table of how it would work.
| A |
|---|
| 2 |
| 5 |
| 6 |
| 8 |
A
B
C
D
Binary
Row number
8
0001
1
6
0010
2
6
8
0011
3
5
0100
4
5
8
0101
5
5
6
0110
6
5
6
8
0111
7
2
1000
8
2
8
1001
9
2
6
1010
10
2
6
8
1011
11
2
5
1100
12
2
5
8
1101
13
2
5
6
1110
14
2
5
6
8
1111
15

Pivot Table in Hive and Create Multiple Columns for Unique Combinations

I want to pivot the following table
| ID | Code | date | qty |
| 1 | A | 1/1/19 | 11 |
| 1 | A | 2/1/19 | 12 |
| 2 | B | 1/1/19 | 13 |
| 2 | B | 2/1/19 | 14 |
| 3 | C | 1/1/19 | 15 |
| 3 | C | 3/1/19 | 16 |
into
| ID | Code | mth_1(1/1/19) | mth_2(2/1/19) | mth_3(3/1/19) |
| 1 | A | 11 | 12 | 0 |
| 2 | B | 13 | 14 | 0 |
| 3 | C | 15 | 0 | 16 |
I am new to hive, i am not sure how to implement it.
NOTE: I don't want to do mapping because my month values change over time.

Oracle - Do filter predicate order change the execution plan?

We are troubleshooting a performance problem - The application uses a VIEW and filter predicates applied as below. In the first case, result is retrieved in seconds,but the second one runs for more than an hour.
columns referred in the view actually point to table D. Please note that this view was written for some other purpose and now used for an enhancement(dont know why - management decision).
I can see that the problem is FTS on the partition tables E and F, but not able to understand why/how the order of predicates change the execution plan? this is a cost based approach only and statistics are upto date.
This is not a well formed query - using % eliminates indexes etc - but what puzzles me is why the plan changes when the statement is exchanging filter predicates
select COL2 from VIEW where COL3 like
'%180%' and COL4 LIKE '%Solu%'
Plan hash value: 1618822878
------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | Pstart| Pstop |
------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | | | 1379K(100)| | | |
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
|* 1 | FILTER | | | | | | | |
| 2 | NESTED LOOPS OUTER | | 116K| 8568K| 1077K (19)| 02:47:19 | | |
|* 3 | HASH JOIN RIGHT OUTER | | 97611 | 6481K| 266K (1)| 00:41:25 | | |
|* 4 | INDEX FAST FULL SCAN | IDX1_A | 45156 | 352K| 25 (8)| 00:00:01 | | |
| 5 | NESTED LOOPS | | 97611 | 5719K| 266K (1)| 00:41:25 | | |
| 6 | NESTED LOOPS | | 98629 | 5719K| 266K (1)| 00:41:25 | | |
| 7 | VIEW | VW_NSO_1 | 98629 | 674K| 82 (3)| 00:00:01 | | |
| 8 | HASH UNIQUE | | 98629 | 1338K| 82 (3)| 00:00:01 | | |
| 9 | UNION-ALL | | | | | | | |
| 10 | TABLE ACCESS FULL | B | 97591 | 667K| 80 (3)| 00:00:01 | | |
| 11 | TABLE ACCESS FULL | C | 1038 | 2076 | 2 (0)| 00:00:01 | | |
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------
|* 12 | INDEX UNIQUE SCAN | PK_D | 1 | | 2 (0)| 00:00:01 | | |
|* 13 | TABLE ACCESS BY GLOBAL INDEX ROWID| D | 1 | 53 | 3 (0)| 00:00:01 | ROWID | ROWID |
| 14 | VIEW | | 1 | 7 | 8 (25)| 00:00:01 | | |
| 15 | UNION ALL PUSHED PREDICATE | | | | | | | |
| 16 | SORT UNIQUE | | 1 | 9 | 4 (25)| 00:00:01 | | |
|* 17 | INDEX RANGE SCAN | IDX1_E | 1 | 9 | 3 (0)| 00:00:01 | | |
| 18 | SORT UNIQUE | | 1 | 9 | 5 (20)| 00:00:01 | | |
|* 19 | INDEX RANGE SCAN | F | 2 | 18 | 4 (0)| 00:00:01 | | |
|* 20 | TABLE ACCESS BY INDEX ROWID | G | 1 | 36 | 3 (0)| 00:00:01 | | |
|* 21 | INDEX UNIQUE SCAN | UK_G | 1 | | 2 (0)| 00:00:01 | | |
------------------------------------------------------------------------------------------------------------------------------------
select COL2 from VIEW where COL4 LIKE '%Solu%' AND COL3 like
'%180%' ;
Plan hash value: 2380952204
-----------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Pid | Ord | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | Pstart| Pstop |
-----------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | | 24 | SELECT STATEMENT | | | | | 2598K(100)| | | |
|* 1 | 0 | 23 | FILTER | | | | | | | | |
|* 2 | 1 | 20 | HASH JOIN OUTER | | 116K| 8568K| 7632K| 2295K (2)| 05:56:38 | | |
|* 3 | 2 | 11 | HASH JOIN OUTER | | 97611 | 6481K| 6864K| 266K (1)| 00:41:27 | | |
| 4 | 3 | 9 | NESTED LOOPS | | 97611 | 5719K| | 266K (1)| 00:41:25 | | |
| 5 | 4 | 7 | NESTED LOOPS | | 98629 | 5719K| | 266K (1)| 00:41:25 | | |
| 6 | 5 | 5 | VIEW | VW_NSO_1 | 98629 | 674K| | 82 (3)| 00:00:01 | | |
| 7 | 6 | 4 | HASH UNIQUE | | 98629 | 1338K| | 82 (3)| 00:00:01 | | |
| 8 | 7 | 3 | UNION-ALL | | | | | | | | |
| 9 | 8 | 1 | TABLE ACCESS FULL | B | 97591 | 667K| | 80 (3)| 00:00:01 | | |
| 10 | 8 | 2 | TABLE ACCESS FULL | C | 1038 | 2076 | | 2 (0)| 00:00:01 | | |
|* 11 | 5 | 6 | INDEX UNIQUE SCAN | PK_D | 1 | | | 2 (0)| 00:00:01 | | |
|* 12 | 4 | 8 | TABLE ACCESS BY GLOBAL INDEX ROWID| D | 1 | 53 | | 3 (0)| 00:00:01 | ROWID | ROWID |
|* 13 | 3 | 10 | TABLE ACCESS FULL | A | 45156 | 352K| | 75 (3)| 00:00:01 | | |
| 14 | 2 | 19 | VIEW | | 127M| 851M| | 1988K (3)| 05:08:49 | | |
| 15 | 14 | 18 | UNION-ALL | | | | | | | | |
| 16 | 15 | 14 | HASH UNIQUE | | 21M| 184M| 412M| 235K (4)| 00:36:33 | | |
| 17 | 16 | 13 | PARTITION RANGE ALL | | 21M| 184M| | 164K (4)| 00:25:30 | 1 | 32 |
| 18 | 17 | 12 | TABLE ACCESS FULL | E | 21M| 184M| | 164K (4)| 00:25:30 | 1 | 32 |
| 19 | 15 | 17 | HASH UNIQUE | | 106M| 910M| 4569M| 1752K (2)| 04:32:17 | | |
| 20 | 19 | 16 | PARTITION RANGE ALL | | 238M| 2048M| | 1286K (1)| 03:19:54 | 1 | 32 |
| 21 | 20 | 15 | TABLE ACCESS FULL | F | 238M| 2048M| | 1286K (1)| 03:19:54 | 1 | 32 |
|* 22 | 1 | 22 | TABLE ACCESS BY INDEX ROWID | G | 1 | 36 | | 3 (0)| 00:00:01 | | |
|* 23 | 22 | 21 | INDEX UNIQUE SCAN | UK_G | 1 | | | 2 (0)| 00:00:01 | | |
-----------------------------------------------------------------------------------------------------------------------------------------------------

How to get the max value for elements having the same id under Laravel

Using Laravel/Eloquent, I would like to retrieve the max value for each week_id in the following table.
+---------+-----------+
| week_id | value |
+---------+-----------+
| 5 | |
| 6 | 1 |
| 6 | |
| 6 | |
| 7 | 3 |
| 7 | 4 |
| 7 | |
+---------+-----------+
With MySql I would do it like this:
SELECT week_id, max(value) as max_value FROM foo_table GROUP BY week_id
=>
+---------+-----------+
| week_id | max_value |
+---------+-----------+
| 5 | |
| 6 | 1 |
| 7 | 4 |
+---------+-----------+
How could I achieve the same under Laravel?
Try this:
DB::table('foo_table')
->select('week_id', DB:raw('max(value) as max_value'))
->groupBy('week_id')
->get();

Sort values in an associative array pl/sql

If ID is even I must sort the values that correspond to that ID DESC , if the ID is odd I must sort the values ASC. This is the table called Grades.
ID|COL1|COL2|COL3|COL4|COL5|COL6|COL7|
1 | 6 | 3 | 8 | 4 | 7 | 8 | 4 |
2 | 5 | 7 | 9 | 2 | 1 | 7 | 8 |
3 | 2 | 7 | 4 | 8 | 1 | 5 | 9 |
4 | 8 | 4 | 7 | 9 | 4 | 1 | 4 |
5 | 7 | 5 | 2 | 5 | 2 | 6 | 4 |
The result must be this:
ID|COL1|COL2|COL3|COL4|COL5|COL6|COL7|
1 | 3 | 4 | 4 | 6 | 7 | 8 | 8 |
2 | 9 | 8 | 7 | 7 | 5 | 2 | 1 |
3 | 1 | 2 | 4 | 5 | 7 | 8 | 9 |
4 | 9 | 8 | 7 | 4 | 4 | 4 | 1 |
5 | 2 | 2 | 4 | 5 | 5 | 6 | 7 |
As you can see ID=1->odd number so the values must be sorted ASC
This is the code so far:
declare
type grades_array is table of grades%rowtype index by pls_integer;
grades_a grades_array;
cnt number;
begin
Select count(id) into cnt from grades;
For i in 1..cnt loop
--I used an associative array
Select * into grades_a(i) from grades where grades.id=i;
end loop;
For i in grades_a.FIRST..grades_a.LAST loop
if (mod(grades_a(i).id,2)=1)then .......
--I don't know how to sort the specific rows, in this case ASC
--dbms_output.put_line(grades_a(i).col1);
end if;
end loop;
--Also it is specified in the exercise that the table can change, e.g add more columns
end;
I would simply use PIVOT/UNPIVOT for this.
First UNPIVOT the table and assign a rank to each column value in ascending/descending order.
SQL Fiddle
Query 1:
SELECT id,
colval,
ROW_NUMBER () OVER (
PARTITION BY id
ORDER BY CASE MOD (id, 2) WHEN 1 THEN colval END,
CASE MOD (id, 2) WHEN 0 THEN colval END DESC) r
FROM x UNPIVOT (colval FOR colname
IN (col1 AS 'col1', col2 AS 'col2', col3 AS 'col3', col4 AS 'col4',
col5 AS 'col5', col6 AS 'col6', col7 AS 'col7')
)
Results:
| ID | COLVAL | R |
|----|--------|---|
| 1 | 3 | 1 |
| 1 | 4 | 2 |
| 1 | 4 | 3 |
| 1 | 6 | 4 |
| 1 | 7 | 5 |
| 1 | 8 | 6 |
| 1 | 8 | 7 |
| 2 | 9 | 1 |
| 2 | 8 | 2 |
| 2 | 7 | 3 |
| 2 | 7 | 4 |
| 2 | 5 | 5 |
| 2 | 2 | 6 |
| 2 | 1 | 7 |
| 3 | 1 | 1 |
| 3 | 2 | 2 |
| 3 | 4 | 3 |
| 3 | 5 | 4 |
| 3 | 7 | 5 |
| 3 | 8 | 6 |
| 3 | 9 | 7 |
| 4 | 9 | 1 |
| 4 | 8 | 2 |
| 4 | 7 | 3 |
| 4 | 4 | 4 |
| 4 | 4 | 5 |
| 4 | 4 | 6 |
| 4 | 1 | 7 |
| 5 | 2 | 1 |
| 5 | 2 | 2 |
| 5 | 4 | 3 |
| 5 | 5 | 4 |
| 5 | 5 | 5 |
| 5 | 6 | 6 |
| 5 | 7 | 7 |
Then PIVOT the result based on the rank.
Query 2:
WITH pivoted AS (
SELECT id,
colval,
ROW_NUMBER () OVER (
PARTITION BY id
ORDER BY CASE MOD (id, 2) WHEN 1 THEN colval END,
CASE MOD (id, 2) WHEN 0 THEN colval END DESC) r
FROM x UNPIVOT (colval FOR colname
IN (col1 AS 'col1', col2 AS 'col2', col3 AS 'col3', col4 AS 'col4',
col5 AS 'col5', col6 AS 'col6', col7 AS 'col7')
)
)
SELECT * FROM pivoted
PIVOT (MAX (colval)
FOR r
IN (1 AS col1, 2 AS col2, 3 AS col3, 4 AS col4,
5 AS col5, 6 AS col6, 7 AS col7))
Results:
| ID | COL1 | COL2 | COL3 | COL4 | COL5 | COL6 | COL7 |
|----|------|------|------|------|------|------|------|
| 1 | 3 | 4 | 4 | 6 | 7 | 8 | 8 |
| 2 | 9 | 8 | 7 | 7 | 5 | 2 | 1 |
| 3 | 1 | 2 | 4 | 5 | 7 | 8 | 9 |
| 4 | 9 | 8 | 7 | 4 | 4 | 4 | 1 |
| 5 | 2 | 2 | 4 | 5 | 5 | 6 | 7 |

Resources