Very strange results in MySQL 5.7 (specifically 5.7.13-0ubuntu0.16.04.2 ).
I suspect this could be a bug in MySQL.
DROP TABLE IF EXISTS `test_grids_1`;
CREATE TABLE `test_grids_1` (
`unq_id` int(11) NOT NULL DEFAULT '0',
`var_fld` int(11) DEFAULT '0'
) ENGINE=InnoDB;
INSERT INTO `test_grids_1` VALUES
(1,4500),
(2,6000);
DROP TABLE IF EXISTS `test_grid_dtl_1`;
CREATE TABLE `test_grid_dtl_1` (
`dtl_id` int(11) NOT NULL DEFAULT '0',
`unq_id` int(11) DEFAULT '0',
`dtl_var` decimal(14,2) DEFAULT '0.00'
) ENGINE=InnoDB;
INSERT INTO `test_grid_dtl_1` VALUES
(1,1,2.00),
(2,1,2.40),
(3,2,2.30);
SELECT
( g.calc_var * d.dtl_var ) new_var,
g.calc_var
FROM
(
SELECT
unq_id,
IF ( var_fld > 5000, ( 1 / var_fld ) , 5000 ) calc_var
FROM
test_grids_1
) g
INNER JOIN
test_grid_dtl_1 d
ON d.unq_id = g.unq_id;
+--------------+----------+
| new_var | calc_var |
+--------------+----------+
| 10000.000000 | 5000 |
| 12000.000000 | 5000 |
| 0.000383 | 0.0002 |
+--------------+----------+
SELECT
( g.calc_var * d.dtl_var ) new_var,
g.calc_var
FROM
(
SELECT
unq_id,
IF ( var_fld > 5000, ( 1 / var_fld ) , 5000 ) calc_var
FROM
test_grids_1
) g
INNER JOIN
test_grid_dtl_1 d
ON d.unq_id = g.unq_id
ORDER BY
1;
+--------------+----------+
| new_var | calc_var |
+--------------+----------+
| 0.000383 | 0.0002 |
| 10000.000000 | 99.9999 |
| 12000.000000 | 99.9999 |
+--------------+----------+
3 rows in set (0.00 sec)
When the sort is included it causes the returned values for certain criteria to be completely incorrect.
Values that are expected to be 5000 are suddenly 99.9999.
If anyone could please check and confirm similar behaviour on other 5.7 installations, it would be great.
Thanks
What's going on?
The query is being bit by implicit casting by MySQL with joining and ordering.
Solution
Let's solve the problem first and then discuss how we got there. Notice the change of 1/var_fld to 1.0/var_fld and 5000 to 5000.0.
SELECT g.calc_var * d.dtl_var as new_var, g.calc_var
FROM (
SELECT unq_id, IF (var_fld > 5000, 1.0/var_fld, 5000.0) calc_var
FROM test_grids_1
) g
INNER JOIN test_grid_dtl_1 d ON d.unq_id = g.unq_id
ORDER BY 1
+---------------+------------+
| new_var | calc_var |
+---------------+------------+
| 0.0003833 | 0.00017 |
| 10000.0000000 | 5000.00000 |
| 12000.0000000 | 5000.00000 |
+---------------+------------+
You could re-write the query slight differently as well by using cast. Notice that I have included the hexadecimal value in the last column. It will be useful as you read on:
SELECT g.calc_var * d.dtl_var as new_var, g.calc_var, hex(g.calc_var) as hcalc_var
FROM (
SELECT unq_id, IF (var_fld > 5000, cast(1/var_fld as decimal(15,5)), 5000.0) calc_var
FROM test_grids_1
) g
INNER JOIN test_grid_dtl_1 d ON d.unq_id = g.unq_id
ORDER BY 1
+---------------+------------+-----------+
| new_var | calc_var | hcalc_var |
+---------------+------------+-----------+
| 0.0003910 | 0.00017 | 0 |
| 10000.0000000 | 5000.00000 | 1388 |
| 12000.0000000 | 5000.00000 | 1388 |
+---------------+------------+-----------+
Other solution
Notice the replacement of if with case statement.
SELECT g.calc_var * d.dtl_var as new_var, g.calc_var
FROM (
SELECT unq_id, case when var_fld > 5000 then 1/var_fld else 5000 end calc_var
FROM test_grids_1
) g
INNER JOIN test_grid_dtl_1 d ON d.unq_id = g.unq_id
ORDER BY 1
+--------------+-----------+
| new_var | calc_var |
+--------------+-----------+
| 0.000383 | 0.0002 |
| 10000.000000 | 5000.0000 |
| 12000.000000 | 5000.0000 |
+--------------+-----------+
Notice how case statement didn't need any kind of casting to achieve almost the same result. However, to get to the exact same result as the first query, you'd have to do something like this -
Yet another one
Notice the 1.0/var_fld and 5000.0 along with cast instead of if
SELECT g.calc_var * d.dtl_var as new_var, g.calc_var
FROM (
SELECT unq_id, case when var_fld > 5000 then 1.0/var_fld else 5000.0 end calc_var
FROM test_grids_1
) g
INNER JOIN test_grid_dtl_1 d ON d.unq_id = g.unq_id
ORDER BY 1
How did this come to highlight?
Let's look at the original query; I've added a new field hex(g.calc_var) that is a hexadecimal representation of g.calc_var.
SELECT g.calc_var * d.dtl_var as new_var, g.calc_var, hex(g.calc_var) as hcalc_var
FROM (
SELECT unq_id, IF (var_fld > 5000, 1/var_fld, 5000) calc_var
FROM test_grids_1
) g
INNER JOIN test_grid_dtl_1 d ON d.unq_id = g.unq_id
ORDER BY 1
+--------------+----------+-----------+
| new_var | calc_var | hcalc_var |
+--------------+----------+-----------+
| 0.000383 | 0.0002 | 0 |
| 10000.000000 | 99.9999 | 1388 |
| 12000.000000 | 99.9999 | 1388 |
+--------------+----------+-----------+
Compare the results with the very first query in the solution part
SELECT g.calc_var * d.dtl_var as new_var, g.calc_var, hex(g.calc_var) as hcalc_var
FROM (
SELECT unq_id, IF (var_fld > 5000, 1.0/var_fld, 5000.0) calc_var
FROM test_grids_1
) g
INNER JOIN test_grid_dtl_1 d ON d.unq_id = g.unq_id
ORDER BY 1
+---------------+------------+-----------+
| new_var | calc_var | hcalc_var |
+---------------+------------+-----------+
| 0.0003833 | 0.00017 | 0 |
| 10000.0000000 | 5000.00000 | 1388 |
| 12000.0000000 | 5000.00000 | 1388 |
+---------------+------------+-----------+
Notice that the hex value is exactly the same in both the queries but the decimal values are different.
How can 5000 end up as 99.9999?
select cast(5000 as decimal(6,4)) as test;
+---------+
| test |
+---------+
| 99.9999 |
+---------+
1 row in set, 1 warning (0.00 sec)
show warnings;
+---------+------+-----------------------------------------------+
| Level | Code | Message |
+---------+------+-----------------------------------------------+
| Warning | 1264 | Out of range value for column 'test' at row 1 |
+---------+------+-----------------------------------------------+
Like that! When 5000 is cast to a decimal with length 6 including 4 decimals, the result is the maximum that will fit decimal(6,4). Ouch.
In this case a warning is thrown, which is good. One could catch it during testing. However, the query in question does not throw any warning. And that's not good.
This leads to multiple questions
Why is casting happening correctly without order by?
What in the order by is causing casting to happen the way we noticed?
Why is case...end showing better results than if(...) even when no casting is done?
Why is warning not thrown when casting like this is done?
You might want to put a bug report into MySQL folks. I don't have latest MariaDB installed so I can't say if this issue exists in MariaDB also. I got curious and installed 10.0.25-MariaDB-0ubuntu0.16.04.1. It appears the same issue is in MariaDB 10.0.25 also.
Is this preventable?
Yes. Whenever dealing with int to float/double/decimal, cast implicitly to get predictable results. I still expect MySQL to look into this corner case. If someone comes across documentation that explains this behavior, please add a comment to this answer so that I can educate myself.
Related
I encountered a very weird result while trying to filter my data using RAND() function.
Suppose i have a table filled with some data:
CREATE TABLE `status_log` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`rank` int(11) DEFAULT 50,
)
Then i do the following simple select:
select id,rank as rank,(rand()*100) as thres
from status_log
where rank = 50
and have a clear and expected output:
<...skip...>
| 6575476 | 50 | 34.51090244065123 |
| 6575511 | 50 | 67.84258230388404 |
| 6575589 | 50 | 35.68020727083106 |
| 6575644 | 50 | 74.87329251586766 |
| 6575723 | 50 | 67.32584384020961 |
| 6575771 | 50 | 12.009344726809621 |
| 6575863 | 50 | 58.06919518678374 |
+---------+------+-----------------------+
66169 rows in set (2.502 sec).
So, i generate some random data from 0 to 100 and join each result to the table, around 66000 results in total.
Then i want only a (random) part of the data to be shown. It doesn't have any purpose for production, by the way, it's just some artificial test, so let's not discuss it.
select *
from (
select id,rank as rank,(rand()*100) as thres
from status_log
where rank = 50) t
where thres>rank
order by thres;
After that i get the following:
<...skip...>
| 4396732 | 50 | 99.97966075314177 |
| 4001782 | 50 | 99.98002871869134 |
| 1788580 | 50 | 99.98064143581375 |
| 5300286 | 50 | 99.98275954274717 |
| 146401 | 50 | 99.98552389441573 |
| 4744748 | 50 | 99.98644758014609 |
+---------+------+--------------------+
16449 rows in set (2.188 sec)
It's obvious that for the mean of 50 the expected number of results should be around 33000 out of total 66000. So it seems that the distribution of rand() is biased, correct?
Let's then change > to <:
select *
from (
select id,rank as rank,(rand()*100) as thres
from status_log
where rank = 50) t
where thres<rank
order by thres;
<...skip...>
| 4653786 | 50 | 49.98035016467827 |
| 6041489 | 50 | 49.980370281245904 |
| 5064204 | 50 | 49.989308742796354 |
| 1699741 | 50 | 49.991373205549436 |
| 3234039 | 50 | 49.99390454030959 |
| 806791 | 50 | 49.99575274996064 |
| 3713581 | 50 | 49.99814410693771 |
+---------+------+----------------------+
16562 rows in set (2.373 sec)
Again 16000! So not the half but the quarter of all results is shown!
It seems that the output of rand() inside the brackets is somehow influenced with the expression outside them. How is this possible?
I can also union it:
select * from (select id,rank as rank,(rand()*100) as thres from status_log where rank = 50) t where thres<50
UNION ALL
select * from (select id,rank as rank,(rand()*100) as thres from status_log where rank = 50) t where thres>=50;
The expected number of results has to be somewhere around 66000, but it returns only 33000 or so.
I observe this behavior only when rand() is non-deterministic and is generated dynamically each time. If i do ...select id,rank as rank,(rand(id)*100)... (i.e. make the output of rand() dependent of id), i start getting the expected number of results (33000-ish). The same happens if i precalculate and fill a temporary field in the table.
I also tried making the filtering with rank=30, and the results were ~6000 and ~32000 for < and > respectively.
Version 10.5.8-MariaDB-3, InnoDB
Using a single query with HAVING instead of a subquery with WHERE in the main query seems to work around it.
select id,rank as rank,(rand()*100) as thres
from status_log
where rank = 50
having thres > rank
order by thres
This appears to be this bug:
RAND() evaluated and filtered twice with subquery
I have a 2 table like this:
first table
+------------+---------------+--------+
| pk | user_one |user_two|
+------------+---------------+--------+
second table
+------------+---------------+--------+----------------+----------------+
| pk | sender |receiver|fk of firsttable|content |
+------------+---------------+--------+----------------+----------------+
First and second table have one to many(1:N) relations.
There are many records in second table:
| pk | sender|receiver|fk of firsttable|content |
|120 |car224 |car223 |1 |test message1 to 223
|121 |car224 |car223 |1 |test message2 to 223
|122 |car224 |car225 |21 |test message1 to 225
|123 |car224 |car225 |21 |test message2 to 225
|124 |car224 |car225 |21 |test message3 to 225
|125 |car224 |car225 |21 |test message4 to 225
I need to find if fk has the same value and I want the row with the largest pk.
I've changed the above column name to make it easier to understand.
Here is the actual sql I've tried so far:
select *
from (select rownum rn,
mr.mrno,
mr.user_one,
mr.user_two,
m.mno,
m.content
from tbl_messagerelation mr,
tbl_message m
where (mr.user_one = 'car224' or
mr.user_two='car224') and
m.rowid in (select max(rowid)
from tbl_message
group by m.mno) and
rownum <= 1*20)
where rn > (1-1) * 20
And this is the result:
+---------+-------+----------+----------+-------------------------+----------------------+
| rn | mrno | user_one | user_two | mno(pk of second table) | content |
+---------+-------+----------+----------+-------------------------+----------------------+
| 1 | 1 | car224 | car223 | 125 | test message4 to 225 |
| 2 | 21 | car224 | car225 | 125 | test message4 to 225 |
+---------+-------+----------+----------+-------------------------+----------------------+
My desired result is something like this:
+---------+---------+----------+--------------------+----------------------+
| fk | sender | receiver | pk of second table | content |
+---------+---------+----------+--------------------+----------------------+
| 1 | car224 | car223 | 121 | test message2 to 223 |
| 21 | car224 | car223 | 125 | test message4 to 225 |
+---------+---------+----------+--------------------+----------------------+
Your table description when compared to your query is confusing me. However, what I could understand was that you are probably looking for row_number().
An important advice is to use standard explicit JOIN syntax rather than outdated a,b syntax for joins. Join keys were not clear to me and you may replace it appropriately in your final query.
select * from
(
select mr.*, m.*, row_number() over ( partition by m.fk order by m.pk desc ) as rn
from tbl_messagerelation mr join tbl_message m on mr.? = m.?
) where rn =1
Or perhaps you don't need that join at all
select * from
(
select m.*, row_number() over ( partition by m.fk order by m.pk desc ) as rn
from tbl_message m
) where rn =1
I have query with bind variables which comming from outer application.
The optimizer use the the unwanted index and I want to force it use another plan.
So I generate the good plan using index hint and then created the baseline with the plans
and connect the wanted plan to the query sql_id, and change the fixed attribute to 'YES'.
I executed the DBMS_XPLAN.DISPLAY_SQL_PLAN_BASELINE function
and the output shows that the wanted plan marked as fixed=yes.
So why when I'm running the query it still with the bad plan??
The code:
-- Query
SELECT DISTINCT t_01.puid
FROM PWORKSPACEOBJECT t_01 , PPOM_APPLICATION_OBJECT t_02
WHERE ( ( UPPER(t_01.pobject_type) IN ( UPPER( :1 ) , UPPER( :2 ) )
AND ( t_02.pcreation_date >= :3 ) ) AND ( t_01.puid = t_02.puid ) )
-- get the text
select sql_fulltext
from v$sqlarea
where sql_id = '21pts328r2nb7' and rownum = 1;
-- prepare the explain plan
explain plan for
SELECT DISTINCT t_01.puid
FROM PWORKSPACEOBJECT t_01 , PPOM_APPLICATION_OBJECT t_02
WHERE ( ( UPPER(t_01.pobject_type) IN ( UPPER( :1 ) , UPPER( :2 ) )
AND ( t_02.pcreation_date >= :3 ) ) AND ( t_01.puid = t_02.puid ) ) ;
-- we can see that there is no use of index - PIPIPWORKSPACEO_2
select * from table(dbms_xplan.display);
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10382 | 517K| 61553 |
| 1 | HASH UNIQUE | | 10382 | 517K| 61553 |
| 2 | HASH JOIN | | 158K| 7885K| 61549 |
| 3 | INLIST ITERATOR | | | | |
| 4 | TABLE ACCESS BY INDEX ROWID| PWORKSPACEOBJECT | 158K| 4329K| 52689 |
| 5 | INDEX RANGE SCAN | PIPIPWORKSPACEO_3 | 158K| | 534 |
| 6 | INDEX RANGE SCAN | DBTAO_IX1_PPOM | 3402K| 74M| 2911 |
------------------------------------------------------------------------------------
Note
-----
- 'PLAN_TABLE' is old version
-- generate plan with the wanted index
explain plan for
select /*+ index(t_01 PIPIPWORKSPACEO_2)*/ distinct t_01.puid
from pworkspaceobject t_01 , ppom_application_object t_02
where ( ( upper(t_01.pobject_type) in ( upper( :1 ) , upper( :2 ) )
and ( t_02.pcreation_date >= :3 ) ) and ( t_01.puid = t_02.puid ) ) ;
-- the index working - the index used
select * from table(dbms_xplan.display);
-----------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10382 | 517K| 223K|
| 1 | HASH UNIQUE | | 10382 | 517K| 223K|
| 2 | HASH JOIN | | 158K| 7885K| 223K|
| 3 | TABLE ACCESS BY INDEX ROWID| PWORKSPACEOBJECT | 158K| 4329K| 214K|
| 4 | INDEX FULL SCAN | PIPIPWORKSPACEO_2 | 158K| | 162K|
| 5 | INDEX RANGE SCAN | DBTAO_IX1_PPOM | 3402K| 74M| 2911 |
-----------------------------------------------------------------------------------
Note
-----
- 'PLAN_TABLE' is old version
-- get the sql_id of the query with the good index
-- 7t72qvghr0zqh
select sql_id from v$sqlarea where sql_text like 'select /*+ index(t_01 PIPIPWORKSPACEO_2)%';
-- get the plan hash value of the good plan by the sql_id
--4040955653
select plan_hash_value from v$sql_plan where sql_id = '7t72qvghr0zqh';
-- get the plan hash value of the bad plan by the sql_id
--1044780890
select plan_hash_value from v$sql_plan where sql_id = '21pts328r2nb7';
-- load the source plan
begin
dbms_output.put_line(
dbms_spm.load_plans_from_cursor_cache
( sql_id => '21pts328r2nb7' )
);
END;
-- the new base line created with the bad plan
select * from dba_sql_plan_baselines;
-- load the good plan of the second sql_id (with the wanted index)
-- and bind it to the sql_handle of the source query
begin
dbms_output.put_line(
DBMS_SPM.LOAD_PLANS_FROM_CURSOR_CACHE
( sql_id => '7t72qvghr0zqh',
plan_hash_value => 4040955653,
sql_handle => 'SQL_4afac4211aa3317d' )
);
end;
-- new there are 2 plans bind to the same sql_handle and sql_text
select * from dba_sql_plan_baselines;
-- alter the good one to be fixed
begin
dbms_output.put_line(
dbms_spm.alter_sql_plan_baseline
( sql_handle =>
'SQL_4afac4211aa3317d',
PLAN_NAME => 'SQL_PLAN_4pyq444da6cbxf7c97cc7',
ATTRIBUTE_NAME => 'fixed',
ATTRIBUTE_VALUE => 'YES'
)) ;
end;
-- check the good plan - fixed = yes
select * from table(
dbms_xplan.display_sql_plan_baseline (
sql_handle => 'SQL_4afac4211aa3317d',
plan_name => 'SQL_PLAN_4pyq444da6cbxf7c97cc7',
format => 'ALL'));
--------------------------------------------------------------------------------
SQL handle: SQL_4afac4211aa3317d
SQL text: SELECT DISTINCT t_01.puid FROM PWORKSPACEOBJECT t_01 ,
PPOM_APPLICATION_OBJECT t_02 WHERE ( ( UPPER(t_01.pobject_type) IN (
UPPER( :1 ) , UPPER( :2 ) ) AND ( t_02.pcreation_date >= :3 ) ) AND (
t_01.puid = t_02.puid ) )
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
Plan name: SQL_PLAN_4pyq444da6cbxf7c97cc7 Plan id: 4157177031
Enabled: YES Fixed: YES Accepted: YES Origin: MANUAL-LOAD
--------------------------------------------------------------------------------
Plan hash value: 4040955653
-----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10382 | 517K| | 223K (1)| 00:44:37 |
| 1 | HASH UNIQUE | | 10382 | 517K| | 223K (1)| 00:44:37 |
|* 2 | HASH JOIN | | 158K| 7885K| 6192K| 223K (1)| 00:44:37 |
| 3 | TABLE ACCESS BY INDEX ROWID| PWORKSPACEOBJECT | 158K| 4329K| | 214K (1)| 00:42:50 |
|* 4 | INDEX FULL SCAN | PIPIPWORKSPACEO_2 | 158K| | | 162K (1)| 00:32:25 |
|* 5 | INDEX RANGE SCAN | DBTAO_IX1_PPOM | 3402K| 74M| | 2911 (1)| 00:00:35 |
-----------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$1
3 - SEL$1 / T_01#SEL$1
4 - SEL$1 / T_01#SEL$1
5 - SEL$1 / T_02#SEL$1
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("T_01"."PUID"="T_02"."PUID")
4 - filter(UPPER("POBJECT_TYPE")=UPPER(:1) OR UPPER("POBJECT_TYPE")=UPPER(:2))
5 - access("T_02"."PCREATION_DATE">=:3)
Column Projection Information (identified by operation id):
-----------------------------------------------------------
1 - (#keys=1) "T_01"."PUID"[VARCHAR2,15]
2 - (#keys=1) "T_01"."PUID"[VARCHAR2,15]
3 - "T_01"."PUID"[VARCHAR2,15]
4 - "T_01".ROWID[ROWID,10]
5 - "T_02"."PUID"[VARCHAR2,15]
Note
-----
- 'PLAN_TABLE' is old version
-- run explain plan for the query
-- need to use the new plan
declare
v_string clob;
begin
select sql_fulltext
into v_string
from v$sqlarea
where sql_id = '21pts328r2nb7' and rownum = 1;
execute immediate 'explain plan for ' || v_string using '1','1',sysdate;
end;
-- check the plan - still the unwanted index and plan
select * from table(dbms_xplan.display);
------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 10382 | 517K| 61553 |
| 1 | HASH UNIQUE | | 10382 | 517K| 61553 |
| 2 | HASH JOIN | | 158K| 7885K| 61549 |
| 3 | INLIST ITERATOR | | | | |
| 4 | TABLE ACCESS BY INDEX ROWID| PWORKSPACEOBJECT | 158K| 4329K| 52689 |
| 5 | INDEX RANGE SCAN | PIPIPWORKSPACEO_3 | 158K| | 534 |
| 6 | INDEX RANGE SCAN | DBTAO_IX1_PPOM | 3402K| 74M| 2911 |
------------------------------------------------------------------------------------
Note
-----
- 'PLAN_TABLE' is old version
From a read through of your test case, I suspect the problem is that you're interpreting the FIXED attribute incorrectly.
If you list all the plans for your baseline, you will probably find the original and the loaded cursor plan are both ENABLED and ACCEPTED at the moment. I think what you need to do (based on my own usage of these calls) is use the ENABLED attribute. Set ENABLED to NO for the unwanted plan.
Try:
exec dbms_spm.alter_sql_plan_baseline(
sql_handle=>'SQL_...' -- baseline to update
,plan_name=>'SQL_PLAN_...' -- unwanted plan signature to disable
,attribute_name=>'ENABLED',attribute_value=>'NO')
I want Pivot multi column. What use oracle pivot table?
SQL:
SELECT * FROM
(
SELECT *
FROM IRO_SIM A
WHERE A.COM_CODE = 'AAQ'
AND A.PCODE = 'AKIOP'
)
PIVOT
(
LISTAGG(SIMTYPE,',')
WITHIN GROUP (ORDER BY SIMTYPE)
FOR SIMTYPE IN ('H','V')
)
Sample Data:
COM_CODE | PCODE | L_VALUE | A_SIM | AMT_SIM | SIMTYPE
A | AKIOP | 1700 | TOTAL | 50 | H
A | AKIOP | 500 | EACH | 100 | V
A | BHUIO | 200 | TOTAL | 500 | H
A | BHUIO | 600 | TOTAL | 400 | V
i need Result:
COM_CODE | PCODE | H_VALUE | H_ASIM | H_AMTSIM | V_VALUE | V_ASIM | V_AMTSIM
A | AKIOP | 1700 | TOTAL | 50 | 500 | EACH | 100
A | BHUIO | 200 | TOTAL | 500 | 600 | TOTAL | 400
thanks advance :)
Just list the multiple columns. Every expression in your PIVOT clause will be matched with every value in the FOR clause. So, what you want is this:
SELECT * FROM d
PIVOT ( sum(l_value) as value, max(a_sim) as asim, sum(amt_sim) as amtsim
FOR simtype in ('H' AS "H", 'V' AS "V") )
With data...
with d as (
SELECT 'A' com_code, 'AKIOP' pcode, 1700 l_value, 'TOTAL' a_sim, 50 amt_sim, 'H' simtype FROM DUAL UNION ALL
SELECT 'A' com_code, 'AKIOP' pcode, 500 l_value, 'EACH' a_sim, 100 amt_sim, 'V' simtype FROM DUAL UNION ALL
SELECT 'A' com_code, 'BHUIO' pcode, 200 l_value, 'TOTAL' a_sim, 500 amt_sim, 'H' simtype FROM DUAL UNION ALL
SELECT 'A' com_code, 'BHUIO' pcode, 600 l_value, 'TOTAL' a_sim, 400 amt_sim, 'V' simtype FROM DUAL)
SELECT * FROM d
PIVOT ( sum(l_value) as value, max(a_sim) as asim, sum(amt_sim) as amtsim
FOR simtype in ('H' AS "H", 'V' AS "V") )
I'm wondering if it is possible to create a calculated member to obtain the sum of distinct values for a fact. I will try to explain it with the following example:
I have a fact where the primary key is related with two dimensions (one to many cardinality). The fact contains a measure and its value is the same for all members of each distinct combination of FACT_ID and DIM_1_ID. For the total, I don't want to consider multiple times the same values. So, with the following values the total should be 450 and not 850 (default Mondrian behavior).
| FACT_ID | DIM_1_ID | DIM_2_ID | MEASURE |
|---------|----------|----------|---------|
| 1 | A | D | 100 |
| 1 | A | E | 100 |
| 1 | B | F | 50 |
| 2 | A | D | 300 |
| 2 | A | E | 300 |
|---------|----------|----------|---------|
TOTAL | 450 |
Is it possible? How can it be done with Mondrian?
Thanks in advance
UPDATE - Current status
As described in one of the comments bellow, base on #whytheq's answer, I managed to calculate the right value for the total, using the following MDX formula for the measure:
Sum(
Order(
[dActivity.hActivity].[lActivity].MEMBERS*[dFacility.hFacility].[lFacility].MEMBERS,
[dActivity.hActivity].[lActivity].currentmember.name
) as [m_set] ,
iif(
[m_set].currentordinal = 0
OR
not(
[m_set]
.item([m_set].currentordinal)
.item(0).NAME
=
[m_set]
.item([m_set].currentordinal-1)
.item(0).NAME
) ,
[Measures].[mBudget]
,
0
)
)
However, this expression is using the complete set for every single row, so the result overrides the measure real value for the different fact rows.
| FACT_ID | DIM_1_ID | DIM_2_ID | MEASURE |
|---------|----------|----------|---------|
| 1 | A | D | 450 |
| 1 | A | E | 450 |
| 1 | B | F | 450 |
| 2 | A | D | 450 |
| 2 | A | E | 450 |
|---------|----------|----------|---------|
TOTAL | 450 |
Great question - really tricky to do in MDX.
If we do the following then there are 158 rows returned - a handful have duplicate values for [Measures].[Internet Sales Amount]:
SELECT
[Measures].[Internet Sales Amount] ON 0
,NON EMPTY
Order
(
[Product].[Product].[Product]
,[Measures].[Internet Sales Amount]
,bdesc
) ON 1
FROM [Adventure Works];
This only counts them if the member above is different for the respective measure:
WITH
SET [x] AS
Order
(
NonEmpty
(
[Product].[Product].[Product]
,[Measures].[Internet Sales Amount]
)
,[Measures].[Internet Sales Amount]
,bdesc
)
SET [FILTERED] AS
Filter
(
[x]
,
(
[x].Item(
[x].CurrentOrdinal - 1)
,[Measures].[Internet Sales Amount]
)
<>
(
[x].Item(
[x].CurrentOrdinal)
,[Measures].[Internet Sales Amount]
)
)
MEMBER [Measures].[distCount] AS
Count([FILTERED])
SELECT
[Measures].[distCount] ON 0
FROM [Adventure Works];
Maybe try adding the EXISTING keyword into your calculatio:
Sum
(
Order
(
EXISTING //<<<
[dActivity.hActivity].[lActivity].MEMBERS
*
[dFacility.hFacility].[lFacility].MEMBERS
,[dActivity.hActivity].[lActivity].CurrentMember.Name
) AS [m_set]
,IIF
(
[m_set].CurrentOrdinal = 0
OR
(NOT
[m_set].Item(
[m_set].CurrentOrdinal).Item(0).Name
=
[m_set].Item(
[m_set].CurrentOrdinal - 1).Item(0).Name)
,[Measures].[mBudget]
,0
)
)
You could try to obtain the average over the set. The code is a bit complex.
WITH SET SomeSet AS
{
Fact.FactID.FactID.MEMBERS
*
Fact.DimID1.DimID1.MEMBERS
*
Fact.DimID2.DimID2.MEMBERS
}
MEMBER Measures.AvgVal AS
AVG
(
{Fact.FactID.CURRENTMEMBER}
*
{Fact.DimID1.CURRENTMEMBER}
*
NonEmpty
(
Fact.DimID2.DimID2.MEMBERS,
{{Fact.FactID.CURRENTMEMBER} *
{Fact.DimID1.CURRENTMEMBER}} *
[Measures].[TheMeasure]
)
,
[Measures].[TheMeasure]
)
SELECT NON EMPTY SomeSet ON 1,
NON EMPTY {
[Measures].[TheMeasure],
Measures.AvgVal
} on 0
from [YourCube]
What I am doing is, for the current FactID- DimID1 combination on the axis, I am getting the list of all possible DimID2s and then, over the internally generated non-empty tuples of FactID-DimID1-DimID2, deriving the average value of the measure TheMeasure
So, for example (100+100)/2 = 100 value would be displayed for the combination of FactID = 1 and DimID1 = A