Very strange thing has happened in our OBIEE. There were no modifications in rpd or in database, but every analyses that contains measure column has null values for that column. In all of them.
Here is one example that has been working fine till now.
Criteria:
Corresponding Result:
Checked a physical query generated for this simple analyses and it is different:
WITH SAWITH0 AS
(select distinct T5520.CAL_DAY as c1, T3160.CODE as c2
from DM_FILIALS_V T3160 /* D04 Filials */,
DM_CALENDAR_V T5520 /* D03 Calendar */,
DM_FACT_DATA_V T74769 /* F44 Dm Fact Data */
where (T3160.CODE = T74769.FILIAL_CODE and T5520.CAL_DAY = T74769.PERIOD and
T5520.CAL_DAY = TO_DATE('2021-06-11', 'YYYY-MM-DD') and
T74769.PERIOD = TO_DATE('2021-06-11', 'YYYY-MM-DD')))
select D1.c1 as c1,
D1.c2 as c2,
D1.c3 as c3,
D1.c4 as c4,
D1.c5 as c5,
D1.c6 as c6
from (select D1.c1 as c1,
D1.c2 as c2,
D1.c3 as c3,
D1.c4 as c4,
D1.c5 as c5,
D1.c6 as c6
from (select 0 as c1,
D1.c1 as c2,
D1.c2 as c3,
cast(NULL as DOUBLE PRECISION) as c4,
cast(NULL as DOUBLE PRECISION) as c5,
cast(NULL as DOUBLE PRECISION) as c6,
ROW_NUMBER() OVER(PARTITION BY D1.c1, D1.c2 ORDER BY D1.c1 ASC, D1.c2 ASC) as c7
from SAWITH0 D1) D1
where (D1.c7 = 1)
order by c2, c3) D1
where rownum <= 10000000
Can anybody tell what is going on here? I tried to restart BI services from EM, but that didn't help.
If the underlying model isn't valid you will always run into issues. It isn't "strange" since basically you modeled something which - to the model - implied that the fact had no valid relationship with the dimension. I.e. that the fact can't be analyzed by that dimension. Think in terms of conformed and non-conformed dimensions. Yours had become a non-conformed dimension.
Never forget that logically you model "relationships", not technical "joins".
Related
The whole point is to get peaks per period (e.g. 5m peaks) for value that accumulates. So it needs to be summed per period and then the peak (maximum) can be found in those sums. (select max(v) from (select sum(v) from t group by a1, a2))
I have a base table t.
Data are inserted into t, consider two attributes (time t1 and some string a2) and one numeric value.
Value accumulates so it needs to be summed to get the total volume over certain period. Example of rows inserted:
t1 | a2 | v
----------------
date1 | b | 1
date2 | c | 20
I'm using a MV to compute sumState() and from that I get peaks using sumMerge() and then max().
I need it only for max values so I was wondering I could use maxState() directly.
So this is what I do now: I use MV that computes a 5m sum and from that I read max()
CREATE TABLE IF NOT EXISTS sums_table ON CLUSTER '{cluster}' (
t1 DateTime,
a2 String,
v AggregateFunction(sum, UInt32)
)
ENGINE = ReplicatedAggregatingMergeTree(
'...',
'{replica}'
)
PARTITION BY toDate(t1)
ORDER BY (a2, t1)
PRIMARY KEY (a2);
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_a
ON CLUSTER '{cluster}'
TO sums_table
AS
SELECT toStartOfFiveMinute(t1) AS t1, a2,
sumState(toUInt32(v)) AS v
FROM t
GROUP BY t1, a2
from that I'm able to read max of 5m sum for a2 using
SELECT
a2,
max(sum) AS max
FROM (
SELECT
t1,
a2,
sumMerge(v) AS sum
FROM sums_table
WHERE t1 BETWEEN :fromDateTime AND :toDateTime
GROUP BY t1, a2
)
GROUP BY a2
ORDER BY max DESC
That works perfectly.
So I wanted to achieve the same using maxState and maxMerge():
CREATE TABLE IF NOT EXISTS max_table ON CLUSTER '{cluster}' (
t1 DateTime,
a2 String,
max_v AggregateFunction(max, UInt32)
)
ENGINE = ReplicatedAggregatingMergeTree(
'...',
'{replica}'
)
PARTITION BY toDate(t1)
ORDER BY (a2, t1)
PRIMARY KEY (a2)
CREATE MATERIALIZED VIEW IF NOT EXISTS mv_b
ON CLUSTER '{cluster}'
TO max_table
AS
SELECT
t1,
a2
maxState(v) AS max_v
FROM (
SELECT
toStartOfFiveMinute(t1) AS t1,
a2,
toUInt32(sum(v)) AS v
FROM t
GROUP BY t1, a2
)
GROUP BY t1, a2
and I thought if I get a max per time (t1) and a2, and then select max of that per a2, I'd get the maximum value for each a2, but I'm getting totally different max values using this query compared to the max of sums mentioned above.
SELECT
a2,
max(max) AS max
FROM (
SELECT
t1,
a2,
maxMerge(v) AS max
FROM max_table
WHERE t1 BETWEEN :fromDateTime AND :toDateTime
GROUP BY t1, a2
) maxs_per_time_and_a2
GROUP BY a2
What did I do wrong? Do I get MVs wrong? Is it possible to use maxState with maxMerge for 2+ attributes to compute max over a longer period, let's say year?
SELECT
t1,
a2
maxState(v) AS max_v
FROM (
SELECT
toStartOfFiveMinute(t1) AS t1,
a2,
toUInt32(sum(v)) AS v
FROM t
GROUP BY t1, a2
)
GROUP BY t1, a2
This is incorrect. And impossible.
Because MV is an insert trigger. It never reads REAL table t.
You are getting max from sum of rows in insert buffer.
If you insert 1 row with v=10. You will get max_v = 10. MatView does not "know" that a previous insert has added some rows, their sum is not taken into account.
I have 4 tables named A1, A2, B1, B2.
To fulfill a requirement, I have two ways to write SQL queries. The first one is:
(A1 UNION ALL A2) A JOIN (B1 UNION ALL B2) B ON A.id = B.a_id WHERE ...
And the second one is:
(A1 JOIN B1 on A1.id = B1.a_id WHERE ...) UNION ALL (A2 JOIN B2 on A2.id = B2.a_id WHERE ... )
I tried both approaches and realized they both give the same execution time and query plans in some specific cases. But I'm unsure whether they will always give the same performance or not.
So my question is when the first/second one is better in terms of performance?
In terms of coding, I prefer the first one because I can create two views on (A1 UNION ALL A2) as well as (B1 UNION ALL B2) and treat them like two tables.
The second one is better:
(A1 JOIN B1 on A1.id = B1.a_id WHERE ...) UNION ALL (A2 JOIN B2 on A2.id = B2.a_id WHERE ... )
It gives more information to Oracle CBO optimizer about how your tables are related to each other. CBO can calculate potentials plans' costs more precisely. It's all about cardinality, column statistics, etc.
Purely functionally, and without knowing what's in the tables,the first seems better - if data matches in a1 and b2, your 2nd query won't join it.
We have a query that fails in our Prod environment that fails with ora-01427 single-row subquery returns more rows.
This is oracle 11g database. Query as below. This query runs fine till we add the final left outer join with SQ3, once added it fails with ORA-1427 after some time.
select c1,c2..c8 from
t1 left join
(subquery with joins)SQ1
left join
(subquery with joins)SQ2
left join
(subquery with joins)SQ4
left join
(subquery with joins)SQ5
left join
(SELECT DISTINCT MAX(c1) c1, c2, c3, c4, c5,c6
FROM s1.t1 WHERE c2='NY' AND c7<'2' AND c8='Y'
GROUP BY c1, c2, c3, c4, c5,c6) SQ3 ON sq3.c3=t1.c3
AND sq3.c8=t1.c8
AND sq3.c7=t2.c6
AND sq3.c6 <'2'
AND sq3.c4='Y'
When i rewrite this query using WITH clause then it runs fine, see below. Any idea on why the first query fails when the second one below executes with no change to logic.
with
(SELECT DISTINCT MAX(c1) c1, c2, c3, c4, c5,c6
FROM s1.t1 WHERE c2='NY' AND c7<'2' AND c8='Y'
GROUP BY c1, c2, c3, c4, c5,c6) as SQ3
select c1,c2..c8 from
t1 left join
(subquery with joins)SQ1
left join
(subquery with joins)SQ2
left join
(subquery with joins)SQ4
left join
(subquery with joins)SQ5
left join
SQ3 ON sq3.c3=t1.c3
AND sq3.c8=t1.c8
AND sq3.c7=t2.c6
AND sq3.c6 <'2'
AND sq3.c4='Y'
Mira a ver esto de agrupar por la columna del agregado, no parece correcto
(SELECT DISTINCT MAX(c1) c1, c2, c3, c4, c5,c6
FROM s1.t1 WHERE c2='NY' AND c7<'2' AND c8='Y'
**
GROUP BY c1
**
You don't need to group by the column you are using with your aggregate function. So change your last query to -
SELECT MAX(c1) c1, c2, c3, c4, c5,c6
FROM s1.t1
WHERE c2 = 'NY'
AND c7 < '2'
AND c8 = 'Y'
GROUP BY c2, c3, c4, c5,c6
I wish to create a program in Java which will ask the user a number of questions and report some results. It is pretty much like a survey. In order to explain the problem better consider the following example:
Let’s say that there are currently 4 questions available eg Qa, Qb, Qc and Qd. Each question has a number of possible options:
=> Question A has 4 possible options a1, a2, a3 and a4.
=> Question B has 3 possible options b1, b2 and b3
=> Question C has 5 possible options c1, c2, c3, c4 and c5
=> Question D has 2 possible options d1 and d2
Moreover there are some results available which will be reported based on the user’s answers in the above questions. Let’s assume that there are 5 such results called R1, R2, R3, R4 and R5. Each result has a number of characteristics. These characteristics are really answers to the above questions. More precisely:
=> The characteristics of R1 is the set of {Qa = a4, Qb = b2, Qc = c2, Qd = d1}
This says that R1 is related with Qa via the a1 option, with Qb via the b2 option and so on
=> R2: {Qa = a3, Qb = b3, Qc = c3, Qd = d2}
=> R3: {Qa = a4, Qb = b1, Qc = c1, Qd = d2}
=> R4: {Qa = a2, Qb = b2, Qc = c5, Qd = d1}
=> R5: {Qa = a1, Qb = b3, Qc = c4, Qd = d2}
Let’s say that a user U provides the following answers to the questions
{Qa = a4, Qb = b1, Qc = c1, Qd = d1}
The purpose of the program is to report the result which is closer to the user answers along with a percentage of how close it is. For instance since there is no any result which matches 100% the user answers the program should report the results which match as more answers as possible (above a certain threshold eg 50%). In that specific case the program should report the follow results:
=> R3 with 75% (since there are 3 matches on the 4 questions)
=> R1 with 50% (since there are 2 matches on the 4 questions)
Notice that R4 has one match (so 25%) whereas R2 and R5 have no matches at all (so 0%).
The main issue on implementing the above program is that there are a lot of questions (approximately 30) with a number of answers each (3-4 answers each). I am not aware of an efficient algorithm which can retrieve the results which are closer to the user answers. Notice that the way that these results are stored is not important at all. You can assume that the results are stored in a relational database and that SQL query is used to retrieve them.
The only solution I can think of is to perform an exhaustive search but this not efficient at all. In other words I am thinking to do the following:
=> First try to retrieve results which match exactly the user answers:
{Qa = a4, Qb = b1, Qc = c1, Qd = d1}
=> If no results exist then change the option of a question (eg Qa) and try again. For example try:
{Qa = a1, Qb = b1, Qc = c1, Qd = d1}
=> If there is still nothing then try the rest possible options for Qa eg a2, a3
=> If there is still nothing then give Qa the initial user answer (that is a4) and move to Qb to do something similar. For example try something like: {Qa = a4, Qb = b2, Qc = c1, Qd = d1}
=> If after trying all the possible options for all questions one by one there are any results then try changing the options of COMBINATIONS of questions. For example try change the options of two questions at the same time (eg Qa and Qb): {Qa = a1, Qb = b2, Qc = c1, Qd = d1}
=> Then try combinations of three questions and so on...
Clearly the above algorithm would be extremely slow on a large number of questions. Is there any known algorithm or heuristic which is more efficient than the above algorithm?
Thanks in advance
"Only" 30 Questions?
Then the following "stupid" algorithm will probably be faster than any highly "intelligent" and complicated algorithm.
iterate over characteristics
score = 0
iterate over questions
if questions's answer is right in current characteristic
score++
Then add a variable which keeps track of the maximum value and matching characteristic and you are set.
Runtime is size of characteristics * size of questions, whereas the algorithm you are describing can have exponential runtime, and on top of that is much more complicated both for programming and for executing (due to effects as branch misprediction)
In my database, I have a user table and a workgroup table, and a many-to-many relationship. A user can belong to one or more workgroups. I am using entity framework for my ORM (EF 4.1 Code First).
User Table has users:
1,2,3,4,5,6,7,8,9,10
Workgroup table has workgroups:
A,B,C, D
WorkgroupUser table has entries
A1, A2, A3, A4, A5
B1, B3, B5, B7, B9
C2, C4, C6, C8, C10
D1, D2, D3, D9, D10
What I would like to do is:
Given user 4, it belongs to workgroups A,C
and has common users
1,2,3,4,5 (from workgroup A) and
2,4,6,8,10 from workgroup C
and the distinct set of users in common is 1,2,3,4,5,6,8,10
How do I write a LINQ statement (preferably in fluent API) for this?
Thank you,
Here's the general idea (since I don't know the properties of User and WorkGroup entity)
var result = context.users.Where(u => u.ID == 4)
.SelectMany(u => u.WorkGroups.SelectMany(wg => wg.Users))
.Distinct()