I have a table that has three columns: Worker ID, Task Name and Task Status. Each worker has four tasks, Task-1 through Task-4. Each task can have one of two status: Completed or Open. The four tasks are dependent, i.e., later task(s) cannot be Completed until preceding task(s) are completed. I am writing a SSRS report to pull these information out. In the report, I want to display Worker ID/Task Names/Task Status when at least one task is Open (i.e., hide the ones that all four tasks have been completed). Also I want to tally the number of workers that have at least one open task (I figured this out by CountDistinct(WorkerID)), as well as the number of workers that have only Task-4 open (could not figure this out). I have attached a sketch of the report that I wanted in the picture.
Please note: I DO NOT want worker 3 to show since all four tasks have been completed for that worker. If I filtered on the data set by only allowing Open tasks, then the 'Completed' rows for the other two workers are filtered out which is not what I wanted. Thanks.
If you are using SQL Server, you could do this in the dataset and then simply show the output in your report.
The Idea is to group your dataset by your workerid and tast status looking for count <4 or task status = 'OPEN' - In case all four tasks are open.
have a look at this sql fiddle
http://sqlfiddle.com/#!18/e051c/1/0
This should give you the output you desire.
Incase the Fiddle fails -
create table workers
(
workerid int
,taskname varchar (10)
,taskstatus varchar(10)
) ;
insert into workers (workerid,taskname,taskstatus )
values
(1,'Task-1','Completed'),
(1 , 'Task-2' ,'Open' ),
(1 , 'Task-3' ,'Open' ),
(1 , 'Task-4' ,'Open' ),
(2 , 'Task-1' ,'Completed' ),
(2 , 'Task-2' ,'Completed' ),
(2 , 'Task-3' ,'Completed' ),
(2 , 'Task-4' ,'Open' ),
(3 , 'Task-1' ,'Completed' ),
(3 , 'Task-2' ,'Completed' ),
(3 , 'Task-3' ,'Completed' ),
(3 , 'Task-4' ,'Completed' ),
(4 , 'Task-1' ,'Open' ),
(4 , 'Task-2' ,'Open' ),
(4 , 'Task-3' ,'Open' ),
(4 , 'Task-4' ,'Open' )
;with valid_workers as (
Select distinct
workerID
from workers
group by
workerID
,taskstatus
having count(*) <4 or taskstatus = 'Open'
)
Select w.*
from workers w
inner join valid_workers v
on v.workerID = w.workerID ;
-- Result
| workerid | taskname | taskstatus |
|----------|----------|------------|
| 1 | Task-1 | Completed |
| 1 | Task-2 | Open |
| 1 | Task-3 | Open |
| 1 | Task-4 | Open |
| 2 | Task-1 | Completed |
| 2 | Task-2 | Completed |
| 2 | Task-3 | Completed |
| 2 | Task-4 | Open |
| 4 | Task-1 | Open |
| 4 | Task-2 | Open |
| 4 | Task-3 | Open |
| 4 | Task-4 | Open |
Related
How to convert below SQL server recursive query in vertica. I know that vertica does not support recursive query. i tried using sum() over with lag but i am still not able to acheive final expected output.
with Product as (
select * from (
VALUES
(1, '2018-12-25','2019-01-05' ),
(1, '2019-03-01','2019-03-10' ),
(1, '2019-03-15','2019-03-19' ),
(1, '2019-03-22','2019-03-28' ),
(1, '2019-03-30','2019-04-02' ),
(1, '2019-04-10','2019-04-15' ),
(1, '2019-04-18','2019-04-25' )
) as a1 (ProductId ,ProductStartDt ,ProductEndDt)
), OrderedProduct as (
select *, ROW_NUMBER() over (order by ProductStartDt) as RowNum
from Product
), DateGroupsInterim (RowNum, GroupNum, GrpStartDt, Indx) as (
select RowNum, 1, ProductEndDt,1
from OrderedProduct
where RowNum=1
union all
select OrderedProduct.RowNum,
CASE WHEN OrderedProduct.ProductStartDt <= dateadd(day, 15, dgi.GrpStartDt)
THEN dgi.GroupNum
ELSE dgi.GroupNum + 1
END,
CASE WHEN OrderedProduct.ProductStartDt <= dateadd(day, 15, dgi.GrpStartDt)
THEN dgi.GrpStartDt
ELSE OrderedProduct.ProductEndDt
END,
CASE WHEN OrderedProduct.ProductStartDt <= dateadd(day, 15, dgi.GrpStartDt)
THEN 0
ELSE 1
END
from DateGroupsInterim dgi
join OrderedProduct on OrderedProduct.RowNum=dgi.RowNum+1
) select OrderedProduct.ProductId, OrderedProduct.ProductStartDt, OrderedProduct.ProductEndDt, DateGroupsInterim.GrpStartDt, DateGroupsInterim.GroupNum, Indx
from DateGroupsInterim
JOIN OrderedProduct on OrderedProduct.RowNum = DateGroupsInterim.RowNum
order by 2
Below is how the expected output looks like.
The operation you want to do is also called "sessionization" - which is the operation of splitting a time series into groups/ sub time series that have a certain meaning together.
The way you describe it, it does not seem to be possible:
The next group relies exactly on both the start of its previous group (15 min later than the start of the first row of the previous group) and the end of the previous group's last row. This needs to be a loop or a recursion, which is not offered by Vertica.
I managed to join the table with itself and get a session id for consecutive rows within 15 minutes. But, as of now, they're overlapping, and I found no way to determine which group I want to keep...
Like so:
WITH product(productid ,productstartdt ,productenddt) AS (
SELECT 1, DATE '2018-12-25',DATE '2019-01-05'
UNION ALL SELECT 1, DATE '2019-03-01',DATE '2019-03-10'
UNION ALL SELECT 1, DATE '2019-03-15',DATE '2019-03-19'
UNION ALL SELECT 1, DATE '2019-03-22',DATE '2019-03-28'
UNION ALL SELECT 1, DATE '2019-03-30',DATE '2019-04-02'
UNION ALL SELECT 1, DATE '2019-04-10',DATE '2019-04-15'
UNION ALL SELECT 1, DATE '2019-04-18',DATE '2019-04-25'
)
,
groups AS (
SELECT
a.productstartdt AS in_productstartdt
, b.*
, CONDITIONAL_CHANGE_EVENT(a.productstartdt) OVER(PARTITION BY a.productid ORDER BY a.productstartdt) AS grp
FROM product a
LEFT JOIN product b
ON a.productid = b.productid
AND a.productstartdt <= b.productstartdt
AND (a.productstartdt=b.productstartdt OR b.productstartdt <= a.productenddt + 15)
)
SELECT * FROM groups;
-- out in_productstartdt | productid | productstartdt | productenddt | grp
-- out -------------------+-----------+----------------+--------------+-----
-- out 2018-12-25 | 1 | 2018-12-25 | 2019-01-05 | 0
-- out 2019-03-01 | 1 | 2019-03-01 | 2019-03-10 | 1
-- out 2019-03-01 | 1 | 2019-03-22 | 2019-03-28 | 1
-- out 2019-03-01 | 1 | 2019-03-15 | 2019-03-19 | 1
-- out 2019-03-15 | 1 | 2019-03-15 | 2019-03-19 | 2
-- out 2019-03-15 | 1 | 2019-03-22 | 2019-03-28 | 2
-- out 2019-03-15 | 1 | 2019-03-30 | 2019-04-02 | 2
-- out 2019-03-22 | 1 | 2019-03-22 | 2019-03-28 | 3
-- out 2019-03-22 | 1 | 2019-03-30 | 2019-04-02 | 3
-- out 2019-03-22 | 1 | 2019-04-10 | 2019-04-15 | 3
-- out 2019-03-30 | 1 | 2019-04-10 | 2019-04-15 | 4
-- out 2019-03-30 | 1 | 2019-03-30 | 2019-04-02 | 4
-- out 2019-04-10 | 1 | 2019-04-10 | 2019-04-15 | 5
-- out 2019-04-10 | 1 | 2019-04-18 | 2019-04-25 | 5
-- out 2019-04-18 | 1 | 2019-04-18 | 2019-04-25 | 6
-- out (15 rows)
-- out
-- out Time: First fetch (15 rows): 35.454 ms. All rows formatted: 35.503 ms
What is the next difficulty is how to get rid of grp-s 2, 3, and 5 ....
I have a Column in a Database which contains multiple Values in one Column, which i need as different rows.
The Column contains comma delimited parts but also a Part with comma in brackets. I don't need to split this parts. (Only split on commas which are NOT in brackets)
Versions
Oracle 11g
Example:
**ID | Kategory**
1 | "ATD 5(2830),ATO 4(510),EDI 1,EH A1,SCI 2,SS 1,STO-SE 1(oral, CNS, blood),STO-SE 2(oral, respiratory effects)"
This string i need as
- 1 => ATD 5(2830)
- 1 => ATO 4(510)
- 1 => EDI 1
- 1 => EH A1
- 1 => SCI 2
- 1 => SS 1
- 1 => STO-SE 1(oral,CNS, blood)
- 1 => STO-SE 2(oral, respiratory effects)
Parts like (oral, CNS, blood) which contains comma in brackets i don't need to split.
You can use the regular expression (([^(]*?(\(.*?\))?)*)(,|$) to match:
[^(]*? Zero-or-more (but as few as possible) non-opening-bracket characters
(\(.*?\))? Then, optionally, an opening bracket and as few characters as possible until the closing bracket.
( )* Wrapped in a capturing group repeated zero-or-more times
( ) Wrapped in a capturing group to be able to reference the entire matched item
(,|$) Followed by either a comma or the end-of-string.
Like this:
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE table_name ( ID, Kategory ) AS
SELECT 1, 'ATD 5(2830),ATO 4(510),EDI 1,EH A1,SCI 2,SS 1,STO-SE 1(oral, CNS, blood),STO-SE 2(oral, respiratory effects)' FROM DUAL;
Query 1:
SELECT ID,
l.COLUMN_VALUE AS item,
REGEXP_SUBSTR(
Kategory,
'(([^(]*?(\(.*?\))?)*)(,|$)',
1,
l.COLUMN_VALUE,
NULL,
1
) AS value
FROM table_name t
CROSS JOIN
TABLE(
CAST(
MULTISET(
SELECT LEVEL
FROM DUAL
CONNECT BY LEVEL < REGEXP_COUNT( t.Kategory, '(([^(]*?(\(.*?\))?)*)(,|$)' )
)
AS SYS.ODCINUMBERLIST
)
) l
Results:
| ID | ITEM | VALUE |
|----|------|-------------------------------------|
| 1 | 1 | ATD 5(2830) |
| 1 | 2 | ATO 4(510) |
| 1 | 3 | EDI 1 |
| 1 | 4 | EH A1 |
| 1 | 5 | SCI 2 |
| 1 | 6 | SS 1 |
| 1 | 7 | STO-SE 1(oral, CNS, blood) |
| 1 | 8 | STO-SE 2(oral, respiratory effects) |
i'am searching for a smart oracle sql solution to distribute data into a number of buckets. The order of x is important. I know there are a lot of algorithms but iam pretty sure there must be smart sql (analytic function) solution e.g. NTILE(3) but i don't get it.
x|quantity
1|7
2|4
3|9
4|2
5|10
6|3
8|7
9|7
10|4
11|9
12|2
13|10
16|3
17|7
The result should look something like this:
x_from|x_to|sum(quantity)
1|4|22
...and so on
Thanks in advance
Tim
This example divides the table into 4 buckets (ntile( 4 )):
SELECT min( "x" ) as "From",
max( "x" ) as "To",
sum("quantity")
FROM (
SELECT t.*,
ntile( 4 ) over (order by "x" ) as group_no
FROM table1 t
)
GROUP BY group_no
ORDER BY 1;
| From | To | SUM("QUANTITY") |
|------|----|-----------------|
| 1 | 4 | 22 |
| 5 | 9 | 27 |
| 10 | 12 | 15 |
| 13 | 17 | 20 |
I'm wondering if it is possible to create a calculated member to obtain the sum of distinct values for a fact. I will try to explain it with the following example:
I have a fact where the primary key is related with two dimensions (one to many cardinality). The fact contains a measure and its value is the same for all members of each distinct combination of FACT_ID and DIM_1_ID. For the total, I don't want to consider multiple times the same values. So, with the following values the total should be 450 and not 850 (default Mondrian behavior).
| FACT_ID | DIM_1_ID | DIM_2_ID | MEASURE |
|---------|----------|----------|---------|
| 1 | A | D | 100 |
| 1 | A | E | 100 |
| 1 | B | F | 50 |
| 2 | A | D | 300 |
| 2 | A | E | 300 |
|---------|----------|----------|---------|
TOTAL | 450 |
Is it possible? How can it be done with Mondrian?
Thanks in advance
UPDATE - Current status
As described in one of the comments bellow, base on #whytheq's answer, I managed to calculate the right value for the total, using the following MDX formula for the measure:
Sum(
Order(
[dActivity.hActivity].[lActivity].MEMBERS*[dFacility.hFacility].[lFacility].MEMBERS,
[dActivity.hActivity].[lActivity].currentmember.name
) as [m_set] ,
iif(
[m_set].currentordinal = 0
OR
not(
[m_set]
.item([m_set].currentordinal)
.item(0).NAME
=
[m_set]
.item([m_set].currentordinal-1)
.item(0).NAME
) ,
[Measures].[mBudget]
,
0
)
)
However, this expression is using the complete set for every single row, so the result overrides the measure real value for the different fact rows.
| FACT_ID | DIM_1_ID | DIM_2_ID | MEASURE |
|---------|----------|----------|---------|
| 1 | A | D | 450 |
| 1 | A | E | 450 |
| 1 | B | F | 450 |
| 2 | A | D | 450 |
| 2 | A | E | 450 |
|---------|----------|----------|---------|
TOTAL | 450 |
Great question - really tricky to do in MDX.
If we do the following then there are 158 rows returned - a handful have duplicate values for [Measures].[Internet Sales Amount]:
SELECT
[Measures].[Internet Sales Amount] ON 0
,NON EMPTY
Order
(
[Product].[Product].[Product]
,[Measures].[Internet Sales Amount]
,bdesc
) ON 1
FROM [Adventure Works];
This only counts them if the member above is different for the respective measure:
WITH
SET [x] AS
Order
(
NonEmpty
(
[Product].[Product].[Product]
,[Measures].[Internet Sales Amount]
)
,[Measures].[Internet Sales Amount]
,bdesc
)
SET [FILTERED] AS
Filter
(
[x]
,
(
[x].Item(
[x].CurrentOrdinal - 1)
,[Measures].[Internet Sales Amount]
)
<>
(
[x].Item(
[x].CurrentOrdinal)
,[Measures].[Internet Sales Amount]
)
)
MEMBER [Measures].[distCount] AS
Count([FILTERED])
SELECT
[Measures].[distCount] ON 0
FROM [Adventure Works];
Maybe try adding the EXISTING keyword into your calculatio:
Sum
(
Order
(
EXISTING //<<<
[dActivity.hActivity].[lActivity].MEMBERS
*
[dFacility.hFacility].[lFacility].MEMBERS
,[dActivity.hActivity].[lActivity].CurrentMember.Name
) AS [m_set]
,IIF
(
[m_set].CurrentOrdinal = 0
OR
(NOT
[m_set].Item(
[m_set].CurrentOrdinal).Item(0).Name
=
[m_set].Item(
[m_set].CurrentOrdinal - 1).Item(0).Name)
,[Measures].[mBudget]
,0
)
)
You could try to obtain the average over the set. The code is a bit complex.
WITH SET SomeSet AS
{
Fact.FactID.FactID.MEMBERS
*
Fact.DimID1.DimID1.MEMBERS
*
Fact.DimID2.DimID2.MEMBERS
}
MEMBER Measures.AvgVal AS
AVG
(
{Fact.FactID.CURRENTMEMBER}
*
{Fact.DimID1.CURRENTMEMBER}
*
NonEmpty
(
Fact.DimID2.DimID2.MEMBERS,
{{Fact.FactID.CURRENTMEMBER} *
{Fact.DimID1.CURRENTMEMBER}} *
[Measures].[TheMeasure]
)
,
[Measures].[TheMeasure]
)
SELECT NON EMPTY SomeSet ON 1,
NON EMPTY {
[Measures].[TheMeasure],
Measures.AvgVal
} on 0
from [YourCube]
What I am doing is, for the current FactID- DimID1 combination on the axis, I am getting the list of all possible DimID2s and then, over the internally generated non-empty tuples of FactID-DimID1-DimID2, deriving the average value of the measure TheMeasure
So, for example (100+100)/2 = 100 value would be displayed for the combination of FactID = 1 and DimID1 = A
table looks kind of like:
create table taco (
taco_id int primary key not null,
taco_name varchar(255),
taco_prntid int,
meat_id int,
meat_inht char(1) -- inherit meat
)
data looks like:
insert into taco values (1, '1', null, 1, 'N');
insert into taco values (2, '1.1', 1, null, 'Y');
insert into taco values (3, '1.1.1', 2, null, 'N');
insert into taco values (4, '1.2', 1, 2, 'N');
insert into taco values (5, '1.2.1', 4, null, 'Y');
insert into taco values (6, '1.1.2', 2, null, 'Y');
or...
- 1 has a meat_id=1
- 1.1 has a meat_id=1 because it inherits from its parent via taco_prntid=1
- 1.1.1 has a meat_id of null because it does NOT inherit from its parent
- 1.2 has a meat_id=2 and it does not inherit from its parent
- 1.2.1 has a meat_id=2 because it does inherit from its parent via taco_prntid=4
- 1.1.2 has a meat_id=1 because it does inherit from its parent via taco_prntid=2
Now... how in the world do I query what the meat_id is for each taco_id? What is below did work until I realized that I wasn't using the inheritance flag and some of my data was messing up.
select x.taco_id,
x.taco_name,
to_number(substr(meat_id,instr(rtrim(meat_id), ' ', -1)+1)) as meat_id
from ( select taco_id,
taco_name,
level-1 "level",
sys_connect_by_path(meat_id, ' ') meat_id
from taco
start with taco_prntid is null
connect by prior taco_id = taco_prntid
) x
I can post some failed attempts to modify my query above but they're rather embarrassing failures. I haven't worked with hierarchical queries at all before beyond the basics so I'm hoping there is some keyword or concept I'm not aware I should be searching for.
I posted an answer myself down at the bottom to show what I ended up with ultimately. I'm leaving the other answer as accepted because they were able to make the data more clear for me and without it, I wouldn't have gotten anywhere.
Your inner query is correct. All you need is to pick only the rightmost number from the meat_id column of inner query, when flag is Y.
I have used REGEXP_SUBSTR function to get the rightmost number and CASE statement to check the flag.
SQL Fiddle
Query 1:
select taco_id,
taco_name,
taco_prntid,
case meat_inht
when 'N' then meat_id
when 'Y' then to_number(regexp_substr(meat_id2,'\d+\s*$'))
end meat_id,
meat_inht
from ( select taco_id,
taco_name,
taco_prntid,
meat_id,
meat_inht,
level-1 "level",
sys_connect_by_path(meat_id, ' ') meat_id2
from taco
start with taco_prntid is null
connect by prior taco_id = taco_prntid
)
order by 1
Results:
| TACO_ID | TACO_NAME | TACO_PRNTID | MEAT_ID | MEAT_INHT |
|---------|-----------|-------------|---------|-----------|
| 1 | 1 | (null) | 1 | N |
| 2 | 1.1 | 1 | 1 | Y |
| 3 | 1.1.1 | 2 | (null) | N |
| 4 | 1.2 | 1 | 2 | N |
| 5 | 1.2.1 | 4 | 2 | Y |
| 6 | 1.1.2 | 2 | 1 | Y |
Query 2:
select taco_id,
taco_name,
taco_prntid,
meat_id,
meat_inht,
level-1 "level",
sys_connect_by_path(meat_id, ' ') meat_id2
from taco
start with taco_prntid is null
connect by prior taco_id = taco_prntid
Results:
| TACO_ID | TACO_NAME | TACO_PRNTID | MEAT_ID | MEAT_INHT | LEVEL | MEAT_ID2 |
|---------|-----------|-------------|---------|-----------|-------|----------|
| 1 | 1 | (null) | 1 | N | 0 | 1 |
| 2 | 1.1 | 1 | (null) | Y | 1 | 1 |
| 3 | 1.1.1 | 2 | (null) | N | 2 | 1 |
| 6 | 1.1.2 | 2 | (null) | Y | 2 | 1 |
| 4 | 1.2 | 1 | 2 | N | 1 | 1 2 |
| 5 | 1.2.1 | 4 | (null) | Y | 2 | 1 2 |
This is what I've ended up with so far... after applying the logic in the accepted answer. I added a few more things so that I can join the result up against my meat table. the upper case could be optimized a little bit but I am so over this part of the query so.... it's going to have to stay for now.
select x.taco_id,
x.taco_name,
x.taco_prntname,
meat_id
,case when to_number(regexp_substr(meat_id,'\d+\s*$'))=0 then null else
to_number(regexp_substr(meat_id,'\d+\s*$')) end as meat_id
from ( select taco_id,
taco_name,
taco_prntname,
level-1 "level",
sys_connect_by_path(
case when meat_inht='N' then nvl(to_char(meat_id),'0') else '' end
,' ') meat_id
from taco join jobdtl on jobdtl.jobdtl_id=taco.jobdtl_id
start with taco_prntid is null
connect by prior taco_id = taco_prntid
) x
(do you ever wonder, when you read questions like this, what the real schema is? obviously I am not working on a taco project. or does it even matter as long as the general relationships and concept is preserved?)