I have a table with order number and product code. One order can have multiple lines. I would like to count the number of orders which have BOTH product code A AND product code B.
My table looks like this:
OrderNumber ProductCode
Order1 A
Order1 B
Order2 B
Order3 A
Order3 B
Order4 C
So for this result set, the answer would be 2, as both Order1 and Order3 contains both A and B.
I would very much like to create this as a measure in dax.
Thank you.
Try this:
SUMX(
CALCULATETABLE(
VALUES(MyTable[OrderNumber]),
MyTable[ProductCode] = "A"
),
IF(CALCULATE(COUNTROWS(MyTable),MyTable[ProductCode] = "B") > 0,1)
)
Related
I have a table like below in power BI with two columns Category and subcategory. I m trying to get the count of subcategory="S2" for each category into a calculated column (like in S2_count).
Category Subcategory S2_count
A S1 1
A S2 1
A S1 1
B S1 2
B S3 2
B S2 2
B S2 2
C S2 2
C S3 2
C S2 2
Is there a way using the DAX to get this ? I tried the below formula but no clue how to apply both filter and group by
s2_count =
CALCULATE(
COUNT(Test01[subcategory]),
GROUPBY(Test01,Test01[subcategory]))
Thy this:
s2_count =
COUNTROWS (
FILTER (
'Test01',
'Test01'[Category] = EARLIER ( 'Test01'[Category] )
&& 'Test01'[Subcategory] = "S2"
)
)
The EARLIER Function will return 'Test01'[Category] in its previous filtercontext, which is the rowcontext.
You can also do this using CALCULATE.
s2_count =
CALCULATE( COUNTROWS( Test01 ),
Test01[Subcategory] = "S2",
ALLEXCEPT( Test01, Test01[Category] )
)
The ALLEXCEPT function removes any of the row context except for the Category.
Note: If there are no other columns in your table, you don't need the ALLEXCEPT argument and you can just use this instead:
s2_count = CALCULATE( COUNTROWS( Test01 ), Test01[Subcategory] = "S2" )
If you do have other columns though, they are passed from row context to filter context along with the Category and you won't get the right result.
I have three tables:
products purchased (RecordEntered as A)
products sold in the country (SoldInCountry as B)
products sold outside the country (SoldOutCountry as C)
Each record in A could be:
entered and not yet sold
entered and sold only in the country
entered and sold only out of the country
entered and sold in the country and also outside the country
I started grouping the pieces in table B like so:
SELECT
A.IdRecord, A.Qty, sum(isnull(B.Qty,0)) AS Expr1
FROM
RecordEntered AS A
LEFT OUTER JOIN
SoldInCountry AS B ON A.IdRecord = B.IdRecord
group by A.IdRecord, A.Qty
But I do not know how to go on.
I would like a query to show me how many pieces I still have in stock.
Like this:
A.Qty - (SUM(ISNULL(B.Qty, 0)) + SUM(ISNULL(C.Qty, 0)))
I wrote an example in SQL, but the goal is LINQ:
from a in _ctx.....
where .....
select...
thanks
It isn't easy to do a full outer join in LINQ (see my answer here: https://stackoverflow.com/a/43669055/2557128) but you don't need that to solve this:
var numInStock = from item in RecordEntered
select new {
item.Code,
Qty = item.Qty - (from sic in SoldInCountry where sic.IdRecord == item.IdRecord select sic.Qty).SingleOrDefault() -
(from soc in SoldOutCountry where soc.IdRecord == item.IdRecord select soc.Qty).SingleOrDefault()
};
I assumed there would only be one sold record of each type for an item, if there could be more than one, you would need to Sum the matching records:
var numInStock = from item in RecordEntered
select new {
item.Code,
Qty = item.Qty - (from sic in SoldInCountry where sic.IdRecord == item.IdRecord select sic.Qty).DefaultIfEmpty().Sum() -
(from soc in SoldOutCountry where soc.IdRecord == item.IdRecord select soc.Qty).DefaultIfEmpty().Sum()
};
select account_no, amount, customer from transactions where branch = 'Pennywell'
select c.customer_name, c.cust_street, c.cust_city, b.branch_name, b.branch_city, a.account_no, a.balance from customer c, transactions t, accounts a, branch b
where t.customer = c.customer_name and a.account_no = t.account_no and b.branch_name = a.branch_name
select customer_name, cust_city from customer where customer_name not in (select customer from transactions)
The first one is just selection on Pennywell followed by projection on account_no, amount, customer:
\pi_{account_no, amount, customer} (\sigma_branch = 'Pennywell'(transactions))
The second one follows the same principle:
List all your tables:
customer, transactions, accounts, branch
Rename each one of the using \rho:
\rho_c(customer), \rho_t(transactions), \rho_a(accounts),
\rho_b(branch)
Calculate the Cartesian product
\rho_c(customer) x \rho_t(transactions) x \rho_a(accounts) x
\rho_b(branch)
Perform the selection ("where") on the result of step 3 replacing and by conjunctions, or by disjunctions and not by negations:
\sigma_{t.customer = c.customer_name /\ a.account_no = t.account_no /\ b.branch_name = a.branch_name}(\rho_c(customer) x \rho_t(transactions) x \rho_a(accounts) x \rho_b(branch))
Finally perform the projection:
\pi_{c.customer_name, c.cust_street, c.cust_city, b.branch_name, b.branch_city, a.account_no, a.balance}(\sigma_{t.customer = c.customer_name /\ a.account_no = t.account_no /\ b.branch_name = a.branch_name}(\rho_c(customer) x \rho_t(transactions) x \rho_a(accounts) x \rho_b(branch)))
The last query is a bit more tricky and it involves a bit more thinking.
\pi_{customer_name}(transactions)
are all the customers we want to ignore and
\pi_{customer_name}(customer)
are all customers. Hence,
\pi_{customer_name}(customer) - \pi_{customer_name}(transactions)
are all those we want to keep. Finally we need to find their cities (for the sake of simplicity I'm using the join operator |x|):
\pi_{customer_name, cust_city}((\pi_{customer_name}(customer) - \pi_{customer_name}(transactions)) |x| customer)
I have a question about Linq select statement. I am new to Linq so any help will be very helpful. I did a lot of research but I still didn't manage to write down correct Linq statement.
I have this two tables and attributes:
Table Titles(title_id(PK), title) and
Table Sales(title_id(PK), qty)
where are title_id and title string values and qty is a number which represents some quantity.
I need to write a select which will take five most selling titles from this two tables.
So, I need to make sum from qty (we can have more records with the same Sales.title_id attribute) and make group by title_id and order by sum(qty) descending and then return attributes title and title_id.
How can I make suitable solution for my question?
Regards,
Dahakka
You can do group join of tables by title_id (each group g will represent all sales of joined title). Then select title description and total of sales for that title. Order result by totals, select title and take required number of top sales titles:
var query = (from t in db.Titles
join s in db.Sales on t.title_id equals s.title_id into g
select new { Title = t.title, Total = g.Sum(x => x.qty) } into ts
orderby ts.Total descending
select ts.Title).Take(5);
Resulting SQL will look like:
SELECT TOP (5) [t2].[title] AS [Title], [t2].[value] AS [Total]
FROM (
SELECT [t0].[title_id], (
SELECT SUM([t1].[qty])
FROM [Sales] AS [t1]
WHERE [t0].[title_id] = [t1].[title_id]
) AS [value]
FROM [Titles] AS [t0]
) AS [t2]
ORDER BY [t2].[value] DESC
Following is the linq query in method syntax
sales.GroupBy(s=>s.title_id)
.Select ( x =>
new {
Title_id = x.Key,
Sales= x.Sum (x=> x.qty)
})
.OrderByDescending(x=>x.Sales).Take(5)
.Join( titles,
sale=>sale.Title_id,
title=> title.title_id,
(sale, title)=> new
{
Title = title.Title,
TotalSales=sale.Sales
}
);
In addition to this question SQL query that gives distinct results that match multiple columns
which had very neat solution, I was wondering how the next step would look:
DOCUMENT_ID | TAG
----------------------------
1 | tag1
1 | tag2
1 | tag3
2 | tag2
3 | tag1
3 | tag2
4 | tag1
5 | tag3
So, to get all the document_ids that have tag 1 and 2 we would perform a query like this:
SELECT document_id
FROM table
WHERE tag = 'tag1' OR tag = 'tag2'
GROUP BY document_id
HAVING COUNT(DISTINCT tag) = 2
Now, what would be interesting to know is how we would get all the distinct document_ids that have tags 1 and 2, and in addition to that the ids that have tag 3.
We could imagine making the same query and performing a union between them:
SELECT document_id
FROM table
WHERE tag = "tag1" OR tag = "tag2"
GROUP BY document_id
HAVING COUNT(DISTINCT tag) = 2
UNION
SELECT document_id
FROM table
WHERE tag = "tag3"
GROUP BY document_id
But I was wondering if with that condition added, we could think of another initial query. I am imagining having many "unions" like that with different tags and tag counts.
Wouldn't it be very bad in terms of performance to create chains of unions like that?
This still uses unions of sorts but may be easier to read and control. I am really interested on the speed of this query on a large data set, so please let me know how fast it is. When I put in your small data set it took 0.0001 secs.
SELECT DISTINCT (dt1.document_id)
FROM
document_tag dt1,
(SELECT document_id
FROM document_tag
WHERE tag = 'tag1'
) AS t1s,
(SELECT document_id
FROM document_tag
WHERE tag = 'tag2'
) AS t2s,
(SELECT document_id
FROM document_tag
WHERE tag = 'tag3'
) AS t3s
WHERE
(dt1.document_id = t1s.document_id
AND dt1.document_id = t2s.document_id
)
OR dt1.document_id = t3s.document_id
This will make it easy to add new parameters because you have already specified the result set for each tag.
For example adding:
OR dt1.document_id = t2s.document_id
to the end will also pick up document_id 2
It's possible to do this within a single, however you'll need to promote your WHERE clause into the having clause in order to use a disjunctive.
You're correct, that will get slower and slower as you add new tags you want to look for in additional UNION clauses. Each UNION clause is an additional query that needs to be planned and executed. Plus you won't be able to sort when you are done.
You're looking for a basic data warehousing technique. First, let me recreate your schema with one additional table.
create table a (document_id int, tag varchar(10));
insert into a values (1, 'tag1'), (1, 'tag2'), (1, 'tag3'), (2, 'tag2'),
(3, 'tag1'), (3, 'tag2'), (4, 'tag1'), (5, 'tag3');
create table b (tag_group_id int, tag varchar(10));
insert into b values (1, 'tag1'), (1, 'tag2'), (2, 'tag3');
Table b contains "tag groups". Group 1 includes tag1 and tag2, while group 2 contains tag3.
Now you can modify table b to represent the query you are interested in. When you are ready to query, you create temp tables to store aggregate data:
create temporary table c
(tag_group_id int, count_tags_in_group int, tags_in_group varchar(255));
insert into c
select
tag_group_id,
count(tag),
group_concat(tag)
from b
group by tag_group_id;
create temporary table d (document_id int, tag_group_id int, document_tag_count int);
insert into d
select
a.document_id,
b.tag_group_id,
count(a.tag) as document_tag_count
from a
inner join b on a.tag = b.tag
group by a.document_id, b.tag_group_id;
Now c contains the number of tags for tag group, and d contains the number of tags each document has for each tag group. If a row in c matches a row in d, then that means that document has all of the tags in that tag group.
select
d.document_id as "Document ID",
c.tags_in_group as "Matched Tag Group"
from d
inner join c on d.tag_group_id = c.tag_group_id
and d.document_tag_count = c.count_tags_in_group
One cool thing about this approach is that you could run reports like 'How many documents have 50% or more of the tags in each of these tag groups?'
select
d.document_id as "Document ID",
c.tags_in_group as "Matched Tag Group"
from d
inner join c on d.tag_group_id = c.tag_group_id
and d.document_tag_count >= 0.5 * c.count_tags_in_group