Hibernate, Order by count of subquery on ManyToMany using Predicates - spring

I have a query written using Predicates in Hibernate and I need to add a subquery on a join table to count the number of joins and order by the number of joins where they exist in an array of ids.
The join table is a ManyToMany relation.
I am using flyway to setup the table schema, so while the join table exists in the database, a join model is not needed in Hibernate to join the 2 related models therefore no join model exists.
I don't care about retrieving these related models, I just want the number of joins so that I can order by them.
The following is PostGreSQL, which works. I need to convert the following PSQL into a Predicate based query:
SELECT u.*, COUNT(jui.interest_id) AS juiCount
FROM "user" u
LEFT JOIN (
SELECT ui.user_id, ui.interest_id
FROM user_interest ui
WHERE ui.interest_id IN (?)
) AS jui ON u.id = jui.user_id
GROUP BY u.id
ORDER BY juiCount DESC
Where the ids provided for the IN condition are passed into the subquery. The above query is in PostGreSQL.
Working with what I have so far:
CriteriaBuilder b = em.getCriteriaBuilder();
CriteriaQuery<User> q = b.createQuery(User.class);
Root<User> u = q.from(User.class);
// This doesn't make sense because this is not a join table
// Subquery<Interest> sq = q.subquery(Interest.class)
// Root<Interest> squi = sq.from(Interest.class);
// sq.select(squi);
// sq.where(b.in("interest_id", new LiteralExpression<Long[]>((CriteriaBuilderImpl) b, interestIds)));
q.orderBy(
// b.desc(b.tuple(u, b.count(squi))),
b.asc(u.get(User_.id))
);
q.where(p);
return em
.createQuery(q)
.getResultList();
Everything I have managed to find doesn't quite seem to fit right given that they are not using ManyToMany in the use of q.subquery() in their example.
Anyone can help fill in the blanks on this please?

Related

Using query hints to use a index in an inner table

I have a query which uses the view a as follows and the query is extremely slow.
select *
from a
where a.id = 1 and a.name = 'Ann';
The view a is made up another four views b,c,d,e.
select b.id, c.name, c.age, e.town
from b,c,d,e
where c.name = b.name AND c.id = d.id AND d.name = e.name;
I have created an index on the table of c named c_test and I need to use it when executing the first query.
Is this possible?
Are you really using this deprecated 1980s join syntax? You shouldn't. Use proper explicit joins (INNER JOIN in your case).
You are joining the two tables C and D on their IDs. That should mean they are 1:1 related. If not, "ID" is a misnomer, because an ID is supposed to identify a row.
Now let's look at the access route: You have the ID from table B and the name from tables B and C. We can tell from the column name that b.id is unique and Oracle guarantees this with a unique index, if the database is set up properly.
This means the DBMS will look for the B row with ID 1, find it instantly in the index, find the row instantly in the table, see the name and see whether it matches 'Ann'.
The only thing that can be slow hence is joining C, D, and E. Joining on unique IDs is extremely fast. Joining on (non-unigue?) names is only fast, if you provide indexes on the names. I'd recommend the following indexes accordingly:
create index idx_c on c (name);
create index idx_e on e (name);
To get this faster still, use covering indexes instead:
create index idx_b on b (id, name);
create index idx_c on c (name, id, age);
create index idx_d on d (id, name);
create index idx_e on e (name, town);

How to return values from query when joining multiple tables in Linq?

I have a Linq queries that have tables join and couple of tables inner join together. Sometimes I got an error from the query when table is empty. What I trying to do is I am tryting to get a value from table even if other table is empty.
Thanks in Advance.
You need to do left join
Assuming left join between customer and order table.
var query =
from customer in dc.Customers
from order
in dc.Orders
.Where(o => customer.CustomerId == o.CustomerId)
.DefaultIfEmpty()
select new { Customer = customer, Order = order }
Also refer below link
http://forums.asp.net/t/1792428.aspx/1

How to improve LINQ to EF performance

I have two classes: Property and PropertyValue. A property has several values where each value is a new revision.
When retrieving a set of properties I want to include the latest revision of the value for each property.
in T-SQL this can very efficiently be done like this:
SELECT
p.Id,
pv1.StringValue,
pv1.Revision
FROM dbo.PropertyValues pv1
LEFT JOIN dbo.PropertyValues pv2 ON pv1.Property_Id = pv2.Property_Id AND pv1.Revision < pv2.Revision
JOIN dbo.Properties p ON p.Id = pv1.Property_Id
WHERE pv2.Id IS NULL
ORDER BY p.Id
The "magic" in this query is to join on the lesser than condition and look for rows without a result forced by the LEFT JOIN.
How can I accomplish something similar using LINQ to EF?
The best thing I could come up with was:
from pv in context.PropertyValues
group pv by pv.Property into g
select g.OrderByDescending(p => p.Revision).FirstOrDefault()
It does produce the correct result but is about 10 times slower than the other.
Maybe this can help. Where db is the database context:
(
from pv1 in db.PropertyValues
from pv2 in db.PropertyValues.Where(a=>a.Property_Id==pv1.Property_Id && pv1.Revision<pv2.Revision).DefaultIfEmpty()
join p in db.Properties
on pv1.Property_Id equals p.Id
where pv2.Id==null
orderby p.Id
select new
{
p.Id,
pv1.StringValue,
pv1.Revision
}
);
Next to optimizing a query in Linq To Entities, you also have to be aware of the work it takes for the Entity Framework to translate your query to SQL and then map the results back to your objects.
Comparing a Linq To Entities query directly to a SQL query will always result in lower performance because the Entity Framework does a lot more work for you.
So it's also important to look at optimizing the steps the Entity Framework takes.
Things that could help:
Precompile your query
Pre-generate views
Decide for yourself when to open the database connection
Disable tracking (if appropriate)
Here you can find some documentation with performance strategies.
if you want to use multiple conditions (less than expression) in join you can do this like
from pv1 in db.PropertyValues
join pv2 in db.PropertyValues on new{pv1.Property_ID, Condition = pv1.Revision < pv2.Revision} equals new {pv2.Property_ID , Condition = true} into temp
from t in temp.DefaultIfEmpty()
join p in db.Properties
on pv1.Property_Id equals p.Id
where t.Id==null
orderby p.Id
select new
{
p.Id,
pv1.StringValue,
pv1.Revision
}

Writing a LINQ query to traverse many tables

I wanted to write a LINQ query based on the SQL below.
Basically this strategy seems really confusing - why start from MerchantGroupMerchant and do 2 'from' statements?
Problem: Is there a simpler way to write this LINQ query?
var listOfCampaignsMerchantIsInvolvedIn =
(from merchantgroupactivity in uow.MerchantGroupActivities
from merchantgroupmerchant in uow.MerchantGroupMerchants
where merchantgroupmerchant.MerchantU.Id == merchantUIDGuid
select new
{
merchantgroupactivity.ActivityU.CampaignU.Id
}).Distinct();
Here is the table structure:
and the SQL:
SELECT DISTINCT Campaign.ID
FROM Campaign
INNER JOIN Activity
ON ( Campaign.CampaignUID = Activity.CampaignUID )
INNER JOIN MerchantGroupActivity
ON ( Activity.ActivityUID = MerchantGroupActivity.ActivityUID )
INNER JOIN MerchantGroup
ON ( MerchantGroup.MerchantGroupUID = MerchantGroupActivity.MerchantGroupUID )
INNER JOIN MerchantGroupMerchant
ON ( MerchantGroupMerchant.MerchantGroupUID = MerchantGroup.MerchantGroupUID )
INNER JOIN Merchant
ON ( Merchant.MerchantUID = MerchantGroupMerchant.MerchantUID )
WHERE Merchant.ID = 'M1'
No, not really, even if you use views to partially or completely reduce query size your execution plan will still look the same in the end (and execute just as fast/slow). If you have to traverse 5 joins then you have to traverse 5 joins, the only cure is "shorting" the model by introducing links between say merchant and activity or merchant and campaign. You can accomplish this by either introducing the M2M table between them (at the cost of manual maintenance), but I would not recommend it unless retrieval is really an issue. If this query is too slow you should check for existence of indexes on all join FK fields.

Linq To Entity Framework selecting whole tables

I have the following Linq statement:
(from order in Orders.AsEnumerable()
join component in Components.AsEnumerable()
on order.ORDER_ID equals component.ORDER_ID
join detail in Detailss.AsEnumerable()
on component.RESULT_ID equals detail.RESULT_ID
where orderRestrict.ORDER_MNEMONIC == "MyOrderText"
select new
{
Mnemonic = detail.TEST_MNEMONIC,
OrderID = component.ORDER_ID,
SeqNumber = component.SEQ_NUM
}).ToList()
I expect this to put out the following query:
select *
from Orders ord (NoLock)
join Component comp (NoLock)
on ord .ORDER_ID = comp.ORDER_ID
join Details detail (NoLock)
on comp.RESULT_TEST_NUM = detail .RESULT_TEST_NUM
where res.ORDER_MNEMONIC = 'MyOrderText'
but instead I get 3 seperate queries that select all rows from the tables. I am guessing that Linq is then filtering the values because I do get the correct values in the end.
The problem is that it takes WAY WAY too long because it is pulling down all the rows from all three tables.
Any ideas how I can fix that?
Remove the .AsEnumerable()s from the query as these are preventing the entire query being evaluated on the server.

Resources