I have a generic repository and and a method
public List<employee> List()
{
return _dbContext.Set<employee>().Where(x => x.Status == "Active")ToList();
}
This call is taking 2 to 5 minutes if I have 2000 records
If I am using EF directly with out repository it is working very fast
var p = from emp in context.Inboxes where employee.Status == "Active" select employee;
How I can improve performance of repository collection?
Related
I created one class
class Employee { Integer id; String name; String departments; }
and in sql server database i have records
I stored departments as ";" separated. For Example Department = Computer;Civil
1,Chaitanya,Computer;Civil
2,Tom,Physics;Chemistry
3,Harry,Economics;Commerce
4,Henry,Computer;Civil;Mechanical
5,Ravi,null
Now i want to filter data with departments let's say there is one multiselect in frontend where i have list of departments and i select two departments for example-> Computer,Civil and in backend i got List<String> deparmentFilter as parameter say Computer;Civil
Now as per my requirement i have to return two data from Spring Boot Controller
1,Chaitanya,Computer;Civil
4,Henry,Computer;Civil;Mechanical
Right Now what i did is i executed the query to fetch all the records and then i right below logic
List<Employee> employeesToBeRemoved = new ArrayList<>();
if (!departmentNames.isEmpty()) {
allEmployees.forEach(employee -> {
if (employee.getDepartment() != null) {
Set<String> departmentNamesResult = new HashSet<>(Arrays.asList(employee.getDepartment().
split(";")));
Boolean isExist = Collections.disjoint(departmentNamesResult, departmentNames);
if (Boolean.TRUE.equals(isExist)) {
employeesToBeRemoved.add(employee);
}
} else {
employeesToBeRemoved.add(employee);
}
});
}
allEmployees.removeAll(employeesToBeRemoved);
I tried to move it to predicates but not able to do that, This solution is taking much time to execute,
Please suggest me some other better ways (optimized way) to improve performance.
Is there is any way to add this filter in predicates?
Another approach i am thinking (12/05/2022)
Let's say i have one table employee_department_mapping and in that table i have employeeId and departmentName so in this correct way to add predicate?
CriteriaQuery<Object> subQuery1 = criteriaBuilder.createQuery();
Root<EmployeeDepartmentMapping> subQueryEmpDptMp = subQuery1.from(EmployeeDepartmentMapping.class);
predicates1.add(subQueryEmpDptMp.get("departmentName").in(departmentNames));
You might achieve better performance by splitting your table and using join:
class Employee { Integer id; String name; Integer departmentsId; }
class EmployeeDepartments { Integer departmentsId; String department; }
You may use Element Collection to achieve this.
Now, instead of having a the following row:
1,Chaitanya,Computer;Civil
You will have the following:
table1:
1,Chaitanya,123
table2:
123,Compter
123,Civil
Execute a join to get all row from table2 with table1 to get your result
I found that adding .First() after Take(1) in the following LINQ query
{
var qty= (from ii in Inventory
where ii.Part == "abc" & ii.Zone == "xyz"
select ii.Qty).Take(1);
}
increases execution time several thousand times. Same with .Single(). Wondering why. Note that even without First the result already has only one record.
Full code:
namespace ConsoleApp1
{
class SurroundingClass
{
class part
{
public string id { get; set; }
public string zone { get; set; }
public int qty { get; set; }
}
public static void Main()
{
List<part> inventory = new List<part>();
for (var i = 1; i <= 50000; i++)
inventory.Add(new part() { id = System.Convert.ToString(i), zone = System.Convert.ToString(i), qty = 3 });
object qty1;
DateTime d0 = DateTime.Now;
for (var i = 1; i <= 20000; i++)
qty1 = (from x in inventory
where x.id == "40000" & x.zone == "40000"
select x.qty).Take(1).First();
DateTime d1 = DateTime.Now;
Console.WriteLine(((TimeSpan)(d1 - d0)).Seconds);
}
}
}
In the first code example you are not executing query, only creating it.
First(), Single(), ToArray() and some other methods triggering query execution / enumeration.
According to #Vladimir 's answer, you need to be aware of between Deferred and Immediate Query Execution in LINQ.
From that point: .First() and Single or even you foreach, For Each loop, it called Immediate Query Execution.
So that's the reason why your query increases execution time several thousand times.
Some tips which one should you use?:
Immediate Query Execution
If you want to cache the results of a query.
If you want to get the final result without re-executing query.
Deferred Query Execution
If you want to build the complexity of a query in several steps by separating query construction from query execution.
If you want to fetch the latest information.
I'm using LINQ to execute a query on a List type variable with a large amount of data (over a million). For performance purposes I'm using IEnumerable to store the results but when I try to access it there is a slight delay.
Specifically I want to see if the query produced any results, but when I use the .Count() or .Any() functions the performance drops.
I read that for IEnumerable types the execution of the query happens at the time of need, hence the delay. Is there a way to see if the IEnumerable has elements inside it without having that much delay?
This is what I'm trying to run.
IEnumerable<Entity> matchingEntities = entities.Where(e => e.Names.Any(n => myEntity.Names.Any(entityName => entityName.CompareNameObjects(n))));
and here are my classes
public class Entity
{
public string EntityIdentifier { get; set; }
public List<Name> Names { get; set; }
}
public class Name
{
public string FullName { get; set; }
public string NameType { get; set; }
public bool CompareNameObjects(Name name2)
{
return FullName == name2.FullName &&
NameType == name2.NameType;
}
}
entities is a list of all my objects and I want to check if myEntity has any Names identical with another entity in the set.
EDITED:
The data structure is similar to the 2 classes (Entity and Name). The entities are created by selecting all the entities, along with their names, from the database in XML format and then I convert the XML to a List as such:
List<Entity> entities = new List<Entity>();
using (SqlConnection conn = new SqlConnection(ConfigurationManager.ConnectionStrings["myCS"].ConnectionString))
{
conn.Open();
SqlCommand cmd = new SqlCommand("GetAllEntities", conn);
cmd.CommandType = CommandType.StoredProcedure;
string entitiesXml = "";
using (SqlDataReader rdr = cmd.ExecuteReader())
{
while (rdr.Read())
{
entitiesXml += rdr["XmlString"].ToString();
}
}
using (TextReader reader = new StringReader(entitiesXml))
entities = (Entity)xmlSerializer.Deserialize(reader);
conn.Close();
}
GetAllEntities (Stored Procedure):
declare #xmlString nvarchar(max) =(
select e.EntityIdentifier,
(
select n.[Full Name] as 'FullName',
n.[Name Type] as 'NameType'
from tblNames n
where e.EntityID=n.[Entity_ID]
for xml path('Name'), type
)
from tblEntities e
order by e.EntityID
for xml path('Entity')
)
select #xmlString as XmlString
Basically, you should avoid getting all data from your database then filter it with C# code. It consumes a lot of effort.
However, for quick solution, you can improve performance by preparing your conditions in a Dictionary form firstly.
// Let's say you have myEntity here
var myEntity = new Entity();
var entities = new List<Entity>();
// You should prepare the list of name that you wanna to find before you do it so that you don't have to make it repeatedly for every iteration
var names = myEntity.Names.Select(p=> p.FullName + p.NameType ).ToDictionary(p=>p, p=>p);
IEnumerable<Entity> matchingEntities = entities.Where(e => e.Names.Any(n => names.ContainsKey(n.FullName + n.NameType)));
This is just an example that may give you more idea. You can improve much more. I hope it can help you
I am using entity framework inside my asp.net mvc web application, but I can not understand how it will handle multiple transaction accessing the same data.
For example I have the following action method that deelte a collection and then loop through a collection and delete the records:-
[HttpPost]
public ActionResult AssignPermisionLevel2(ICollection<SecurityroleTypePermision> list, int id)
{
repository.DeleteSecurityroleTypePermisions(id);
foreach (var c in list)
{
repository.InsertOrUpdateSecurityroleTypePermisions(c,User.Identity.Name);
}
repository.Save();
return RedirectToAction("AssignPermisionLevel", new { id = id });
}
Which will call the following repository method:-
public void DeleteSecurityroleTypePermisions(int securityroleID)
{
var r = tms.SecurityroleTypePermisions.Where(a => a.SecurityRoleID == securityroleID);
foreach (var c in r) {
tms.SecurityroleTypePermisions.Remove(c);
}
}
&
public void InsertOrUpdateSecurityroleTypePermisions(SecurityroleTypePermision role, string username)
{
var auditinfo = IntiateAdminAudit(tms.AuditActions.SingleOrDefault(a => a.Name.ToUpper() == "ASSIGN PERMISION").ID, tms.SecurityTaskTypes.SingleOrDefault(a => a.Name.ToUpper() == "SECURITY ROLE").ID, username, tms.SecurityRoles.SingleOrDefault(a=>a.SecurityRoleID == role.SecurityRoleID).Name, tms.PermisionLevels.SingleOrDefault(a=>a.ID== role.PermisionLevelID).Name + " --> " + tms.TechnologyTypes.SingleOrDefault(a=>a.AssetTypeID == role.AssetTypeID).Name);
tms.SecurityroleTypePermisions.Add(role);
InsertOrUpdateAdminAudit(auditinfo);
}
So let say two users access the same action method at the same time, so will their transactions conflict with each other? , or all the transaction actions (Deletion & Addition) will execute and then the other transaction will start?
UPDATE
Inside my Controller class i will initiate the repository as follow :-
[Authorize]
public class SecurityRoleController : Controller
{
Repository repository = new Repository();
my second question is . You mentioned that EF will mark the entities for deletion or for insetion, then the sql will execute indie the database. but what if one sql statement delete some entities and the other sql statement from the second transaction delete the other entities , could this conflict happen at the database level ? or once the first sql statement from the first transaction start execution, it will prevent other transactions from being executed ? can you advice ?
This entirely depends on how you implement your DbContext. If your context is instantiated within a controller then each transaction will be contained within that context, i.e.
public class SomeController : Controller
{
var repository = new DbContext();
[HttpPost]
public ActionResult AssignPermisionLevel2(ICollection<SecurityroleTypePermision> list, int id)
{
repository.DeleteSecurityroleTypePermisions(id);
foreach (var c in list)
{
repository.InsertOrUpdateSecurityroleTypePermisions(c,User.Identity.Name);
}
repository.Save();
return RedirectToAction("AssignPermisionLevel", new { id = id });
}
}
Each request will create its own instance of the repository and the two will not conflict on an application level. When SaveChanges is called on a DbContext it is done in a single transaction, and as the repository object is created for each request.
Unfortunately Entity Framework does not delete as you expect, and will delete individual elements rather than the entire table. What is actually happening when you are removing the entities in the first step and adding them in the second is as follows:
Load Entities X,Y, and Z
Mark X,Y, and Z for deletion
Insert new rows A, B and C
Run SQL which deletes X, Y and Z, and inserts A, B and C
Now if two requests come in at the same time what could possibly happen is objects X,Y and Z are both loaded in step 1 by both request contexts. They are both marked for deletion and two sets of A, B and C are set to insert. When the first transaction executes it will be fine, however when the second transaction commits it will not be able to find X, Y and Z as they no longer exist.
You may be able to use a lock over the critical section so that the entities are not loaded before they are deleted by another request. The lock would have to be static so something such as:
public class SecurityRoleController : Controller
{
Repository repository = new Repository();
public static object REQUEST_LOCK = new object();
[HttpPost]
public ActionResult AssignPermisionLevel2(ICollection<SecurityroleTypePermision> list, int id)
{
lock(REQUEST_LOCK)
{
repository.DeleteSecurityroleTypePermisions(id);
foreach (var c in list)
{
repository.InsertOrUpdateSecurityroleTypePermisions(c,User.Identity.Name);
}
repository.Save();
}
return RedirectToAction("AssignPermisionLevel", new { id = id });
}
}
Update 2
There are two sides to your problem, the way SQL handles transactions and the way Entity Framework performs deletes. Without going into massive detail on threading you basically have to lock the action so that the same method cannot execute twice at exactly the same time. This will prevent the context from reading potentially stale/already deleted data.
You can read more on SQL/EF race conditions with this question: Preventing race condition of if-exists-update-else-insert in Entity Framework
Lets say I have a generic list of the the following objects:
public class Supermarket
{
public string Brand { get; set; }
public string Suburb { get; set; }
public string State { get; set; }
public string Country { get; set; }
}
So using a List<Supermarket> which is populated with many of these objects with different values I am trying to:
Select the distinct Suburb properties from a
superset of Supermarket objects contained in a List<Supermarket> (say this superset contains 20
distinct Suburbs).
Join the Distinct List of Suburbs above to another set of aggregated and counted Suburbs obtained by a LINQ query to a different, smaller list of List<Supermarket>
The distinct items in my superset are:
"Blackheath"
"Ramsgate"
"Penrith"
"Vaucluse"
"Newtown"
And the results of my aggregate query are:
"Blackheath", 50
"Ramsgate", 30
"Penrith", 10
I want to join them to get
"Blackheath", 50
"Ramsgate", 30
"Penrith", 10
"Vaucluse", 0
"Newtown", 0
Here is what I have tried so far:
var results = from distinctSuburb in AllSupermarkets.Select(x => x.Suburb).Distinct()
select new
{
Suburb = distinctSuburb,
Count = (from item in SomeSupermarkets
group item by item.Suburb into aggr
select new
{
Suburb = aggr.Key,
Count = aggr.Count()
} into merge
where distinctSuburb == merge.Suburb
select merge.Count).DefaultIfEmpty(0)
} into final
select final;
This is the first time I have had to post on Stack Overflow as its such a great resource, but I can't seem to cobble together a solution for this.
Thanks for your time
EDIT: OK So I solved this a short while after the initial post. The only thing I was missing was chaining a call to .ElementAtOrDefault(0) after the call to .DefaultIfEmpty(0). I also verifed that using .First() instead of .DefaultIfEmpty(0) as Ani pointed out worked, The correct query is as follows:
var results = from distinctSuburb in AllSupermarkets.Select(x => x.Suburb).Distinct()
select new
{
Suburb = distinctSuburb,
Count = (from item in SomeSupermarkets
group item by item.Suburb into aggr
select new
{
Suburb = aggr.Key,
Count = aggr.Count()
} into merge
where distinctSuburb == merge.Suburb
select merge.Count).DefaultIfEmpty(0).ElementAtOrDefault(0)
} into final
select final;
LASTLY: I ran Ani's code snippet and confirmed that it ran successfully, so both approaches work and solve the original question.
I don't really understand the assumed equivalence between State and Suburb (where distinctSuburb == merge.State), but you can fix your query adding a .First() after the DefaultIfEmpty(0) call.
But here's how I would write your query: using a GroupJoin:
var results = from distinctSuburb in AllSupermarkets.Select(x => x.Suburb).Distinct()
join item in SomeSupermarkets
on distinctSuburb equals item.Suburb
into suburbGroup
select new
{
Suburb = distinctSuburb,
Count = suburbGroup.Count()
};