How to retrieve element from IQueryable in Parallel Loop - linq

I got the my records here.
IQueryable<EmployeeItem> dtEmployee = GetAll();
After that loop that dtEmployee.
This is my normal loop which is working fine.
for (int i = 0; i < dtEmployee.Count(); i++)
{
var drEmployee = dtEmployee.AsEnumerable().ElementAt(i);
}
This is the parallel loop that I want to try.
System.Threading.Tasks.Parallel.For(0, dtEmployee.Count(), i => {
var drEmployee = dtEmployee.AsEnumerable().ElementAt(i);
});
I don't have any compile errors but when I run it in my Visual Studio, I got this error:

This is my normal loop which is working fine.
I wouldn't say fine, it's incredibly inefficient. Since dtEmployee represents a query, and AsEnumerable() means it's no longer a database query, just a lazy sequence, each ElementAt() (and also the Count()) causes a separate database query, which likely retrieves all the employees.
To fix this, use foreach:
foreach (var drEmployee in dtEmployee)
{
}
This is the parallel loop that I want to try.
Since dtEmployee is not thread-safe, you can't access it from multiple threads at the same time. The fix here is actually pretty much the same as above: use Parallel.ForEach():
Parallel.ForEach(dtEmployee, drEmployee => {
});

Related

can somebody tell me what error i have in this BPM

This code will auto generate a new part number. This is a Post-processing BPM for BO GetNewPart
int iPartnum = 0;
string cPartid = string.Empty;
Erp.Tables.Company Company;
foreach (var ttpart_xRow in ttPart)
{
var ttpartRow = ttpart_xRow;
Company = (from Company_Row in Db.Company
where Company_Row.Company == Session.CompanyID
select Company_Row).FirstOrDefault();
iPartnum = (decimal)Company["AutoGenerate_c"] + 1;
cPartid = System.Convert.ToString(iPartnum);
ttpartRow.PartNum = cPartid;
Services.Lib.UpdateTableBuffer._UpdateTableBuffer(Company,"AutoGenerate_c", iPartnum);
}
Is it just not working or is there an error message?
Services.Lib.UpdateTableBuffer._UpdateTableBuffer(Company,"AutoGenerate_c", iPartnum);
I have personally never used or even seen this Lib item so I can't vouch for it. I would update the object manually inside of a transaction scope because I doubt GetNewPart ever touches that database and therefore probably doesn't create a transaction.
using (System.Transactions.TransactionScope txScope = IceDataContext.CreateDefaultTransactionScope())//start the transaction
{
//Your Logics go here
Db.Validate();
txScope.Complete();//commit the transaction
}
As a side note, I try to keep these sorts of things off of the company record because nearly every process in the system touches it and I don't want a process to lock it up or cause weird race conditions. I generally like to reserve a record that will only get touched for this specific purpose so I have a UDCodeType/UDCode for this sort of thing.

Why is the second LINQ query faster?

In this code:
static bool Spin(int WaitTime)
{
Console.WriteLine("Running task {0} : thread {1}]",
Task.CurrentId, Thread.CurrentThread.ManagedThreadId);
Thread.Sleep(WaitTime);
return true;
}
public void DemoPLINQLong()
{
var SomeBigNumber = 1000000;
var sequence = Enumerable.Range(0, SomeBigNumber);
var sw = new Stopwatch();
sw.Start();
sequence.Where(i => Spin(SomeBigNumber));
sw.Stop();
var synchTime = sw.Elapsed;
sw.Restart();
sequence.Where(i => Spin(SomeBigNumber));
sw.Stop();
var asynchTime = sw.Elapsed;
Console.WriteLine("Synchronous: {0} Asynchronous: {1}",
synchTime.ToString(), asynchTime.ToString());
}
The results are consistent:
Synchronous: 00:00:00.0021800 Asynchronous: 00:00:00.0000076
Why is the second LINQ query hundreds of times faster? Is there some kind of caching going on? How?
DotNet caches and creates performance optimizations the first time anything is executed; this is known as a Just In Time environment (JIT). Upon subsequent calls to the same code, the run time environment can re-use the existing optimizations which is why you'll frequently see the first run of nearly anything being much slower than subsequent runs of the same code.
A couple of side notes about the posted code:
Not sure what the "Synchronous" and "Asynchronous" terms are referring to; both examples are the exact same thing and there is nothing Asynchronous about them.
If you're not aware, none of the LINQ is being evaluated in the example due to the nature of LINQ's deferred execution. You can see this behavior if you change the example from: sequence.Where(i => Spin(SomeBigNumber)) to sequence.Where(i => Spin(SomeBigNumber)).ToList(). Where, ToList() will force the evaluation of the LINQ predicate and the Console.WriteLine will be written to the console in the Spin method.

How can I get the "actual" count of element in a IEnumerable?

If I wrote :
for (int i = 0; i < Strutture.Count(); i++)
{
}
and Strutture is an IEnumerable with 200 elements, IIS crash. That's because I see every time I do Strutture.Count() it executes all LINQ queries linked with that IEnumerable.
So, how can I get the "current" number of elements? I need a list?
"That's because I see every time I do Strutture.Count() it executes all LINQ queries linked with that IEnumerable."
Without doing such, how is it going to know how many elements there are?
For example:
Enumerable.Range(0,1000).Where(i => i % 2==0).Skip(100).Take(5).Count();
Without executing the LINQ, how could you know how many elements there are?
If you want to know how many elements there are in the source (e.g. Enumerable.Range) then I suggest you use a reference to that source and query it directly. E.g.
var numbers = Enumerable.Range(0,1000);
numbers.Count();
Also keep in mind some data sources don't really have a concept of 'Count' or if they do it involves going through every single item and counting them.
Lastly, if you're using .Count() repetitively [and you don't expect the value to actually change] it can be a good idea to cache:
var count = numbers.Count();
for (int i =0; i<count; i++) // Do Something
Supplemental:
"At first Count(), LINQ queries are executes. Than, for the next, it just "check" the value :) Not "execute the LINQ query again..." :)" - Markzzz
Then why don't we do that?
var query = Enumerable.Range(0,1000).Where(i => i % 2==0).Skip(100).Take(5).Count();
var result = query.ToArray() //Gets and stores the result!
result.Length;
:)
"But when I do the first "count", it should store (after the LINQ queries) the new IEnumerable (the state is changed). If I do again .Count(), why LINQ need to execute again ALL queries." - Markzzz
Because you're creating a query that gets compiled down into X,Y,Z. You're running the same query twice however the result may vary.
For example, check this out:
static void Main(string[] args)
{
var dataSource = Enumerable.Range(0, 100).ToList();
var query = dataSource.Where(i => i % 2 == 0);
//Run the query once and return the count:
Console.WriteLine(query.Count()); //50
//Now lets modify the datasource - remembering this could be a table in a db etc.
dataSource.AddRange(Enumerable.Range(100, 100));
//Run the query again and return the count:
Console.WriteLine(query.Count()); //100
Console.ReadLine();
}
This is why I recommended storing the results of the query above!
Materialize the number:
int number = Strutture.Count();
for (int i = 0; i < number; i++)
{
}
or materialize the list:
var list = Strutture.ToList();
for (int i = 0; i < list.Count; i++)
{
}
or use a foreach
foreach(var item in Strutture)
{
}

What are the benefits of a Deferred Execution in LINQ?

LINQ uses a Deferred Execution model which means that resulting sequence is not returned at the time the Linq operators are called, but instead these operators return an object which then yields elements of a sequence only when we enumerate this object.
While I understand how deferred queries work, I'm having some trouble understanding the benefits of deferred execution:
1) I've read that deferred query executing only when you actually need the results can be of great benefit. So what is this benefit?
2) Other advantage of deferred queries is that if you define a query once, then each time you enumerate the results, you will get different results if the data changes.
a) But as seen from the code below, we're able to achieve the same effect ( thus each time we enumerate the resource, we get different result if data changed ) even without using deferred queries:
List<string> sList = new List<string>( new[]{ "A","B" });
foreach (string item in sList)
Console.WriteLine(item); // Q1 outputs AB
sList.Add("C");
foreach (string item in sList)
Console.WriteLine(item); // Q2 outputs ABC
3) Are there any other benefits of deferred execution?
The main benefit is that this allows filtering operations, the core of LINQ, to be much more efficient. (This is effectively your item #1).
For example, take a LINQ query like this:
var results = collection.Select(item => item.Foo).Where(foo => foo < 3).ToList();
With deferred execution, the above iterates your collection one time, and each time an item is requested during the iteration, performs the map operation, filters, then uses the results to build the list.
If you were to make LINQ fully execute each time, each operation (Select / Where) would have to iterate through the entire sequence. This would make chained operations very inefficient.
Personally, I'd say your item #2 above is more of a side effect rather than a benefit - while it's, at times, beneficial, it also causes some confusion at times, so I would just consider this "something to understand" and not tout it as a benefit of LINQ.
In response to your edit:
In your particular example, in both cases Select would iterate collection and return an IEnumerable I1 of type item.Foo. Where() would then enumerate I1 and return IEnumerable<> I2 of type item.Foo. I2 would then be converted to List.
This is not true - deferred execution prevents this from occurring.
In my example, the return type is IEnumerable<T>, which means that it's a collection that can be enumerated, but, due to deferred execution, it isn't actually enumerated.
When you call ToList(), the entire collection is enumerated. The result ends up looking conceptually something more like (though, of course, different):
List<Foo> results = new List<Foo>();
foreach(var item in collection)
{
// "Select" does a mapping
var foo = item.Foo;
// "Where" filters
if (!(foo < 3))
continue;
// "ToList" builds results
results.Add(foo);
}
Deferred execution causes the sequence itself to only be enumerated (foreach) one time, when it's used (by ToList()). Without deferred execution, it would look more like (conceptually):
// Select
List<Foo> foos = new List<Foo>();
foreach(var item in collection)
{
foos.Add(item.Foo);
}
// Where
List<Foo> foosFiltered = new List<Foo>();
foreach(var foo in foos)
{
if (foo < 3)
foosFiltered.Add(foo);
}
List<Foo> results = new List<Foo>();
foreach(var item in foosFiltered)
{
results.Add(item);
}
Another benefit of deferred execution is that it allows you to work with infinite series. For instance:
public static IEnumerable<ulong> FibonacciNumbers()
{
yield return 0;
yield return 1;
ulong previous = 0, current = 1;
while (true)
{
ulong next = checked(previous + current);
yield return next;
previous = current;
current = next;
}
}
(Source: http://chrisfulstow.com/fibonacci-numbers-iterator-with-csharp-yield-statements/)
You can then do the following:
var firstTenOddFibNumbers = FibonacciNumbers().Where(n=>n%2 == 1).Take(10);
foreach (var num in firstTenOddFibNumbers)
{
Console.WriteLine(num);
}
Prints:
1
1
3
5
13
21
55
89
233
377
Without deferred execution, you would get an OverflowException or if the operation wasn't checked it would run infinitely because it wraps around (and if you called ToList on it would cause an OutOfMemoryException eventually)
An important benefit of deferred execution is that you receive up-to-date data. This may be a hit on performance (especially if you are dealing with absurdly large data sets) but equally the data might have changed by the time your original query returns a result. Deferred execution makes sure you will get the latest information from the database in scenarios where the database is updated rapidly.

What do they mean when they say LINQ is composable?

What does it mean and why (if at all) is it important?
It means you can add additional "operators" to a query. It's important because you can do it extremely efficiently.
For example, let's say you have a method that returns a list (enumerable) of employees:
var employees = GetEmployees();
and another method that uses that one to return all managers:
IEnumerable<Employee> GetManagers()
{
return GetEmployees().Where(e => e.IsManager);
}
You can call that function to get managers that are approaching retirement and send them an email like this:
foreach (var manager in GetManagers().Where(m => m.Age >= 65) )
{
SendPreRetirementMessage(manager);
}
Pop quiz: How many times will that iterate over your employees list? The answer is exactly once; the entire operation is still just O(n)!
Also, I don't need to have separate methods for this. I could compose a query with these steps all in one place:
var retiringManagers = GetEmployees();
retiringManagers = retiringManagers.Where(e => e.IsManager);
retiringManagers = retiringManagers.Where(m => m.Age >= 65);
foreach (var manager in retiringMangers)
{
SendPreRetirementMessage();
}
One cool thing about this is that I can change is at run time, such that I can include or not include one part of the composition inside an if block, such that the decision to use a specific filter is made at run time, and everything still comes out nice and pretty.
I think it means that you can daisy chain your queries, like this
var peterJacksonsTotalBoxOffice
= movies.Where(movie => movie.Director == "Peter Jackson")
.Sum(movie => movie.BoxOffice);

Resources