recreation or reinitialization? - performance

Many times i have come accross a situation where there is a loop and a new object is constructed at the beginning of the loop and added to a collection. For example, pseudocode:
iterating over a resultset do
create an object
set instance data in object to some resultset data
put object in collection
next
How is this approach instead?
create an object
iterating over a resultset do
set instance data in object to some resultset data
put object in collection
next
What are the pros and cons of both the approaches? Which can be faster? Is there a better way than the two?
P.S. : i dont know what tags to put. Pardon me.

Depending on the language you are using to implement that you will get different results.
Some languages return a reference to an object. So the first option will do what you expect because a new object is created and appended to the collection with its own values.
iterating over a resultset do
create an object
set instance data in object to some resultset data
put object in collection
next
But if the language simply returns a reference to an object and you try and do the second method
create an object (x)
iterate over resultset do
set instance data in object (x) to resultset data (returns a reference to x with updated data)
put object in collection (puts a reference to x in collection)
next
So after iterating the resultset you will end up with a bunch of references to the same object in your collection, with the values being whatever was last assigned to.

Your final pseudocode should be more like:
create an object
iterating over a resultset do
set instance data in object to some resultset data
**clone** object into collection
next
Since otherwise you would simply end up with so many copies of your final object, since you kept modifying the same object that you added references to. So you don't really save any object space or creation time. The second method is perfectly valid if for some reason you would prefer not to re-initialize every loop, such as when you need to change less members each loop that you'd have to initialize. Just make sure your object's copy constructor is robust.

Related

How can Linq be so incredibly fast? C#

Let's say I have 100 000 objects of type Person which have a date property with their birthday in them.
I place all the objects in a List<Person> (or an array) and also in a dictionary where I have the date as the key and every value is a array/list with persons that share the same birthday.
Then I do this:
DateTime date = new DateTime(); // Just some date
var personsFromList = personList.Where(person => person.Birthday == date);
var personsFromDictionary = dictionary[date];
If I run that 1000 times the Linq .Where lookup will be significantly faster in the end than the dictionary. Why is that? It does not seem logical to me. Is the results being cached (and used again) behind the scenes?
From Introduction to LINQ Queries (C#) (The Query)
... the important point is that in LINQ, the query variable itself takes no action and returns no data. It just stores the information that is required to produce the results when the query is executed at some later point.
This is known as deferred execution. (later down the same page):
As stated previously, the query variable itself only stores the query commands. The actual execution of the query is deferred until you iterate over the query variable in a foreach statement. This concept is referred to as deferred execution...
Some linq methods must iterate the IEnumerable and therefor will execute immediately - methods like Count, Max, Average etc' - all the aggregation methods.
Another way to force immediate execution is to use ToArray or ToList, which will execute the query and store it's results in an array or list.

EF core 2 first query slow

I'm using EF core 2 as ORM in my project.
I faced this problem while executing this query:
var query = (from droitsGeo in _entities.DroitsGeos
join building in _entities.Batiments
on droitsGeo.IdPerimetre equals building.IdBatiment
where droitsGeo.IdUtilisateur == idUser &&
droitsGeo.IdClient == idClient &&
building.Valide == true &&
droitsGeo.IdNiveauPerimetre == geographicalLevel
orderby sort ascending
select new GeographicalModel
{
Id = building.IdBatiment,
IdParent = building.IdEtablissement,
Label = building.LibBatiment,
});
First execution tooks about 5 second and second less than one second as show below :
First execution of query :
Time elapsed EF: 00:00:04.8562419
After first execution of query :
Time elapsed EF: 00:00:00.5496862
Time elapsed EF: 00:00:00.6658079
Time elapsed EF: 00:00:00.6176030
I have same result using Stored procedure.
When i execute sql query generated by EF in SQL Server, the result is returned in less than a second.
what is wrong with EF Core 2 or did i miss something in configuration?
The EF by default tracks all the entities you run queries against.
When you run it for the first time the track change mechanism kicks in... that's why it takes a little bit longer.
You can avoid this, especially when retrieving collections by using .AsNoTracking() when composing the query.
Take a look:
var items = DbContext.MyDbSet
.Include(SecondObject)
.AsNoTracking()
.ToList();
EF core needs to compile LINQ quires using reflection therefor first queries are always slow. There is already a GitHub issue here
I have a simple idea to resolve this issue with the help of stored procedures and thereafter AutoMapper.
Create an stored procedures that return all the columns that you want, no matter if they are from different tables. Once the data is received from the stored procedure and you have received the object in one of your Model classes, you can then use AutoMapper to map only the relevant attributes to other classes. Please note that I am not giving you a tutorial of how to use stored procedure. I am giving you an example that might explain better:
A stored procedure is created which returns results from three tables named A, B and C.
A model class named SP_Result.cs is created corresponding to created stored procedure to map the received object of stored procedure (this is required when working with stored procedures in EF Core)
'ViewModels` are created having same attributes as returning from each table A, B and C.
Thereafter, mapping configurations will be created for SP_Result with ViewModel of Class A, Class B and Class C. e.g. CreateMap<SP_Result, ViewModel_A>(); CreateMap<SP_Result, ViewModel_B>();. I suppose, you would have a request and response objects which can be used instead of ViewModels. Name the properties accordingly in the stored procedure using AS keyword. e.g. Select std_Name AS 'Name'
This mapping will map the individual properties to each class. AutoMapper ignore the properties which do not exists in either of the classes mentioned in Mapping Configuration.
If you are selecting a list of objects where each object does have its own list of objects, this scenario will generally create N + 1 queries in EF. In fact, if you try to achieve this using stored procedures, you will have to create multiple queries or run the stored procedure multiple times (in a loop may be), or you will end up receiving Cartesian product.

Retrieving values from SQLite3 resultset seems inconsistent?

If I run execute("SELECT * FROM users WHERE id = 1")[0][0], I'll get back the first field from the first row of that resultset.
If I run prepare("SELECT * FROM users WHERE id = ?").execute(1)[0][0], which to me seems like it should return an identical result, I get the error [] NoMethodError.
I can't for the life of me figure out why and the documentation seems really sparse. What's going on?
Your two execute methods are not identical and are returning different things.
In the first instance, you are calling execute on Database, which returns a simple array. Then you correctly index into it with [].
However, prepare returns a Statement object which you then call execute on. This instead returns a ResultSet, which doesn't have the same semantics as an array.
You might be looking for execute!.

Data Entity Framework and LINQ - Get giant data set and execute a command one at a time on each object

We are pulling in a giant dataset of records (in the 100's of thousands) and then need to update a field on each one, one at a time in an atomic transation. They records are unrelated to each other and we don't want to do a blind update to all couple hundred thousand (there are views and indexes on this table that make that very prohibitive). The ONLY way that I could get this to work without doing a giant transation was as follows (container is a reference to a custom ObjectContext):
var expiredWorkflows = from iw in container.InitiatedWorkflows
where iw.InitiationStatusID != 1 && iw.ExpirationDate < DateTime.Now
select iw.ID;
foreach (int expiredWorkflow in expiredWorkflows)
container.ExecuteStoreCommand("UPDATE dbo.InitiatedWorkflow SET InitiationStatusID = 7 WHERE ID = #ID", new SqlParameter() { ParameterName = "#ID", Value = expiredWorkflow.ToString() } );
We tried looping through each one and just updating the field via the container and then calling SaveChanges(), but that runs everything as one transaction. We tried calling SaveChanges() in the foreach loop, but that threw transaction exceptions. Is there any way to what we are trying to do using the ObjectContext, so it would do something like (the above select would be changed to return the full object, not just the ID):
foreach (var expiredWorkflow in expiredWorkflows)
expiredWorkflow.InitiationStatusID = 7
container.SaveChanges(SaveOptions.OneAtATime);
Speaking generally, if the operation you need to carry out is as simple as the sort of UPDATE your code above suggests, this is the sort of operation that will run far better on the back end database--assuming, of course, there's some clear way to select only the rows that need to be changed. Entity Framework is intended more for manipulating small to medium sets of objects that can easily be loaded into memory and twiddled there, not large bulk-processing operations for which stored procedures are often best. EF can certainly perform those big operations, but it will take a lot longer to execute one SQL statement per row.

iterating over linq entity column

i need to insert a record with linq
i have a namevaluecollection with the data from a form post..
so started in the name=value&name2=value2 etc.. type format
thing is i need to inset all these values into the table, but of course the table fields are typed, and i need to type up the data before inserting it
i could of course explicitly do
linqtableobj.columnproperty = convert.toWhatever(value);
but i have many columns in the table, and the data coming back from the form, doesnt always contain all fields in the table
thought i could iterate over the linq objects columns, getting their datatype - to use to convert the appropriate value from the form data
fine all good, but then im still stuck with doing
linqtableobj.columnproterty = converted value
...if there is one for every column in the table
foreach(col in newlinqrowobj)
{
newlinqobj[col] = convert.changetype(namevaluecollection[col.name],col.datatype)
}
clearly i cant do that, but anything like that possible.. or
is it possible to loop around the columns for the new 'record' setting the values as i go.. and i guess grabbing the types at that point to do the conversion
stumped i am
thanks
nat
If you have some data type with a hundred different properties, and you want to copy those into a completely different data type with a hundred different properties, then somehow somewhere in your code you are going to have to define a hundred different "mapping" instructions. It doesn't matter what framework you are using, or whether the "mapping" instructions are lines of C# code, XML elements, lambda functions, proprietary "stuff", or whatever. There's no getting away from it.
Bearing that in mind, having one line of code per property looks to me like the fastest, simplest, most readable and maintainable solution.
If I understood your problem correctly, you could use reflection (or dynamic code generation if it is performance sensitive) to circumvent your typing problems
There is a preety good description of how to do something like this at codeproject.
Basically you get a PropertyInfo for the property you want to set (if it's not a property I think you would need dynamic code generation) and use it's setValue method (after calling the appropriate Convert.ChangeType of course). This will basicall circumvent the whole static typing, so there you are.

Resources