In few pass week I just learn about GORM as the database ORM. After checking inside the code, every command (limit, order, where, or, select, etc) are returning new instance by cloning the current DB.
Is there anyone here know what is the main purpose of cloning the DB instead of using the current instance?
When I have command select, where, limit, order, join, that will be 5 times of cloning the DB instance. AFAIK, creating object on the memory are expensive.
The purpose is to be able to store "temporary" instance of your query to be able to derive them later. That is, if you have a number of queries which share the some part of the sequence, you should be able to do something like
q := gorm.Select(...).Limit(...).Order(...)
q1 := q.Where(...)
q2 := q.Where(...)
(This example is a rought example that probably doesn't even map to GORM API as I don't use it myself.)
Now, I believe that cloning objects in memory that won't be kept long doesn't hinder much performance compared to the cost of doing a SQL query, which imply a network round-trip…
Related
If I'm using .Find() instead of .Where() to query an object, and someone updates the database making the in memory model out of sync, does Entity know/is Entity alerted to the change so that it updates the model in memory?
Does .Find() expose me to the risk of missing data?
is Entity alerted to the change so that it updates the model in memory?
No. I've never seen an ORM that does that. It is possible, but not trivial. You can read more about it in Query Notifications in SQL Server. And that's not even the whole story because once you can listen to database events you'd heave to decide what to do with them client-side. Like, what to do with changed values that were also changed in the client?
But the Find method is designed to do almost the opposite. It always tries to return an object from the local cache. It only queries the database if the object isn't there yet. So it's designed to return stale data, if you like. It is perfect for relatively complex operations in which you're going to need an object multiple times, but don't want to get it from the database all the time.
LINQ query statements (Find isn't LINQ) are somewhere in the middle. They do query the database, but they don't update objects that are in the cache already. If you changed an object locally, the changes won't be erased by a Select statement.
You can refresh the local cache, but the DbContext API, which was an improvement of the former ObjectContext API, even makes that a bit less easy than before. The message is: don't do it. If you want fresh data: create a new context.
Does .Find() expose me to the risk of missing data?
Sure, but so does First() and Where(). Any time you load data into memory you risk the data behind it changing without your knowledge. You can minimize that risk in EF by not hanging on to entities for long periods of time, and using a new context for every DB operation (or operations).
In Meteor, I have a little confusion between Session and Local Collection.
I know that Session is a temporary reactive key-value store, client-side only, and is cleaned on page refresh.
Local collection seems to be the same: reactive, temporary client-side storage, cleaned on page refresh with more flexible function like insert, update & remove query like server-side Mongo collection.
So I guess I could manage everything in Local Collection without Session, or, everything in Session without Local Collection.
But what is the best and efficient way to use Session and/or Local collection?
Simply, when to use Session and not use it?
And when to use Local collection and when not use it?
As I read your question I told myself that this is a very easy question, but then I was scratching my head. I tried to figure out an example that you can just accomplish with session or collections. But I didn't found any use-case. So let's rollup things from begin. Basically you already answered the question on your own, because it is the little sugar that makes collections something special.
When to use a collection?
Basically a collection is a database artifact. Imagine you have a client-server-application. All the data is persisted in the server side storage. Now you can use a local collection to provide the user a small subset of the servers collection. So a client collection is a database with reduced amount of data. The advantage is that you can access the collection with queries. You can use the same queries on server and client. In additon a collection always contains multiple objects of the same type. Sometimes you produce data on client for the client. No server interaction needed. Than you can use a local collection. A local collection provides the same functionality as a normal collection without server communication. This should be used if you have multiple objects with the same structure and in special if you'd like to use query operators.
You can also save the data inside a session object. Session objects can contain multiple objects as well. But imaging you want to find an object in an objectarray indexed with a special id. Than you need to iterate throw the whole array in order to find this object. You have to write additional logic, that can be handled with collection like magic. Further, collections return cursors. A cursor is an reactive object that just changes if the selected data changes. That means if you use find with an id. Than this object just rerenders when the object to this id changes. With session you can't. When a session changes you need to rerender all depending objects.
When to use a session?
For everything else. Sessions are often just small objects that contain some configuration logic. It is basically just one object and not a multiple occurency of equal objects. Haven't time now to go in detail but if it does not fit the collection use-cases you can use sessions.
Have a look at this post that describes why sessions should not be overused.
I assume that by local collection you mean: new Mongo.Collection(null)
The difference is that local collections do not survive hot code pushes. A refresh will erase Session, but hot code push will not, there's special code in Meteor to persist the values of the Session variable in the case of a hot code push..
You would use Session whenever you're storing temporary values that do NOT need to be persisted to the database.
Trivial examples could include a users selection of filters or the item in an index vies that is currently selected.
manipulated data in minimongo (insert, update, delete etc) is intended to be sent back to the server and stored in the database. For example this could be updating a users profile information etc.
I've got a database setup that is a bit on the complicated side, with several many-many tables.
I'm trying to generate an XML document from this data. There's a bit of checking, like if a name is not defined in one language try to get the name from another language (instead of showing null)
The problem I have that there are a lot of queries within loops.
Are there any guidelines for this, like what stuff to stay away from and what to use, to improve the performance?
cfoutput cfloop cfquery ?
If the looping logic is basically doing data processing, eg: based on the values from the first query, deciding what to go back to the database with for the next query, the best thing you can do for performance is to take all that logic out of your CF code, and put it into the DB. Use the DB for data processing, use CF for handling the data once it's been processed, and converting it into output.
The only time CF should be doing data manipulation is if you need to process data from differing sources: eg the database, some remote service, the file system, a different database, etc. Basically only if the database can't do the data processing itself should you be involving ColdFusion.
Regarding, " like if a name is not defined in one language try to get the name from another language (instead of showing null)".
You should be able to do this in your query. Pretty much every db out there has a coalesce function. They all support case constructs as well. You just have to pick the most appropriate method for your situation.
is there a way of knowing ID of identity column of record inserted via InsertOnSubmit beforehand, e.g. before calling datasource's SubmitChanges?
Imagine I'm populating some kind of hierarchy in the database, but I wouldn't want to submit changes on each recursive call of each child node (e.g. if I had Directories table and Files table and am recreating my filesystem structure in the database).
I'd like to do it that way, so I create a Directory object, set its name and attributes,
then InsertOnSubmit it into DataContext.Directories collection, then reference Directory.ID in its child Files. Currently I need to call InsertOnSubmit to insert the 'directory' into the database and the database mapping fills its ID column. But this creates a lot of transactions and accesses to database and I imagine that if I did this inserting in a batch, the performance would be better.
What I'd like to do is to somehow use Directory.ID before commiting changes, create all my File and Directory objects in advance and then do a big submit that puts all stuff into database. I'm also open to solving this problem via a stored procedure, I assume the performance would be even better if all operations would be done directly in the database.
One way to get around this is to not use an identity column. Instead build an IdService that you can use in the code to get a new Id each time a Directory object is created.
You can implement the IdService by having a table that stores the last id used. When the service starts up have it grab that number. The service can then increment away while Directory objects are created and then update the table with the new last id used at the end of the run.
Alternatively, and a bit safer, when the service starts up have it grab the last id used and then update the last id used in the table by adding 1000 (for example). Then let it increment away. If it uses 1000 ids then have it grab the next 1000 and update the last id used table. Worst case is you waste some ids, but if you use a bigint you aren't ever going to care.
Since the Directory id is now controlled in code you can use it with child objects like Files prior to writing to the database.
Simply putting a lock around id acquisition makes this safe to use across multiple threads. I've been using this in a situation like yours. We're generating a ton of objects in memory across multiple threads and saving them in batches.
This blog post will give you a good start on saving batches in Linq to SQL.
Not sure off the top if there is a way to run a straight SQL query in LINQ, but this query will return the current identity value of the specified table.
USE [database];
GO
DBCC CHECKIDENT ("schema.table", NORESEED);
GO
I am trying to develop my first web project using the entity framework, while I love the way that you can use linq instead of writing sql, I do have some severe performance issuses. I have a lot of unhandled data in a table which I would like to do a few transformations on and then insert into another table. I run through all objects and then inserts them into my new table. I need to do some small comparisons (which is why I need to insert the data into another table) but for performance tests I have removed them. The following code (which approximately 12-15 properties to set) took 21 seconds, which is quite a long time. Is it usually this slow, and what might I do wrong?
DataLayer.MotorExtractionEntities mee = new DataLayer.MotorExtractionEntities();
List<DataLayer.CarsBulk> carsBulkAll = ((from c in mee.CarsBulk select c).Take(100)).ToList();
foreach (DataLayer.CarsBulk carBulk in carsBulkAll)
{
DataLayer.Car car = new DataLayer.Car();
car.URL = carBulk.URL;
car.color = carBulk.SellerCity.ToString();
car.year = //... more properties is set this way
mee.AddToCar(car);
}
mee.SaveChanges();
You cannot create batch updates using Entity Framework.
Imagine you need to update rows in a table with a SQL statement like this:
UPDATE table SET col1 = #a where col2 = #b
Using SQL this is just one roundtrip to the server. Using Entity Framework, you have (at least) one roundtrip to the server loading all the data, then you modify the rows on the client, then it will send it back row by row.
This will slow things down especially if your network connection is limited, and if you have more than just a couple of rows.
So for this kind of updates a stored procedure is still a lot more efficient.
I have been experimenting with the entity framework quite a lot and I haven't seen any real performance issues.
Which row of your code is causing the big delay, have you tried debugging it and just measuring which method takes the most time?
Also, the complexity of your database structure could slow down the entity framework a bit, but not to the speed you are saying. Are there some 'infinite loops' in your DB structure? Without the DB structure it is really hard to say what's wrong.
can you try the same in straight SQL?
The problem might be related to your database and not the Entity Framework. For example, if you have massive indexes and lots of check constraints, inserting can become slow.
I've also seen problems at insert with databases which had never been backed-up. The transaction log could not be reclaimed and was growing insanely, causing a single insert to take a few seconds.
Trying this in SQL directly would tell you if the problem is indeed with EF.
I think I solved the problem. I have been running the app locally, and the database is in another country (neighbor, but never the less). I tried to load the application to the server and run it from there, and it then only took 2 seconds to run instead of 20. I tried to transfer 1000 records which took 26 seconds, which is quite an update, though I don't know if this is the "regular" speed for saving the 1000 records to the database?