Magento register models for performance gain - performance

I have a large application in Magento which is pretty "heavy" in terms of data collections and actions that are performed. Currently I'm trying to optimize the performance and I noticed that during a regular page load, around 900 Mage::getModel calls are being performed. The call itself is quite fast but when 900 calls are made, this affect performance as all the calls take around 2 seconds.
I was wondering if it's safe to use Magento's registry functionality to register models that have no construct arguments. Basically if a Mage::getModel('sales/quote') is called, after loading the class I intend to register the instance under a unique key (like 'model_register_'.className) and all subsequent Mage::getModel('sales/quote') calls will no longer create a new instance of the model but return the one in the registry, which should improve performance. This, of course, would only be used for calls that have no $constructArguments in the Mage::getModel call.
Has anyone done this before? As I am interested if this approach is safe or if this might cause other issues.

Apparently this does not work because registry keeps a reference to the object. So when retrieving the model from the registry, you would not get a "clean" instance

Related

What's the best practice for NSPersistentContainer newBackgroundContext?

I'm familiarizing myself with NSPersistentContainer. I wonder if it's better to spawn an instance of the private context with newBackgroundContext every time I need to insert/fetch some entities in the background or create one private context, keep it and use for all background tasks through the lifetime of the app.
The documentation also offers convenience method performBackgroundTask. Just trying to figure out the best practice here.
I generally recommend one of two approaches. (There are other setups that work, but these are two that I have used, and tested and would recommend.)
The Simple Way
You read from the viewContext and you write to the viewContext and only use the main thread. This is the simplest approach and avoid a lot of the multithread issues that are common with core-data. The problem is that the disk access is happening on the main thread and if you are doing a lot of it it could slow down your app.
This approach is suitable for small lightweight application. Any app that has less than a few thousand total entities and no bulk changes at once would be a good candidate for this. A simple todo list, would be a good example.
The Complex Way
The complex way is to only read from the viewContext on the main thread and do all your writing using performBackgroundTask inside a serial queue. Every block inside the performBackgroundTask refetches any managedObjects that it needs (using objectIds) and all managedObjects that it creates are discarded at the end of the block. Each performBackgroundTask is transactional and saveContext is called at end of the block. A fuller description can be found here: NSPersistentContainer concurrency for saving to core data
This is a robust and functional core-data setup that can manage data at any reasonable scale.
The problem is that you much always make sure that the managedObjects are from the context you expect and are accessed on the correct thread. You also need a serial queue to make sure you don't get write conflicts. And you often need to use fetchedResultsController to make sure entities are not deleted while you are holding pointers to them.

Async read approach for large datasets

OK, so Realm (.NET) doesn't support async queries in it's current version.
In case the underlying table for a certain RealmObject contains a lot of records, say in the hundreds of thousands or millions, what is the preferred approach (given the current no async limitation)?
My current options (none tested thus far):
On the UI thread use Realm.GetInstance().All<T> and filter it (and then enumerate the IEnumerable). My assumption is that the UI thread will block waiting for this possible lengthy operation.
Do the previous on a worker thread. The downside would be that all RealmObject's need to be mapped to some auxiliary domain model (or even the same model, but disconnected from Realm) because realm objects cannot be shared/marshaled between threads.
Is there any recommended approach (by the Realm creators, of course)? I'm aware this doesn't completely fit the question model for this site, but so be it.
Realm enumerators are truly lazy and the All<T> is a further special case, so it is certainly fast enough to do on the UI thread.
Even queries are so fast, most of the time we recommend people do them on the UI thread.
To enlarge on my comment on the question, RealmObject subclasses are woven at compile time with the property getters and setters being mapped to call directly through to the C++ core, getting memory-mapped data.
That keeps updates between threads lightning fast, as well as delivering our incredible column-scanning speed. Most cases do not require indexes nor do they need running on separate threads.
If you create a standalone RealmObject subclass eg: new Dog() it has a flag IsManaged==false which means the getter and setter methods still use the backing field, as generated by the compiler.
If you create an object with CreateObject or you take a standalone into the Realm with Realm.Manage then IsManaged==true and the backing field is ignored.

EF5 (entity framework) memory leak and doesn't release after dispose

So I'm using web api to expose data services. Initially I created my dbcontext as a static memory, and each time I open up my project under IISExpress, the memory balloons to over 100MB in memory. I understand that it isnt recommended to use static due to the solved answer here:
Entity framework context as static
So I went ahead and converted my application to using regular non-static dbcontext and included a dispose method on my api:
protected override void Dispose(Boolean disposing)
{
if (provider.Context != null)
{
provider.Context.Dispose();
provider = null;
}
base.Dispose(disposing);
}
Now every time I make a call, it goes through this method and disposes. Now, I open the application, still balloons to 100k, and each time I make a call, I watch the memory of my iisexpress process, and it keeps on going up and it's not coming back down after the dispose, it keeps increasing to almost 200MB+.
So static or not, memory explodes whenever I use it.
Initially I thought it was my web api that was causing it, until I removed all my services and just created the EF object in my api (I'm using breezejs, so this code is trivial, the actual implementation is down below, but makes no diff to memory consumption):
private DistributorLocationEntities context = new DistributorLocationEntities();
And bam, 110MB immediately.
Is there any helpful tips and tweaks on how I can release memory when I use it? Should I add garbage collect to my dispose()? Any pitfalls to allocating and deallocating memory rapidly like that? For example, I make calls to the service each time I make a keystroke to accomplish an "autocomplete" feature.
I'm also not certain what will happen if I put this in production, and we have dozens of users accessing the db; I wouldn't want the users to increase the memory to 1 or 2GB and it doesn't get released.
Side note: All my data services for now are searches, so there are no save changes or updates, though there can be later on though. Also, I don't return any linq queries as an array or enumerable, they remain as queryables throughout the service call.
One more thing, I do use breezejs, so I wrap up my context as such:
readonly EFContextProvider<DistributorLocationEntities> provider = new EFContextProvider<DistributorLocationEntities>();
and the tidbits that goes along with this:
Doc for Breeze's EFContextProvider
proxycreationenabled = false
ladyloadingenabled = false
idispose is not included
but I still dispose the context anyways, which makes no difference.
I don't know what you're doing. I do know that you should not have any static resources of any kind in your Web API controllers (breeze-flavored or not).
I strongly suspect you've violated that rule.
Adding a Dispose method no difference if the object is never disposed ... which it won't be if it is held in a static variable.
I do not believe that Breeze has any role in your problem whatsoever. You've already shown that it doesn't.
I suggest you start from a clean slate, forget Breeze for now, a get a simple Web API controller that creates a DbContext per request. When you've figured that out, proceed to add some Breeze.
As mentioned in Ward's comment, statics are a big no-no, so I spent time on moving my EF objects out of static. Dispose method didn't really help either.
I gave this article a good read:
http://msdn.microsoft.com/en-us/data/hh949853.aspx
There are quite a few performance options EF provides (that doesn't come out of the box). So here are a few things I've done:
Added pre-generated views to EF: T4 templates for generating views for EF4/EF5. The nice thing about this is that it abstracts away from the DB and pre-generates the view to decrease model load time
Next, I read this post on Contains in EF: Why does the Contains() operator degrade Entity Framework's performance so dramatically?. Apparently I saw an an attractive answer of converting my IEnumerable.Contains into a HashSet.Contains. This boosted my performance considerably.
Finally, reading the microsoft article, I realized there is a "AsNoTracking()" that you can hook up to the DBContext, this turns of automatic caching for that specific context in linq. So you can do something like this
var query = (from t in db.Context.Table1.AsNoTracking() select new { ... }
Something I didn't have to worry about was compiling queries in EF5, since it does it for you automatically, so you don't have to add CompileQuery.Compile(). Also if you're using EF 6 alpha 2, you don't need to worry about Contains or pre-generating views, since this is fixed in that version.
So when I start up my EF, this is a "cold" query execution, my memory goes high, but after recycling IIS, memory is cut in half and uses "warm" query execution. So that explains a lot!

Does Wordpress load-cache functions such as wp_get_nav_menu_object?

I've been asking myself this question for quite a while. Maybe someone has already done some digging (or is involved in WP) to know the answer.
I'm talking about storing objects from WP-functions in PHP variables, for the duration of a page load, e.g. to avoid having to query the database twice for the same result set.
I don't mean caching in the sense of pre-rendering dynamic pages and saving them in HTML format for faster retrieval.
Quite a few "template tags" (Wordpress functions) may be used multiple times in a theme during one page load. When a theme or plugin calls such a function, does WP run a database query every time to retrieve the necessary data, and does it parse this data every time to return the desired object?
Or, does the function store the its result in a PHP variable the first time it runs, and checks if it already exists before it queries the database or parses?
Examples include:
wp_get_nav_menu_object()
wp_get_nav_menu_items()
wp_list_categories()
wp_tag_cloud()
wp_list_authors()
...but also such important functions as bloginfo() or wp_nav_menu().
Of course, it wouldn't make much sense to cache any and all queries like post-related ones. But for the above examples (there are more), I believe it would.
So far, I've been caching these generic functions myself when a theme required the same function to be called more than once on a page, by writing my own functions or classes and caching in global or static variables. I don't see why I should add to the server load by running the exact same generic query more than a single time.
Does this sort of caching already exist in Wordpress?
Yes, for some queries and functions. See WP Object Cache. The relevant functions are wp_cache_get, wp_cache_set, wp_cache_add, and wp_cache_delete. You can find these functions being used in many places through the WordPress code to do exactly what you are describing.

Cache Management with Numerous Similar Database Queries

I'm trying to introduce caching into an existing server application because the database is starting to become overloaded.
Like many server applications we have the concept of a data layer. This data layer has many different methods that return domain model objects. For example, we have an employee data access object with methods like:
findEmployeesForAccount(long accountId)
findEmployeesWorkingInDepartment(long accountId, long departmentId)
findEmployeesBySearch(long accountId, String search)
Each method queries the database and returns a list of Employee domain objects.
Obviously, we want to try and cache as much as possible to limit the number of queries hitting the database, but how would we go about doing that?
I see a couple possible solutions:
1) We create a cache for each method call. E.g. for findEmployeesForAccount we would add an entry with a key account-employees-accountId. For findEmployeesWorkingInDepartment we could add an entry with a key department-employees-accountId-departmentId and so on. The problem I see with this is when we add a new employee into the system, we need to ensure that we add it to every list where appropriate, which seems hard to maintain and bug-prone.
2) We create a more generic query for findEmployeesForAccount (with more joins and/or queries because more information will be required). For other methods, we use findEmployeesForAccount and remove entries from the list that don't fit the specified criteria.
I'm new to caching so I'm wondering what strategies people use to handle situations like this? Any advice and/or resources on this type of stuff would be greatly appreciated.
I've been struggling with the same question myself for a few weeks now... so consider this a half-answer at best. One bit of advice that has been working out well for me is to use the Decorator Pattern to implement the cache layer. For example, here is an article detailing this in C#:
http://stevesmithblog.com/blog/building-a-cachedrepository-via-strategy-pattern/
This allows you to literally "wrap" your existing data access methods without touching them. It also makes it very easy to swap out the cached version of your DAL for the direct access version at runtime quite easily (which can be useful for unit testing).
I'm still struggling to manage my cache keys, which seem to spiral out of control when there are numerous parameters involved. Inevitably, something ends up not being properly cleared from the cache and I have to resort to heavy-handed ClearAll() approaches that just wipe out everything. If you find a solution for cache key management, I would be interested, but I hope the decorator pattern layer approach is helpful.

Resources