Questions
Does the user option preload refer to caching on the client or on the server?
Are there any ways to make this occur asynchronously so that users don't take a large performance hit when first requesting data from a table?
More Info
In Dynamics Ax 2012, under File > User Options > Preload a user can select which tables are preloaded the first time they're accessed.
I've not found anything to say whether this behaviour relates to caching on the client or the AOS.
The fact it's a user setting implies that it's the client.
But it could be an AOS setting where users with this option take the initial hit of preloading the entire table, whilst those without would benefit from any caching caused by other users, but wouldn't trigger the load themselves.
If it's the latter we could improve performance by removing this option from all (human) users, leaving it enabled only on our batch user account, having scheduled jobs on each AOS to request a record from each table, thus triggering the preload without any user being negatively impacted.
Ref: http://dynamicbusinesssolutions.ru/axshared.en/html/9cd36702-2fa7-470c-a627-08
If a table is large or frequently changed it is not a candidate for entire table cache. This applies to ordinary users and batch users alike.
The EntireTable cache is located on the server, but the load is initiated by the user, the first user doing the select takes a performance hit.
To succesfully disable a table from preload, you can disable it using the Admin user, it will apply to all users. Or you can let all users disable it by themselves.
Personally I never change the user setup. If a table is large I change the table CacheLookup property as a customization.
See Set-based Caching:
When you set a table's CacheLookup property to EntireTable, all the
records in the table are placed in the cache after the first select.
This type of caching follows the rules of single record caching. This
means that the SELECT statement WHERE clause must include equality
tests on all fields of the unique index that is defined in the table's
PrimaryIndex property.
The EntireTable cache is located on the server
and is shared by all connections to the Application Object Server
(AOS). If a select is made on the client tier to a table that is
EntireTable cached, it first looks in its own cache and then searches
the server-side EntireTable cache.
An EntireTable cache is created for
each table for a given company. If you have two selects on the same
table for different companies the entire table is cached twice.
Note: Avoid using EntireTable caches for large tables because once
the cache size reaches 128 KB the cache is moved from memory to disk.
A disk search is much slower than an in-memory search.
Related
Using the Auth Manager of Yii I used CachedDbAuthManager. Once SQL executes for specific role against a user it caches the result. Next time records fetched from cache. Now once admin delete the role for a particular user it still remains in cache.
What is solution to this Problem?
Have a look at Yii's Cache Dependency Implementation.
You could eg. invalidate a cache when the admin edits an auth table, see also the database cache dependency. Often this is done just by looking for the latest eg. modified_at time, but this column is not part of the standard auth tables.
From the database cache man page:
CDbCacheDependency represents a dependency based on the query result of a SQL statement.
There is another extension SingleDbAuthManager which is doing nearly the same thing. It reads whole auth tree at once and cache it.
The performance of both SingleDbAuthManager and CachedDbAuthManager is vering. CachedDbAuthManager taking less time but fails to update cache in my case.
Let me start by describing the scenario. I have an MVC 3 application with SQL Server 2008. In one of the pages we display a list of Products that is returned from the database and is UNIQUE per logged in user.
The SQL query (actually a VIEW) used to return the list of products is VERY expensive.
It is based on very complex business requirements which cannot be changed at this stage.
The database schema cannot be changed or redesigned as it is used by other applications.
There are 50k products and 5k users (each user may have access to 1 up to 50k products).
In order to display the Products page for the logged in user we use:
SELECT TOP X * FROM [VIEW] WHERE UserID = #UserId -- where 'X' is the size of the page
The query above returns a maximum of 50 rows (maximum page size). The WHERE clause restricts the number of rows to a maximum of 50k (products that the user has access to).
The page is taking about 5 to 7 seconds to load and that is exactly the time the SQL query above takes to run in SQL.
Problem:
The user goes to the Products page and very likely uses paging, re-sorts the results, goes to the details page, etc and then goes back to the list. And every time it takes 5-7s to display the results.
That is unacceptable, but at the same time the business team has accepted that the first time the Products page is loaded it can take 5-7s. Therefore, we thought about CACHING.
We now have two options to choose from, the most "obvious" one, at least to me, is using .Net Caching (in memory / in proc). (Please note that Distributed Cache is not allowed at the moment for technical constraints with our provider / hosting partner).
But I'm not very comfortable with this. We could end up with lots of products in memory (when there are 50 or 100 users logged in simultaneously) which could cause other issues on the server, like .Net constantly removing cache items to free up space while our code inserts new items.
The SECOND option:
The main problem here is that it is very EXPENSIVE to generate the User x Product x Access view, so we thought we could create a flat table (or in other words a CACHE of all products x users in the database). This table would be exactly the result of the view.
However the results can change at any time if new products are added, user permissions are changed, etc. So we would need to constantly refresh the table (which could take a few seconds) and this started to get a little bit complex.
Similarly, we though we could implement some sort of Cache Provider and, upon request from a user, we would run the original SQL query and select the products from the view (5-7s, acceptable only once) and save that result in a flat table called ProductUserAccessCache in SQL. Next request, we would get the values from this cached-table (as we could easily identify the results were cached for that particular user) with a fast query without calculations in SQL.
Any time a product was added or a permission changed, we would truncate the cached-table and upon a new request the table would be repopulated for the requested user.
It doesn't seem too complex to me, but what we are doing here basically is creating a NEW cache "provider".
Does any one have any experience with this kind of issue?
Would it be better to use .Net Caching (in proc)?
Any suggestions?
We were facing a similar issue some time ago, and we were thinking of using EF caching in order to avoid the delay on retrieving the information. Our problem was a 1 - 2 secs. delay. Here is some info that might help on how to cache a table extending EF. One of the drawbacks of caching is how fresh you need the information to be, so you set your cache expiration accordingly. Depending on that expiration, users might need to wait to get the fresh info more than they would like to, but if your users can accept that they migth be seing outdated info in order to avoid the delay, then the tradeoff would worth it.
In our scenario, we decided to better have the fresh info than quick, but as I said before, our waiting period wasn't that long.
Hope it helps
We have a fantasy football application that uses memcached and the classic memcached-object-read-with-sql-server-fallback. This works fairly well, but recently I've been contemplating the overhead involved and whether or not this is the best approach.
Case in point - we need to generate a drop down list of the users teams, so we follow this pattern:
Get a list of the users teams from memcached
If not available get the list from SQL server and store in memcached.
Do a multiget to get the team objects.
Fallback to loading objects from sql store these.
This is all very well - each cached piece of data is relatively easily cached and invalidated, but there are two major downsides to this:
1) Because we are operating on objects we are incurring a rather large overhead - a single team occupies some hundred bytes in memcached and what we really just need for this case is a list of team names and ids - not all the other stuff in the team objects.
2) Due to the fallback to loading individual objects, the number of SQL queries generated on an empty cache or when the items expire can be massive:
1 x Memcached multiget (which misses, which and causes)
1 x SELECT ... FROM Team WHERE Id IN (...)
20 x Store in memcached
So that's 21 network request just for this one query, and also the IN query is slower than a specific join.
Obviously we could just do a simple
SELECT Id, Name FROM Teams WHERE UserId = XYZ
And cache that result, but this this would mean that this data would need to be specifically invalidated whenever the user creates a new team. In this case it might seem relatively simple , but we have many of these type of queries, and many of them operate on axes that are not easily invalidated (like a list of id and names of the teams that your friends have created in a specific game).
Sooo.. My question is - do any of you have ideas for resolving the mentioned drawbacks, or should I just accept that there is an overhead and that cache misses are bad, live with it?
First, cache what you need, maybe that two fields, not a complete record.
Second, cache what you need again, break the result set into records and cache them seperately
about caching:
You generally use caching to offload the slower disc-based storage, in this case mysql. The memory cache scales up rather easily, mysql scales less easy.
Given that, even if you double the cpu/netowork/memory usage of the cache and putting it all together again, it will still offload the db. Adding another nodejs instance or another memcached server is easy.
back to your question
You say its a user's team, you could go and fetch it when the user logs-in, and keep it updated in cache while the user changes it throughout his session.
I presume the team member's names do not change, if so you can load all team members by id,name and store those in cache or even local on nodejs, use the same fallback strategy as you do now. Only step 1 and 2 and 4 will be left then.
personally i usually try to split the sql results into smaller ready-made pieces and cache those, and keep the cache updated as long as possible, untimately trying to use mysql only as storage and never read from it
usually you will run some logic on the returned rows form mysql anyways, theres no need to keep repeating that.
Oracle's database change notification feature sends rowids (physical row addresses) on row inserts, updates and deletes. As indicated in the oracle's documentation this feature can be used by the application to built a middle tier cache. But this seems to contradict when we have a detailed look on how row ids work.
ROWID's (physical row addresses) can change when various database operations are performed as indicated by this stackoverflow thread. In addition to this, as tom mentions in this thread clustered tables can have same rowids.
Based on the above research, it doesn't seem to be safe to use the rowid sent during the database change notification as the key in the application cache right? This also raises a question on - Should database change notification feature be used to built an application server cache? or is a recommendation made to restart all the application server clusters (to reload/refresh the cache) when the tables of the cached objects undergo any operations which result in rowid's to change? Would that be a good assumption to be made for production environments?
It seems to me to none of operations that can potentially change the ROWID is an operation that would be carried out in a productive environment while the application is running. Furthermore, I've seen a lot of productive software that uses the ROWID accross transaction (usually just for a few seconds or minutes). That software would probably fail before your cache if the ROWID changed. So creating a database cache based on change notification seems reasonable to me. Just provide a small disclaimer regarding the ROWID.
The only somewhat problematic operation is an update causing a movement to another partition. But that's something that rarely happens because it defeats the purpose of the partitioning, at least if it occurred regularly. The designer of a particular database schema will be able to tell you whether such an operation can occur and is relevant for caching. If none of the tables has ENABLE ROW MOVEMENT set, you don't even need to ask the designer.
As to duplicate ROWIDs: ROWIDs aren't unique globally, they are unique within a table. And you are given both the ROWID and the table name in the change notification. So the tuple of ROWID and table name is a perfect unique key for building a reliable cache.
I have a user object represented in JPA which has specific sub-types. Eg, think of User and then a subclass Admin, and another subclass Power User.
Let's say I have 100k users. I have successfully implemented the second level cache using Ehcache in order to increase performance and have validated that it's working.
http://docs.jboss.org/hibernate/core/3.3/reference/en/html/performance.html#performance-cache
I know it does work (ie, you load the object from the cache rather than invoke an sql query) when you call the load method. I've verified this via logging at the hibernate level and also verifying that it's quicker.
However, I actually want to select a subset of all the users...for example, let's say I want to do a count of how many Power Users there are.
Furthermore, my users have an associated ZipCode object...the ZipCode objects are also second level cached...what I'd like to do is actually be able to ask queries like...how many Power Users do i have in New York state...
However, my question is...how do i write a query to do this that will hit the second level cache and not the database. Note that my second level cache is configured to be read/write...so as new users are added to the system they should automatically be added to the cache...also...note that I have investigated the Query cache briefly but I'm not sure it's applicable as this is for queries that are run multiple times...my problem is more a case of...the data should be in the second level cache anyway so what do I have to do so that the database doesn't get hit when I write my query.
cheers,
Brian
(...) the data should be in the second level cache anyway so what do I have to do so that the database doesn't get hit when I write my query.
If the entities returned by your query are cached, have a look at Query#iterate(). This will trigger a first query to retrieve a list of IDs and then subsequent queries for each ID... that would hit the L2 cache.