Can anyone, someone point me in the direction of how to troubleshoot why a Tabular model that I have built does not seem to want to cache query results?
It is my understanding that MDX queries to Tabular model will be cached, however with our model they never seem to be! And I can't figure out why..
My best guess is that it's memory pressure, and the system is clearing down the RAM, but even that is a guess..
Are there any counters, DMVs, or other perfmon stats etc that i can use to actually see what is going on and check?
Thanks.
Plenty of places to look, but I'd recommend starting with a Profiler/xEvent trace. Below is an example of 2 runs of the same MDX query.
The first run is on a cold-cache...
The second run is on a warm-cache and you can see that it is resolving the query from cache...
This is much easier to see if you can isolate the query on non-production server (e.g. test/dev environment). There are quite a few reasons why a particular query may not be taking advantage of the cache...but you need to first confirm that it is not using the cache.
Related
We are hoping to convert to mass transit, but due to the size of the data we need to store in the message data repository, we are needing to keep it in memory. The data is sometimes over 15GB, and the SQL query to fetch it takes up to 300 seconds. We don't want to have to fetch the data potentially 100s of times (which is how the consumers would be logically broken up). We need to continue to fetch it from sql once and keep it in memory, or performance will be crippled. We would ultimately like to stop needing to have so much data in memory, but it is not something we could feasibly fix in the short term. We are hoping to convert to mass transit now though, as we see significant benefits in doing so, and the InMemoryMessageDataRepository seems like the only path that would allow us to do this right now. However, the documentation says only to use that for unit testing.
I am trying to find out if this is something we can use in production code. What are the pitfalls in using the InMemoryMessageDataRepository in production? I can't find much about this, other than this blog post, https://kevsoft.net/2017/05/18/transferring-big-payloads-with-masstransit-and-azure-blob-storage.html, which says
It is however advised not to use in-memory storage due to potential data loss.
Why would there be potential data loss? What is mass transit doing with the InMemoryMessageDataRepository that would make using it in production risky?
I am new to Sphinx and want to make it index a 2 million row table (~1.5GB in size). I will use plain indexes.
At the moment, I don't know how much memory should I put in the mem_limit config. My idea is that I could simply keep the default, and then I could see how many results are being swapped (stay in disk) or expired (how frequently used results in memory go to disk).
I'm not sure yet exactly how Sphinx works, anyway, but this is my understanding for now. However, how can I see stats like these, just like we can see the STATS for Memcached?
Having some kind of stats would definitely help me know how to better tune Sphinx for my application.
In case it's relevant, I use MariaDB and PHP on CentOS.
In case its not clear mem_limit is ONLY used by the indexer program. For use during creating the index.
... frankly the setting isnt all that critical. Just set it as high as you have available memory.
It's not applicable to searchd which actually answers queries.
There is 'SHOW STATUS' command, but it doesnt really have anything about menory
http://sphinxsearch.com/docs/current.html#sphinxql-show-status
... memory usage (and there are no variables to control it!) can be got from general OS commands. On linux for example, possibly something like memstat
I recently added the wonderful MiniProfiler package to my project and it helped me a lot to improve page render speed.
Now I notice the following. Every first request to a page takes a significant longer time in SQL than subsequent visits.
Here's an example:
First visit:
Second and later visits:
Is this caused by some sort of caching in LINQ or on SQL Server? I'm using .NET 4 and LINQ-to-SQL with default settings in my dbml file.
There are a lot of things that can affect the performance of a first hit. The jitter might have to do some work, and various levels of caching might come into play.
That said, SQL Server has very advanced caching features. It's not at all unusual for repeat queries against the server to be much faster than the initial query.
I'm currently using CakePHP for my training plan application but I have rendering times of 800 ms per page.
Do you have any tips on improving the performance?
Thanks in advance.
br, kms
In general the tool that will give you the greatest speed boost is caching, and CakePHP has a dedicated helper. The second most useful thing to do is optimizing the queries, only asking for the data you need - this can be done with the containable behavior.
You can find more detailed tips in this post.
Install APC, if you dont have it. that will instantly make it < 500ms. also without touching a single line of code.
Make sure your tables have all the proper indexes so that queries are as fast as they can be.
Next up look at some things about caching routes / urls as that is a huge drain.
Those will give you the most speed boost for the least amount of work
Have you tried any of the CSS/JS asset combiners for CakePHP? They combine/compress/minify your CSS/JS scripts and cache them where applicable. Here's one that's quite recent.
Not specific to CakePHP but you could go through all the factors in Google Page Speed, it will help you speed up your page loading times by suggesting what scripts you could combine and advice on how to reduce requests.
Other than that, look into the containable behaviour, see if you can cut out any necessary info/queries by only selecting what you need at any time.
This question is well-populated with information on speeding Cake up.
First off all I know:
Premature optimization is the root of all evil
But I think wrong autocomplete can really blow up your site.
I would to know if there are any libraries out there which can do autocomplete efficiently(serverside) which preferable can fit into RAM(for best performance). So no browserside javascript autocomplete(yui/jquery/dojo). I think there are enough topic about this on stackoverflow. But I could not find a good thread about this on stackoverflow (maybe did not look good enough).
For example autocomplete names:
names:[alfred, miathe, .., ..]
What I can think off:
simple SQL like for example: SELECT name FROM users WHERE name LIKE al%.
I think this implementation will blow up with a lot of simultaneously users or large data set, but maybe I am wrong so numbers(which could be handled) would be cool.
Using something like solr terms like for example: http://localhost:8983/solr/terms?terms.fl=name&terms.sort=index&terms.prefix=al&wt=json&omitHeader=true.
I don't know the performance of this so users with big sites please tell me.
Maybe something like in memory redis trie which I also haven't tested performance on.
I also read in this thread about how to implement this in java (lucene and some library created by shilad)
What I would like to hear is implementation used by sites and numbers of how well it can handle load preferable with:
Link to implementation or code.
numbers to which you know it can scale.
It would be nice if it could be accesed by http or sockets.
Many thanks,
Alfred
Optimising for Auto-complete
Unfortunately, the resolution of this issue will depend heavily on the data you are hoping to query.
LIKE queries will not put too much strain on your database, as long as you spend time using 'EXPLAIN' or the profiler to show you how the query optimiser plans to perform your query.
Some basics to keep in mind:
Indexes: Ensure that you have indexes setup. (Yes, in many cases LIKE does use the indexes. There is an excellent article on the topic at myitforum. SQL Performance - Indexes and the LIKE clause ).
Joins: Ensure your JOINs are in place and are optimized by the query planner. SQL Server Profiler can help with this. Look out for full index or full table scans
Auto-complete sub-sets
Auto-complete queries are a special case, in that they usually works as ever decreasing sub sets.
'name' LIKE 'a%' (may return 10000 records)
'name' LIKE 'al%' (may return 500 records)
'name' LIKE 'ala%' (may return 75 records)
'name' LIKE 'alan%' (may return 20 records)
If you return the entire resultset for query 1 then there is no need to hit the database again for the following result sets as they are a sub set of your original query.
Depending on your data, this may open a further opportunity for optimisation.
I will no comply with your requirements and obviously the numbers of scale will depend on hardware, size of the DB, architecture of the app, and several other items. You must test it yourself.
But I will tell you the method I've used with success:
Use a simple SQL like for example: SELECT name FROM users WHERE name LIKE al%. but use TOP 100 to limit the number of results.
Cache the results and maintain a list of terms that are cached
When a new request comes in, first check in the list if you have the term (or part of the term cached).
Keep in mind that your cached results are limited, some you may need to do a SQL query if the term remains valid at the end of the result (I mean valid if the latest result match with the term.
Hope it helps.
Using SQL versus Solr's terms component is really not a comparison. At their core they solve the problem the same way by making an index and then making simple calls to it.
What i would want to know is "what you are trying to auto complete".
Ultimately, the easiest and most surefire way to scale a system is to make a simple solution and then just scale the system by replicating data. Trying to cache calls or predict results just make things complicated, and don't get to the root of the problem (ie you can only take them so far, like if each request missed the cache).
Perhaps a little more info about how your data is structured and how you want to see it extracted would be helpful.