Can I use LIMIT to speed up a SPARQL query? - performance

I have a large number of results from a query that users can refine by typing a search term.
However, when there are many, many results, I don't need to show all of them.
I notice that when I use LIMIT in my SPARQL query though, the query takes just as long. Is there a way to use LIMIT in an "interrupt" fashion to shorten the processing time?
Thank you.

No, the implementation of LIMIT like any part of the query is up to the underlying query engine.
Some query engines may implement LIMIT in such a way that it will perform quicker than getting all the results but this doesn't necessarily apply to every query (nor to every query engine)
Depending on the framework being used to make queries and process results you may be able to process results in such a way that you only look at the portion of results you care about but that likely doesn't solve your problem.

Related

Parse-platform-Mongo-DB Are aggregate queries more efficient than normal queries?

Is using query.aggregate(pipeline) in mongoDB more efficient than using normal queries such as query.equalTo, or query.greaterThan?
Aggregate queries definitely require much less code, but that alone doesn't seem to justify the complexity they bring with all the additional parantheses and abbreviations.
Normal queries seem more straightforward, but are they inferior in performance? What is a good use case for aggregate queries vs normal ones?

Strategies to compare performance of two Elasticsearch queries?

Since actual query runtime varies, it's not always useful to just check the runtime of two queries to determine which is generally faster. What are some ways to generally test whether one query is more efficient than another?
As an example of what I'm after, in MongoDB I can run explain on a query to get the number of documents iterated vs. returned. If the documents iterated is several orders of magnitude higher than what it's actually returning, I know I have an inefficient query. I know that since Elasticsearch indexes data much differently than other dbs, this may not translate well, but I'm wondering if there's some rough equivalent.
I'm looking at the Profile API which looks like a good starting place. Are fields like next_doc and next_doc_count what I'm after? Are there any others I should look for? Thanks!!

MongoDB text index search slow for common words in large table

I am hosting a mongodb database for a service that supports full text searching on a collection with 6.8 million records.
Its text index includes ten fields with varying weights.
Most searches take less than a second. Some searches take two to three seconds. However, some searches take 15 - 60 seconds! The 15-60 second search cases are unacceptable for my application. I need to find a way to speed those up.
Searching takes 15-60 seconds when words that are very common in the index are used in the search query.
I seems that the text search feature does not support lazy parameters. My first thought was to cache a list of the 50 most common words in my text index and then ask mongodb to evaluate those last (lazy) and on top of the filtered results returned by the less common parameters. Hopefully people are still with me. For example, say I have a query "products chocolate", where products is common and chocolate is uncommon. I would like to be able to ask mongodb to evaluate "chocolate" first, and then filter those results with the "products" term. Does anyone know of a way to achieve this?
I can achieve the above scenario by omitting the most common words (i.e. "products") from the db query and then reapplying the common term filter on the application side after it has received records found by db. It is preferable for all query logic to happen on the database, but am open to application side processing for a speed payout.
There are still some holes in this design. If a user only searches common terms, I have no choice but to hit the database with all the terms. From preliminary reading, I gather that it is not recommended (or not supported) to have multiple text indexes (with different names) on the same collection. My plan is to create two identical tables, each with my 6.8M records, with different indexes - one for common words and one for uncommon words. This feels kludgy and clunky, but am willing to do this for a speed increase.
Does anyone have any insight and/or advice on how to speed up this system. I'd like as much processing to happen on the database as possible to keep it fast. I'm sure my little 6.8M record table is not the largest that mongodb has seen. Thanks!
Well I worked around these performance issues by allowing MongoDB full text search to search in OR based format. I'm prioritizing my results by fine tuning the weights on my indexed fields and just ordering by rank. I do get more results than desired, but that's not a huge problem because my weighted results that appear at the top will most likely be consumed before my user gets to less relevant results at the bottom.
If anyone is struggling with MongoDB text search performance using AND searching only, just switch back to OR and control your results using weights. It performs leaps better.
hth
This is the exact same issue as $all versus $in. $all only uses the index for the first keyword in the array. I believe your seeing the same issue here, reason why the OR a.k.a. IN works for you.

What are the deciding factors for the order of Tables when joining amongst them?

I know that when joining across multiple tables, performance is dependent upon the order in which they are joined. What factors should I consider when joining tables?
Most modern RDBM's optimize the query based upon which tables are joined, the indexes used, table statistics, etc. They rarely, if ever, differ in their final execution plan based upon the order of the joins in the query.
SQL is designed to be declarative; you specify what you want, not (in most cases) how to get it. While there are things like index hints that can allow you to direct the optimizer to use or avoid specific indexes, by and large you can leave that work to the engine and be about the business of writing your queries.
In the end, running different versions of your queries within SQL Server Management Studio and viewing the actual execution plans is the only way to tell if order can truly make a difference.
As far as I know, the join order has no effect on query performance. The query engine will parse the query and execute it in the way it believes is the most efficient. If you want, try writing the query using different join orders and look at the execution plan. They should be the same.
See this article: http://sql-4-life.blogspot.com/2009/03/order-of-inner-joins.html

SSAS aggregation not being used

So I have a fairly hefty cube that won't be much good without aggregations. I'm still in dev phases, so I'm manually attempting usage based agg design. I'm aggregating some of the main queries that we've designed. However, every time I pull these up, it looks like it's reading through each partition it hits (biggest groups are partitioned monthly).
I decided I'd try to narrow it down. After all, may just be the queries, or a blip, or what have you. So, using SQL Server Profiler and BIDS Helper, I created one and only one aggregation on one of my measure groups. I then ran said query and looked at the profiler, and it again hit every single partition, and didn't grab a thing out of an aggregation.
My only guess is that this is due to the fact that the measure being pulled back has a measure expression (currency conversion). Anybody got any ideas?
As pointed out in the Identifying Bottlenecks whitepaper, measure expressions invalidate aggregations. Once I removed all measure expressions from the measure group, the aggregations were again in use. Hoorah!

Resources