Hibernate criteria query with restrictions takes long execution time - spring-boot

In our application we are using Criteria query with about 15 restrictions to filter some results.
The query is like this :
Select column1,column2,colum3... From Table where column=?1 and column=?2....
Based upon the input the AND clause will grow. I really can't find where lies the issue.
criteria.addRestrictions() is the one we are using to create this kind of dynamic query. We are validating the input simple If clause with a predicate input!=null. All the inputs are string only. And also we don't have indexes for every column. I know without indexes, a full table scan will be performed. But I want to know which is causing long execution time for that query.
Any help is appreciated. Thanks .

Related

How to optimize filtering : Rethink db

introduction
I have been working with the ReThink Db using data explorer tab. I am new to Rethink Db.
I have created this query to filter record on date base. I needed to optimize the query so that it can take less time for large records.
r.db('test').table('usrz').filter(function(test) {
return test("createdDate").date().during(
r.time(2016,12,20, 'Z'),
r.time(2016,12,30, 'Z'))
}).orderBy(r.desc('createdDate'))
Any help or reference will be apreciated. Thanks for your time.
RethinkDB queries can be optimized by using indexes. (see https://www.rethinkdb.com/docs/secondary-indexes/javascript)
To create an index:
r.table('usrz').indexCreate('createdDate')
Your query can be converted to use that index by transforming the filter/during combination into a between and by adding an index argument to orderBy

RethinkDb OrderBy Before Filter, Performance

The data table is the biggest table in my db. I would like to query the db and then order it by the entries timestamps. Common sense would be to filter first and then manipulate the data.
queryA = r.table('data').filter(filter).filter(r.row('timestamp').minutes().lt(5)).orderBy('timestamp')
But this is not possible, because the filter creates a side table. And the command would throw an error (https://github.com/rethinkdb/rethinkdb/issues/4656).
So I was wondering if I put the orderBy first if this would crash the perfomance when the datatbse gets huge over time.
queryB = r.table('data').orderBy('timestamp').filter(filter).filter(r.row('timestamp').minutes().lt(5))
Currently I order it after querying, but usually datatbases are quicker in these processes.
queryA.run (err, entries)->
...
entries = _.sortBy(entries, 'timestamp').reverse() #this process takes on my local machine ~2000ms
Question:
What is the best approach (performance wise) to query this entries ordered by timestamp.
Edit:
The db is run with one shard.
Using an index is often the best way to improve performance.
For example, an index on the timestamp field can be created:
r.table('data').indexCreate('timestamp')
It can be used to sort documents:
r.table('data').orderBy({index: 'timestamp'})
Or to select a given range, for example the past hour:
r.table('data').between(r.now().sub(60*60), r.now(), {index: 'timestamp'})
The last two operations can be combined int one:
r.table('data').between(r.now().sub(60*60), r.maxval, {index: 'timestamp'}).orderBy({index: 'timestamp'})
Additional filters can also be added. A filter should always be placed after an indexed operation:
r.table('data').orderBy({index: 'timestamp'}).filter({colour: 'red'})
This restriction on filters is only for indexed operations. A regular orderBy can be placed after a filter:
r.table('data').filter({colour: 'red'}).orderBy('timestamp')
For more information, see the RethinkDB documentation: https://www.rethinkdb.com/docs/secondary-indexes/python/

Does query by id faster then other queries in SOLR (Lucene)?

Id is a primary key, another_field is a some string field.
http://localhost:8983/solr/select?q=id:c2c32773-1691-11df-97a5-7038c432aabf
http://localhost:8983/solr/select?q=another_field:c2c32773-1691-11df-97a5-7038c432aabf
Is the first query faster?
Query on id does not make it faster then any other query unless the fields are text fields which would take it a bit longer depending upon the analysis performed.
Also if you want to query on Id or fixed valued fields better using the Filter Query which would be much more faster cause of the field caching.

Efficient way for sorting items for different parameters?

Suppose you have millions items(say search results) and you have different parameters for sorting these items(like in eCommerce sites). We will be showing the items using pagination.
Let us say it can be sorted by date, popularity and relevance and results are paginated. How would you implement this functionality? Generally I would create different compare functions for parameters and get results accordingly.
If there any other efficient way to have this kind of functionality instead of sorting the search results every time? Also, do we generally run sql query every time using relevant order parameter or should we sort the search result of previous query to save us from re-searching time?
"If there any other efficient way to have this kind of functionality instead of sorting the search results every time?"
I would say you do not need sort every time but execute SQL query with appropriate OrderBy parameter, paginate it and show to the user
"Also, do we generally run sql query every time using relevant order parameter or should we sort the search result of previous query to save us from re-searching time?"
For sure you need to generate a new SQL query, as the first page data based on a new order parameter can contain completely different set of data from previouse one.

SQLite - how to return rows containing a text field that contains one or more strings?

I need to query a table in an SQLite database to return all the rows in a table that match a given set of words.
To be more precise: I have a database with ~80,000 records in it. One of the fields is a text field with around 100-200 words per record. What I want to be able to do is take a list of 200 single word keywords {"apple", "orange", "pear", ... } and retrieve a set of all the records in the table that contain at least one of the keyword terms in the description column.
The immediately obvious way to do this is with something like this:
SELECT stuff FROM table
WHERE (description LIKE '% apple %') or (description LIKE '% orange %') or ...
If I have 200 terms, I end up with a big and nasty looking SQL statement that seems to me to be clumsy, smacks of bad practice, and not surprisingly takes a long time to process - more than a second per 1000 records.
This answer Better performance for SQLite Select Statement seemed close to what I need, and as a result I created an index, but according to http://www.sqlite.org/optoverview.html sqlite doesn't use any optimisations if the LIKE operator is used with a beginning % wildcard.
Not being an SQL expert, I am assuming I'm doing this the dumb way. I was wondering if someone with more experience could suggest a more sensible and perhaps more efficient way of doing this?
Alternatively, is there a better approach I could use to the problem?
Using the SQLite fulltext search would be faster than a LIKE '%...%' query. I don't think there's any database that can use an index for a query beginning with %, as if the database doesn't know what the query starts with then it can't use the index to look it up.
An alternative approach is putting the keywords in a separate table instead, and making an intermediate table that has the information about which row in your main table has which keywords. If you indexed all the relevant columns that way, it could be queried very quickly.
Sounds like you might want to have a look at Full Text Search. It was contributed to SQLite by someone from google. The description:
allows the user to efficiently query
the database for all rows that contain
one or more words (hereafter
"tokens"), even if the table contains
many large documents.
This is the same problem as full-text search, right? In which case, you need some help from the DB to construct indexes into these fields if you want to do this efficiently. A quick search for SQLite full text search yields this page.
The solution you correctly identify as clumsy is probably going to do up to 200 regular expression matches per document in the worst case (i.e. when a document doesn't match), where each match has to traverse the entire field. Using the index approach will mean that your search speed will be independent of the size of each document.

Resources