How to get all the rows returned from the solr instead of getting only 10 rows?
You can define how many rows you want (see Pagination in SolrNet), but you can't get all documents. Solr is not a database. It doesn't make much sense to get all documents in Solr, if you feel you need it you might be using the wrong tool for the job.
This is also explained in detail in the Solr FAQ.
As per Solr Wiki,
About the row that query returns,
The default value is "10", which is used if the parameter is not specified. If you want to tell Solr to return all possible results from the query without an upper bound, specify rows to be 10000000 or some other ridiculously large value that is higher than the possible number of rows that are expected.
refer this https://wiki.apache.org/solr/CommonQueryParameters
You can setup rows=x, where x is the desired number of doc in the query url.
You can also get groups of 10 doc, by looping over the founds docs by changing start value and leaving row=10
Technically it is possible to get all results from a SOLR search. All you need to do is to specify the limit as -1.
Related
I have a requirement where I need to page through a big set of results. Now, I understand how primitive paging is not a good idea here considering that if I have to search let's say 10th Page, elastic search would have to load all 10 pages in memory, sort them and then aggregate to give the results which is not an ideal situation.
However, when using search after, we provide the last sorted value from first page and we basically tell elastic search - "Give me results which are beyond this field". My question here is how is this performant? As I understand it is the same as adding a new filter to your original query which says that next set of results should have the sort value greater than last page. Is that all that's there to it or is there something more that I am missing?
To make it short, normal pagination with from/size will always need to keep track of all hits from the from parameter of the very first request (i.e. increasing with each new pagination).
Whereas, using search_after it is not necessary to do so as the amount of data to keep track of is only as big as the size parameter (i.e. constant with each pagination).
If you want to dig into more details, I suggest you have a look at the following tickets:
#4940: Improve scroll search by using Lucene's IndexSearcher#searchAfter
#8192: Search: Expose Lucene's searchAfter in the search API
#16125: Add search_after parameter in the SearchAPI
For Kibana server decommissioning purposes, I want to get a list of index patterns which never had any single document and had documents.
How to achieve this using Kibana only?
I tried this but it doesn't give the list based on the document count.
GET /_cat/indices
Also in individual level getting the count to check the documents are there is time consuming .
GET index-pattern*/_count
You can try this. V is for verbose and s stands for sort.
GET /_cat/indices?v&s=store.size:desc
From the docs :
These metrics are retrieved directly from Lucene, which {es} uses internally to power indexing and search. As a result, all document counts include hidden nested documents.
In elasticsearch 7.9, I have an Index with 1 shard and 1 replica. I use simple datetime filter to get docs between start time and end time, but I often get same result set in different order. I do not want to use Sort clause and compute scores. I just want to get results in same order.
So there is anyway to do this without using Sort?
It may be happening due to the fact, that you have 1 replica for your index, which might have some difference or different values for your timestamp field, you can use the preference param and make sure, your search results are always returned from the same shard.
Refer bouncy result issue blog in ES for more info.
I need to send a large bunch of ids in terms query, and i tried with approx 2000 guids, but I found that the data is not being posted to elasticseach. Json array was empty. Is there any limit to max count of values in terms query??and is there any config setting that can increase the max query length for terms query.
I just tried to find out on web if its the json_encode function that does not support such a large array size to encode, but its not the case, so second thing that came to my mind is if elasticsearch terms query supports this or not??
Any help or guidance will be highly appreciated.
If you are using a bool filter or query, it looks like there is a limit of 1024 clauses. See this.
https://groups.google.com/forum/#!topic/elasticsearch/LqywKHKWbeI
Based on that same link, it also appears that you can the option in your elasticsearch.yml
When I apply the 'ToFacets("facets/CameraFacets")' extension on the 'IQueryable' that comes from my query, I find the count on one of the 'IEnumerable' collections against a facet in the dictionary is 1024. I know for sure there are more, but how do I retrieve them? Will increasing the safe limit automatically give me all values, also is there another way of doing this without having to increase that limit?
Yes if you change the safe limit it will pull in more facets, take a look at the HandleTermsFacet(..) in the code.
However, I wouldn't recommend it. It's a perf issue because 1024 facets means you are doing 1024 seperate queries.
If you need to deal with this many facets, you are better off using a Map/Reduce index, also see this blog post