Jest Client Size Parameter Ignored - elasticsearch

I am using the Jest Client to query Elasticsearch from my Java program. Everything works correctly except that when I add the "size" parameter it is ignored. Building and execution of the Search is like so:
Search search = new Search.Builder(query)
.setSearchType(SearchType.QUERY_THEN_FETCH)
.addIndex(index)
.addType(type)
.setParameter(Parameters.SIZE, 1)
.build();
jestClient.execute(search);
This query always returns 10 results, instead of the 1 result expected. In case it is relevant, there are only 5 shards, so it is not returning a result per shard.
Is there any particular reason this parameter is ignored? When running the same query with the same parameters on the command line with 'curl -XGET' or when simply putting it in the browser the query runs correctly and the size parameter is taken in to account.

It turns out I had a misunderstanding of how the size parameter was working. I believe Jest was handling it correctly when sizing the query results, but what I was actually interested in were the aggregation results.
Adding a "size" parameter to my aggregation worked, as can be seen in the Terms Aggregation docs (copied below).
{
"aggs" : {
"products" : {
"terms" : {
"field" : "product",
"size" : 5
}
}
}
}

Related

What is the difference between `constant_score + filter` and `term` query?

I have two queries in Elasticsearch:
{
"term" : {
"price" : 20
}
}
and
"constant_score" : {
"filter" : {
"term" : {
"price" : 20
}
}
}
They are returning the same query result. I wonder what the main difference between them. I read some articles about scoring document. And I believe both queries are scoring document. The constant_score will use default score 1.0 to match the document's score. So I don't see much difference between these two.
The results would be exactly the exact.
However, the biggest difference is that the constant_score/filter version will cache the results of the term query since it's run in a filter context. All future executions will leverage that cache. Also, one feature of the constant_score query is that the returned score is always equal to the given boost value (which defaults to 1)
The first query will be run outside of the filter context and hence not benefit from the filter cache.

Using term or terms with one value in Elasticsearch queries

I am querying an Elasticsearch index using the values of a field. Sometimes, I have to extract all the documents having a field set to exactly one value; Some other times I have to retrieve all the documents having a field, set with one of the values in a list of values.
The latter use case contains the former. Can I use a single query using the terms construct?
POST /_search
{
"query": {
"terms" : { "user" : ["kimchy", "elasticsearch"]}
}
}
Or, in cases I know I need to search only for a unique value, it is better to use the term construct?
POST _search
{
"query": {
"term" : { "user" : "kimchy" }
}
}
Which approach is better regarding performance? Does Elasticsearch perform any optimization if the value in the terms construct is unique?
Thanks to all.
See this link. Terms query is automatically cached while term query is not . So, the next you run the same query, the took time for query for execution will be faster. So if you have a case where you need to run the same query again and again, terms query is a good choice. If not, there is not much of difference between the two.

Elasticsearch error when same search query

Now i am working with Elastic search.But when i run same query for search,i can not get same results.
First query:"GET /mega/employee/_search?q=last_name:Smith"
Result:I only get 2 results that "last_name==Smith"
Second query:
"GET /mega/employee/_search
{
"query" : {
"match" : {
"last_name" : "Smith"
}
}
}"
I get 3 results,with last result:last_name==Fer
Someones can explain for me?
You probably need to POST to _search. GET with a body is ill-supported (and, I'd argue, meaningless according to the spec), and is rather abused by Elasticsearch. Your request is almost certainly being interpreted at a GET to your _search endpoint with no query attached.

How to view the response for multiple indices for a single query

I have created multiple indices in elasticsearch and have passed a single query to all of them. Is there any way to know,how many results came from each index?
Here is the screenshot of my elasticsearch head,showing a single aggregation applied to two indices
screenshot:
Here as in the figure you can see I have done an aggregation named "posted_time" on the indices foodfind and comics (red box 1).
But in the response window,to the right,only the results for the index "comics" is shown. How can I see the results for the other index too?
You can use terms aggregation on the field _index for this.
Lets say you need to run the same on index-a , index-b and index-c.
You need to make the request in this pattern -
curl -XPOST 'http://localhost:9200/index-a,index-b,index-c/_search' -d '{
"aggs" : {
"indexStats" : {
"terms" : {
"field" : "_index"
}
}
}
}'

How to enable fuzziness for phrase queries in ElasticSearch

We're using ElasticSearch for searching through millions of tags. Our users should be able to include boolean operators (+, -, "xy", AND, OR, brackets). If no hits are returned, we fall back to a spelling suggestion provided by ES and search again. That's our query:
$ curl -XGET 'http://127.0.0.1:9200/my_index/my_type/_search' -d '
{
"query" : {
"query_string" : {
"query" : "some test query +bools -included",
"default_operator" : "AND"
}
},
"suggest" : {
"text" : "some test query +bools -included",
"simple_phrase" : {
"phrase" : {
"field" : "my_tags_field",
"size" : 1
}
}
}
}
Instead of only providing a fallback to spelling suggestions, we'd like to enable fuzzy matching. If, for example, a user searches for "stackoverfolw", ES should return matches for "stackoverflow".
Additional question: What's the better performing method for "correcting" spelling errors? As it is now, we have to perform two subsequent requests, first with the original search term, then with the by ES suggested term.
The query_string does support some fuzziness but only when using the ~ operator, which I think doesn't your usecase. I would add a fuzzy query then and put it in or with the existing query_string. For instance you can use a bool query and add the fuzzy query as a should clause, keeping the original query_string as a must clause.
As for your additional question about how to correct spelling mistakes: I would use fuzzy queries to automatically correct them and two subsequent requests if you want the user to select the right correction from a list (e.g. Did you mean), but your approach sounds good too.

Resources