How to create a Elastic Search query for or condition - elasticsearch

I want to create a elastic query in java.
The query is
Select * from table name(col1==1 && col2==2) || (col3==3 && col4==4);

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html
A combination of "Should" and "must" in elastic search query will get you to what you are looking for.
must
The clause (query) must appear in matching documents and will contribute to the score.
filter
The clause (query) must appear in matching documents. However unlike must the score of the query will be ignored. Filter clauses are executed in filter context, meaning that scoring is ignored and clauses are considered for caching.
should
The clause (query) should appear in the matching document.
must_not
The clause (query) must not appear in the matching documents. Clauses are executed in filter context meaning that scoring is ignored and clauses are considered for caching. Because scoring is ignored, a score of 0 for all documents is returned.

You can use elastic search query if you want to query in sql style and return the results as JSON objects.
Example of the request:
POST _sql?format=json
{
"query": """select "host.name", avg("system.fsstat.total_size.used")/avg(system.fsstat.total_size.total) * 100 as used_percent from "metricbeat*" where "#timestamp" >= NOW() - INTERVAL 60 MINUTE group by host.name"""
}
Link: https://www.elastic.co/what-is/elasticsearch-sql

Related

How to prevent slow match / match_phrase queries for keywords in Kibana?

How can I achieve that a match query for certain fields is equivalent to a term query?
I have a larger index in Elastic covering events. Each event has an eventid field consisting of a random hex string (e.g. f4fc38c993c1a8273f9c40eedc9050b7) as well as some other fields. The eventid is indexed as keyword in Elastic.
If I query based on this field in Kibana, the query often runs into timeouts, because Kibana automatically generates a match query for eventid:f4fc38c993c1a8273f9c40eedc9050b7.
If I set a manual filter using { "query": { "term": { "eventid": "f4fc38c993c1a8273f9c40eedc9050b7" } } } (so a term instead of match query) I get a response quite quickly.
From my understanding, these should be pretty much equivalent, as keyword fields aren't analyzed, so the match query should be equivalent to a term query.
What am I missing?

The boolean fuzzy query in elasticsearch is not returning expected result

I am trying to build a fuzzy bool query on first and last names in elasticsearch 7.2.0. I have a document with "asim" and "banskota" as first and last name respectively. But when I query with "asi" or "asimmm" and the exact last name, elasticsearch returns no result. However, when queried with exact first name or "asimm", it returns me the intended result from the document.
I also wrote a "fuzzy" query instead of "match". I experimented with different fuzziness parameters, but the outcome is same. Both first name and last names are analyzed, and I queried the 'analyzer' API wrt how it analyze
'asim'. It is indexing the document with 'asim' as a single token with standard analyzer.
EDIT: It turns out that the fuzzy query works with 'Substitution' case, for example, it returns the result for 'asim' when queried with 'asmi' but not for deletion. It is surprising to me as the edit distance in the substitution is greater than in the deletion case. When the string length is greater, for instance with the last name 'Banskota', fuzzy matching works for 'deletion' case as well. What should I do to make the fuzzy search work in 'deletion' case with string length of 4 or 5?
fuzzy_body = {"size": 10,
"query":{
"bool":{
"must": [
{
"match":{"FIRST_NAME_N":{'query': 'asi',"fuzziness": "AUTO"}},
},
{
"fuzzy":{"LAST_NAME_N": "banskota"}
}
]
}
}
}
It turns out that if the name fields are indexed as keyword type, the query returns the expected results with "AUTO" fuzziness.

How can I find the true score from Elasticsearch query string with a wildcard?

My ElasticSearch 2.x NEST query string search contains a wildcard:
Using NEST in C#:
var results = _client.Search<IEntity>(s => s
.Index(Indices.AllIndices)
.AllTypes()
.Query(qs => qs
.QueryString(qsq => qsq.Query("Micro*")))
.From(pageNumber)
.Size(pageSize));
Comes up with something like this:
$ curl -XGET 'http://localhost:9200/_all/_search?q=Micro*'
This code was derived from the ElasticSearch page on using Co-variants. The results are co-variant; they are of mixed type coming from multiple indices. The problem I am having is that all of the hits come back with a score of 1.
This is regardless of type or boosting. Can I boost by type or, alternatively, is there a way to reveal or "explain" the search result so I can order by score?
Multi term queries like wildcard query are given a constant score equal to the boosting by default. You can change this behaviour using .Rewrite().
var results = client.Search<IEntity>(s => s
.Index(Indices.AllIndices)
.AllTypes()
.Query(qs => qs
.QueryString(qsq => qsq
.Query("Micro*")
.Rewrite(RewriteMultiTerm.ScoringBoolean)
)
)
.From(pageNumber)
.Size(pageSize)
);
With RewriteMultiTerm.ScoringBoolean, the rewrite method first translates each term into a should clause in a bool query and keeps the scores as computed by the query.
Note that this can be CPU intensive and there is a default limit of 1024 bool query clauses that can be easily hit for a large document corpus; running your query on the complete StackOverflow data set (questions, answers and users) for example, hits the clause limit for questions. You may want to analyze some text with an analyzer that uses an edgengram token filter.
Wildcard searches will always return a score of 1.
You can boost by a particular type. See this:
How to boost index type in elasticsearch?

BoolFilter and BoolQuery in ElasticSearch

I am applying to two search requests a filter and a query semantically identical like so:
static FilterBuilder filter(String field1Value, String field2Value){
return FilterBuilders.boolFilter().must(FilterBuilders.termFilter("field1",field1Value)).should(FilterBuilders.termFilter("field2",field2Value));
}
static QueryBuilder query(String field1Value, String field2Value){
return QueryBuilders.boolQuery().must(QueryBuilders.termQuery("field1",field1Value)).should(QueryBuilders.termQuery("field2",field2Value));
}
client.prepareSearch(indexName).setPostFilter(filter("hello", "world")).setTypes("mytype");
client.prepareSearch(indexName).setQuery(query("hello","world")).setTypes("mytype");
However, while the search with the query returns results, the search with the filter doesn't return any result. Aren't the two suppose to behave identically and if not, why?
They are not exactly the same.
In a bool query with a must clause a document would be a match if none of the clauses in should are matched provided there is no explicit minimum_should_match in the query.
In filter bool query at-least one should clause needs to be satisfied for a document to be considered a match. In filters there is no option of minimum_should_match and can be treated as always set to one.
i.e for filters it can be viewed as follows
[must_clause] && [should_clause1 || should_clause_2]
For the example in the OP :
1) the documents would pass the filter if and only if they match field1 criteria in must clause and field2 criteria in should clause .
2) Whereas for bool query it would suffice for a document to be considered a match if must-clause is satisfied i.e field1 match

Elasticsearch difference between MUST and SHOULD bool query

What is the difference between MUST and SHOULD bool query in ES?
If I ONLY want results that contain my terms should I then use must ?
I have a query that should only contain certain values, and also no results that has a lower date/timestamp than todays time/date - NOW
Also
Can i use multiple filters inside a must like the code bellow:
"filtered": {
"filter": {
"bool" : {
"must" : {
"term" : { "type" : 1 }
"term" : { "totals" : 14 }
"term" : { "groupId" : 3 }
"range" : {
"expires" : {
"gte": "now"
}
}
},
must means: The clause (query) must appear in matching documents. These clauses must match, like logical AND.
should means: At least one of these clauses must match, like logical OR.
Basically they are used like logical operators AND and OR. See this.
Now in a bool query:
must means: Clauses that must match for the document to be included.
should means: If these clauses match, they increase the _score; otherwise, they have no effect. They are simply used to refine the relevance score for each document.
Yes you can use multiple filters inside must.
Since this is a popular question, I would like to add that in Elasticsearch version 2 things changed a bit.
Instead of filtered query, one should use bool query in the top level.
If you don't care about the score of must parts, then put those parts into filter key. No scoring means faster search. Also, Elasticsearch will automatically figure out, whether to cache them, etc. must_not is equally valid for caching.
Reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html
Also, mind that "gte": "now" cannot be cached, because of millisecond granularity. Use two ranges in a must clause: one with now/1h and another with now so that the first can be cached for a while and the second for precise filtering accelerated on a smaller result set.
As said in the documentation:
Must: The clause (query) must appear in matching documents.
Should: The clause (query) should appear in the matching document. In a boolean query with no must clauses, one or more should clauses must match a document. The minimum number of should clauses to match can be set using the minimum_should_match parameter.
In other words, results will have to be matched by all the queries present in the must clause ( or match at least one of the should clauses if there is no must clause.
Since you want your results to satisfy all the queries, you should use must.
You can indeed use filters inside a boolean query.

Resources