How to write ElasticSearch query with AND condition - elasticsearch

I am trying to write an elastic search query for searching the data with two.conditions something as below
{
"query": {
"match": {
"trackingId": "track4324234234244",
"log_message": "downstream request-response"
}
}
}
The above query wont work because [match] query doesn't support multiple fields. Is there a way I can achieve this.

You can use Bool query, where a must clause can be used.
must means: The clause (query) must appear in matching documents. These clauses must match, like logical AND.
To know about the difference between must and should refer to this SO answer
Adding Working example with sample docs and search query
Index Sample Data:
{
"trackingId":"track4324234234244",
"log_message":"downstream request-response"
}
{
"trackingId":"track4324234234244",
"log_message":"downstream"
}
{
"trackingId":"tracks4324234234244",
"log_message":"downstream request-response"
}
Search query:
{
"query": {
"bool": {
"must": [
{
"match": {
"trackingId": "track4324234234244"
}
},
{
"match": {
"log_message": {
"query": "downstream request-response",
"operator": "and"
}
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "my_index",
"_type": "_doc",
"_id": "1",
"_score": 1.8570712,
"_source": {
"trackingId": "track4324234234244",
"log_message": "downstream request-response"
}
}
]

Apart from Bool, you can also make use of simple query string as mentioned below:
POST <your_index_name>/_search
{
"query": {
"simple_query_string": {
"fields": ["trackingId", "log_message"],
"query": "track4324234234244 downstream request-response",
"default_operator": "AND"
}
}
}
Note how I've just added all the terms and made use of default_operator: AND so that it returns only documents having all the terms present in the fields.
There is also query_string however I would recommend using the above one as query_string works in strict fashion meaning, it would throw errors if the query string has any syntax errors while simple_query_string does not.
POST <your_index_name>/_search
{
"query": {
"query_string": {
"fields": ["trackingId", "log_message"],
"query": "(track4324234234244) AND (downstream request-response)",
"default_operator": "AND"
}
}
}
So as to when to use simple_query_string, mostly only if you would want to expose the query string or terms to end user, at that point which this would be useful.
Hope that helps!

Related

Ignore 'AND', 'OR' constructions with query_string in elastic search and do literally search

In elasticsearch it's possible to execute the following query:
GET /_search
{
"query": {
"query_string": {
"query": "(apple) OR (banana)",
}
}
}
This results in all documents having any field with the value 'apple' or 'banana'. I'm looking for a way to prevent the user from writing queries like "(apple) OR (banana)" in the search box. This should be converted to a literal search for "(apple) OR (banana)" (so returning any document with a value set to "(apple) OR (banana)"). What's the best way to do this?
To give a bit more context: "query_string" was chosen to be able to perform 'contains' queries on entire documents using wildcards.
Thank you in advance!
[Edit] To be a bit more clear:
Example:
Doc 1: { "snack": "apple" }
Doc 2: {"snack": "banana"}
Doc 3: {"snack": "(apple) OR (banana)"}
If the user would search for "(apple) OR (banana)" this normally results in Doc 1 and Doc 2, but I would want it to match only with Doc 3.
Solved thanks to #Bhavya and #TreffnonX:
Summary: took #Bhavya solution, but wrapped my search string in extra double quotes:
GET _search
{
"query": {
"query_string": {
"query": "\"\\(apple\\) OR \\(banana\\)\""
}
}
}
or
GET _search
{
"query": {
"query_string": {
"query": "\"(apple) OR (banana)\""
}
}
}
Adding a working example with index data,search query, and search result
Index Data:
{ "snack": "apple" }
{"snack": "banana"}
{"snack": "(apple) OR (banana)"}
Search Query:
{
"query": {
"query_string": {
"query": "\\(apple\\) OR \\(banana\\)"
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64352271",
"_type": "_doc",
"_id": "1",
"_score": 4.0350027,
"_source": {
"snack": "(apple) OR (banana)"
}
}
]

Elastic search query using python list

How do I pass a list as query string to match_phrase query?
This works:
{"match_phrase": {"requestParameters.bucketName": {"query": "xxx"}}},
This does not:
{
"match_phrase": {
"requestParameters.bucketName": {
"query": [
"auditloggingnew2232",
"config-bucket-123",
"web-servers",
"esbck-essnap-1djjegwy9fvyl",
"tempexpo",
]
}
}
}
match_phrase simply does not support multiple values.
You can either use a should query:
GET _search
{
"query": {
"bool": {
"should": [
{
"match_phrase": {
"requestParameters.bucketName": {
"value": "auditloggingnew2232"
}
}
},
{
"match_phrase": {
"requestParameters.bucketName": {
"value": "config-bucket-123"
}
}
}
]
},
...
}
}
or, as #Val pointed out, a terms query:
{
"query": {
"terms": {
"requestParameters.bucketName": [
"auditloggingnew2232",
"config-bucket-123",
"web-servers",
"esbck-essnap-1djjegwy9fvyl",
"tempexpo"
]
}
}
}
that functions like an OR on exact terms.
I'm assuming that 1) the bucket names in question are unique and 2) that you're not looking for partial matches. If that's the case, plus if there are barely any analyzers set on the field bucketName, match_phrase may not even be needed! terms will do just fine. The difference between term and match_phrase queries is nicely explained here.

Query string query with keyword and text fields in the same search

Upgrading from Elasticsearch 5.x to 6.x. We make extensive use of query string queries and commonly construct queries which used fields of different types.
In 5.x, the following query worked correctly and without error:
{
"query": {
"query_string": {
"query": "my_keyword_field:\"Exact Phrase Here\" my_text_field:(any words) my_other_text_field:\"Another phrase here\" date_field:[2018-01-01 TO 2018-05-01]",
"default_operator": "AND",
"analyzer": "custom_text"
}
}
}
In 6.x, this query will return the following error:
{
"type": "illegal_state_exception",
"reason": "field:[my_keyword_field] was indexed without position data; cannot run PhraseQuery"
}
If I wrap the phrase in parentheses instead of quotes, the search will return 0 results:
{
"query": {
"query_string": {
"query": "my_keyword_field:(Exact Phrase Here)",
"default_operator": "AND",
"analyzer": "custom_text"
}
}
}
I guess this is because there is a conflict between the way the analyzer stems the incoming query and how the data is stored in the keyword field, but the phrase approach (my_keyword_field:"Exact Phrase Here") did work in 5.x.
Is this no longer supported in 6.x? And if not, what is the migration path and/or a good workaround?
It would be better to rephrase the query by using different type of queries available for different use cases. For example use term query for exact search on keyword field. Use range query for ranges etc.
You can rephrase query as below:
{
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "my_text_field:(any words) my_other_text_field:\"Another phrase here\"",
"default_operator": "AND",
"analyzer": "custom_text"
}
},
{
"term": {
"my_keyword_field": "Exact Phrase Here"
}
},
{
"range": {
"date_field": {
"gte": "2018-01-01",
"lte": "2018-05-01"
}
}
}
]
}
}
}

Search multiple indices for an ID efficiently?

I need to get a document but I have no idea what index it is in. I have a bunch of indices for different days; all prefixed with "mydocs-". I've tried:
GET /mydocs-*/adoc/my_second_doc
returns "index_not_found_exception"
GET /mydocs-*/adoc/_search
{
"query": {
"bool":{
"filter": [{
"term":{
"_id": ["my_second_doc"]
}
}]
}
}
}
returns all the docs in the index.
Now, if I search the specific index I can get the doc. Problem is that I don't always know the index it is in beforehand. So, I'd have to search many, many indices for it (thousands of indices).
GET /mydocs-12/adoc/my_second_doc
returns the desired doc.
Any ideas on how to do an efficient Get/Search for the doc?
Have you tried with :
GET mydocs-*/adoc/_search
{
"query": {
"term": {
"_id": "my_second_doc"
}
}
}
Or more specifically with :
GET mydocs-*/_search
{
"query": {
"bool": {
"must": [
{
"term": {
"_id": "my_second_doc"
}
},
{
"term": {
"_type": "adoc"
}
}
]
}
}
}
The above two queries will find all the documents whose index starting with mydocs-, type is adoc and id is my_second_doc.

Elastic Search Query (a like x and y) or (b like x and y)

Some background info: In the bellow example user searched for "HTML CSS". I split each word from the search string and created the SQL query seen bellow.
Now I am trying to make an elastic search query that has the same logic as the following SQL query:
SELECT
title, description
FROM `classes`
WHERE
(`title` LIKE '%html%' AND `title` LIKE '%css%') OR
(description LIKE '%html%' AND description LIKE '%css%')
Currently, half way there but can't seem to get it right yet.
{
"query": {
"bool": {
"must": [
{
"term": {
"title": "html"
}
},
{
"term": {
"title": "css"
}
}
]
}
},
"_source": [
"title"
],
"size": 30
}
Now I need to find how to add follow logic
OR (description LIKE '%html%' AND description LIKE '%css%')
One important point is that I need to only fetch documents that have both words in either title or disruption. I don't want to fetch documents that have only 1 word.
I will update questions as I find more info.
Update: The chosen answer also provides a way to boost scoring based on the field.
Can you try following query. You can use should for making or operation
{
"query": {
"bool": {
"should": [
{
"bool": {
"must": [
{
"match": { // Go for term if your field is analyzed
"title": {
"query": "html css",
"operator": "and",
"boost" : 2
}
}
}
]
}
},
{
"bool": {
"must": [
{
"match": {
"description": {
"query": "html css",
"operator": "and"
}
}
}
]
}
}
],
"minimum_number_should_match": 1
}
},
"_source": [
"title",
"description"
]
}
Hope this helps!!
I feel most appropriate query to be used in this case is multi_match.
multi_match query is convenient way of running the same query on
multiple fields.
So your query can be written as:
GET /_search
{
"_source": ["title", "description"],
"query": {
"multi_match": {
"query": "html css",
"fields": ["title^2", "description"],
"operator":"and"
}
}
}
_source filters the dataset so that only fields mentioned in array
will be displayed in results.
^2 denotes boosting title field with the number 2
operator:and makes sure that all terms in query must be matched
in either fields
From the elasticsearch 5.2 doc:
One option is to use the nested datatype instead of the object datatype.
More details here: https://www.elastic.co/guide/en/elasticsearch/reference/5.2/nested.html
Hope this helps

Resources