elasticsearch - search query - ignore order - elasticsearch

I'm using a query like
{bool: {must: [{match: {name: "Cat Dog"}}]
This gives me records with name e.g. "Cat Dog Cow" but not e.g. "Cat Cow Dog".
As I read here solutions for it can be used span_near, is this the only way?
I tried query such as :
{"query":{"bool":{"must":[],"must_not":[],"should":[{"span_near":{"slop":12,"in_order":false,"clauses":[{"span_term":{"name":"Cat"}},{"span_term":{"name":"Dog"}}]}}]}}}
But this gives me 0 hits. What can be the issue?

The match query returns documents that match a provided text, the provided text is analyzed before matching.
Adding a working example
Index mapping:
{
"mappings": {
"properties": {
"name": {
"type": "text"
}
}
}
}
Search Query:
{
"query": {
"match": {
"name": {
"query": "Cat Dog"
}
}
}
}
Search Result:
"hits": [
{
"_index": "65230619",
"_type": "_doc",
"_id": "1",
"_score": 0.36464313,
"_source": {
"name": "Cat Dog Cow"
}
},
{
"_index": "65230619",
"_type": "_doc",
"_id": "2",
"_score": 0.36464313,
"_source": {
"name": "Cat Cow Dog"
}
}
]
Search Query using span_near
{
"query": {
"span_near" : {
"clauses" : [
{ "span_term" : { "name" : "cat" } },
{ "span_term" : { "name" : "dog" } }
],
"slop" : 12,
"in_order" : false
}
}
}

Related

search first element of a multivalue text field in elasticsearch

I want to search first element of array in documents of elasticsearch, but I can't.
I don't find it that how can I search.
For test, I created new index with fielddata=true, but I still didn't get the response that I wanted
Document
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
Values
name : ["John", "Doe"]
My request
{
"query": {
"bool" : {
"must" : {
"script" : {
"script" : {
"source": "doc['name'][0]=params.param1",
"params" : {
"param1" : "john"
}
}
}
}
}
}
}
Incoming Response
"reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [name] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
You can use the following script that is used in a search request to return a scripted field:
{
"script_fields": {
"firstElement": {
"script": {
"lang": "painless",
"inline": "params._source.name[0]"
}
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64391432",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"fields": {
"firstElement": [
"John" <-- note this
]
}
}
]
You can use a Painless script to create a script field to return a customized value for each document in the results of a query.
You need to use equality equals operator '==' to COMPARE two
values where the resultant boolean type value is true if the two
values are equal and false otherwise in the script query.
Adding a working example with index data, mapping, search query, and search result
Index Mapping:
{
"mappings":{
"properties":{
"name":{
"type":"text",
"fielddata":true
}
}
}
}
Index data:
{
"name": [
"John",
"Doe"
]
}
Search Query:
{
"script_fields": {
"my_field": {
"script": {
"lang": "painless",
"source": "params['_source']['name'][0] == params.params1",
"params": {
"params1": "John"
}
}
}
}
}
Search Result:
"hits": [
{
"_index": "test",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"fields": {
"my_field": [
true <-- note this
]
}
}
]
Arrays of objects do not work as you would expect: you cannot query
each object independently of the other objects in the array. If you
need to be able to do this then you should use the nested data type
instead of the object data type.
You can use the script as shown in my another answer if you want to just compare the value of the first element of the array to some other value. But based on your comments, it looks like your use case is quite different.
If you want to search the first element of the array you need to convert your data, into nested form. Using arrays of object at search time you can’t refer to “the first element” or “the last element”.
Adding a working example with index data, mapping, search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"name": {
"type": "nested"
}
}
}
}
Index Data:
{
"booking_id": 2,
"name": [
{
"first": "John Doe",
"second": "abc"
}
]
}
{
"booking_id": 1,
"name": [
{
"first": "Adam Simith",
"second": "John Doe"
}
]
}
{
"booking_id": 3,
"name": [
{
"first": "John Doe",
"second": "Adam Simith"
}
]
}
Search Query:
{
"query": {
"nested": {
"path": "name",
"query": {
"bool": {
"must": [
{
"match_phrase": {
"name.first": "John Doe"
}
}
]
}
}
}
}
}
Search Result:
"hits": [
{
"_index": "test",
"_type": "_doc",
"_id": "2",
"_score": 0.9400072,
"_source": {
"booking_id": 2,
"name": [
{
"first": "John Doe",
"second": "abc"
}
]
}
},
{
"_index": "test",
"_type": "_doc",
"_id": "3",
"_score": 0.9400072,
"_source": {
"booking_id": 3,
"name": [
{
"first": "John Doe",
"second": "Adam Simith"
}
]
}
}
]

Filter elastic search data when fields contain ~

I have bunch of documents like below. I want to filter the data where projectkey starts with ~.
I did read some articles which says ~ is an operator in Elastic query so cannot really filter with that.
Can someone help to form the search query for /branch/_search API ??
{
"_index": "branch",
"_type": "_doc",
"_id": "GAz-inQBJWWbwa_v-l9e",
"_version": 1,
"_score": null,
"_source": {
"branchID": "refs/heads/feature/12345",
"displayID": "feature/12345",
"date": "2020-09-14T05:03:20.137Z",
"projectKey": "~user",
"repoKey": "deploy",
"isDefaultBranch": false,
"eventStatus": "CREATED",
"user": "user"
},
"fields": {
"date": [
"2020-09-14T05:03:20.137Z"
]
},
"highlight": {
"projectKey": [
"~#kibana-highlighted-field#user#/kibana-highlighted-field#"
],
"projectKey.keyword": [
"#kibana-highlighted-field#~user#/kibana-highlighted-field#"
],
"user": [
"#kibana-highlighted-field#user#/kibana-highlighted-field#"
]
},
"sort": [
1600059800137
]
}
UPDATE***
I used prerana's answer below to use -prefix in my query
Something is still wrong when i use prefix and range - i get below error - What am i missing ??
GET /branch/_search
{
"query": {
"prefix": {
"projectKey": "~"
},
"range": {
"date": {
"gte": "2020-09-14",
"lte": "2020-09-14"
}
}
}
}
{
"error": {
"root_cause": [
{
"type": "parsing_exception",
"reason": "[prefix] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
"line": 6,
"col": 5
}
],
"type": "parsing_exception",
"reason": "[prefix] malformed query, expected [END_OBJECT] but found [FIELD_NAME]",
"line": 6,
"col": 5
},
"status": 400
}
If I understood your issue well, I suggest the creation of a custom analyzer to search the special character ~.
I did a test locally as follows while replacing ~ to __SPECIAL__ :
I created an index with a custom char_filter alongside with the addition of a field to the projectKey field. The name of the new multi_field is special_characters.
Here is the mapping:
PUT wildcard-index
{
"settings": {
"analysis": {
"char_filter": {
"special-characters-replacement": {
"type": "mapping",
"mappings": [
"~ => __SPECIAL__"
]
}
},
"analyzer": {
"special-characters-analyzer": {
"tokenizer": "standard",
"char_filter": [
"special-characters-replacement"
]
}
}
}
},
"mappings": {
"properties": {
"projectKey": {
"type": "text",
"fields": {
"special_characters": {
"type": "text",
"analyzer": "special-characters-analyzer"
}
}
}
}
}
}
Then I ingested the following contents in the index:
"projectKey": "content1 ~"
"projectKey": "This ~ is a content"
"projectKey": "~ cars on the road"
"projectKey": "o ~ngram"
Then, the query was:
GET wildcard-index/_search
{
"query": {
"match": {
"projectKey.special_characters": "~"
}
}
}
The response was:
"hits" : [
{
"_index" : "wildcard-index",
"_type" : "_doc",
"_id" : "h1hKmHQBowpsxTkFD9IR",
"_score" : 0.43250346,
"_source" : {
"projectKey" : "content1 ~"
}
},
{
"_index" : "wildcard-index",
"_type" : "_doc",
"_id" : "iFhKmHQBowpsxTkFFNL5",
"_score" : 0.3034693,
"_source" : {
"projectKey" : "This ~ is a content"
}
},
{
"_index" : "wildcard-index",
"_type" : "_doc",
"_id" : "-lhKmHQBowpsxTkFG9Kg",
"_score" : 0.3034693,
"_source" : {
"projectKey" : "~ cars on the road"
}
}
]
Please let me know If you have any issue, I will be glad to help you.
Note: This method works if there is a blank space after the ~. You can see from the response that the 4th data was not displayed.
while #hansley answer would work, but it requires you to create a custom analyzer and still as you mentioned you want to get only the docs which starts with ~ but in his result I see all the docs containing ~, so providing my answer which requires very less configuration and works as required.
Index mapping default, so just index below docs and ES will create a default mapping with .keyword field for all text field
Index sample docs
{
"title" : "content1 ~"
}
{
"title" : "~ staring with"
}
{
"title" : "in between ~ with"
}
Search query should fetch obly 2nd docs from sample docs
{
"query": {
"prefix" : { "title.keyword" : "~" }
}
}
And search result
"hits": [
{
"_index": "pre",
"_type": "_doc",
"_id": "2",
"_score": 1.0,
"_source": {
"title": "~ staring with"
}
}
]
Please refer prefix query for more info
Update 1:
Index Mapping:
{
"mappings": {
"properties": {
"date": {
"type": "date"
}
}
}
}
Index Data:
{
"date": "2015-02-01",
"title" : "in between ~ with"
}
{
"date": "2015-01-01",
"title": "content1 ~"
}
{
"date": "2015-02-01",
"title" : "~ staring with"
}
{
"date": "2015-02-01",
"title" : "~ in between with"
}
Search Query:
{
"query": {
"bool": {
"must": [
{
"prefix": {
"title.keyword": "~"
}
},
{
"range": {
"date": {
"lte": "2015-02-05",
"gte": "2015-01-11"
}
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "stof_63924930",
"_type": "_doc",
"_id": "2",
"_score": 2.0,
"_source": {
"date": "2015-02-01",
"title": "~ staring with"
}
},
{
"_index": "stof_63924930",
"_type": "_doc",
"_id": "4",
"_score": 2.0,
"_source": {
"date": "2015-02-01",
"title": "~ in between with"
}
}
]

elasticSearch: bool query with multiple values on one field

This works:
GET /bitbucket$$pull-request-activity/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"prid": "12343"
}
},
{
"match": {
"repoSlug": "com.xxx.vserver"
}
}
]
}
}
}
But I would like to capture multiple prids in one call.
This does not work however:
GET /bitbucket$$pull-request-activity/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"prid": "[12343, 11234, 13421]"
}
},
{
"match": {
"repoSlug": "com.xxx.vserver"
}
}
]
}
}
}
any hints?
As you are using must in your bool query, then this represents logical AND, so be sure that all the documents that you are Matching of the prid field, should also match with "repoSlug": "com.xxx.vserver".
If none of the documents match with "repoSlug": "com.xxx.vserver", then no result will return.
And, if only 2 documents match, then only 2 of them will be returned in the search result, and not all the documents.
Adding Working example with mapping, sample docs and search query
Index Sample Data :
{
"id":"1",
"message":"hello"
}
{
"id":"2",
"message":"hello"
}
{
"id":"3",
"message":"hello-bye"
}
Search Query:
{
"query": {
"bool": {
"must": [
{
"match": {
"id": "[1, 2, 3]"
}
},
{
"match": {
"message": "hello"
}
}
]
}
}
}
Search Result :
"hits": [
{
"_index": "foo14",
"_type": "_doc",
"_id": "1",
"_score": 1.5924306,
"_source": {
"id": "1",
"message": "hello"
}
},
{
"_index": "foo14",
"_type": "_doc",
"_id": "3",
"_score": 1.4903541,
"_source": {
"id": "3",
"message": "hello-bye"
}
},
{
"_index": "foo14",
"_type": "_doc",
"_id": "2",
"_score": 1.081605,
"_source": {
"id": "2",
"message": "hello"
}
}
]

ElasticSearch: why it is not possible to get suggest by criteria?

I want to get suggestions from some text for concrete user.
As I understand Elasticsearch provides suggestions based on the whole dictionary(inverted index) that contains all the terms in the index.
So if user1 posts some text then this text can be suggested to user2. Am I right?
Is it possible to add filter by criteria (by user for example) to reduce the set of terms to be suggested?
Yes, that's very much possible, let me show you by an example, which uses the query with filter context:
Index def
{
"mappings": {
"properties": {
"title": {
"type": "text" --> inverted index for storing suggestions on title field
},
"userId" : {
"type" : "keyword" --> like in you example
}
}
}
}
Index sample doc
{
"title" : "foo baz",
"userId" : "katrin"
}
{
"title" : "foo bar",
"userId" : "opster"
}
Search query without userId filter
{
"query": {
"bool": {
"must": {
"match": {
"title": "foo"
}
}
}
}
}
Search results(bring both results)
"hits": [
{
"_index": "so_suggest",
"_type": "_doc",
"_id": "1",
"_score": 0.18232156,
"_source": {
"title": "foo bar",
"userId": "posted" --> note another user
}
},
{
"_index": "so_suggest",
"_type": "_doc",
"_id": "2",
"_score": 0.18232156,
"_source": {
"title": "foo baz",
"userId": "katrin" -> note user
}
}
]
Now lets reduce the suggestion by filtering the docs created by user katrin
Search query
{
"query": {
"bool": {
"must": {
"match": {
"title": "foo"
}
},
"filter": {. --> note filter on userId field
"term": {
"userId": "katrin"
}
}
}
}
}
Search result
"hits": [
{
"_index": "so_suggest",
"_type": "_doc",
"_id": "2",
"_score": 0.18232156,
"_source": {
"title": "foo baz",
"userId": "katrin"
}
}
]

I want to get all Entities from nested JSON Data where the "ai_id" has the Value = 0

i have this bellow JSON Data, and i want to write a Query in Elasticsearch , the Query is ,
(Give me all Entities where the "ai_id" has the Value = 0 ).
the JSON Data ist :
{
"_index": "try1",
"_type": "_doc",
"_id": "2",
"_score": 1,
"_source": {
"target": {
"br_id": 0,
"an_id": 0,
"ai_id": 0,
"explanation": [
"element 1",
"element 2"
]
},
"process": {
"an_id": 1311,
"pa_name": "micha"
},
"text": "hello world"
}
},
{
"_index": "try1",
"_type": "_doc",
"_id": "1",
"_score": 1,
"_source": {
"target": {
"br_id": 0,
"an_id": 1,
"ai_id": 1,
"explanation": [
"element 3",
"element 4"
]
},
"process": {
"an_id": 1311,
"pa_name": "luca"
},
"text": "the all People are good"
}
}
]
}
}
I tried this but seems not to Work , Please any Help i will be thankfull.
GET try1\_search
{
"query":{
{ "match_all": { "ai_id": 0}}
}
}
and this did not work too,
GET try1/_search
{
"query": {
"nested" : {
"query" : {
"must" : [
{ "match" : {"ai_id" : 0} }
]
}
}
}
}
Please an Suggestion .
thx
You need to query nested on your target object like this-
GET /try1/_search
{
"query": {
"nested" : {
"path" : "target",
"query" : {
"bool" : {
"must" : [
{ "match" : {"target.ai_id" : 0} }
]
}
}
}
}
}
Ref. https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-nested-query.html

Resources