Set elasticsearch match precendence - elasticsearch

So if a user search with the word covid, i want all those results first, where title of the sentence starts with word covid and then I want all those items where in other parts of the title have the word have word covid. How can I achieve this?
I want more specific answer, how to do that with searchkick.

If you are using the default mapping, You can use the bool query with two should clause, one with match on the text and another is prefix query on .keyword subfield as shown in below example.
Index sample documents
{
"name" : "foo bar"
}
{
"name" : "bar foo"
}
Search query
{
"query": {
"bool": {
"should": [
{
"match": {
"name" : "foo"
}
},
{
"prefix": {
"name.keyword": "foo"
}
}
]
}
}
}
Search results
"hits": [
{
"_index": "71998426",
"_id": "1",
"_score": 1.1823215,
"_source": {
"name": "foo bar"
}
},
{
"_index": "71998426",
"_id": "2",
"_score": 0.18232156,
"_source": {
"name": "bar foo"
}
}
]
Note: first result, having foo bar is scored much higher and comes first in the search hits.

you can even use boost i guess little modification to #Amit code
"query": {
"bool": {
"should": [
{
"match": {
"name" : "covid",
"boost" : 0.5
}
},
{
"prefix": {
"name.keyword": "foo",
"boost" : 1.0
}
}
]
}
}
}```

Related

Elasticsearch case-insensitive partial match over multiple fields

I'm implementing a search box in Elasticsearch and I have an Elasticsearch index with the following mappings:
{
"mappings": {
"properties": {
"name": {
"type": "text"
},
"brand": {
"type": "text"
}
}
}
}
And I'd like, quite simply, to do a query such as (in SQL):
SELECT * FROM <table> WHERE brand ILIKE '%test%' OR name ILIKE '%test%';
I've tried a query such as:
{
"query": {
"query_string": {
"query": "*test*",
"fields": ["brand", "name"]
}
}
}
and that gives me my desired result, however, I've noticed that the docs recommend not using query_string for a search box as it can lead to performance issues.
I then tried a multi_match query:
{
"query": {
"multi_match" : {
"query": "test"
}
}
}
But that yielded no results. Further, when I used an ngram tokenizer, it returned all documents all the time.
I've consulted countless resources on this and even on StackOverflow there are countless unanswered questions regarding this topic. Could somebody explain how this is achieved in the Elasticsearch world, or am I simply using the wrong tool for the job? Thanks.
Since you have not provided the sample documents, I have created complete example, what you are trying to do is very much possible in Elasticsearch, with simple boolean should wildcard queries as shown below
{
"query": {
"bool": {
"should": [
{
"wildcard": {
"name.keyword": {
"value": "*test*"
}
}
},
{
"wildcard": {
"brand.keyword": {
"value": "*test*"
}
}
}
],
"minimum_should_match": 1,
"boost": 1.0
}
}
}
You can test above query on below sample documents
{
"brand" : "test",
"name" : "name foo according to use"
}
{
"brand" : "barand name is foo",
"name" : "name foo according to use"
}
{
"brand" : "barand name is test",
"name" : "name tested according to use"
}
{
"brand" : "barand name is testing",
"name" : "test the name"
}
on above 4 sample documents, query returns below documents
"hits": [
{
"_index": "73885469",
"_id": "1",
"_score": 2.0,
"_source": {
"brand": "barand name is testing",
"name": "test the name"
}
},
{
"_index": "73885469",
"_id": "2",
"_score": 2.0,
"_source": {
"brand": "barand name is test",
"name": "name tested according to use"
}
},
{
"_index": "73885469",
"_id": "4",
"_score": 1.0,
"_source": {
"brand": "test",
"name": "name foo according to use"
}
}
]
Which is i believe your expected documents

Elastic search partial match but strict phrase matching

I'm looking for a way to fuzzy partial match against a field where the words match, however I want to also add in strict phrase matching.
i.e. say I have fields such as
foo bar
bar foo
I would like to achieve the following search behaviour:
If I search foo, I would like to return back both results.
If I search ba, I would like to return back both results.
If I search bar foo, I would like to only return back one result.
If I search bar foo foo, I don't want to return any results.
I would also like to add in single character fuzziness matching, so if a foo is mistyped as fbo then it would return back both results.
My current search and index analyzer uses an edge_gram tokenizer and is working fairly well, except if any gram matches, it will return the results regardless if the following words match. i.e. my search would return the back the following result for the search bar foo buzz
foo bar
bar foo
My tokenzier:
ngram_tokenizer: {
type: "edge_ngram",
min_gram: "2",
max_gram: "15",
token_chars: ['letter', 'digit', 'punctuation', 'symbol'],
},
My analyzer:
nGram_analyzer: {
filter: [
lowercase,
"asciifolding"
],
type: "custom",
tokenizer: "ngram_tokenizer"
},
My field mapping:
type: "search_as_you_type",
doc_values: false,
max_shingle_size: 3,
analyzer: "nGram_analyzer"
One way to achieve all your requirements is to use span_near query
Span near query are much longer, but these are suitable for doing phrase match along with fuzziness parameter
Adding a working example with index data, search queries and search results
Index Mapping:
{
"mappings": {
"properties": {
"title": {
"type": "text"
}
}
}
}
Index Data:
{
"title":"bar foo"
}
{
"title":"foo bar"
}
Search Queries:
If I search foo, I would like to return back both results.
{
"query": {
"bool": {
"must": [
{
"span_near": {
"clauses": [
{
"span_multi": {
"match": {
"fuzzy": {
"title": {
"value": "foo",
"fuzziness": 2
}
}
}
}
}
],
"slop": 0,
"in_order": true
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "67205552",
"_type": "_doc",
"_id": "2",
"_score": 0.18232156,
"_source": {
"title": "bar foo"
}
},
{
"_index": "67205552",
"_type": "_doc",
"_id": "1",
"_score": 0.18232156,
"_source": {
"title": "foo bar"
}
}
]
If I search ba, I would like to return back both results.
{
"query": {
"bool": {
"must": [
{
"span_near": {
"clauses": [
{
"span_multi": {
"match": {
"fuzzy": {
"title": {
"value": "ba",
"fuzziness": 2
}
}
}
}
}
],
"slop": 0,
"in_order": true
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "67205552",
"_type": "_doc",
"_id": "2",
"_score": 0.18232156,
"_source": {
"title": "bar foo"
}
},
{
"_index": "67205552",
"_type": "_doc",
"_id": "1",
"_score": 0.18232156,
"_source": {
"title": "foo bar"
}
}
]
If I search bar foo foo, I don't want to return any results.
{
"query": {
"bool": {
"must": [
{
"span_near": {
"clauses": [
{
"span_multi": {
"match": {
"fuzzy": {
"title": {
"value": "bar",
"fuzziness": 2
}
}
}
}
},
{
"span_multi": {
"match": {
"fuzzy": {
"title": {
"value": "foo",
"fuzziness": 2
}
}
}
}
},
{
"span_multi": {
"match": {
"fuzzy": {
"title": {
"value": "foo",
"fuzziness": 2
}
}
}
}
}
],
"slop": 0,
"in_order": true
}
}
]
}
}
}
Search Result will be empty

search first element of a multivalue text field in elasticsearch

I want to search first element of array in documents of elasticsearch, but I can't.
I don't find it that how can I search.
For test, I created new index with fielddata=true, but I still didn't get the response that I wanted
Document
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
Values
name : ["John", "Doe"]
My request
{
"query": {
"bool" : {
"must" : {
"script" : {
"script" : {
"source": "doc['name'][0]=params.param1",
"params" : {
"param1" : "john"
}
}
}
}
}
}
}
Incoming Response
"reason": "Fielddata is disabled on text fields by default. Set fielddata=true on [name] in order to load fielddata in memory by uninverting the inverted index. Note that this can however use significant memory. Alternatively use a keyword field instead."
You can use the following script that is used in a search request to return a scripted field:
{
"script_fields": {
"firstElement": {
"script": {
"lang": "painless",
"inline": "params._source.name[0]"
}
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64391432",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"fields": {
"firstElement": [
"John" <-- note this
]
}
}
]
You can use a Painless script to create a script field to return a customized value for each document in the results of a query.
You need to use equality equals operator '==' to COMPARE two
values where the resultant boolean type value is true if the two
values are equal and false otherwise in the script query.
Adding a working example with index data, mapping, search query, and search result
Index Mapping:
{
"mappings":{
"properties":{
"name":{
"type":"text",
"fielddata":true
}
}
}
}
Index data:
{
"name": [
"John",
"Doe"
]
}
Search Query:
{
"script_fields": {
"my_field": {
"script": {
"lang": "painless",
"source": "params['_source']['name'][0] == params.params1",
"params": {
"params1": "John"
}
}
}
}
}
Search Result:
"hits": [
{
"_index": "test",
"_type": "_doc",
"_id": "1",
"_score": 1.0,
"fields": {
"my_field": [
true <-- note this
]
}
}
]
Arrays of objects do not work as you would expect: you cannot query
each object independently of the other objects in the array. If you
need to be able to do this then you should use the nested data type
instead of the object data type.
You can use the script as shown in my another answer if you want to just compare the value of the first element of the array to some other value. But based on your comments, it looks like your use case is quite different.
If you want to search the first element of the array you need to convert your data, into nested form. Using arrays of object at search time you can’t refer to “the first element” or “the last element”.
Adding a working example with index data, mapping, search query, and search result
Index Mapping:
{
"mappings": {
"properties": {
"name": {
"type": "nested"
}
}
}
}
Index Data:
{
"booking_id": 2,
"name": [
{
"first": "John Doe",
"second": "abc"
}
]
}
{
"booking_id": 1,
"name": [
{
"first": "Adam Simith",
"second": "John Doe"
}
]
}
{
"booking_id": 3,
"name": [
{
"first": "John Doe",
"second": "Adam Simith"
}
]
}
Search Query:
{
"query": {
"nested": {
"path": "name",
"query": {
"bool": {
"must": [
{
"match_phrase": {
"name.first": "John Doe"
}
}
]
}
}
}
}
}
Search Result:
"hits": [
{
"_index": "test",
"_type": "_doc",
"_id": "2",
"_score": 0.9400072,
"_source": {
"booking_id": 2,
"name": [
{
"first": "John Doe",
"second": "abc"
}
]
}
},
{
"_index": "test",
"_type": "_doc",
"_id": "3",
"_score": 0.9400072,
"_source": {
"booking_id": 3,
"name": [
{
"first": "John Doe",
"second": "Adam Simith"
}
]
}
}
]

How to search on queried documents in Elasticsearch?

I am a newbie in Elasticsearch and I am facing a problem. My task is searching on a set of documents. For example, I have data with struct like this:
type Doc struct{
id string
project_id string
code string
name string
status string
}
But the difficult thing is I what to get all the documents with project_id=abc then search on them by any other fields (code,name,status) that match the keyword 'test' (for example). How can I do that in Elasticsearch query, please help me!
Thanks.
You can use a boolean query that matches documents matching
boolean combinations of other queries.
must is the same as logical AND operator and should is the same as logical OR operator
Adding a working example with index data, search query, and search result
Index Data:
{
"id": 2,
"project_id":"abc",
"code": "a",
"name":"bhavya",
"status":"engineer"
}
{
"id": 1,
"project_id":"abc",
"code": "a",
"name":"bhavya",
"status":"student"
}
{
"id": 3,
"project_id":"def",
"code": "a",
"name":"deeksha",
"status":"engineer"
}
Search Query:
The given query satisfies the condition that "project_id" = "abc" AND "name" : "bhavya" AND "status":"student"
{
"query": {
"bool": {
"must": [
{
"match": {
"project_id": "abc"
}
},
{
"match": {
"name": "bhavya"
}
},
{
"match": {
"status": "student"
}
}
]
}
}
}
Search Result:
"hits": [
{
"_index": "stof_64274465",
"_type": "_doc",
"_id": "1",
"_score": 1.7021472,
"_source": {
"id": 1,
"project_id": "abc",
"code": "a",
"name": "bhavya",
"status": "student"
}
}
]
I think maybe using Logstash and filtering some data can help you.
Here is my solution
GET /[index_name]/_search
{
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "some text here"
}
}
],
"filter": [
{
"term": {
"project_id.keyword": "abc"
}
}
]
}
}
}

Returning documents that match multiple wildcard string queries

I'm new to Elasticsearch and would greatly appreciate help on this
In the query below I only want the first document to be returned, but instead both documents are returned. How can I write a query to search for two wildcard strings on two separate fields, but only return documents that match?
I think what's being returned currently is score dependent, but I don't need the score.
POST /pr/_doc/1
{
"type": "Type ONE",
"currency":"USD"
}
POST /pr/_doc/2
{
"type": "Type TWO",
"currency":"USD"
}
GET /pr/_search
{
"query": {
"bool": {
"must": [
{
"simple_query_string": {
"query": "Type ON*",
"fields": ["type"],
"analyze_wildcard": true
}
},
{
"simple_query_string": {
"query": "US*",
"fields": ["currency"],
"analyze_wildcard":true
}
}
]
}
}
}
Use below query which uses the default_operator: AND and query string for in depth information and further reading.
Search query
{
"query": {
"query_string": {
"query": "(Type ON*) AND (US*)",
"fields" : ["type", "currency"],
"default_operator" : "AND"
}
}
}
Index your sample docs and it returns your expected doc only:
"hits": [
{
"_index": "multiplequery",
"_type": "_doc",
"_id": "1",
"_score": 2.1823215,
"_source": {
"type": "Type ONE",
"currency": "USD"
}
}
]

Resources