Elasticsearch complex proximity query - filter

Given that I have a query like below:
council* W/5 (tip OR tips)
The above query can be translated as: Find anything that has council* and (tip OR tips) no more than 5 words apart.
So following text will match:
Shellharbour City Council Tip
council best tip
councils top 10 tips
But this one should not match:
... City Council at Shellharbour. There is not any good tip at all.
I need help to build an elasticsearch query for that. I was thinking about Regex query but I'm not quite sure about better alternatives. Thanks

You can use a combination of the span_near query, span_multi and span_or. We can use the query below to perform the same search.
{
"query": {
"span_near": {
"clauses": [
{
"span_multi":
{
"match":
{
"prefix": { "text": "council"}
}
}
},
{
"span_or": {
"clauses": [
{
"span_term": {
"text": {
"value": "tip"
}
}
},
{
"span_term": {
"text": {
"value": "tips"
}
}
}
]
}
}
],
"slop": 5,
"in_order": true
}
}
}
The important things to look out for are the span_term which is the text your searching for. In this example I only had one field called "text". Slop indicates the number of words we will allow between the terms, and in_order indicates that the order of words is important. So "tip council" will not match, where as "council tip" will.

Related

Match query fuzzily to an array of candidates

I have an index in elastic with the following document structure:
{
"questions": [
"What is your name?",
"How are you called?",
"What should I call you?",
...
],
"answer": "<answer>"
}
I would like to match queries to one of the entries in the questions array.
For example the query "What's your name"?
The returning document should be the one with the closest matching entry of questions in all the documents in the index.
I have tried:
{
"query": {
"match": { "questions": { "query": "<question>", "fuzziness": "auto" } },
}
}
But that sometimes returns a "wrong" document, even if the query is one of the entries of questions in one of the documents exactly.
I've also tried
{
"query": {
"match_phrase": { "questions": "<query>" },
}
}
But that doesn't allow fuzziness, and since the queries are human inputs, it's not catching enough cases
And lastly I tried
{
"query": {
"span_near": [
{ "span_multi": {
"match": {
"fuzzy: {
"questions": { "fuzziness": "auto", "value": "<first word of the query>" },
}
}
},
{ "span_multi": {
"match": {
"fuzzy: {
"questions": { "fuzziness": "auto", "value": "<second word of the query>" },
}
}
},
...
]
}
}
But that (at least as far as I seem to notice) only matches questions exactly with fuzzy words.
What I would like (at least as far as I understand), is a fuzzy TF-IDF across all entries of questions, get the best match and then rank the documents according to the best matches of one of the entries of questions (not the entirety of the questions array)
I'm a pretty inexperienced novice when it comes to Elastic, so I appreciate any tips and tricks or outright solutions you might have for me, thank you!

Counting the SEARCH term/phrase in a specific field in Elasticsearch

I have this type of data
{
"name_id": 2145
"address": "Antartica"
"characteristics" : "He is a very nice person with very nice personality. the nicest thing about him is his nice dog"
}
now I am running this query
GET friends/_search
{
"query": {
"bool": {
"must": [
{"term": {
"name_id.keyword": "B08F2BWX2V"
}
},
{
"match_phrase": {
"characteristics": "nice"
}
}
]
}
}
}
is there a way i can get the results and the word count i.e,
nice : 4
There is an elastic api that can return the token count information you need.
It is the Term vectors API.
I'm not sure if it will be exactly what you need but I saw in the post below a question similar to yours:
https://stackoverflow.com/a/69734423/18778181

Elasticsearch ordering by field value which is not in the filter

can somebody help me please to make a query which will order result items according some field value if this field is not part of query in request. I have a query:
{
"_source": [
"ico",
"name",
"city",
"status"
],
"sort": {
"_score": "desc",
"status": "asc"
},
"size": 20,
"query": {
"bool": {
"should": [
{
"match": {
"normalized": {
"query": "idona",
"analyzer": "standard",
"boost": 3
}
}
},
{
"term": {
"normalized2": {
"value": "idona",
"boost": 2
}
}
},
{
"match": {
"normalized": "idona"
}
}
]
}
}
}
The result is sorted according field status alphabetically ascending. Status contains few values like [active, canceled, old....] and I need something like boosting for every possible values in query. E.g. active boost 5, canceled boost 4, old boost 3 ........... Is it possible to do it? Thanks.
You would need a custom sort using script to achieve what you want.
I've just made use of generic match_all query for my query, you can probably go ahead and add your query logic there, but the solution that you are looking for is in the sort section of the below query.
Make sure that status is a keyword type
Custom Sorting Based on Values
POST <your_index_name>/_search
{
"query":{
"match_all":{
}
},
"sort":[
{ "_score": "desc" },
{
"_script":{
"type":"number",
"script":{
"lang":"painless",
"inline":"if(params.scores.containsKey(doc['status'].value)) { return params.scores[doc['status'].value];} return 100000;",
"params":{
"scores":{
"active":5,
"old":4,
"cancelled":3
}
}
},
"order":"desc"
}
}
]
}
In the above query, go ahead and add the values in the scores section of the query. For e.g. if your value is new and you want it to be at say value 2, then your scores would be in the below:
{
"scores":{
"active":5,
"old":4,
"cancelled":3,
"new":6
}
}
So basically the documents would first get sorted by _score and then on that sorted documents, the script sort would be executed.
Note that the script sort is desc by nature as I understand that you would want to show active documents at the top, followed by other values. Feel free to play around with it.
Hope this helps!

Must match multiple values

I have a query that works fine when I need the property of a document
to match just one value.
However I also need to be able to search with must with two values.
So if a banana has id 1 and a lemon has id 2 and I search for yellow
I will get both if I have 1 and 2 in the must clause.
But if i have just 1 I will only get the banana.
{
"from": 0,
"size": 20,
"query": {
"bool": {
"should": [
{ "match":
{ "fruit.color": "yellow" }}
],
"must" : [
{ "match": { "fruit.id" : "1" } }
]
}
}
}
I havenĀ“t found a way to search with two values with must.
is that possible?
If the document "must" be returned only if the id is 1 or 2, that sounds like another should clause. If I'm understanding your question properly, you want documents with either id 1 OR id 2. Additionally, if the color is yellow, give it a higher score.
Here's one way you might achieve what you're looking for:
{
"query": {
"bool": {
"should": {
"match": {
"fruit.color": "yellow"
}
},
"must": {
"bool": {
"should": [
{
"match": {
"fruit.id": "1"
}
},
{
"match": {
"fruit.id": "2"
}
}
]
}
}
}
}
}
Here I put the two match queries in the should clause of a separate bool query. This achieves the OR behavior you are looking for.
Have another look at the Bool Query documentation and take note of the nuances of should. It behaves differently by default depending on whether or not there is a sibling must clause and whether or not the bool query is being executed in filter context.
Another key option that is adjustable and can help you achieve your expected results is the minimum_should_match parameter. Have a look at this documentation page.
Instead of a match query, you could simply try the terms query for ORing between multiple terms.
Match queries are generally used for analyzed fields. For exact matching, you should use term queries
{
"from": 0,
"size": 20,
"query": {
"bool": {
"should": [
{ "match": { "fruit.color": "yellow" } }
],
"must" : [
{ "terms": { "fruit.id": ["1","2"] } }
]
}
}
}
term or terms query is the perfect way to fetch the exact text or id, using match query result in search inside the id or text
Ex:
id = '4'
id = '44'
Search using match query with id = 4 return both 4 & 44 since it matches 4 in both. This is where terms query come into play.
same search using terms query will return 4 only.
So the accepted is absolutely wrong. Use the #Rahul answer. Just one more thing you need to do, Instead of text you need to analyse the field as a keyword
Example for indexing a field both as a text and keyword (mapping is for flat level for nested change it accordingly).
{
"index_patterns": [ "test" ],
"mappings": {
"kb_mapping_doc": {
"_source": {
"enabled": true
},
"properties": {
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
}
using #Rahul's answer doesn't worked because you might be analysed as a text.
id - access a text field
id.keyword - access a keyword field
it would be
{
"from": 0,
"size": 20,
"query": {
"bool": {
"should": [{
"match": {
"color": "yellow"
}
}],
"must": [{
"terms": {
"id.keyword": ["1", "2"]
}
}]
}
}
}
So I would say accepted answer will return falsy results Please use #Rahul's answer with the corresponding mapping.

Elasticsearch query to search two word combination with space

I have a elasticsearch query to search the data based on name.
My query is
$http.post(elasticSearchURL,{ "filter": { "and": [{ "term": { "Name": "allan" } } ] } })
The above query works fine for single word search but when I give two words with space it doesn't picks any data for it.
My query is not working for below scenario.
{ "filter": { "and": [{ "term": { "Name": "allan edward" } } ] } }
I dont know what keyword should I have to append to satisfy my search scenario.
Thanks in advance
Phrase match query is what you are looking for.
A query like below should work fine -
{
"query": {
"match_phrase": {
"title": "allan edward"
}
}
}

Resources