Elasticsearch newbie here. I'm trying to lookup documents that has foo in its name but want to prioritize that ones having bar as well i.e. those with bar will be at the top of the list. The result doesn't have the ones with bar at the top. boost here doesn't seem to have any effect, likely I'm not understanding how boost works here. Appreciate any help here.
query: {
bool: {
should: [
{
query_string: {
query: `name:foo*bar*`,
boost: 5
}
},
{
query_string: {
query: `name:*foo*`,
}
}
]
}
}
Sample document structure:
{
"name": "foos, one two three",
"type": "car",
"age": 10
}
{
"name": "foos, one two bar three",
"type": "train",
"age": 30
}
Index mapping
{
"detail": {
"mappings": {
"properties": {
"category": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"servings": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}
Try switching the order for the query like so:
query: {
bool: {
should: [
{
query_string: {
query: `name:*foo*`,
}
},
{
query_string: {
query: `name:foo*bar*`,
boost: 5
}
}
]
}
}
it should work but if not you might need to do a nested search.
Search against keyword field.
If you will only run first part of the query ("query": "name:foo*bar*"), you will see that it is not returning anything. It is searching against tokens generated rather than whole string.
Text "foos, one two bar three" generates tokens like ["foos","one","two","bar","three"] and query is searching for "foo*bar*" in individual tokens hence no result. Keyword fields are stored as it is so search is happening against entire text.
{
"query": {
"bool": {
"should": [
{
"query_string": {
"query": "name.keyword:foo*bar*",
"boost": 5
}
},
{
"query_string": {
"query": "name.keyword:*foo*"
}
}
]
}
}
Wildcards take huge memory and don't scale well. So it is better to avoid it. If foo and bar appear at start of words , you can use prefix query
{
"query": {
"bool": {
"should": [
{
"prefix": {
"name": "foo"
}
},
{
"prefix": {
"name": "bar"
}
}
]
}
}
}
You can also explore ngrams
Related
Hi My elastic search index has the mapping as.
"userId": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"userId": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"userName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
Also my search query looks like this
GET : http://localhost:5000/questions/_search
Body is
{
"query": {
"bool": {
"filter": [
{ "term": { "userId.userId": "testuser#demo.com"
}}
]
}
}
}
I am always getting 0 hits. Is there a better value to query multivalue json.
userId.userId field is of text type. If no analyzer is defined, elasticsearch by default uses a standard analyzer. This will tokenize testuser#demo.com into
{
"tokens": [
{
"token": "testuser",
"start_offset": 0,
"end_offset": 8,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "demo.com",
"start_offset": 9,
"end_offset": 17,
"type": "<ALPHANUM>",
"position": 1
}
]
}
You need to use "userId.userId.keyword" field on the userId.userId field. This uses the keyword analyzer instead of the standard analyzer (notice the ".keyword" after userId.userId field).
You are getting 0 hits, because the term query, always searches for exact matching term. And as you are using the standard analyzer (which is the default one) for searching, you will not get correct results
{
"query": {
"bool": {
"filter": [
{
"term": {
"userId.userId.keyword": "testuser#demo.com"
}
}
]
}
}
}
If you want to search for multiple fields use the terms query
{
"query": {
"bool": {
"filter": [
{
"terms": {
"userId.userId.keyword": [
"testuser#demo.com",
"abc.com"
]
}
}
]
}
}
}
Update 1:
You can use the must_not clause along with the term query to get all records that have userId not equal to testuser#demo.com
{
"query": {
"bool": {
"must_not": {
"term": {
"userId.userId.keyword": "testuser#demo.com"
}
}
}
}
}
Terms query returns documents that contain one or more exact terms in a provided field.The terms query is the same as the term query, except you can search for multiple values.
{
"query": {
"terms": {
"userId.userId": [ "testuser#demo.com", "other#demo.com" ],
"boost": 1.0
}
}
}
{
"myindex": {
"mappings": {
"properties": {
"city": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
I tried to update by using below PUT request on the index, but still getting the above ouput of _mapping
{
"_doc" : {
"properties" : {
"city" : {"type" : "text"}
}
}
}
I am not able to query with inexact words because its type is "keyword", for the below the actual value in record is "Mumbai"
{
"query": {
"bool": {
"must": {
"match": {
"city": {
"query": "Mumbi",
"minimum_should_match": "10%"
}
}
}
}
}
}
Below mapping (What is shared in the question) will store 'city' as text and 'city.keyword' as a keyword.
{
"myindex": {
"mappings": {
"properties": {
"city": {
"type": "text", // ==========> Store city as text
"fields": {
"keyword": {
"type": "keyword", // =========> store city.keyword as a keyword
"ignore_above": 256
}
}
}
}
}
}
}
your's is the use case of Fuzzy search and not minimum_should_match.
ES Docs for Fuzzy Search: https://www.elastic.co/blog/found-fuzzy-search
Try below query
{
"query": {
"match": {
"city": {
"query": "mubai",
"fuzziness": "AUTO"
}
}
}
}
minimum_should_match
Minimum number of clauses that must match for a document to be returned
It signifies the percentage of clauses not the percentage of the string. Go through this documentation to frame the query to get the expected results. Invalid queries return invalid results.
I am trying to update particular field in document based on some condition. In general sql way, I want to do following.
Update index indexname
set name = "XXXXXX"
where source: file and name : "YYYYYY"
I am using below to update all the documents but I am not able to add any condition.
POST indexname/_update_by_query
{
"query": {
"term": {
"name": "XXXXX"
}
}
}
Here is the template, I am using:
{
"indexname": {
"mappings": {
"idxname123": {
"_all": {
"enabled": false
},
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"date1": {
"type": "date",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"source": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
Could someone guide me how to add condition to it as mentioned above for the source and name.
Thanks,
Babu
You can make use of the below query to what you are looking for. I'm assuming name and source are your fields in your index.
POST <your_index_name>/_update_by_query
{
"script": {
"inline": "ctx._source.name = 'XXXXX'",
"lang": "painless"
},
"query": {
"bool": {
"must": [
{
"term": {
"name": {
"value": "YYYYY"
}
}
},
{
"term": {
"source": {
"value": "file"
}
}
}
]
}
}
}
You can probably make use of any of the Full Text Queries or Term Queries inside the Bool Query for either searching/updating/deletions.
Do spend sometime in going through them.
Note: Make use of Term Queries only if your field's datatype is keyword
Hope this helps!
I have a document with a nested field and I'm having some trouble getting highlighting to work. Why am I not getting highlighting when my term query contains pointy brackets (<>)?
We have two fields in a nested mapping containing similar data:
"value": {
"type": "keyword",
"normalizer": "lowercase"
},
"valueWithQualifier": {
"type": "keyword",
"normalizer": "lowercase"
}
The lowercase normalizer uses the filters ["asciifolding", "lowercase"]
The value is generally an alphanumeric string but the valueWithQualifier takes the form value<qualifier>. When I execute a term query on the value field, it generally returns highlighting information. When I execute a term query on the valueWithQualifier field, I never get highlighting info.
{
"query": {
"nested": {
"path": "assoc",
"query": {
"term": {
"assoc.value": "123abc"
}
},
"inner_hits": {
"highlight": {
"fields": {
"assoc.value*": {}
}
}
}
}
}
}
This returns an inner hit with a highlight:
"highlight": {
"assoc.value": [
"<em>123abc</em>"
]
}
However, this query returns the inner_hit but no highlighting:
{
"query": {
"nested": {
"path": "assoc",
"query": {
"term": {
"assoc.valueWithQualifier": "123abc<qual>"
}
},
"inner_hits": {
"highlight": {
"fields": {
"assoc.value*": {}
}
}
}
}
}
}
However, this does return the highlighting (but I'd rather use a term query due to efficiency):
{
"query": {
"nested": {
"path": "assoc",
"query": {
"prefix": {
"assoc.valueWithQualifier": "123abc"
}
},
"inner_hits": {
"highlight": {
"fields": {
"assoc.value*": {}
}
}
}
}
}
}
"highlight": {
"assoc.valueWithQualifier": [
"<em>123abc<qual></em>"
]
}
And before someone asks, I have tried adding "encoder": "html" to the highlight.
It turns out this is a bug that was fixed in ES 6.2 (https://github.com/elastic/elasticsearch/pull/27604).
I have following mapping:
PUT /test_products
{
"mappings": {
"_doc": {
"properties": {
"type": {
"type": "keyword"
},
"name": {
"type": "text"
},
"entity_id": {
"type": "integer"
},
"weighted": {
"type": "integer"
}
"product_relation": {
"type": "join",
"relations": {
"window": "simple"
}
}
}
}
}
}
I want to get "window" products with all "simple"s but only where one or more "simple"s have property "weighted" = 1
I wrote following query:
GET test_products/_search
{
"query": {
"has_child": {
"type": "simple",
"query": {
"term": {
"weighted": 1
}
},
"inner_hits": {}
}
}
}
But I've got "window"s with "simple"s which are match to the term. In other words I want to filter "window"s list by "simple"'s option and get all matched "window"s with all their "simple"s. Is it possible without "nested" in one query? Or I have to do some queries?
OK. Luckily, I need to get only one "window" product with all it's children by it's ID, so I found parent_id query which can helps me with this task.
Now I have following query:
GET test_products/_search
{
"query": {
"parent_id": {
"type": "simple",
"id": "window-1"
}
}
}
Unfortunately, I have to execute 2 queries (has_child and then parent_id) instead of one but it's OK for me.