Is it impossible to index a document where a property has multiple fields, one of them being a completion type with contexts? - elasticsearch

Here is my mapping (some fields renamed/removed), I'm using ES 6.0
{
"mappings": {
"_doc" :{
"properties" : {
"username" : {
"type": "keyword",
"fields": {
"suggest" : {
"type" : "completion",
"contexts": [
{
"name": "user_id",
"type": "category"
}
]
}
}
},
"user_id": {
"type": "integer"
}
}
}
}
}
Now when I try to index a document with
PUT usernames/_doc/1
{
"username" : "JOHN",
"user_id": 1
}
OR
PUT usernames/_doc/1
{
"username" : {
"input": "JOHN",
"contexts: {
"user_id": 1
}
}
"user_id": 1
}
The first doesn't index with context and the second just fails. I've attempted to add a path like so,
{
"mappings": {
"_doc" :{
"properties" : {
"username" : {
"type": "keyword",
"fields": {
"suggest" : {
"type" : "completion",
"contexts": [
{
"name": "user_id",
"type": "category",
"path": "user_id",
}
]
}
}
},
"user_id": {
"type": "integer"
}
}
}
}
}
And attempting indexing again
PUT usernames/_doc/1
{
"username" : "JOHN",
"user_id": 1
}
But it just throws a context must be a keyword or text error. Do I have to give up and make a totally new property username-autocomplete instead? Or is there some magical way where I can have a context completion suggester and another field on the same property, and be able to index like I would other multifield properties?

The second approach is the right one (i.e. with the path inside the context), but you need to set the user_id field as a keyword and it will work:
{
"mappings": {
"_doc" :{
"properties" : {
"username" : {
"type": "keyword",
"fields": {
"suggest" : {
"type" : "completion",
"contexts": [
{
"name": "user_id",
"type": "category",
"path": "user_id",
}
]
}
}
},
"user_id": {
"type": "keyword" <--- change this
}
}
}
}
}
Then you can index your document without creating an additional field, like this:
PUT usernames/_doc/1
{
"username" : "JOHN",
"user_id": "1" <--- wrap in double quotes
}

Related

Range query on Nested Documents for Keyword Type

I have a document structure like this. For this below two documents, we have nested documents called interaction info. I just need to get only the documents that have title duration and their value is greater than 60
Here whats the catch the value field is keyword, not an integer. I know that only for the Integer Range query will get executed. Is there any possible way to find the documents that have a duration greater than 60 ( Painless Query or Script Query ). Like converting the value Field into Integer and then searching the document.
{
"key": "f07ff9ba-36e4-482a-9c1c-d888e89f926e",
"interactionInfo": [
{
"title": "duration",
"value": "11"
},
{
"title": "timetaken",
"value": "9"
},
{
"title": "talk_time",
"value": "145"
}
]
},
{
"key": "f07ff9ba-36e4-482a-9c1c-d888e89f926e",
"interactionInfo": [
{
"title": "duration",
"value": "120"
},
{
"title": "timetaken",
"value": "9"
},
{
"title": "talk_time",
"value": "60"
}
]
}
I have added script to get interactionInfo.value>"somevalue". Scripts are slow and it is better to resolve this at index time and use a range query.
Index:
{
"index15" : {
"mappings" : {
"properties" : {
"interactionInfo" : {
"type" : "nested",
"properties" : {
"title" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"value" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"key" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
Query:
{
"query": {
"nested": {
"path": "interactionInfo",
"query": {
"bool": {
"must": [
{
"term": {
"interactionInfo.title.keyword": {
"value": "duration"
}
}
},
{
"script": {
"script": {
"source":"def val=Integer.parseInt(doc['interactionInfo.value.keyword'].value); if(val>params.value) return true; else return false;",
"params": {
"value":10
}
}
}
}
]
}
},
"inner_hits": {}
}
}
}

How to Query elasticsearch index with nested and non nested fields

I have an elastic search index with the following mapping:
PUT /student_detail
{
"mappings" : {
"properties" : {
"id" : { "type" : "long" },
"name" : { "type" : "text" },
"email" : { "type" : "text" },
"age" : { "type" : "text" },
"status" : { "type" : "text" },
"tests":{ "type" : "nested" }
}
}
}
Data stored is in form below:
{
"id": 123,
"name": "Schwarb",
"email": "abc#gmail.com",
"status": "current",
"age": 14,
"tests": [
{
"test_id": 587,
"test_score": 10
},
{
"test_id": 588,
"test_score": 6
}
]
}
I want to be able to query the students where name like '%warb%' AND email like '%gmail.com%' AND test with id 587 have score > 5 etc. The high level of what is needed can be put something like below, dont know what would be the actual query, apologize for this messy query below
GET developer_search/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"name": "abc"
}
},
{
"nested": {
"path": "tests",
"query": {
"bool": {
"must": [
{
"term": {
"tests.test_id": IN [587]
}
},
{
"term": {
"tests.test_score": >= some value
}
}
]
}
}
}
}
]
}
}
}
The query must be flexible so that we can enter dynamic test Ids and their respective score filters along with the fields out of nested fields like age, name, status
Something like that?
GET student_detail/_search
{
"query": {
"bool": {
"must": [
{
"wildcard": {
"name": {
"value": "*warb*"
}
}
},
{
"wildcard": {
"email": {
"value": "*gmail.com*"
}
}
},
{
"nested": {
"path": "tests",
"query": {
"bool": {
"must": [
{
"term": {
"tests.test_id": 587
}
},
{
"range": {
"tests.test_score": {
"gte": 5
}
}
}
]
}
},
"inner_hits": {}
}
}
]
}
}
}
Inner hits is what you are looking for.
You must make use of Ngram Tokenizer as wildcard search must not be used for performance reasons and I wouldn't recommend using it.
Change your mapping to the below where you can create your own Analyzer which I've done in the below mapping.
How elasticsearch (albiet lucene) indexes a statement is, first it breaks the statement or paragraph into words or tokens, then indexes these words in the inverted index for that particular field. This process is called Analysis and that this would only be applicable on text datatype.
So now you only get the documents if these tokens are available in inverted index.
By default, standard analyzer would be applied. What I've done is I've created my own analyzer and used Ngram Tokenizer which would be creating many more tokens than just simply words.
Default Analyzer on Life is beautiful would be life, is, beautiful.
However using Ngrams, the tokens for Life would be lif, ife & life
Mapping:
PUT student_detail
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "ngram",
"min_gram": 3,
"max_gram": 4,
"token_chars": [
"letter",
"digit"
]
}
}
}
},
"mappings" : {
"properties" : {
"id" : {
"type" : "long"
},
"name" : {
"type" : "text",
"analyzer": "my_analyzer",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"email" : {
"type" : "text",
"analyzer": "my_analyzer",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"age" : {
"type" : "text" <--- I am not sure why this is text. Change it to long or int. Would leave this to you
},
"status" : {
"type" : "text",
"analyzer": "my_analyzer",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"tests":{
"type" : "nested"
}
}
}
}
Note that in the above mapping I've created a sibling field in the form of keyword for name, email and status as below:
"name":{
"type":"text",
"analyzer":"my_analyzer",
"fields":{
"keyword":{
"type":"keyword"
}
}
}
Now your query could be as simple as below.
Query:
POST student_detail/_search
{
"query": {
"bool": {
"must": [
{
"match": {
"name": "war" <---- Note this. This would even return documents having "Schwarb"
}
},
{
"match": {
"email": "gmail" <---- Note this
}
},
{
"nested": {
"path": "tests",
"query": {
"bool": {
"must": [
{
"term": {
"tests.test_id": 587
}
},
{
"range": {
"tests.test_score": {
"gte": 5
}
}
}
]
}
}
}
}
]
}
}
}
Note that for exact matches I would make use of Term Queries on keyword fields while for normal searches or LIKE in SQL I would make use of simple Match Queries on text Fields provided they make use of Ngram Tokenizer.
Also note that for >= and <= you would need to make use of Range Query.
Response:
{
"took" : 233,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 3.7260926,
"hits" : [
{
"_index" : "student_detail",
"_type" : "_doc",
"_id" : "1",
"_score" : 3.7260926,
"_source" : {
"id" : 123,
"name" : "Schwarb",
"email" : "abc#gmail.com",
"status" : "current",
"age" : 14,
"tests" : [
{
"test_id" : 587,
"test_score" : 10
},
{
"test_id" : 588,
"test_score" : 6
}
]
}
}
]
}
}
Note that I observe the document you've mentioned in your question, in my response when I run the query.
Please do read the links I've shared. It is vital that you understand the concepts. Hope this helps!

Simple elasticsearch input - Rejecting mapping update final mapping would have more than 1 type: [_doc, doc]

I'm trying to send data to elasticsearch but running into an issue where my number field only comes up as a string. These are the steps I took.
Step 1. Add index & map
PUT http://123.com:5101/core_060619/
{
"mappings": {
"properties": {
"date": {
"type": "date",
"format": "HH:mm yyyy-MM-dd"
},
"data": {
"type": "integer"
}
}
}
}
Result:
{
"acknowledged": true,
"shards_acknowledged": true,
"index": "core_060619"
}
Step 2. Add data
PUT http://123.com:5101/core_060619/doc/1
{
"test" : [ {
"data" : "119050300",
"date" : "00:00 2019-06-03"
} ]
}
Result:
{
"error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "Rejecting mapping update to [zyxnewcoreyxbl_060619] as the final mapping would have more than 1 type: [_doc, doc]"
}
],
"type": "illegal_argument_exception",
"reason": "Rejecting mapping update to [zyxnewcoreyxbl_060619] as the final mapping would have more than 1 type: [_doc, doc]"
},
"status": 400
}
You can not have more than one type of document in Elasticsearch 6.0.0+. If you set your document type to doc, then you can add another document by simply PUT http://123.com:5101/core_060619/doc/1, PUT http://123.com:5101/core_060619/doc/2 etc.
Elasticsearch 6.+
PUT core_060619/
{
"mappings": {
"doc": { //type of documents in index is 'doc'
"properties": {
"date": {
"type": "date",
"format": "HH:mm yyyy-MM-dd"
},
"data": {
"type": "integer"
}
}
}
}
}
Since we created mapping to have doc type of documents, now we can add new documents by simply adding /doc/_id:
PUT core_060619/doc/1
{
"test" : [ {
"data" : "119050300",
"date" : "00:00 2019-06-03"
} ]
}
PUT core_060619/doc/2
{
"test" : [ {
"data" : "111120300",
"date" : "10:15 2019-06-02"
} ]
}
Elasticsearch 7.+
Types are removed, but you can use custom like field(s):
PUT twitter
{
"mappings": {
"_doc": {
"properties": {
"type": { "type": "keyword" },
"name": { "type": "text" },
"user_name": { "type": "keyword" },
"email": { "type": "keyword" },
"content": { "type": "text" },
"tweeted_at": { "type": "date" }
}
}
}
}
PUT twitter/_doc/user-kimchy
{
"type": "user",
"name": "Shay Banon",
"user_name": "kimchy",
"email": "shay#kimchy.com"
}
PUT twitter/_doc/tweet-1
{
"type": "tweet",
"user_name": "kimchy",
"tweeted_at": "2017-10-24T09:00:00Z",
"content": "Types are going away"
}
GET twitter/_search
{
"query": {
"bool": {
"must": {
"match": {
"user_name": "kimchy"
}
},
"filter": {
"match": {
"type": "tweet"
}
}
}
}
}
Removal of mapping types

Elasticsearch - Can't search using suggestion field (“is not a completion suggest field”)

I'm completely new to elasticsearch and I'm trying to use elasticsearch completion suggester on an existing field called "identity.full_name", index = "search" and type = "person".
I followed the below index to change the mappings of the field.
1)
POST /search/_close
2)
POST search/person/_mapping
{
"person": {
"properties": {
"identity.full_name": {
"type": "text",
"fields":{
"suggest":{
"type":"completion"
}
}
}
}
}
}
3)
POST /search/_open
When I check the mappings at this point, using
GET search/_mapping/person/field/identity.full_name
I get the result,
{
"search": {
"mappings": {
"person": {
"identity.full_name": {
"full_name": "identity.full_name",
"mapping": {
"full_name": {
"type": "text",
"fields": {
"completion": {
"type": "completion",
"analyzer": "simple",
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 50
},
"keyword": {
"type": "keyword",
"ignore_above": 256
},
"suggest": {
"type": "completion",
"analyzer": "simple",
"preserve_separators": true,
"preserve_position_increments": true,
"max_input_length": 50
}
}
}
}
}
}
}
}
}
Which is suggesting that it has been updated to be a completion field.
However, when I'm querying to check if this works using,
GET search/person/_search
{
"suggest": {
"person-suggest" : {
"prefix" : "EMANNUEL",
"completion" : {
"field" : "identity.full_name"
}
}
}
}
It is giving me the error "Field [identity.full_name] is not a completion suggest field"
I'm not sure why I'm getting this error. Is there anything else I can try?
sample data:
{
"_index": "search",
"_type": "person",
"_id": "3106105149",
"_score": 1,
"_source": {
"identity": {
"id": "3106105149",
"first_name": "FLORENT",
"last_name": "TEBOUL",
"full_name": "FLORENT TEBOUL"
}
}
}
{
"_index": "search",
"_type": "person",
"_id": "125296353",
"_score": 1,
"_source": {
"identity": {
"id": "125296353",
"first_name": "CHRISTINA",
"last_name": "BHAN",
"full_name": "CHRISTINA K BHAN"
}
}
}
so when I do a GET based on prefix "CHRISTINA"
GET search/person/_search
{
"suggest": {
"person-suggest" : {
"prefix" : "CHRISTINA",
"completion" : {
"field" : "identity.full_name.suggest"
}
}
}
}
I'm getting all the results like a match_all query.
You should use it like
GET search/person/_search
{
"suggest": {
"person-suggest" : {
"prefix" : "EMANNUEL",
"completion" : {
"field" : "identity.full_name.suggest"
}
}
}
}
Mapping for GET search/_mapping/person/field/identity.full_name
{
"search" : {
"mappings" : {
"person" : {
"identity.full_name" : {
"full_name" : "identity.full_name",
"mapping" : {
"full_name" : {
"type" : "text",
"fields" : {
"suggest" : {
"type" : "completion",
"analyzer" : "simple",
"preserve_separators" : true,
"preserve_position_increments" : true,
"max_input_length" : 50
}
}
}
}
}
}
}
}
}

Elasticsearch context suggester, bool on contexts

I'm using context suggester and am wondering if we can set the scope of the context to be used for suggestions rather that using all contexts.
Currently the query needs to match all contexts. Can we add an "OR" operation on the contexts and/or specify which context to use for a particular query?
Taking the example from here :
Mapping :
PUT /venues/poi/_mapping
{
"poi" : {
"properties" : {
"suggest_field": {
"type": "completion",
"context": {
"type": {
"type": "category"
},
"location": {
"type": "geo",
"precision" : "500m"
}
}
}
}
}
}
Then I index a document :
{
"suggest_field": {
"input": ["The Shed", "shed"],
"output" : "The Shed - fresh sea food",
"context": {
"location": {
"lat": 51.9481442,
"lon": -5.1817516
},
"type" : "restaurant"
}
}
}
Query:
{
"suggest" : {
"text" : "s",
"completion" : {
"field" : "suggest_field",
"context": {
"location": {
"value": {
"lat": 51.938119,
"lon": -5.174051
}
}
}
}
}
}
If I query using only one Context ("location" in the above example) it gives an error, I need to pass both the contexts, is it possible to specify which context to use? Or pass something like a "Context_Operation" parameter set to "OR".
You have 2 choices:
First you add all available type values as default in your mapping (not scalable)
{
"poi" : {
"properties" : {
"suggest_field": {
"type": "completion",
"context": {
"type": {
"type": "category",
"default": ["restaurant", "pool", "..."]
},
"location": {
"type": "geo",
"precision" : "500m"
}
}
}
}
}
}
Second option, you add a default value to every indexed document, and you add only this value as default
Mapping:
{
"poi" : {
"properties" : {
"suggest_field": {
"type": "completion",
"context": {
"type": {
"type": "category",
"default": "any"
},
"location": {
"type": "geo",
"precision" : "500m"
}
}
}
}
}
}
Document:
{
"suggest_field": {
"input": ["The Shed", "shed"],
"output" : "The Shed - fresh sea food",
"context": {
"location": {
"lat": 51.9481442,
"lon": -5.1817516
},
"type" : ["any", "restaurant"]
}
}
}

Resources