elasticsearch highlight not working - elasticsearch

Elasticsearch 5.5
using examples from document, cannot get the highlight field from result.
The document says that store is required. But the 'title' field has been stored.
mapping
PUT my_index
{
"mappings": {
"user": {
"properties": {
"title": {
"type": "text",
"store": true
},
"date": {
"type": "date",
"store": true
},
"content": {
"type": "text"
}
}
}
}
}
indexing
PUT my_index/user/1
{
"title": "Some short title",
"date": "2015-01-01",
"content": "A very long content field..."
}
query
GET my_index/_search
{
"query" : {
"match" : {
"_all" : "short"
}
},
"highlight": {
"fields" : {
"_all" : {}
}
}
}
output
{
"took": 11,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.24257512,
"hits": [
{
"_index": "my_index",
"_type": "user",
"_id": "1",
"_score": 0.24257512,
"_source": {
"title": "Some short title",
"date": "2015-01-01",
"content": "A very long content field..."
}
}
]
}
}
There is no highlight field in output json.
There must be something wrong. Please point it, thanks in advance.

Related

ElasticSearch _parent query

Elastic documentation states that one can use the _parent field in a query (see https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-parent-field.html).
However, I have been unable to get it to work. Here's the simple test:
PUT /company
{
"mappings": {
"branch": {},
"employee": {
"_parent": {
"type": "branch"
}
}
}
}
POST /company/branch/_bulk
{ "index": { "_id": "london" }}
{ "name": "London Westminster", "city": "London", "country": "UK" }
{ "index": { "_id": "liverpool" }}
{ "name": "Liverpool Central", "city": "Liverpool", "country": "UK" }
{ "index": { "_id": "paris" }}
{ "name": "Champs Élysées", "city": "Paris", "country": "France" }
PUT /company/employee/1?parent=london
{
"name": "Alice Smith",
"dob": "1970-10-24",
"hobby": "hiking"
}
Verifying that the employees have a _parent field:
GET /company/employee/_search
{
"query": {
"match_all": {}
}
}
returns
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "company",
"_type": "employee",
"_id": "1",
"_score": 1,
"_routing": "london",
"_parent": "london",
"_source": {
"name": "Alice Smith",
"dob": "1970-10-24",
"hobby": "hiking"
}
}
]
}
}
But the following:
GET /company/employee/_search
{
"query": {
"term": {
"_parent":"london"
}
}
}
returns:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
Using "has_parent" works, but why doesn't using _parent work, as stated in the docs.
Here's the query using has_parent that works:
GET /company/employee/_search
{
"query": {
"has_parent": {
"parent_type":"branch",
"query":{
"match_all": {}
}
}
}
}
What am I missing? Using ElasticSearch 5.0.2.
It's a documentation bug. According to the breaking changes in 5.0, the _parent field is no longer indexed and hence it is not possible to run a term query on that field. You either need to use the has_parent query or the new parent_id query to find that child document:
POST /company/employee/_search
{
"query": {
"parent_id": {
"type": "employee",
"id": "london"
}
}
}
For those who want to follow, I've filed an issue to report this and it got fixed. The updated documentation will soon be available.

Elastic Search : Restricting the search result in array

My index metadata :
{
"never": {
"aliases": {},
"mappings": {
"userDetails": {
"properties": {
"Residence_address": {
"type": "nested",
"include_in_parent": true,
"properties": {
"Address_type": {
"type": "string",
"analyzer": "standard"
},
"Pincode": {
"type": "string",
"analyzer": "standard"
},
"address": {
"type": "string",
"analyzer": "standard"
}
}
}
}
}
},
"settings": {
"index": {
"creation_date": "1468850158519",
"number_of_shards": "5",
"number_of_replicas": "1",
"version": {
"created": "1060099"
},
"uuid": "v2njuC2-QwSau4DiwzfQ-g"
}
},
"warmers": {}
}
}
My setting :
POST never
{
"settings": {
"number_of_shards" : 5,
"analysis": {
"analyzer": {
"standard": {
"tokenizer": "keyword",
"filter" : ["lowercase","reverse"]
}
}
}
}
}
My data :
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.375,
"hits": [
{
"_index": "never",
"_type": "userDetails",
"_id": "1",
"_score": 0.375,
"_source": {
"Residence_address": [
{
"address": "Omega Residency",
"Address_type": "Owned",
"Pincode": "500004"
},
{
"address": "Collage of Engineering",
"Address_type": "Rented",
"Pincode": "411005"
}
]
}
}
]
}
}
My query :
POST /never/_search?pretty
{
"query": {
"match": {
"Residence_address.address": "Omega"
}
}
}
My Result :
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.375,
"hits": [
{
"_index": "never",
"_type": "userDetails",
"_id": "1",
"_score": 0.375,
"_source": {
"Residence_address": [
{
"address": "Omega Residency",
"Address_type": "Owned",
"Pincode": "500004"
},
{
"address": "Collage of Engineering",
"Address_type": "Rented",
"Pincode": "411005"
}
]
}
}
]
}
}
Is there any way to restrict my result to only object containing address = Omega Residency and NOT the other object having address = Collage of Engineering?
You can only do it with nested query and inner_hits. I see that you have include_in_parent: true and not using nested queries though. If you only want to get the matched nested objects you'd need to use inner_hits from nested queries:
GET /never/_search?pretty
{
"_source": false,
"query": {
"nested": {
"path": "Residence_address",
"query": {
"match": {
"Residence_address.address": "Omega Residency"
}
},
"inner_hits" : {}
}
}
}

What is the query required for fetching full-text with delimiter in elasticsearch

Assuming I have a document like this in elasticSearch :
{
"videoName": "taylor.mp4",
"type": "long"
}
I tried full-text search using the DSL query:
{
"query": {
"match":{
"videoName": "taylor"
}
}
}
I need to get the above document, but I don't get it .If I specify taylor.mp4, it returns the document.
So, I would like to know, how to make full-text search with delimiters.
Edit after KARTHEEK answer:
The regexp fetches the taylor.mp4 document. Take the situation, where the document in video index are:
{
"videoName": "Akon - smack that.mp4",
"type": "long"
}
So, the query for retrieving this document can be ,
{
"query": {
"match":{
"videoName": "smack that"
}
}
}
In this case, the document will be retrieved, since we use smack in the query string. match does the full-text search and gets us the document. But, say I only know the that keyword and the match, doesn't get the document. I need to use regexp for that.
{
"query": {
"regexp":{
"videoName": "smack.* that.*"
}
}
}
On the Other hand, if i take up regexp and make all my query strings to smack.* that.*, this will also not retrieve any documents. And, we dont know which word will have its suffix .mp4. So, my question is we need to do the full-text search with match, and it should also detect the delimiters. Is there any other way ?
Edit after Richa asked the mapping of index
for http://localhost:9200/example/videos/_mapping
{
"example": {
"mappings": {
"videos": {
"properties": {
"query": {
"properties": {
"match": {
"properties": {
"videoName": {
"type": "string"
}
}
}
}
},
"type": {
"type": "string"
},
"videoName": {
"type": "string"
}
}
}
}
}
}
Depending upon above query you mentioned right we can use regular expression in order get the result.Please find attached result for your perusal and let me know if there are anything else you want.
curl -XGET "http://localhost:9200/test/sample/_search" -d'
{
"query": {
"regexp":{
"videoName": "taylor.*"
}
}
}'
Result:
{
"took": 22,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "sample",
"_id": "1",
"_score": 1,
"_source": {
"videoName": "taylor.mp4",
"type": "long"
}
}
]
}
}
Please use this mapping
PUT /test_index
{
"settings": {
"number_of_shards": 1
},
"mappings": {
"doc": {
"properties": {
"videoName": {
"type": "string",
"term_vector": "yes"
}
}
}
}
}
After that you need to index a document that you mentioned earlier:
PUT test_index/doc/1
{
"videoName": "Akon - smack that.mp4",
"type": "long"
}
Output:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.15342641,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 0.15342641,
"_source": {
"videoName": "Akon - smack that.mp4",
"type": "long"
}
}
]
}
}
Query to get results:
GET /test_index/doc/1/_termvector?fields=videoName
Results:
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_version": 1,
"found": true,
"took": 1,
"term_vectors": {
"videoName": {
"field_statistics": {
"sum_doc_freq": 3,
"doc_count": 1,
"sum_ttf": 3
},
"terms": {
"akon": {
"term_freq": 1
},
"smack": {
"term_freq": 1
},
"that.mp4": {
"term_freq": 1
}
}
}
}
}
By using this we will search based on "smack"
POST /test_index/_search
{
"query": {
"match": {
"_all": "smack"
}
}
}
Result:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.15342641,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 0.15342641,
"_source": {
"videoName": "Akon - smack that.mp4",
"type": "long"
}
}
]
}
}

Elasticsearch - searching across multiple multiple types of a index and its different types

I have indexed data in elasticsearch . Index name is "demo" .
I have two types (mappings) for "demo" , one is "user" and other is "blog".
"user" type have fields - name , city , country other fields and blog have - "title" , description" , "author_name" etc.
Now I want to search across "demo". If I want to search "java" then it will bring all the documents which have "java" in any fields of any type , either "user" or "blog".
You can use the "_all" field for that index. By default each of your fields will be included in the "_all" field for each type. Then you can just run a match query against the "_all" field. Also, when searching the index, just don't specify a type and all types will be searched.
Here is an example:
DELETE /test_index
PUT /test_index
{
"settings": {
"number_of_shards": 1
},
"mappings": {
"user": {
"properties": {
"name" : { "type": "string" },
"city" : { "type": "string" },
"country" : { "type": "string" }
}
},
"blog": {
"properties": {
"title" : { "type": "string" },
"description" : { "type": "string" },
"author_name" : { "type": "string" }
}
}
}
}
POST /test_index/_bulk
{"index":{"_index":"test_index","_type":"user"}}
{"name":"Bob","city":"New York","country":"USA"}
{"index":{"_index":"test_index","_type":"user"}}
{"name":"John","city":"Jakarta","country":"Java/Indonesia"}
{"index":{"_index":"test_index","_type":"blog"}}
{"title":"Python/ES","description":"using Python with Elasticsearch","author_name":"John"}
{"index":{"_index":"test_index","_type":"blog"}}
{"title":"Java/ES","description":"using Java with Elasticsearch","author_name":"Bob"}
POST /test_index/_search
{
"query": {
"match": {
"_all": "Java"
}
}
}
...
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.68289655,
"hits": [
{
"_index": "test_index",
"_type": "blog",
"_id": "hNJ-AOG2SbS0nw4IPBuXGQ",
"_score": 0.68289655,
"_source": {
"title": "Java/ES",
"description": "using Java with Elasticsearch",
"author_name": "Bob"
}
},
{
"_index": "test_index",
"_type": "user",
"_id": "VqfowNx8TTG69buY9Vd_MQ",
"_score": 0.643841,
"_source": {
"name": "John",
"city": "Jakarta",
"country": "Java/Indonesia"
}
}
]
}
}

Issue with a basic elasticsearch "terms" query

I am trying to run a simple elasticsearch terms query as follows (using the sense chrome extension):
GET _search
{
"query": {
"terms": {
"childcareTypes": [
"SHARED_CHARGE",
"OUT_OF_SCHOOL"
],
"minimum_match": 2
}
}
}
This returns 0 hits:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}
I am not sure why because a match_all query does show that the two of the three records match:
GET _search
{
"query": {
"match_all": {}
}
}
yields:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "bignibou",
"_type": "advertisement",
"_id": "1",
"_score": 1,
"_source": {
"id": 1,
"childcareWorkerType": "AUXILIAIRE_PARENTALE",
"childcareTypes": [
"SHARED_CHARGE",
"OUT_OF_SCHOOL"
],
"giveBath": "YES"
}
},
{
"_index": "bignibou",
"_type": "advertisement",
"_id": "2",
"_score": 1,
"_source": {
"id": 2,
"childcareWorkerType": "AUXILIAIRE_PARENTALE",
"childcareTypes": [
"SHARED_CHARGE",
"OUT_OF_SCHOOL"
],
"giveBath": "EMPTY"
}
},
{
"_index": "bignibou",
"_type": "advertisement",
"_id": "3",
"_score": 1,
"_source": {
"id": 3,
"childcareWorkerType": "AUXILIAIRE_PARENTALE",
"childcareTypes": [
"SHARED_CHARGE"
],
"giveBath": "YES"
}
}
]
}
}
and my mapping does show that the field childcareTypes is analyzed:
{
"advertisement": {
"dynamic": "false",
"properties": {
"id": {
"type": "long",
"store": "yes"
},
"childcareWorkerType": {
"type": "string",
"store": "yes",
"index": "analyzed"
},
"childcareTypes": {
"type": "string",
"store": "yes",
"index": "analyzed"
},
"giveBath": {
"type": "string",
"store": "yes",
"index": "analyzed"
}
}
}
}
Can someone please explain why my terms query returns 0 hits?
It happens like that because terms will not analyze the input. This means that it will search exactly for SHARED_CHARGE and OUT_OF_SCHOOL (capital letters). Whereas you have that field as "index": "analyzed" which means ES will use the standard analyzer to index the data.
For SHARED_CHARGE ES stores shared_charge.
For OUT_OF_SCHOOL ES stores out_of_school.
and my mapping does show that the field childcareTypes is analyzed:
This is exactly where your problem is : the field is analyzed, however, a terms query look directly for terms, which are not analyzed (see here).
To be more precise, the indexed values look like this :
shared_charge
out_of_school
And your terms query search for :
SHARED_CHARGE
OUT_OF_SCHOOL
You can check this behavior as if you try this query...
POST /bignibou/_search
{
"query": {
"terms": {
"childcareTypes": [
"shared_charge",
"out_of_school"
]
}
}
}
...you will find your docs.
You should either use your previous query on a not_analyzed version of the field, or a query from the match family.

Resources