I have an index, which stores a nested document. I wanna see this nested documents, for this purpose I used 'inner_hits' in request, but elastic returns nullPointerException. Do anyone meet with this problem?)
Request to elasticsearch using Postman:
GET http://localhost/my-index/_search
{
"query": {
"nested": {
"path": "address_object",
"query": {
"bool": {
"must": {
"term": {"address_object.city": "Paris"}
}
}
},
"inner_hits" : {}
}
}
}
Response with status code 200:
{
"took": 161,
"timed_out": false,
"_shards": {
"total": 2,
"successful": 1,
"skipped": 0,
"failed": 1,
"failures": [
{
"shard": 0,
"index": "my-index",
"node": "DWdD83KaTmUiodENQkGDww",
"reason": {
"type": "null_pointer_exception",
"reason": null
}
}
]
},
"hits": {
"total": 6500039,
"max_score": 2.1761138,
"hits": []
}
}
Elasticsearch version: 6.2.4
Lucene version: 7.2.1
Update:
Mapping:
{
"my-index": {
"mappings": {
"mytype": {
"dynamic": "false",
"_source": {
"enabled": false
},
"properties": {
"adverts_count": {
"type": "integer",
"store": true
},
...
"address_object": {
"type": "nested",
"properties": {
"adverts_count": {
"type": "integer",
"store": true
},
"city": {
"type": "keyword",
"store": true
}
}
},
...
Sample document:
{
"_index": "my-index",
"_type": "mytype",
"_id": "XDWrGncBdwNBWGEagAM2",
"_score": 2.1587489,
"fields": {
"is_target_page_shown": [
0
],
"updated_at": [
1612264276
],
"is_shown": [
0
],
"nb_queries": [
1
],
"search_query": [
"phone"
],
"target_category": [
15
],
"adverts_count": [
1
]
}
}
Extra information:
If I remove the "inner_hits": {} from search request, elastic returns nested documents(_index, _type, _id, _score), but ain't other fields(e.g city)
Also, as suggested in the comments, I tried setting to true ignore_unmapped, but it doesn't helped. The same nullPointerException.
I tried reproducing your issue, but as you have not provided the proper sample documents(one which you provided doesn't have the address_object properties), I used your mapping and below sample documents.
PUT index-name/_doc/1
{
"address_object" :{
"adverts_count" : 1,
"city": "paris"
}
}
PUT index-name/_doc/2
{
"address_object" :{
"adverts_count" : 1,
"city": "blr"
}
}
And when I use the same search provided by you.
POST 71907588/_search
{
"query": {
"nested": {
"path": "address_object",
"query": {
"bool": {
"must": {
"term": {
"address_object.city": "paris"
}
}
}
},
"inner_hits": {}
}
}
}
I get a proper response, matching paris as city as shown in the search response.
"hits": [
{
"_index": "71907588",
"_id": "1",
"_score": 0.6931471,
"_source": {
"address_object": {
"adverts_count": 1,
"city": "paris"
}
},
"inner_hits": {
"address_object": {
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0.6931471,
"hits": [
{
"_index": "71907588",
"_id": "1",
"_nested": {
"field": "address_object",
"offset": 0
},
"_score": 0.6931471,
"_source": {
"city": "paris",
"adverts_count": 1
}
}
]
}
}
}
}
]
Related
I'm using ElasticSearch 7.0
Given the mapping:
{
"searchquestion": {
"mappings": {
"properties": {
"server": {
"properties": {
"hostname": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}
I have put the following documents into this index:
{
"server": {
"hostname": "server1-windows.loc2.uk"
}
}
{
"server": {
"hostname": "server1-windows.loc2.uk"
}
}
{
"server": {
"hostname": "server1-linux.loc1.uk"
}
}
I would like to query the exact text of the hostname. Luckily, this can be done because there is an additional keyword type field on this field.
Successful query :
{
"query": {
"bool": {
"must": [
{
"match": {
"server.hostname.keyword": {
"query": "server1-windows.loc2.uk"
}
}
}
]
}
}
}
However, I would like to extend this query string, to include another hostname to search for. In my results, I expect to have both documents returned.
My attempt:
{
"query": {
"bool": {
"must": [
{
"match": {
"server.hostname.keyword": {
"query": "server1-windows.loc2.uk server1-linux.loc1.uk",
"operator": "or"
}
}
}
]
}
}
}
This returns no hits, I suspect because the default analyser is splitting this query up into sections, but I'm actually searching the keyword field which is a full string. I cannot add analyzer: keyword to this query search, as server1-windows.loc2.uk server1-linux.loc1.uk as an exact string won't match anything either.
How can I search for both these strings, as their complete selves?
i.e. "query": ["server1-windows.loc2.uk", "server1-linux.loc1.uk"]
I would also like to use wildcards to match any loc. I would expect
"query": ["server1-windows.*.uk"] to match both windows servers, but I get no hits.
What am I missing?
you can use Query_String to get your desired result
Case 1:
Query:
GET server/_search
{
"query": {
"query_string": {
"query": "(server1-windows.loc2.uk) OR (server1-linux.loc1.uk)",
"default_field": "server.hostname.keyword"
}
}
}
Output:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": 0.9808291,
"hits": [
{
"_index": "server",
"_id": "3",
"_score": 0.9808291,
"_source": {
"server": {
"hostname": "server1-linux.loc1.uk"
}
}
},
{
"_index": "server",
"_id": "1",
"_score": 0.4700036,
"_source": {
"server": {
"hostname": "server1-windows.loc2.uk"
}
}
},
{
"_index": "server",
"_id": "2",
"_score": 0.4700036,
"_source": {
"server": {
"hostname": "server1-windows.loc2.uk"
}
}
}
]
}
}
Case 2: with wildcard(*)
Query:
GET server/_search
{
"query": {
"query_string": {
"query": "server1-windows.*.uk",
"default_field": "server.hostname.keyword"
}
}
}
Output:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1,
"hits": [
{
"_index": "server",
"_id": "1",
"_score": 1,
"_source": {
"server": {
"hostname": "server1-windows.loc2.uk"
}
}
},
{
"_index": "server",
"_id": "2",
"_score": 1,
"_source": {
"server": {
"hostname": "server1-windows.loc2.uk"
}
}
}
]
}
}
I've setup a normalizer on an index field to support case insensitive searches, cant seem to get it to work.
GET users/
Returns the following mapping:
{
"users": {
"aliases": {},
"mappings": {
"user": {
"properties": {
"active": {
"type": "boolean"
},
"first_name": {
"type": "keyword",
"fields": {
"normalize": {
"type": "keyword",
"normalizer": "search_normalizer"
}
}
},
},
"settings": {
"index": {
"number_of_shards": "5",
"provided_name": "users",
"creation_date": "1567936315432",
"analysis": {
"normalizer": {
"search_normalizer": {
"filter": [
"lowercase"
],
"type": "custom"
}
}
},
"number_of_replicas": "1",
"uuid": "5SknFdwJTpmF",
"version": {
"created": "6040299"
}
}
}
}
}
Although first_name is normalized to lowercase, queries on the first_name field are case sensitive.
Using the following query for a user with first name Dave
GET users/_search
{
"query": {
"bool": {
"should": [
{
"regexp": {
"first_name": {
"value": ".*dave.*"
}
}
}
]
}
}
}
GET users/_analyze
{
"analyzer" : "standard",
"text": "Dave"
}
returns
{
"tokens": [
{
"token": "dave",
"start_offset": 0,
"end_offset": 4,
"type": "<ALPHANUM>",
"position": 0
}
]
}
Although "Dave" is tokenized to "dave" the following query
GET users/_search
{
"query": {
"match": {
"first_name": "dave"
}
}
}
Returns no hits.
Is there an issue with my current mapping? or the query?
I think you have missed first_name.normalize in query
Indexing Records
{"first_name": "Daveraj"}
{"index": {}}
{"first_name": "RajdaveN"}
{"index": {}}
{"first_name": "Dave"}
Query
"query": {
"bool": {
"should": [
{
"regexp": {
"first_name.normalize": {
"value": ".*dave.*"
}
}
}
]
}
}
}
Result
"took": 10,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1.0,
"hits": [
{
"_index": "test3",
"_type": "test3_type",
"_id": "M8-lEG0BLCpzI1hbBWYC",
"_score": 1.0,
"_source": {
"first_name": "Dave"
}
},
{
"_index": "test3",
"_type": "test3_type",
"_id": "Mc-lEG0BLCpzI1hbBWYC",
"_score": 1.0,
"_source": {
"first_name": "Daveraj"
}
},
{
"_index": "test3",
"_type": "test3_type",
"_id": "Ms-lEG0BLCpzI1hbBWYC",
"_score": 1.0,
"_source": {
"first_name": "RajdaveN"
}
}
]
}
}```
You have created a normalized multi-field: first_name.normalize , but you are searching on the original field first_name which doesn't have any analyzer specified (will default to index-default analyzer or standard).
The examples given here might help:
https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html
You need to explicitly specify the multi-field you want to search on, note even though a multi-field cant have its own content, it indexes different terms as opposed to its parent (although not always) as a result of possibly being analyzed using diff analyzers/char/token filters.
I am trying to search for all the unique names in the index test_nested.
GET test_nested/_mappings
{
"test_nested": {
"mappings": {
"my_type": {
"properties": {
"group": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"user": {
"type": "nested",
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}
}
GET test_nested/_search
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 4,
"max_score": 1,
"hits": [
{
"_index": "test_nested",
"_type": "my_type",
"_id": "AWG5iVBz4bQsVnslc9gL",
"_score": 1,
"_source": {
"group": "fans",
"user": [
{
"name": "Linux"
},
{
"name": "Android (operating system)"
},
{
"name": "Widows 10"
}
]
}
},
{
"_index": "test_nested",
"_type": "my_type",
"_id": "AWG5ieKW4bQsVnslc9gM",
"_score": 1,
"_source": {
"group": "fans",
"user": [
{
"name": "Bitcoin"
},
{
"name": "PHP"
},
{
"name": "Microsoft Windows"
}
]
}
},
{
"_index": "test_nested",
"_type": "my_type",
"_id": "AWG5irrV4bQsVnslc9gN",
"_score": 1,
"_source": {
"group": "fans",
"user": [
{
"name": "Windows XP"
}
]
}
},
{
"_index": "test_nested",
"_type": "my_type",
"_id": "1",
"_score": 1,
"_source": {
"group": "fans",
"user": [
{
"name": "iOS"
},
{
"name": "Android (operating system)"
},
{
"name": "Widows 10"
},
{
"name": "Widows XP"
}
]
}
}
]
}
}
I want all the unique names for a term. i.e. if I search for "wi"* then I should get [Microsoft Windows, Widows 10, Windows XP]
I don't know exactly what you mean but I use that query to list all statuses:
GET order/default/_search
{
"size": 0,
"aggs": {
"status_terms": {
"terms": {
"field": "status.keyword",
"missing": "N/A",
"min_doc_count": 0,
"order": {
"_key": "asc"
}
}
}
}
}
My model has status field and that query lists all statuses.
This is bucket aggregations
One of fields in result is:
sum_other_doc_count - Elastic returns the top unique terms. So if you have many different terms then some of them will not appear in the results. This field is a sum of documents which will not be a part of the response.
For nested objects try to read and use Nested Query docs
I found the solution. Hope it helps someone.
GET record_new/_search
{
"size": 0,
"query": {
"term": {
"software_tags": {
"value": "windows"
}
}
},
"aggs": {
"software_tags": {
"terms": {
"field": "software_tags.keyword",
"include" : ".*Windows.*",
"size": 10000,
"order": {
"_count": "desc"
}
}
}
}
}
I have my mappings as below and I am doing a bool should query on name and other properties as shown below but what I need is that I want to filter CustomerPrices by CustomerId on response.
Each products have same CustomerIds so for eaxample;
product1 -CustomerPrice( CustomerId :1234 -Price:4)
CustomerPrice( CustomerId :567-Price:5)
.
.
Product2 - CustomerPrice(CustomerId :1234 -Price:8)
CustomerPrice(CustomerId :567-Price:10)
.
.
So according to that when I query Product1, response should have only customerPrice for customerId:1234
{
"Product": {
"properties": {
"CustomerPrices": {
"type": "nested",
"properties": {
"Price": {
"store": true,
"type": "float"
},
"CustomerId": {
"type": "integer"
}
}
},
"Name": {
"index": "not_analyzed",
"store": true,
"type": "string"
}
}
}
}
I tried following query but this is not filtering nested objects. I guess it filters product objects as it makes sense because all products have customerId:1234
"query":{
"bool":{
"should":[
{
"multi_match":{
"type":"best_fields",
"query":"product 1",
"fields":[
"Name^7"]
}
},
{
"multi_match":{
"type":"best_fields",
"query":"product 1",
"operator":"and",
"fields":[
"Code^10",
"ShortDescription^6"]
}
},
{
"nested":{
"query":{
"term":{
"CustomerPrices.CustomerId":{
"value":1234
}
}
},
"path":"CustomerPrices"
}
}]
}
},
I've spent some time on your question since it was interesting how this can be achieved and the only solution I found for now is relying on the inner_hits which gives the exact nested object the match was on. I've also deactivated the _source which isn't used anymore.
So given your mapping and having 2 products like:
PUT product/Product/product1
{
"CustomerPrices": [
{
"CustomerId": 1234,
"Price": 4
},
{
"CustomerId": 567,
"Price": 5
}
],
"Name": "John"
}
PUT product/Product/product2
{
"CustomerPrices": [
{
"CustomerId": 1234,
"Price": 8
},
{
"CustomerId": 567,
"Price": 10
}
],
"Name": "Bob"
}
When running the following query: (Used must just to see 1 result, works with should as well)
GET product/_search
{
"_source": false,
"query": {
"bool": {
"must": [
{ "match": { "Name": "Bob"}}
],
"filter": [
{
"nested" : {
"path" : "CustomerPrices",
"score_mode" : "avg",
"query" : {
"bool" : {
"should" : [
{ "match" : {"CustomerPrices.CustomerId" : 1234}}
]
}
},
"inner_hits": {}
}
}
]
}
}
}
I was able to get the result where only "Price" from customer with id 1234 was present:
{
"took": 5,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2876821,
"hits": [
{
"_index": "product",
"_type": "Product",
"_id": "product2",
"_score": 0.2876821,
"inner_hits": {
"CustomerPrices": {
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "product",
"_type": "Product",
"_id": "product2",
"_nested": {
"field": "CustomerPrices",
"offset": 0
},
"_score": 1,
"_source": {
"CustomerId": 1234,
"Price": 8
}
}
]
}
}
}
}
]
}
}
Couldn't find an official way of returning partial results of the document by only having the matched nested object. Maybe something that we need to inform elasticsearch guys about to consider for some next releases. Hope it helps you.
When I use the "fields" option of a query I get a separate array for each field. Is it possible to get back the "complete" nested objects rather than just the field?
In the following example if I try to do "fields": ["cast"] it tells me that cast is not a leaf node. And if I do "fields": ["cast.firstName", "cast.middleName", "cast.lastName"] it returns 3 arrays.
Is there another way of retrieving just a partial amount of the document? Or is there a way to "reassemble" the separate fields into a complete "cast" object?
Example Index and Data:
POST /movies
{
"mappings": {
"movie": {
"properties": {
"cast": {
"type": "nested"
}
}
}
}
}
POST /movies/movie
{
"title": "The Matrix",
"cast": [
{
"firstName": "Keanu",
"lastName": "Reeves",
"address": {
"street": "somewhere",
"city": "LA"
}
},
{
"firstName": "Laurence",
"middleName": "John",
"lastName": "Fishburne",
"address": {
"street": "somewhere else",
"city": "NYC"
}
}
]
}
Example Query:
GET /movies/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "cast",
"filter": {
"bool": {
"must": [
{ "term": { "firstName": "laurence"} },
{ "term": { "lastName": "fishburne"} }
]
}
}
}
}
}
},
"fields": [
"cast.address.city",
"cast.firstName",
"cast.middleName",
"cast.lastName"
]
}
Result of example query:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "movies",
"_type": "movie",
"_id": "AU1JeyBseLgwMCOuOLsZ",
"_score": 1,
"fields": {
"cast.firstName": [
"Keanu",
"Laurence"
],
"cast.lastName": [
"Reeves",
"Fishburne"
],
"cast.address.city": [
"LA",
"NYC"
],
"cast.middleName": [
"John"
]
}
}
]
}
}
I think this is what you're looking for:
POST /movies/_search
{
"_source": {
"include": [
"cast.address.city",
"cast.firstName",
"cast.middleName",
"cast.lastName"
]
},
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"nested": {
"path": "cast",
"filter": {
"bool": {
"must": [
{
"term": {
"firstName": "laurence"
}
},
{
"term": {
"lastName": "fishburne"
}
}
]
}
}
}
}
}
}
}
Result:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "movies",
"_type": "movie",
"_id": "AU1PIJgBA_0Cyshym7-m",
"_score": 1,
"_source": {
"cast": [
{
"lastName": "Reeves",
"address": {
"city": "LA"
},
"firstName": "Keanu"
},
{
"middleName": "John",
"lastName": "Fishburne",
"address": {
"city": "NYC"
},
"firstName": "Laurence"
}
]
}
}
]
}
}
You can also choose to exclude fields instead of including or both, see documentation here: http://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-source-filtering.html