Searching on fields of a nested object on elasticsearch - spring

I have this mapping on ES 1.7.3:
{
"customer": {
"aliases": {},
"mappings": {
"customer": {
"properties": {
"addresses": {
"type": "nested",
"include_in_parent": true,
"properties": {
"address1": {
"type": "string"
},
"address2": {
"type": "string"
},
"address3": {
"type": "string"
},
"country": {
"type": "string"
},
"latitude": {
"type": "double",
"index": "not_analyzed"
},
"longitude": {
"type": "double",
"index": "not_analyzed"
},
"postcode": {
"type": "string"
},
"state": {
"type": "string"
},
"town": {
"type": "string"
},
"unit": {
"type": "string"
}
}
},
"companyNumber": {
"type": "string"
},
"id": {
"type": "string",
"index": "not_analyzed"
},
"name": {
"type": "string"
},
"status": {
"type": "string"
},
"timeCreated": {
"type": "date",
"format": "dateOptionalTime"
},
"timeUpdated": {
"type": "date",
"format": "dateOptionalTime"
}
}
}
},
"settings": {
"index": {
"refresh_interval": "1s",
"number_of_shards": "5",
"creation_date": "1472372294516",
"store": {
"type": "fs"
},
"uuid": "RxJdXvPWSXGpKz8pdcF91Q",
"version": {
"created": "1050299"
},
"number_of_replicas": "1"
}
},
"warmers": {}
}
}
The spring application generates this query:
{
"query": {
"bool": {
"should": {
"query_string": {
"query": "(addresses.\\*:sample* AND NOT status:ARCHIVED)",
"fields": [
"type",
"name",
"companyNumber",
"status",
"addresses.unit",
"addresses.address1",
"addresses.address2",
"addresses.address3",
"addresses.town",
"addresses.state",
"addresses.postcode",
"addresses.country"
],
"default_operator": "or",
"analyze_wildcard": true
}
}
}
}
}
on which "addresses.*:sample*" is the only input.
"query": "(sample* AND NOT status:ARCHIVED)"
Code above works but searches all fields of the customer object.
Since I want to search only on address fields I used the "addresses.*"
Query works only if the fields of the address object are of String type and before I added longitude and latitude fields of double type on address object. Now the error occurs because of these two new fields.
Error:
Parse Failure [Failed to parse source [{
"query": {
"bool": {
"should": {
"query_string": {
"query": "(addresses.\\*:sample* AND NOT status:ARCHIVED)",
"fields": [
"type",
"name",
"companyNumber","country",
"state",
"status",
"addresses.unit",
"addresses.address1",
"addresses.address2",
"addresses.address3",
"addresses.town",
"addresses.state",
"addresses.postcode",
"addresses.country",
],
"default_operator": "or",
"analyze_wildcard": true
}
}
}
}
}
]]
NumberFormatException[For input string: "sample"
Is there a way to search "String" fields within a nested object using addresses.* only?

The solution was to add "lenient": true. As per the documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html
lenient - If set to true will cause format based failures (like providing text to a numeric field) to be ignored.

Related

Filter elasticsearch by range of date on a string property

I need to query elasticsearch & filter the result to be in a range of dates.
the thing is the date property is mapped as a string.
is it possible to do so ?
this is the search query i'm using:
{
"size": 1,
"from": 0,
"query": {
"bool": {
"must": [
{ "match": { "status": "active" }},
{ "match": { "last_action_state": "accepted" }}
],
"filter": [
{"missing" : { "field" : "store_id" }},
{ "range": { "list_time": { "gte": "2017/01/01 00:00:00", "lte": "2017/03/01 23:59:59", "format": "yyyy/MM/dd HH:mm:ss"}}}
]
}
}
}
the thing is i have no control over the mapping since it's created automatically by another program which index the documents, and i can't change the mapping once it's created.
ps: elasticsearch version: 2.3
UPDATE:
index info:
{
"avindex_v3": {
"aliases": {
"avindex": {}
},
"mappings": {
"ads": {
"properties": {
"account_id": {
"type": "long"
},
"ad_id": {
"type": "long"
},
"ad_params": {
"type": "string"
},
"body": {
"type": "string"
},
"category": {
"type": "long"
},
"city": {
"type": "long"
},
"company_ad": {
"type": "boolean"
},
"email": {
"type": "string"
},
"images": {
"type": "string"
},
"lang": {
"type": "string"
},
"last_action_state": {
"type": "string"
},
"list_date": {
"type": "long"
},
"list_id": {
"type": "long"
},
"list_time": {
"type": "string"
},
"modified_at": {
"type": "string"
},
"modified_ts": {
"type": "double"
},
"name": {
"type": "string"
},
"orig_date": {
"type": "long"
},
"orig_list_time": {
"type": "string"
},
"phone": {
"type": "string"
},
"phone_hidden": {
"type": "boolean"
},
"price": {
"type": "long"
},
"region": {
"type": "long"
},
"status": {
"type": "string"
},
"store_id": {
"type": "long"
},
"subject": {
"type": "string"
},
"type": {
"type": "string"
},
"user_id": {
"type": "long"
}
}
}
},
"settings": {
"index": {
"creation_date": "1493216710928",
"number_of_shards": "5",
"number_of_replicas": "1",
"uuid": "WEHGLF8iRyGk3Xgbmo7H8Q",
"version": {
"created": "2040499"
}
}
},
"warmers": {}
}
}
You can try to give it as a keyword like this :
{
"range": {
"list_time.keyword": {
"gte": "2020-08-12 22:24:55.56",
"lte": "2020-08-12 22:24:56.56"
}
}
}

Elasticsearch geo_distance in combination with other queries

Hello I have a problem with the combination of multiple queries within Elasticsearch.
The problem only occurs whenever I try to combine a multi_match query with the geo_distance query. The multi_match query works when the geo_distance query is not present and the geo_distance query works when the multi_match query is not present.
Whenever I execute the multi_match query without the geo_distance query I get the results that I expect. I also get the expected results when I try the geo_distance query without the multi_match query.
Boths results contain the dataset that I would expect to receive when both queries are executed together. But whenever I execute them together I receive 0 results.
When I combine the geo_distance query with a simple term query the search works. So I presume it is problem with the combination of queries.
I would appreciate any ideas.
My query is the following:
{
"query": {
"bool": {
"must": {
"bool": {
"should": {
"multi_match": {
"query": "CompanyName GmbH",
"fields": [
"originalName",
"legalName"
],
"type": "cross_fields",
"operator": "AND"
}
}
}
},
"filter": {
"bool": {
"should": {
"geo_distance": {
"location": [
9.87107,
51.69915
],
"distance": "30.0km",
"distance_type": "arc"
}
}
}
}
}
}
}
The mapping behind all of that is:
{
"customer": {
"aliases": {
},
"mappings": {
"customer-entity": {
"properties": {
"communication": {
"properties": {
"domain": {
"type": "string"
},
"email": {
"type": "string"
},
"landline": {
"type": "string"
},
"mobile": {
"type": "string"
}
}
},
"id": {
"type": "long"
},
"legalName": {
"type": "string",
"store": true
},
"location": {
"type": "geo_point"
},
"operatingModes": {
"type": "string"
},
"originalName": {
"type": "string",
"store": true
}
}
},
"homepage-entity": {
"_parent": {
"type": "customer-entity"
},
"_routing": {
"required": true
},
"properties": {
"customerId": {
"type": "string",
"store": true
},
"id": {
"type": "long"
},
"metas": {
"type": "string",
"store": true
}
}
},
"person-entity": {
"_parent": {
"type": "customer-entity"
},
"_routing": {
"required": true
},
"properties": {
"customerId": {
"type": "string",
"store": true
},
"firstName": {
"type": "string",
"store": true
},
"id": {
"type": "long"
},
"lastName": {
"type": "string",
"store": true
},
"personId": {
"type": "string",
"store": true
}
}
}
},
"settings": {
"index": {
"refresh_interval": "-1",
"number_of_shards": "1",
"creation_date": "1488920698118",
"store": {
"type": "fs"
},
"number_of_replicas": "0",
"uuid": "ZcLN5sxASXGUnKZMg8mBpw",
"version": {
"created": "2040499"
}
}
},
"warmers": {
}
}
}

How to search nested filter by elasticsearch

I want to use [nested filter] function.
because, I want to make filter of nested data only.
for example, if ElasticSearch has this data.
I want to get it. but I try to use [nested filter], I can't get to this data.
do you know about good solution?
Actually, I want to have 2 condition in nested type.
this is like sql,
select * from document Inner Join on document_comments.document_id = document.id where document_comments.deleted_at is null and document_comments.comment like 'test'
[data]
"_source": {
"id": 4,
"name": "hogehoge.csv",
"deleted_at": null,
"hard_delete": null,
"document_comments": [
{
"id": 8,
"comment": "test",
"document_id": 4,
"deleted_at": "2016-03-03T13:43:10"
}
,
{
"id": 11,
"comment": "test",
"document_id": 4,
"deleted_at": null
}
]
}
[mapping]
"documents": {
"search_analyzer": "default_search",
"dynamic_templates": [
{
"string_template": {
"mapping": {
"type": "multi_field",
"fields": {
"ja": {
"analyzer": "ja_analyzer",
"index": "analyzed",
"type": "string"
},
"{name}": {
"analyzer": "ngram_analyzer",
"index": "analyzed",
"type": "string"
},
"yomi": {
"analyzer": "yomi_analyzer",
"index": "analyzed",
"type": "string"
},
"full": {
"index": "not_analyzed",
"type": "string"
}
}
},
"match_mapping_type": "string",
"match": "*"
}
}
],
"properties": {
"#timestamp": {
"format": "dateOptionalTime",
"type": "date"
},
"document_comments": {
"type": "nested",
"properties": {
"deleted_at": {
"format": "dateOptionalTime",
"type": "date"
},
"document_id": {
"type": "integer"
},
"comment": {
"index": "no",
"type": "string",
"fields": {
"ja": {
"analyzer": "ja_analyzer",
"type": "string"
},
"yomi": {
"analyzer": "yomi_analyzer",
"type": "string"
},
"ngram": {
"analyzer": "ngram_analyzer",
"type": "string"
},
"full": {
"index": "not_analyzed",
"type": "string"
}
}
},
"id": {
"type": "long"
},
"deleted_at": {
"format": "dateOptionalTime",
"type": "date"
}
}
},
"name": {
"index": "no",
"type": "string",
"fields": {
"ja": {
"analyzer": "ja_analyzer",
"type": "string"
},
"yomi": {
"analyzer": "yomi_analyzer",
"type": "string"
},
"ngram": {
"analyzer": "ngram_analyzer",
"type": "string"
},
"full": {
"index": "not_analyzed",
"type": "string"
}
}
},
"delete_at": {
"format": "dateOptionalTime",
"type": "date"
},
"id": {
"type": "integer"
},
"hard_delete": {
"format": "dateOptionalTime",
"type": "date"
}
}
}
[query]
{
"_source": [
"id",
"deleted_at",
"document_comments.comment",
"document_comments.deleted_at"
],
"min_score": 0.05,
"query": {
"filtered": {
"query": {
"bool": {
"must": [],
"should": [
{
"multi_match": {
"query": "test",
"type": "cross_fields",
"fields": [
"document.name.ja"
]
}
},
{
"nested": {
"path": "document_comments",
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "test",
"type": "cross_fields",
"fields": [
"document_comments.comment.ja"
]
}
}
],
"must_not": [
{
"filter": {
"exists": {
"field": "document_comments.deleted_at"
}
}
}
]
}
}
}
}
]
}
},
"filter": {
"bool": {
"must": [
[
{
"missing": {
"field": "deleted_at",
"existence": "true",
"null_value": "true"
}
},
{
"missing": {
"field": "hard_delete",
"existence": "true",
"null_value": "true"
}
}
],
{
"type": {
"value": "document"
}
},
{
"term": {
"id": "3"
}
}
]
}
}
}
},
"sort": [
{
"id": "desc"
},
{
"_score": "asc"
}
]
}

ElasticSearch term query vs query_string?

When I query my index with query_string, I am getting results
But when I query using term query, I dont get any results
{
"query": {
"bool": {
"must": [],
"must_not": [],
"should": [
{
"query_string": {
"default_field": "Printer.Name",
"query": "HL-2230"
}
}
]
}
},
"from": 0,
"size": 10,
"sort": [],
"aggs": {}
}
I know that term is not_analyzed and query_string is analyzed but Name is already as "HL-2230", why doesnt it match with term query? I tried also searching with "hl-2230", I still didnt get any result.
EDIT: mapping looks like as below. Printer is the child of Product. Not sure if this makes difference
{
"state": "open",
"settings": {
"index": {
"creation_date": "1453816191454",
"number_of_shards": "5",
"number_of_replicas": "1",
"version": {
"created": "1070199"
},
"uuid": "TfMJ4M0wQDedYSQuBz5BjQ"
}
},
"mappings": {
"Product": {
"properties": {
"index": "not_analyzed",
"store": true,
"type": "string"
},
"ProductName": {
"type": "nested",
"properties": {
"Name": {
"store": true,
"type": "string"
}
}
},
"ProductCode": {
"type": "string"
},
"Number": {
"index": "not_analyzed",
"store": true,
"type": "string"
},
"id": {
"index": "no",
"store": true,
"type": "integer"
},
"ShortDescription": {
"store": true,
"type": "string"
},
"Printer": {
"_routing": {
"required": true
},
"_parent": {
"type": "Product"
},
"properties": {
"properties": {
"RelativeUrl": {
"index": "no",
"store": true,
"type": "string"
}
}
},
"PrinterId": {
"index": "no",
"store": true,
"type": "integer"
},
"Name": {
"store": true,
"type": "string"
}
}
},
"aliases": []
}
}
As per mapping provided by you above
"Name": {
"store": true,
"type": "string"
}
Name is analysed. So HL-2230 will split into two tokens, HL and 2230. That's why term query is not working and query_string is working. When you use term query it will search for exact term HL-2230 which is not there.

elasticsearch "having not" query

Some documents has category fields.. Some of these docs has category fields its value equals to "-1". I need a query return documents which have category fields and "not equal to -1".
I tried this:
GET webproxylog/_search
{
"query": {
"filtered": {
"filter": {
"not":{
"filter": {"and": {
"filters": [
{"term": {
"category": "-1"
}
},
{
"missing": {
"field": "category"
}
}
]
}}
}
}
}
}
}
But not work.. returns docs not have "category field"
EDIT
Mapping:
{
"webproxylog": {
"mappings": {
"accesslog": {
"properties": {
"category": {
"type": "string",
"index": "not_analyzed"
},
"clientip": {
"type": "string",
"index": "not_analyzed"
},
"clientmac": {
"type": "string",
"index": "not_analyzed"
},
"clientname": {
"type": "string",
"index": "not_analyzed"
},
"duration": {
"type": "long"
},
"filetype": {
"type": "string",
"index": "not_analyzed"
},
"hierarchycode": {
"type": "string",
"index": "not_analyzed"
},
"loggingdate": {
"type": "date",
"format": "dateOptionalTime"
},
"reqmethod": {
"type": "string",
"index": "not_analyzed"
},
"respsize": {
"type": "long"
},
"resultcode": {
"type": "string",
"index": "not_analyzed"
},
"url": {
"type": "string",
"analyzer": "slash_analyzer"
},
"user": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
If your category field is string and is analyzed by default, then your -1 will be indexed as 1 (stripping the minus sign).
You will need that field to be not_analyzed or to add a sub-field which is not analyzed (as my solution below).
Something like this:
DELETE test
PUT /test
{
"mappings": {
"test": {
"properties": {
"category": {
"type": "string",
"fields": {
"notAnalyzed": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
POST /test/test/1
{"category": "-1"}
POST /test/test/2
{"category": "2"}
POST /test/test/3
{"category": "3"}
POST /test/test/4
{"category": "4"}
POST /test/test/5
{"category2": "-1"}
GET /test/test/_search
{
"query": {
"bool": {
"must_not": [
{
"term": {
"category.notAnalyzed": {
"value": "-1"
}
}
},
{
"filtered": {
"filter": {
"missing": {
"field": "category"
}
}
}
}
]
}
}
}

Resources