How to highlight date fields in Elasticsearch?

How to highlight date fields in Elasticsearch? - elasticsearch

My mapping:
"mappings": {
"my_type": {
"properties": {
"birthDate": {
"type": "date",
"format": "dateOptionalTime"
},
"name": {
"type": "string"
}
}
}
}
My search query:
GET my_index/_search
{
"query": {
"bool": {
"should": [
{
"match": {
"name": "babken"
}
},
{
"term": {
"birthDate": {
"value": "1999-01-01"
}
}
}
]
}
},
"highlight": {
"fields": {
"*": {}
}
}
}
However in the response body, only the name field is highlighted, even though the birthDate field has matched as well:
"hits": [
{
"_index": "my_index",
"_type": "my_type",
"_id": "1a82fbb4-1268-42b9-9999-ef932f67a114",
"_score": 12.507131,
"_source": {
"name": "babken",
"birthDate": "1999-01-01",
},
"highlight": {
"name": [
"<em>babken</em>"
]
}
}
...
How can I make the birthDate field appear in "highlight" results as well if it has matched?
I'm using Elasticsearch 1.6

You would need to change the the type to string to enable highlighting.
Bare minimum requirement for a field to be enabled for highlighting is that it should be string type.
The following issue has little more discussion about it.

Related

ElasticSearch: Fetch records from nested Array that "only" include given element/s and filter-out the rest with mixed values

I am stuck on one of my tasks.
Overview:
There are some records on elastic search. Which includes information about the candidates and their employment.
There is a field that stores information about the statuses in which the candidate got submitted.
{
"submittedJobs": [
{
"status": "PendingPM", "jobId": "ABC", ...
},
{
"status": "PendingClient", "jobId": "XYZ", ...
},
{
"status": "PendingPM", "jobId": "WXY", ...
},
...
]
}
I want to write an es query to fetch all the records in which submitted jobs array "only" have "pendingPM" statuses and no other statuses.
"query": {
"bool": {
"filter": [
{
"nested": {
"path": "submittedJobs",
"query": {
"bool": {
"must": [
{
"term": {
"submittedJobs.status.keyword": "PendingPM"
}
}
]
}
}
}
}
]
}
}
I tried this query, and it returns the records which include "pendingPM" along with other statuses - might use contains() logic.
here is the mapping
"submittedJobs": {
"type": "nested",
"properties": {
"statusId": {
"type": "long"
},
"status": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256,
"normalizer": "lowercase_normalizer"
}
}
},
"jobId": {
"type": "keyword"
}
}
}
For example. let's suppose there are two documents
document #1:
{
"submittedJobs": [
{
"status": "PendingPM", "jobId": "ABC", ...
},
{
"status": "PendingClient", "jobId": "XYZ", ...
},
{
"status": "PendingPM", "jobId": "WXY", ...
},
...
]
},
document #2:
{
"submittedJobs": [
{
"status": "PendingPM", "jobId": "ABC", ...
},
{
"status": "PendingPM", "jobId": "WXY", ...
},
...
]
}
Only document #2 should be returned, as the entire array contains only "PendingPM" and no other statuses.
Document #1 will be filtered-out since it includes mixed statuses.
Any help will be appreciated.

Try this:
Will be return only document with all item of array with status PendingPM.
{
"query": {
"bool": {
"must_not": [
{
"nested": {
"path": "submittedJobs",
"query": {
"bool": {
"must_not": [
{
"match": {
"submittedJobs.status": {
"query": "PendingPM"
}
}
},
{
"match": {
"submittedJobs.status": {
"query": "PendingClient"
}
}
}
]
}
}
}
}
]
}
}
}

You can use inner_hits along with nested query to get only the matched results from the document
Adding a working example
Index Mapping:
{
"mappings": {
"properties": {
"submittedJobs": {
"type": "nested"
}
}
}
}
Search Query:
{
"query": {
"bool": {
"filter": [
{
"nested": {
"path": "submittedJobs",
"query": {
"bool": {
"must": [
{
"term": {
"submittedJobs.status.keyword": "PendingPM"
}
}
]
}
},
"inner_hits": {}
}
}
]
}
}
}
Search Result would be:
"hits": [
{
"_index": "73062439",
"_id": "1",
"_score": 0.0,
"_source": {
"submittedJobs": [
{
"status": "PendingPM",
"jobId": "ABC"
},
{
"status": "PendingClient",
"jobId": "XYZ"
},
{
"status": "PendingPM",
"jobId": "WXY"
}
]
},
"inner_hits": { // note this
"submittedJobs": {
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 0.4700036,
"hits": [
{
"_index": "73062439",
"_id": "1",
"_nested": {
"field": "submittedJobs",
"offset": 0
},
"_score": 0.4700036,
"_source": {
"jobId": "ABC",
"status": "PendingPM"
}
},
{
"_index": "73062439",
"_id": "1",
"_nested": {
"field": "submittedJobs",
"offset": 2
},
"_score": 0.4700036,
"_source": {
"jobId": "WXY",
"status": "PendingPM"
}
}
]
}
}
}
}
]

Searching Elasticsearch document by existing field not found but the field exists

First of all, I must say I'm on Elasticsearch 5.6.16
I'm trying to figuring out what's happening here. I have several documents indexed with this mapping (I copied the document directly from Kibana):
{
"_index": "my_index",
"_type": "doc",
"_id": "Outbreak_10346",
"_version": 1,
"_score": 1,
"_source": {
"outbreakId": 10346,
"reference": "XX-AD-2021-00003",
"countryCode": "BE",
"adisNotificationReasonType": {
"code": "TERRESTRIAL"
},
"approximateLocation": false,
"latitude": 50.93766,
"longitude": 3.97156,
"adminZoneLevelOne": {
"zoneId": 40,
"zoneCode": "BE2"
},
"affectedSpecies": [
{
"speciesId": 16703,
"name": "Swine",
"measuringUnit": "ANIMAL",
"casesQuantity": 10,
"deadQuantity": 1,
"susceptibleQuantity": 100,
"isAquatic": false
}
],
"affectedSpeciesTotalSusceptible": 100,
"affectedSpeciesTotalCases": 10
}
}
If I do this query in Kibana:
GET my_index/_search
{
"query": {
"exists": {
"field": "adminZoneLevelOne"
}
}
}
I don't get any results. But if I change the field to any of the others I find the documents.
Also, when I retrieve the documents I can access the adminZoneLevelOne field.
How's this possible? Why Elasticsearch doesn't find any document with that field?
The index mapping for adminZoneLevelOne field is:
"adminZoneLevelOne": {
"type": "nested",
"properties": {
"zoneCode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
},
"analyzer": "WHITESPACE"
},
"zoneId": {
"type": "long"
}
}
}
And for adisNotificationReasonType that works fine, is:
"adisNotificationReasonType": {
"properties": {
"code": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
},
"analyzer": "LOWERCASE_KEYWORD"
}
}
}

Since adminZoneLevelOne is of nested type, you need to use exists query along with the nested query as
{
"query": {
"nested": {
"path": "adminZoneLevelOne",
"query": {
"bool": {
"must": [
{
"exists": {
"field": "adminZoneLevelOne"
}
}
]
}
}
}
}
}

Select documents by array of objects when at least one object doesn't contain necessary field Elasticsearch

I have documents in the elasticsearch and can't understand how to apply search script that should return documents if any attachment doesn't contain uuid or uuid is null. Version of elastic 5.2.
Mapping of documents
"mappings": {
"documentType": {
"properties": {
"attachment": {
"properties": {
"uuid": {
"type": "text"
},
"path": {
"type": "text"
},
"size": {
"type": "long"
}
}
}}}
In the elasticsearch it looks like
{
"_index": "documents",
"_type": "documentType",
"_id": "1",
"_score": 1.0,
"_source": {
"attachment": [
{
"uuid": "21321321",
"path": "../uploads/somepath",
"size":1231
},
{
"path": "../uploads/somepath",
"size":1231
},
]},
{
"_index": "documents",
"_type": "documentType",
"_id": "2",
"_score": 1.0,
"_source": {
"attachment": [
{
"uuid": "223645641321321",
"path": "../uploads/somepath",
"size":1231
},
{
"uuid": "22341424321321",
"path": "../uploads/somepath",
"size":1231
},
]},
{
"_index": "documents",
"_type": "documentType",
"_id": "3",
"_score": 1.0,
"_source": {
"attachment": [
{
"uuid": "22789789341321321",
"path": "../uploads/somepath",
"size":1231
},
{
"path": "../uploads/somepath",
"size":1231
},
]}
As result I want to get attachments with _id 1 and 3. But as result I get error of the script
I tried to apply next script:
{
"query": {
"bool": {
"must": [
{
"exists": {
"field": "attachment"
}
},
{
"script": {
"script": {
"inline": "for (item in doc['attachment'].value) { if (item['uuid'] == null) { return true}}",
"lang": "painless"
}
}
}
]
}
}
}
Error is next:
"root_cause": [
{
"type": "script_exception",
"reason": "runtime error",
"script_stack": [
"org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:77)",
"org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:36)",
"for (item in doc['attachment'].value) { ",
" ^---- HERE"
],
"script": "for (item in doc['attachment'].value) { if (item['uuid'] == null) { return true}}",
"lang": "painless"
}
],
Is it possible to select documents in case even one attachment object doesn't contain uuid ?

Iterating arrays of objects is not as trivial as one would expect. I've written extensively about it here and here.
Since your attachments are not defined as nested, ES will internally represent them as flattened lists of values (also called "doc values"). For instance attachment.uuid in doc#2 will become ["223645641321321", "22341424321321"], and attachments.size will turn into [1231, 1231].
This means that you can simply compare the .length of these flattened representations! I assume attachment.size will always be present and can be thus taken as the comparison baseline.
One more thing. To take advantage of these optimized doc values for textual fields, it'll require one small mapping change:
PUT documents/documentType/_mappings
{
"properties": {
"attachment": {
"properties": {
"uuid": {
"type": "text",
"fielddata": true <---
},
"path": {
"type": "text"
},
"size": {
"type": "long"
}
}
}
}
}
When that's done and you've reindexed your docs — which can be done with this little Update by query trick:
POST documents/_update_by_query
You can then use the following script query:
POST documents/_search
{
"query": {
"bool": {
"must": [
{
"exists": {
"field": "attachment"
}
},
{
"script": {
"script": {
"inline": "def size_field_length = doc['attachment.size'].length; def uuid_field_length = doc['attachment.uuid'].length; return uuid_field_length < size_field_length",
"lang": "painless"
}
}
}
]
}
}
}

Just to supplement this answer. If mapping for uuid field was created automatically elastic search adds it in this way:
"uuid": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
then script could look like:
POST documents/_search
{
"query": {
"bool": {
"must": [
{
"exists": {
"field": "attachment"
}
},
{
"script": {
"script": {
"inline": "doc['attachment.size'].length > doc['attachment.uuid.keyword'].length",
"lang": "painless"
}
}
}
]
}
}
}

Source filtering not working on kibana

I am trying to exclude some field with source filtering.
I create an index:
put testindex
{
"mappings": {
"type1": {
"properties":{
"name": { "type": "text" },
"age": { "type": "integer" }
}
}
}
}
insert a document:
put testindex/type1/a
{
"name":"toto",
"age":23
}
and try a filtered query:
get testindex/_search
{
"_source": {
"excludes": [ "age" ]
},
"query": {
"bool": {
"should": []
}
}
}
the result is:
"hits": [
{
"_index": "testindex",
"_type": "type1",
"_id": "a",
"_score": 1,
"_source": {
"name": "toto",
"age": 23
}
}
]
I don't understand why it does not hide the "age" field in _source.
_source: false give the same result.
I used elasticsearch & kibana 5.6

Ok I found it.
It's probably due to Kibana.
When I use lowercase for the "get". It does not work.
get testindex/_search
{
"_source": {
"excludes": [ "age" ]
},
"query": {
"bool": {
"should": []
}
}
}
When I use uppercase, it work. I don't really know why but that's it..
GET testindex/_search
{
"_source": {
"excludes": [ "name" ]
},
"query": {
"bool": {
"should": []
}
}
}

ElasticSearch query sub-objects

I wandered through the docs a lot today, but can't find the answer; probably because I'm new to Elastic and don't really know the entire ES-terminology yet.
Say I have a books type containing a bunch of, well - books. Each book has a nested author.
{
"name": "Me and Jane",
"rating": "10",
"author": {
"name": "John Doe",
"alias":"Mark Twain"
}
}
Now, I know we can query the authors fields like this:
"match": {
"author.name": "Doe"
}
But what if I want to search across all the author fields? I tried author._all, which doesn't work.

Another approach is multi_match with wildcard field names: https://www.elastic.co/guide/en/elasticsearch/guide/current/multi-match-query.html#_using_wildcards_in_field_names
Something like this, I think:
"query": {
"nested": {
"path": "author",
"query": {
"multi_match": {
"query": "doe",
"fields": [
"author.*"
]
}
}
}
}
UPDATE: full sample provided
PUT /books
{
"mappings": {
"paper": {
"properties": {
"author": {
"type": "nested",
"properties": {
"name": {
"type": "string"
},
"alias": {
"type": "string"
}
}
}
}
}
}
}
POST /books/paper/_bulk
{"index":{"_id":1}}
{"author":[{"name":"john doe","alias":"doe"},{"name":"mark twain","alias":"twain"}]}
{"index":{"_id":2}}
{"author":[{"name":"mark doe","alias":"john"}]}
{"index":{"_id":3}}
{"author":[{"name":"whatever","alias":"whatever"}]}
GET /books/paper/_search
{
"query": {
"nested": {
"path": "author",
"query": {
"multi_match": {
"query": "john",
"fields": [
"author.*"
]
}
}
}
}
}
Result is:
"hits": {
"total": 2,
"max_score": 0.5906161,
"hits": [
{
"_index": "books",
"_type": "paper",
"_id": "2",
"_score": 0.5906161,
"_source": {
"author": [
{
"name": "mark doe",
"alias": "john"
}
]
}
},
{
"_index": "books",
"_type": "paper",
"_id": "1",
"_score": 0.5882852,
"_source": {
"author": [
{
"name": "john doe",
"alias": "doe"
},
{
"name": "mark twain",
"alias": "twain"
}
]
}
}
]
}

You can use Query String Query, The example:
{
"query": {
"query_string": {
"fields": ["author.*"],
"query": "doe",
"use_dis_max": true
}
}
}

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

How to highlight date fields in Elasticsearch? - elasticsearch

You would need to change the the type to string to enable highlighting. Bare minimum requirement for a field to be enabled for highlighting is that it should be string type. The following issue has little more discussion about it.

Related

ElasticSearch: Fetch records from nested Array that "only" include given element/s and filter-out the rest with mixed values

Searching Elasticsearch document by existing field not found but the field exists

Select documents by array of objects when at least one object doesn't contain necessary field Elasticsearch

Source filtering not working on kibana

ElasticSearch query sub-objects

Categories

Resources