Elastic Search: Different results for query string when using fields - elasticsearch

We have an elastic search 5.5 setup. We use nest to perform our queries through C#.
When executing the following query:
{
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "00917751"
}
}
]
}
}
}
We get the desired result: one result with that the number as identifier.
When using the following query:
{
"query": {
"bool": {
"must": [
{
"query_string": {
"query": "00917751",
"fields": [
"searchReference",
"searchIdentifier",
"searchObjectNo",
"searchBrand",
"searchExtSerNo"
]
}
}
]
}
}
}
We get no results.
The value we are searching for is in the field searchIndentifier, and has the value "1-00917751".
We have a custom analyzer called "final"
.Custom("final", cu => cu
.Tokenizer("keyword").Filters(new List() { "lowercase" }))
The field searchIndentifier has no custom analyzer set on it. We tried adding the whitespace tokenizer in it but that made no difference.
Another field called "searchObjectNo" does work, when we try to search for the value "S328-25" with the query "S328". These fields are exactly the same.
Any ideas here?
Another question. In the first query, when we search for 1-00917751 (without the quotes) we get a lot of results. But we think that is because of the keyword tokenizer?
Thank you
Schoof
Index settings and mappings:
{
"inventoryitems": {
"aliases": {},
"mappings": {
"inventoryobject": {
"properties": {
"articleGroups": {
"type": "nested",
"properties": {
"id": {
"type": "long"
}
}
},
"articleId": {
"type": "long"
},
"articleNumber": {
"type": "text",
"boost": 1.5,
"analyzer": "final"
},
"brand": {
"type": "text",
"analyzer": "final"
},
"catalogues": {
"type": "nested",
"properties": {
"articleGroupId": {
"type": "long"
},
"articleGroupName": {
"type": "text",
"analyzer": "final",
"fielddata": true
},
"id": {
"type": "long"
},
"name": {
"type": "text",
"analyzer": "final",
"fielddata": true
}
}
},
"details": {
"type": "nested",
"properties": {
"actualState": {
"type": "double"
},
"allocation": {
"type": "text",
"analyzer": "final",
"fielddata": true
},
"available": {
"type": "double"
},
"batch": {
"type": "text",
"analyzer": "final"
},
"calibrationDate": {
"type": "date"
},
"expected": {
"type": "double"
},
"externalSerialNumber": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"inReturn": {
"type": "double"
},
"inventory": {
"type": "double"
},
"isInMobileCarrier": {
"type": "boolean"
},
"locationDetail": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"locationId": {
"type": "long"
},
"locationName": {
"type": "text",
"analyzer": "final",
"fielddata": true
},
"locationType": {
"type": "text",
"analyzer": "final",
"fielddata": true
},
"lotId": {
"type": "long"
},
"mobileCarrierCode": {
"type": "text",
"analyzer": "final",
"fielddata": true
},
"mobileCarrierId": {
"type": "long"
},
"ownerCode": {
"type": "text",
"analyzer": "final"
},
"requested": {
"type": "double"
},
"reserved": {
"type": "double"
},
"storeLocationId": {
"type": "long"
},
"thicknessCode": {
"type": "text",
"analyzer": "final"
},
"weldedMark": {
"type": "text",
"analyzer": "final"
}
}
},
"docNo": {
"type": "long"
},
"hasStock": {
"type": "boolean"
},
"id": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"identifier": {
"type": "text",
"boost": 1.5,
"analyzer": "final"
},
"inventoryItemType": {
"properties": {
"name": {
"type": "text",
"analyzer": "final",
"fielddata": true
}
}
},
"mobileCarrierId": {
"type": "long"
},
"name": {
"type": "text",
"boost": 1.5,
"analyzer": "final"
},
"objectNumber": {
"type": "text",
"boost": 1.5,
"analyzer": "final"
},
"quantity": {
"type": "double"
},
"reference": {
"type": "text",
"boost": 1.5,
"analyzer": "final"
},
"searchBrand": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"searchExtSerNo": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"searchIndentifier": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"searchName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"searchObjectNo": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"searchReference": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"sortNumber": {
"type": "long"
},
"stockUnit": {
"type": "text",
"boost": 1.5,
"analyzer": "final"
}
}
}
},
"settings": {
"index": {
"number_of_shards": "3",
"provided_name": "inventoryitems",
"creation_date": "1539253308319",
"analysis": {
"analyzer": {
"final": {
"filter": [
"lowercase"
],
"type": "custom",
"tokenizer": "keyword"
}
}
},
"number_of_replicas": "1",
"uuid": "Kb5KuYEiR5GQqgBPVYjJfA",
"version": {
"created": "5050299"
}
}
}
}
}

The answer is pretty simple: in your mapping your field is named searchIndentifier and in your query you're using a field called searchIdentifier which doesn't exist ;-)

Related

Elasticsearch query for all values of field with group by

i am having trouble forming query to fetch all values with sql group by kind of thing.
so below is my data structure:
product index:
{
"createdBy" : "61c1fcdd88dbad1920da8caf",
"creationTime" : "2021-12-22T11:58:53.576932Z",
"lastModifiedBy" : "61c1fcdd88dbad1920da8caf",
"lastModificationTime" : "2021-12-22T11:58:53.576932Z",
"id" : "61c312fdc6aa620a609db0b2",
"title" : "string",
"brand" : "string",
"longDesc" : "string",
"categoryId" : "string",
"imageUrls" : [
"string",
"string"
],
"keySpecs" : [
"string",
"string",
],
"facets" : [
{
"name" : "color",
"value" : "red"
},
{
"name" : "storage",
"value" : "16 GB"
},
{
"name" : "brand",
"value" : "Intex"
}
],
"categoryName" : "handsets"
}
Now, i want to fetch all the facets with their different values and count as well. Let's say
productA has color blue, productB has color red
productA has brand ABC, productB has brand XYZ
so, i want data which list all facets like:
color: blue(200 count), red (12 count)
brand: ABC(13 count), XYZ (99 count)
Also, different product will have different type of facet, like iphone will have color memory brand size, but a pen will have color and brand only (not memory/size).
Note: i'm using latest version of elastic
=================
UPDATE 1:
Below is the es mapping details
{
"settings": {
"analysis": {
"filter": {
"english_stop": {
"type": "stop",
"stopwords": "_english_"
},
"english_keywords": {
"type": "keyword_marker",
"keywords": [
"example"
]
},
"english_stemmer": {
"type": "stemmer",
"language": "english"
},
"english_possessive_stemmer": {
"type": "stemmer",
"language": "possessive_english"
}
},
"analyzer": {
"lalashree_standard_analyzer": {
"tokenizer": "standard",
"filter": [
"english_possessive_stemmer",
"lowercase",
"english_stop",
"english_keywords",
"english_stemmer"
]
},
"html_standard_analyzer": {
"char_filter": [
"html_strip"
],
"tokenizer": "standard",
"filter": [
"english_possessive_stemmer",
"lowercase",
"english_stop",
"english_keywords",
"english_stemmer"
]
}
}
}
},
"mappings": {
"properties": {
"id": {
"type": "keyword"
},
"createdBy": {
"type": "keyword"
},
"creationTime": {
"type": "date"
},
"lastModifiedBy": {
"type": "keyword"
},
"lastModificationTime": {
"type": "date"
},
"deleted": {
"type": "boolean"
},
"deletedBy": {
"type": "keyword"
},
"deletionTime": {
"type": "date"
},
"title": {
"type": "text",
"analyzer": "lalashree_standard_analyzer",
"fields": {
"suggest": {
"type": "completion"
}
}
},
"shortDesc": {
"type": "text",
"analyzer": "lalashree_standard_analyzer"
},
"longDesc": {
"type": "text",
"analyzer": "lalashree_standard_analyzer"
},
"categoryId": {
"type": "keyword"
},
"searchDetails": {
"type": "object",
"properties": {
"desc": {
"type": "text",
"analyzer": "lalashree_standard_analyzer"
},
"keywords": {
"type": "text",
"analyzer": "lalashree_standard_analyzer",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"imageUrls": {
"type": "keyword",
"index": false
},
"keySpecs": {
"type": "text",
"analyzer": "lalashree_standard_analyzer"
},
"sections": {
"type": "object",
"properties": {
"name": {
"type": "text",
"index": false
},
"shortDesc": {
"type": "text",
"analyzer": "lalashree_standard_analyzer"
},
"longDesc": {
"type": "text",
"analyzer": "lalashree_standard_analyzer"
},
"htmlContent": {
"type": "text",
"analyzer": "html_standard_analyzer"
}
}
},
"facets": {
"type": "nested",
"properties": {
"name": {
"type": "keyword"
},
"value": {
"type": "keyword"
}
}
},
"specificationItems": {
"type": "object",
"properties": {
"key": {
"type": "text",
"analyzer": "lalashree_standard_analyzer",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"values": {
"type": "text",
"analyzer": "lalashree_standard_analyzer"
}
}
},
"categoryName": {
"type": "keyword"
},
"productFamily": {
"type": "nested",
"properties": {
"id": {
"type": "keyword"
},
"familyVariantOptions": {
"type": "nested",
"properties": {
"name": {
"type": "keyword"
},
"values": {
"type": "keyword"
}
}
},
"productFamilyItems": {
"type": "nested",
"properties": {
"baseProductId": {
"type": "keyword"
},
"itemVariantInfoSet": {
"type": "nested",
"properties": {
"name": {
"type": "keyword"
},
"value": {
"type": "keyword"
}
}
}
}
}
}
},
"rating": {
"type": "float"
},
"totalReviewsCount": {
"type": "long"
},
"stores": {
"type": "nested",
"properties": {
"id": {
"type": "keyword"
},
"logo": {
"type": "keyword",
"index": false
},
"active": {
"type": "boolean"
},
"name": {
"type": "text"
},
"quantity": {
"type": "long"
},
"rating": {
"type": "float"
},
"totalReviewsCount": {
"type": "long"
},
"price.mrp": {
"type": "float"
},
"price.sp": {
"type": "float"
},
"location.geoPoint": {
"type": "geo_point"
},
"oos": {
"type": "boolean"
}
}
}
}
}
}
This query first group by names then groups each name's values. By setting sizes, you can arrange number of facets you want and number of items in each facet. I think it does what you need.
Note that if you have too many documents and if performance matters, this query may perform bad.
{
"size": 0,
"aggs": {
"facets": {
"nested": {
"path": "facets"
},
"aggs": {
"names": {
"terms": {
"field": "facets.name",
"size": 10
},
"aggs": {
"values": {
"terms": {
"field": "facets.value",
"size": 10
}
}
}
}
}
}
}
}

Regex query from Kibana discovery tab over logs from logstash

I am trying to find a way to match the below line with either KQL or Lucene query from Kibana discovery. The value I am trying to match is in field "message" which is of type "text".
message: Starting <app_name> v1.7.0-SNAPSHOT on ...
Tried with below query:
message: /Starting\s[a-ZA-Z] v/
Application Stack:
logstash-7.10.0
Kibana 7.6.1
elasticsearch 7.6.1
Index Name: logstash-filebeat-7.10.0
Index Mappings:
{
"mapping": {
"_doc": {
"dynamic": "true",
"_meta": {},
"_source": {
"includes": [],
"excludes": []
},
"dynamic_date_formats": [
"strict_date_optional_time",
"yyyy/MM/dd HH:mm:ss Z||yyyy/MM/dd Z"
],
"dynamic_templates": [
{
"message_field": {
"path_match": "message",
"match_mapping_type": "string",
"mapping": {
"norms": false,
"type": "text"
}
}
},
{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
},
"norms": false,
"type": "text"
}
}
}
],
"date_detection": true,
"numeric_detection": false,
"properties": {
"#timestamp": {
"type": "date"
},
"#version": {
"type": "keyword"
},
"agent": {
"properties": {
"hostname": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"name": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"type": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"version": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"classname": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"container": {
"properties": {
"id": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"ecs": {
"properties": {
"version": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"geoip": {
"dynamic": "true",
"properties": {
"ip": {
"type": "ip"
},
"latitude": {
"type": "half_float"
},
"location": {
"type": "geo_point"
},
"longitude": {
"type": "half_float"
}
}
},
"input": {
"properties": {
"type": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"log": {
"properties": {
"file": {
"properties": {
"path": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"flags": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"offset": {
"type": "long"
}
}
},
"loglevel": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"message": {
"type": "text",
"norms": false
},
"thread": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}

ElasticSearch Terms Aggregation not working with custom Analyzer and Pattern Tokenizer

I am trying the Terms Aggregation for the first time and there seems to be an issue with the custom pattern tokenizer I am using.
Here is the Mapping:
{
"mappings": {
"properties": {
"contentItemType": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
},
"analyzer": "patternAnalyzer"
},
"theme": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
},
"analyzer": "patternAnalyzer"
}
}
},
"settings": {
"analysis": {
"analyzer": {
"patternAnalyzer": {
"tokenizer": "patternTokenizer"
}
},
"tokenizer": {
"patternTokenizer": {
"type": "pattern",
"pattern": ";"
}
}
}
}
}
When I am trying to search with the aggregation API http://my_server/index_name/_search here is the result:
{
"aggregations": {
"group_by_contentItemType": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Correspondence; Reports",
"doc_count": 3
},
{
"key": "Correspondence",
"doc_count": 2
},
{
"key": "Meeting Minutes; Administrative Records; Reports",
"doc_count": 2
},
{
"key": "Correspondence; Legal and Treaty Material; Reports",
"doc_count": 1
},
{
"key": "Correspondence; Memoranda",
"doc_count": 1
},
{
"key": "Memoranda",
"doc_count": 1
},
{
"key": "Reports",
"doc_count": 1
}
]
},
"group_by_theme": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "International Relations",
"doc_count": 2
},
{
"key": "Key Events; Dissent; Dissent; Resistance; Human Rights",
"doc_count": 2
},
{
"key": "Border Security and Migration; Key Events",
"doc_count": 1
},
{
"key": "Border Security and Migration; Second World War Aftermath",
"doc_count": 1
},
{
"key": "Domestic Politics",
"doc_count": 1
},
{
"key": "Domestic Politics; Border Security and Migration",
"doc_count": 1
},
{
"key": "Economics and Trade; International Relations",
"doc_count": 1
},
{
"key": "Embassy and Consulate Administration; Industry and Agriculture; International Relations",
"doc_count": 1
},
{
"key": "Populations and Social Policy; Second World War Aftermath; International Relations",
"doc_count": 1
}
]
}
}
}
As you can see the issue with the aggregation. I have been stuck on this problem for quite a few days. I have seen so many examples and all but still not able to solve this issue.
Please help. Thanks in Advance!!!
EDIT!!!
Here is the full mapping after #CatalinM answer:
{
"local_cwee": {
"mappings": {
"dynamic": "false",
"properties": {
"author": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"commentaries": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"contentDateEndMonth": {
"type": "integer"
},
"contentDateEndSpecified": {
"type": "boolean"
},
"contentDateEndYear": {
"type": "integer"
},
"contentDateMonth": {
"type": "integer"
},
"contentDateMonthSpecified": {
"type": "boolean"
},
"contentDateStartMonth": {
"type": "integer"
},
"contentDateStartSpecified": {
"type": "boolean"
},
"contentDateStartYear": {
"type": "integer"
},
"contentDateYear": {
"type": "integer"
},
"contentDoi": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"contentItemType": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"contentItemTypeFacets": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"contentTitle": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"copyrightNotices": {
"type": "nested",
"properties": {
"imageName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"text": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"countries": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"country": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"coverDateEndMonth": {
"type": "integer"
},
"coverDateEndSpecified": {
"type": "boolean"
},
"coverDateEndYear": {
"type": "integer"
},
"coverDateMonth": {
"type": "integer"
},
"coverDateMonthSpecified": {
"type": "boolean"
},
"coverDateStartMonth": {
"type": "integer"
},
"coverDateStartSpecified": {
"type": "boolean"
},
"coverDateStartYear": {
"type": "integer"
},
"coverDateYear": {
"type": "integer"
},
"displayName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"documentDoi": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"documentLevel": {
"type": "integer"
},
"keyEvents": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"language": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"languageFacets": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"languages": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"languagesFacets": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"moduleNumber": {
"type": "integer"
},
"notes": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"pageTranscript": {
"type": "text",
"term_vector": "with_positions",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
},
"analyzer": "whiteSpaceAnalyzer"
},
"people": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"publicationDate": {
"type": "integer"
},
"publicationDateEndMonth": {
"type": "integer"
},
"publicationDateEndSpecified": {
"type": "boolean"
},
"publicationDateEndYear": {
"type": "integer"
},
"publicationDateMonth": {
"type": "integer"
},
"publicationDateMonthSpecified": {
"type": "boolean"
},
"publicationDateStartMonth": {
"type": "integer"
},
"publicationDateStartSpecified": {
"type": "boolean"
},
"publicationDateStartYear": {
"type": "integer"
},
"publicationDateYear": {
"type": "integer"
},
"publicationDoi": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"publicationId": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"publicationIdFacet": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"publicationTitle": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"publicationType": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"publicationTypeFacets": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"publicationYear": {
"type": "integer"
},
"publisherName": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"publisherNameFacet": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
}
"subject": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"subjectAreas": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"subjectAreasFacets": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"subjectCountries": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"subjectCountriesFacets": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"subjectKeyword": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"subjectKeywordFacets": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"subthemeFacets": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"subthemes": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"theme": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"themeFacets": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
},
"themes": {
"type": "text",
"analyzer": "patternAnalyzer",
"fielddata": true
}
}
}
}
}
Using your custom tokenizer, the tokens in the text field are "Correspondence", "Meeting Minutes", "Administrative Records", ..etc. So i don't think you need the keyword field.
To make aggregations work on the text field, you'll have to add "fielddata": true in the mapping. This is by default disabled because aggregations on large text fields are not wanted, but in your case the tokens are exactly the values you want to aggregate on.
here's the simplified configuration
{
"mappings": {
"properties": {
"contentItemType": {
"type": "text",
"fielddata": true,
"analyzer": "patternAnalyzer"
}
}
},
"settings": {
"analysis": {
"analyzer": {
"patternAnalyzer": {
"tokenizer": "patternTokenizer"
}
},
"tokenizer": {
"patternTokenizer": {
"type": "pattern",
"pattern": ";"
}
}
}
}
}
the query:
{
"aggregations" : {
"test" : {
"terms" : { "field" : "contentItemType" }
}
}
}
and result:
"aggregations": {
"test": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": " Administrative Records",
"doc_count": 1
},
{
"key": "Meeting Minutes",
"doc_count": 1
},
{
"key": " Reports",
"doc_count": 1
}
]
}
}

How to find objects with inner objects having multiple fields by specific values in Elastic Search

I have an index with objects named "DynamicFields" and each of them have inner objects named "Fields" like this:
{
"DynamicFields": [
{
"Fields": [
{
"DFieldVal": "Value1",
"Owned": 0,
"DFieldRelCode": 181254,
"DFieldCode": 1835
},
{
"DFieldVal": "Value2",
"Owned": 0,
"DFieldRelCode": 181255,
"DFieldCode": 1836
},
{
"DFieldVal": "Value3",
"Owned": 1,
"DFieldRelCode": 181256,
"DFieldCode": 1837
},
{
"DFieldVal": "Value4",
"Owned": 0,
"DFieldRelCode": 181257,
"DFieldCode": 1838
}
]
}
]
}
I need to find objects "DynamicFields" that has inner objects "Fields" with this exact values:
"DFieldCode": 1837
and
"Owned": 0
Im using this query for it, but it gives me wrong result, it should return an empty result because there isn't any inner object "Fields" having both of the values:
{
"from":0,
"size":10,
"query": {
"bool":{
"must":[
{ "terms": { "DynamicFields.Fields.Owned" : [0] } },
{ "terms": { "DynamicFields.Fields.DFieldCode" : [1837] } }
]
}
}
}
I think the problem is that Elastic search sees the inner objects properties as normal property for the Root Object so it returns the objects that have the mentioned fields in all inner objects no matter in the same inner object.
EDIT:
i have summarized the data to make it simpler
the mapping is full map of the data:
{
"marketplace": {
"mappings": {
"object": {
"properties": {
"Addresses": {
"properties": {
"AddrID": {
"type": "long"
},
"AddressText": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"AddressTree": {
"properties": {
"AddrFieldRelID": {
"type": "long"
},
"AddrTitleName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"AddrTitlePersianName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"AddrValName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Latitude": {
"type": "float"
},
"Longitude": {
"type": "float"
}
}
},
"Latitude": {
"type": "float"
},
"Longitude": {
"type": "float"
},
"Tel": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"DelFlag": {
"type": "long"
},
"DynamicFields": {
"properties": {
"DynamicDefCode": {
"type": "long"
},
"DynamicDefDataTypeName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"DynamicDefName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"DynamicValKind": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Fields": {
"properties": {
"DFieldCode": {
"type": "long"
},
"DFieldRelCode": {
"type": "long"
},
"DFieldVal": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Owned": {
"type": "boolean"
}
}
}
}
},
"GFRefCode": {
"type": "long"
},
"GoodsDesc": {
"properties": {
"FName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"GoodsFullName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"Supplier": {
"properties": {
"Barcode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"GPackDayPrice": {
"type": "long"
},
"GoodsEnterDate": {
"type": "date"
},
"GoodsFinalCode": {
"type": "long"
},
"GoodsFullName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"GoodsWHStock": {
"type": "long"
},
"StoreName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"UserName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"WHName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"WareHouseCode": {
"type": "long"
}
}
},
"UserName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"GoodsFinalCode": {
"type": "long"
},
"Images": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"IsMainObject": {
"type": "boolean"
},
"ObjectDetailPackID": {
"type": "long"
},
"ObjectKind": {
"type": "long"
},
"Prices": {
"properties": {
"Barcode": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"GPWeight": {
"type": "float"
},
"GpackDayPrice": {
"type": "long"
},
"PackingName": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"RefGoodsFinalCode": {
"type": "long"
},
"TreePath": {
"properties": {
"DFieldCode": {
"type": "long"
},
"DFieldRelCode": {
"type": "long"
},
"DFieldVal": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
}
}
}
thanks.
As your index shows you saved your array as an object,
read more about this here
Basically unless specified otherwise elasticsearch flattens arrays when being saved, making objects in arrays lose their structure.
you should define the type of Fields as nested to avoid this.

How to update Mapping

How can I update index mapping to include the following field doc_as_upsert : true
My logstash ingesting cloudtrail logs from s3 is showing the following on the log
Could not index event to Elasticsearch. {:status=>400, :action=>["index", {:_id=>nil, :_index=>"cloudtrail-2018.10.08", :_type=>"doc", :_routing=>nil}, #<LogStash::Event:0x251f932>], :response=>{"index"=>{"_index"=>"cloudtrail-2018.10.08", "_type"=>"doc", "_id"=>"t2mmVWYBVQr-RbWuAQIS", "status"=>400, "error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse [requestParameters.disableApiTermination]", "caused_by"=>{"type"=>"json_parse_exception", "reason"=>"Current token (START_OBJECT) not of boolean type\n at [Source: org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper#133a6c; line: 1, column: 1509]"}}}}}
Mapping is dynamic and very long so can't fit it all here but here is what I cat fit
{
"cloudtrail-2018.10.08": {
"mappings": {
"_default_": {
"dynamic_templates": [
{
"message_field": {
"path_match": "message",
"match_mapping_type": "string",
"mapping": {
"norms": false,
"type": "text"
}
}
},
{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
},
"norms": false,
"type": "text"
}
}
}
],
"properties": {
"#timestamp": {
"type": "date"
},
"#version": {
"type": "keyword"
},
"geoip": {
"dynamic": "true",
"properties": {
"ip": {
"type": "ip"
},
"latitude": {
"type": "half_float"
},
"location": {
"type": "geo_point"
},
"longitude": {
"type": "half_float"
}
}
}
}
},
"doc": {
"dynamic_templates": [
{
"message_field": {
"path_match": "message",
"match_mapping_type": "string",
"mapping": {
"norms": false,
"type": "text"
}
}
},
{
"string_fields": {
"match": "*",
"match_mapping_type": "string",
"mapping": {
"fields": {
"keyword": {
"ignore_above": 256,
"type": "keyword"
}
},
"norms": false,
"type": "text"
}
}
}
],
"properties": {
"#timestamp": {
"type": "date"
},
"#version": {
"type": "keyword"
},
"additionalEventData": {
"properties": {
"configRuleArn": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"configRuleInputParameters": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"configRuleName": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"managedRuleIdentifier": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"notificationJobType": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"vpcEndpointId": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"apiVersion": {
"type": "date"
},
"awsRegion": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"errorCode": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"errorMessage": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"eventID": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"eventName": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"eventSource": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"eventType": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"eventVersion": {
"type": "text",
"norms": false,
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
Here am get the following errors when I try update the mapping with these
PUT cloudtrail-*/_mapping/_doc
{
"properties": {
"doc_as_upsert": true
}
}
I get error
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Expected map for property [fields] on field [doc_as_upsert] but got a class java.lang.String"
}
],
"type": "mapper_parsing_exception",
"reason": "Expected map for property [fields] on field [doc_as_upsert] but got a class java.lang.String"
},
"status": 400
}
doc_as_upsert is a flage you use to tell elasticsearch that you want to update the document with the content of doc as the upsert value. it has nothing to do with update index mapping.
assume you want to update document of id 1 and index test (update the name).
POST test/_doc/1/_update
{
"doc" : {
"name" : "new_name"
},
"doc_as_upsert" : true
}

Resources