Related
i am having trouble forming query to fetch all values with sql group by kind of thing.
so below is my data structure:
product index:
{
"createdBy" : "61c1fcdd88dbad1920da8caf",
"creationTime" : "2021-12-22T11:58:53.576932Z",
"lastModifiedBy" : "61c1fcdd88dbad1920da8caf",
"lastModificationTime" : "2021-12-22T11:58:53.576932Z",
"id" : "61c312fdc6aa620a609db0b2",
"title" : "string",
"brand" : "string",
"longDesc" : "string",
"categoryId" : "string",
"imageUrls" : [
"string",
"string"
],
"keySpecs" : [
"string",
"string",
],
"facets" : [
{
"name" : "color",
"value" : "red"
},
{
"name" : "storage",
"value" : "16 GB"
},
{
"name" : "brand",
"value" : "Intex"
}
],
"categoryName" : "handsets"
}
Now, i want to fetch all the facets with their different values and count as well. Let's say
productA has color blue, productB has color red
productA has brand ABC, productB has brand XYZ
so, i want data which list all facets like:
color: blue(200 count), red (12 count)
brand: ABC(13 count), XYZ (99 count)
Also, different product will have different type of facet, like iphone will have color memory brand size, but a pen will have color and brand only (not memory/size).
Note: i'm using latest version of elastic
=================
UPDATE 1:
Below is the es mapping details
{
"settings": {
"analysis": {
"filter": {
"english_stop": {
"type": "stop",
"stopwords": "_english_"
},
"english_keywords": {
"type": "keyword_marker",
"keywords": [
"example"
]
},
"english_stemmer": {
"type": "stemmer",
"language": "english"
},
"english_possessive_stemmer": {
"type": "stemmer",
"language": "possessive_english"
}
},
"analyzer": {
"lalashree_standard_analyzer": {
"tokenizer": "standard",
"filter": [
"english_possessive_stemmer",
"lowercase",
"english_stop",
"english_keywords",
"english_stemmer"
]
},
"html_standard_analyzer": {
"char_filter": [
"html_strip"
],
"tokenizer": "standard",
"filter": [
"english_possessive_stemmer",
"lowercase",
"english_stop",
"english_keywords",
"english_stemmer"
]
}
}
}
},
"mappings": {
"properties": {
"id": {
"type": "keyword"
},
"createdBy": {
"type": "keyword"
},
"creationTime": {
"type": "date"
},
"lastModifiedBy": {
"type": "keyword"
},
"lastModificationTime": {
"type": "date"
},
"deleted": {
"type": "boolean"
},
"deletedBy": {
"type": "keyword"
},
"deletionTime": {
"type": "date"
},
"title": {
"type": "text",
"analyzer": "lalashree_standard_analyzer",
"fields": {
"suggest": {
"type": "completion"
}
}
},
"shortDesc": {
"type": "text",
"analyzer": "lalashree_standard_analyzer"
},
"longDesc": {
"type": "text",
"analyzer": "lalashree_standard_analyzer"
},
"categoryId": {
"type": "keyword"
},
"searchDetails": {
"type": "object",
"properties": {
"desc": {
"type": "text",
"analyzer": "lalashree_standard_analyzer"
},
"keywords": {
"type": "text",
"analyzer": "lalashree_standard_analyzer",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
},
"imageUrls": {
"type": "keyword",
"index": false
},
"keySpecs": {
"type": "text",
"analyzer": "lalashree_standard_analyzer"
},
"sections": {
"type": "object",
"properties": {
"name": {
"type": "text",
"index": false
},
"shortDesc": {
"type": "text",
"analyzer": "lalashree_standard_analyzer"
},
"longDesc": {
"type": "text",
"analyzer": "lalashree_standard_analyzer"
},
"htmlContent": {
"type": "text",
"analyzer": "html_standard_analyzer"
}
}
},
"facets": {
"type": "nested",
"properties": {
"name": {
"type": "keyword"
},
"value": {
"type": "keyword"
}
}
},
"specificationItems": {
"type": "object",
"properties": {
"key": {
"type": "text",
"analyzer": "lalashree_standard_analyzer",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"values": {
"type": "text",
"analyzer": "lalashree_standard_analyzer"
}
}
},
"categoryName": {
"type": "keyword"
},
"productFamily": {
"type": "nested",
"properties": {
"id": {
"type": "keyword"
},
"familyVariantOptions": {
"type": "nested",
"properties": {
"name": {
"type": "keyword"
},
"values": {
"type": "keyword"
}
}
},
"productFamilyItems": {
"type": "nested",
"properties": {
"baseProductId": {
"type": "keyword"
},
"itemVariantInfoSet": {
"type": "nested",
"properties": {
"name": {
"type": "keyword"
},
"value": {
"type": "keyword"
}
}
}
}
}
}
},
"rating": {
"type": "float"
},
"totalReviewsCount": {
"type": "long"
},
"stores": {
"type": "nested",
"properties": {
"id": {
"type": "keyword"
},
"logo": {
"type": "keyword",
"index": false
},
"active": {
"type": "boolean"
},
"name": {
"type": "text"
},
"quantity": {
"type": "long"
},
"rating": {
"type": "float"
},
"totalReviewsCount": {
"type": "long"
},
"price.mrp": {
"type": "float"
},
"price.sp": {
"type": "float"
},
"location.geoPoint": {
"type": "geo_point"
},
"oos": {
"type": "boolean"
}
}
}
}
}
}
This query first group by names then groups each name's values. By setting sizes, you can arrange number of facets you want and number of items in each facet. I think it does what you need.
Note that if you have too many documents and if performance matters, this query may perform bad.
{
"size": 0,
"aggs": {
"facets": {
"nested": {
"path": "facets"
},
"aggs": {
"names": {
"terms": {
"field": "facets.name",
"size": 10
},
"aggs": {
"values": {
"terms": {
"field": "facets.value",
"size": 10
}
}
}
}
}
}
}
}
I'm trying to setup a mapping for this kind of logs (the automatic mapping didn't work).
here's the log that I've to analyse thanks to kibana (found on the internet):
{"index":
{"_index":"logstash-2015.05.18","_type":"log"
}
}
{"#timestamp":"2015-05-18T09:03:25.877Z","ip":"185.124.182.126","extension":"gif","response":"404",
"geo":{
"coordinates":{
"lat":36.518375,"lon":-86.05828083
},
"src":"PH","dest":"MM","srcdest":"PH:MM"
},
"#tags":["success","info"],"utc_time":"2015-05-18T09:03:25.877Z","referer":"http://twitter.com/error/william-shepherd","agent":"Mozilla/5.0 (X11; Linux x86_64; rv:6.0a1) Gecko/20110421 Firefox/6.0a1","clientip":"185.124.182.126","bytes":804,"host":"motion-media.theacademyofperformingartsandscience.org","request":"/canhaz/gemini-7.gif","url":"https://motion-media.theacademyofperformingartsandscience.org/canhaz/gemini-7.gif","#message":"185.124.182.126 - - [2015-05-18T09:03:25.877Z] \"GET /canhaz/gemini-7.gif HTTP/1.1\" 404 804 \"-\" \"Mozilla/5.0 (X11; Linux x86_64; rv:6.0a1) Gecko/20110421 Firefox/6.0a1\"","spaces":"this is a thing with lots of spaces wwwwoooooo","xss":"<script>console.log(\"xss\")</script>","headings":["<h3>f-i-j-nl-ng</h5>","http://facebook.com/success/lodewijk-van-den-berg"],"links":["daniel-tani#facebook.com","http://nytimes.com/security/kathryn-sullivan","www.nytimes.com"],
"relatedContent":[
{"url":"http://www.laweekly.com/news/cbs-crew-rat-fink-2368032","og:type":"article","og:title":"CBS Crew Rat Fink","og:description":"Near a couple of auto body shops (and a sharp new Space Invader mosaic that we'll post soon) near Temple and Westmoreland is a CBS wall with a nice Rat ...","og:url":"http://www.laweekly.com/news/cbs-crew-rat-fink-2368032","article:published_time":"2008-01-14T08:05:26-08:00","article:modified_time":"2014-10-28T14:59:52-07:00","article:section":"News","article:tag":"Mark Mauer","og:image":"http://IMAGES1.laweekly.com/imager/cbs-crew-rat-fink/u/original/2430299/img_2049.jpg","og:image:height":"360","og:image:width":"480","og:site_name":"LA Weekly","twitter:title":"CBS Crew Rat Fink","twitter:description":"Near a couple of auto body shops (and a sharp new Space Invader mosaic that we'll post soon) near Temple and Westmoreland is a CBS wall with a nice Rat ...","twitter:card":"summary","twitter:image":"http://IMAGES1.laweekly.com/imager/cbs-crew-rat-fink/u/original/2430299/img_2049.jpg","twitter:site":"#laweekly"
},
{"url":"http://www.laweekly.com/news/push-and-retna-in-koreatown-2368043","og:type":"article","og:title":"Push and Retna in Koreatown","og:description":"Yeah, I originally had this posted this morning as Push & Ayer - Sorry. It looked like a Retna piece, but I saw the Ayer in there and thought that must ...","og:url":"http://www.laweekly.com/news/push-and-retna-in-koreatown-2368043","article:published_time":"2008-01-29T07:28:32-08:00","article:modified_time":"2014-10-28T14:59:54-07:00","article:section":"News","article:tag":"Shelley Leopold","og:image":"http://IMAGES1.laweekly.com/imager/push-and-retna-in-koreatown/u/original/2430376/img_3671.jpg","og:image:height":"360","og:image:width":"480","og:site_name":"LA Weekly","twitter:title":"Push and Retna in Koreatown","twitter:description":"Yeah, I originally had this posted this morning as Push & Ayer - Sorry. It looked like a Retna piece, but I saw the Ayer in there and thought that must ...","twitter:card":"summary","twitter:image":"http://IMAGES1.laweekly.com/imager/push-and-retna-in-koreatown/u/original/2430376/img_3671.jpg","twitter:site":"#laweekly"
},
{"url":"http://www.laweekly.com/news/asylm-ruets-pdb-on-santa-monica-2368012","og:type":"article","og:title":"Asylm, Ruets, PDB on Santa Monica","og:description":"Not a new piece, but a well-hidden gem a little south of Santa Monica Blvd. in an alley off of Heliotrope or Edgemont. I've been sitting on this for a w...","og:url":"http://www.laweekly.com/news/asylm-ruets-pdb-on-santa-monica-2368012","article:published_time":"2008-04-22T15:11:15-07:00","article:modified_time":"2014-10-28T14:59:48-07:00","article:section":"News","article:tag":"Culture and Lifestyle","og:image":"http://images1.laweekly.com/imager/asylm-ruets-pdb-on-santa-monica/u/original/2430137/img_5027.jpg","og:image:height":"360","og:image:width":"480","og:site_name":"LA Weekly","twitter:title":"Asylm, Ruets, PDB on Santa Monica","twitter:description":"Not a new piece, but a well-hidden gem a little south of Santa Monica Blvd. in an alley off of Heliotrope or Edgemont. I've been sitting on this for a w...","twitter:card":"summary","twitter:image":"http://images1.laweekly.com/imager/asylm-ruets-pdb-on-santa-monica/u/original/2430137/img_5027.jpg","twitter:site":"#laweekly"
},
{"url":"http://www.laweekly.com/news/laurence-tribe-tangles-with-cbs-and-la-city-hall-2396867","og:type":"article","og:title":"Laurence Tribe Tangles with CBS and L.A. City Hall","og:description":"The United States Court of Appeals for the Ninth Circuit’s Courtroom 3 - a miniature auditorium with comfortable, smoked salmon-colored seats - wa...","og:url":"http://www.laweekly.com/news/laurence-tribe-tangles-with-cbs-and-la-city-hall-2396867","article:published_time":"2008-06-04T14:16:10-07:00","article:modified_time":"2014-11-26T14:43:59-08:00","article:section":"News","og:site_name":"LA Weekly","twitter:title":"Laurence Tribe Tangles with CBS and L.A. City Hall","twitter:description":"The United States Court of Appeals for the Ninth Circuit’s Courtroom 3 - a miniature auditorium with comfortable, smoked salmon-colored seats - wa...","twitter:card":"summary","twitter:site":"#laweekly"
}
],
"machine":{
"os":"win xp","ram":3221225472
},
"#version":"1"
}
and here's the mapping I've put in the Kibana's dev tool:
PUT logstash-2019.05.09
{
"mappings": {
"doc": {
"properties": {
"index": {
"_index": {
"type": "keyword"
},
"_type": {
"type": "text"
}
},
"#timestamp": {
"type": "date"
},
"ip": {
"type": "ip"
},
"extension": {
"type": "text"
},
"response": {
"type": "text"
},
"geo": {
"coordinates": {
"type": "geo_point"
},
"src": {
"type": "text"
},
"dest": {
"type": "text"
},
"srcdest": {
"type": "text"
}
},
"tags": {
"type": "text"
},
"utc_time": {
"type": "date"
},
"referer": {
"type": "text"
},
"agent": {
"type": "text"
},
"clientip": {
"type": "ip"
},
"bytes": {
"type": "integer"
},
"host": {
"type": "text"
},
"request": {
"type": "text"
},
"url": {
"type": "text"
},
"#message": {
"type": "text"
},
"spaces": {
"type": "text"
},
"xss": {
"type": "text"
},
"links": {
"type": "text"
},
"relatedContent": {
"url": {
"type": "text"
},
"og:type": {
"type": "text"
},
"og:title": {
"type": "text"
},
"og:description": {
"type": ""
},
"og:url": {
"type": ""
},
"article:published_time": {
"type": "date"
},
"article:modified_time": {
"type": "date"
},
"article:section": {
"type": "keyword"
},
"article:tag": {
"type": "text"
},
"og:image": {
"type": "text"
},
"og:image:height": {
"type": "integer"
},
"og:image:width": {
"type": "integer"
},
"og:site_name": {
"type": "text"
},
"twitter:title": {
"type": "text"
},
"twitter:description": {
"type": "text"
},
"twitter:card": {
"type": "keyword"
},
"twitter:image": {
"type": "text"
},
"twitter:site": {
"type": "keyword"
}
},
"machine": {
"os": {
"type": "text"
},
"ram": {
"type": "integer"
}
},
"#version": {
"type": "integer"
}
}
}
}
}
and here's the error:
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "No type specified for field [index]"
}
],
"type": "mapper_parsing_exception",
"reason": "Failed to parse mapping [doc]: No type specified for field [index]",
"caused_by": {
"type": "mapper_parsing_exception",
"reason": "No type specified for field [index]"
}
},
"status": 400
}
I've already search on internet to find some solutions but I didn't found anything that can help me.
You're missing the properties keyword for all your object fields. Use this mapping instead
PUT logstash-2019.05.09
{
"mappings": {
"doc": {
"properties": {
"#timestamp": {
"type": "date"
},
"ip": {
"type": "ip"
},
"extension": {
"type": "text"
},
"response": {
"type": "text"
},
"geo": {
"properties": {
"coordinates": {
"type": "geo_point"
},
"src": {
"type": "text"
},
"dest": {
"type": "text"
},
"srcdest": {
"type": "text"
}
}
},
"tags": {
"type": "text"
},
"utc_time": {
"type": "date"
},
"referer": {
"type": "text"
},
"agent": {
"type": "text"
},
"clientip": {
"type": "ip"
},
"bytes": {
"type": "integer"
},
"host": {
"type": "text"
},
"request": {
"type": "text"
},
"url": {
"type": "text"
},
"#message": {
"type": "text"
},
"spaces": {
"type": "text"
},
"xss": {
"type": "text"
},
"links": {
"type": "text"
},
"relatedContent": {
"properties": {
"url": {
"type": "text"
},
"og:type": {
"type": "text"
},
"og:title": {
"type": "text"
},
"og:description": {
"type": ""
},
"og:url": {
"type": ""
},
"article:published_time": {
"type": "date"
},
"article:modified_time": {
"type": "date"
},
"article:section": {
"type": "keyword"
},
"article:tag": {
"type": "text"
},
"og:image": {
"type": "text"
},
"og:image:height": {
"type": "integer"
},
"og:image:width": {
"type": "integer"
},
"og:site_name": {
"type": "text"
},
"twitter:title": {
"type": "text"
},
"twitter:description": {
"type": "text"
},
"twitter:card": {
"type": "keyword"
},
"twitter:image": {
"type": "text"
},
"twitter:site": {
"type": "keyword"
}
}
},
"machine": {
"properties": {
"os": {
"type": "text"
},
"ram": {
"type": "integer"
}
}
},
"#version": {
"type": "integer"
}
}
}
}
}
Having read all the tutorials on ES website, I cannot achieve my goal.
We are using ES 1.7.6 and I want to have only one single instance of documents matching my criteria. But what I get from ES is all the data matching the filter and the aggregation statistics.
GET _search
{
"size":1000,
"query":{
"bool":{
"should":[
{
"match":{
"isoMessage.fields.39":{
"query":"00"
}
}
}
]
}
},
"aggs":{
"group_by_CATI":{
"terms":{
"field":"isoMessage.fields.41"
}
}
}
}
Note that the index of isoMessage.fields.41 is set to not_analyzed.
Thanks for any help;
UPDATE: The mapping
{
"cm": {
"mappings": {
"Event": {
"properties": {
"deleted": {
"type": "boolean"
},
"id": {
"type": "string"
},
"isoMessage": {
"properties": {
"fields": {
"properties": {
"2": {
"type": "string"
},
"3": {
"type": "string"
},
"4": {
"type": "string"
},
"7": {
"type": "string"
},
"11": {
"type": "string"
},
"12": {
"type": "string"
},
"13": {
"type": "string"
},
"14": {
"type": "string"
},
"22": {
"type": "string"
},
"25": {
"type": "string"
},
"32": {
"type": "string"
},
"37": {
"type": "string"
},
"39": {
"type": "string"
},
"41": {
"type": "string"
},
"42": {
"type": "string"
},
"48": {
"type": "string"
},
"49": {
"type": "string"
},
"60": {
"type": "string"
},
"63": {
"type": "string"
},
"100": {
"type": "string"
},
"128": {
"type": "string"
}
}
},
"isReversal": {
"type": "boolean"
},
"isReversalDone": {
"type": "boolean"
},
"messageSpec": {
"type": "string"
},
"mti": {
"type": "string"
},
"request": {
"type": "boolean"
},
"response": {
"type": "boolean"
}
}
},
"msg": {
"type": "string"
},
"occurDate": {
"type": "long"
},
"receiver": {
"type": "string"
},
"rrn": {
"type": "string"
},
"sender": {
"type": "string",
"index": "not_analyzed"
},
"txId": {
"type": "string"
},
"version": {
"type": "long"
}
}
}
}
}
}
If I understand correctly, and you want to get only the aggregation result:
Change your size field from:
"size":1000
to
"size":0
in order to set the _search query displayed results limit. It won't affect the aggregation result, though.
I need to query elasticsearch & filter the result to be in a range of dates.
the thing is the date property is mapped as a string.
is it possible to do so ?
this is the search query i'm using:
{
"size": 1,
"from": 0,
"query": {
"bool": {
"must": [
{ "match": { "status": "active" }},
{ "match": { "last_action_state": "accepted" }}
],
"filter": [
{"missing" : { "field" : "store_id" }},
{ "range": { "list_time": { "gte": "2017/01/01 00:00:00", "lte": "2017/03/01 23:59:59", "format": "yyyy/MM/dd HH:mm:ss"}}}
]
}
}
}
the thing is i have no control over the mapping since it's created automatically by another program which index the documents, and i can't change the mapping once it's created.
ps: elasticsearch version: 2.3
UPDATE:
index info:
{
"avindex_v3": {
"aliases": {
"avindex": {}
},
"mappings": {
"ads": {
"properties": {
"account_id": {
"type": "long"
},
"ad_id": {
"type": "long"
},
"ad_params": {
"type": "string"
},
"body": {
"type": "string"
},
"category": {
"type": "long"
},
"city": {
"type": "long"
},
"company_ad": {
"type": "boolean"
},
"email": {
"type": "string"
},
"images": {
"type": "string"
},
"lang": {
"type": "string"
},
"last_action_state": {
"type": "string"
},
"list_date": {
"type": "long"
},
"list_id": {
"type": "long"
},
"list_time": {
"type": "string"
},
"modified_at": {
"type": "string"
},
"modified_ts": {
"type": "double"
},
"name": {
"type": "string"
},
"orig_date": {
"type": "long"
},
"orig_list_time": {
"type": "string"
},
"phone": {
"type": "string"
},
"phone_hidden": {
"type": "boolean"
},
"price": {
"type": "long"
},
"region": {
"type": "long"
},
"status": {
"type": "string"
},
"store_id": {
"type": "long"
},
"subject": {
"type": "string"
},
"type": {
"type": "string"
},
"user_id": {
"type": "long"
}
}
}
},
"settings": {
"index": {
"creation_date": "1493216710928",
"number_of_shards": "5",
"number_of_replicas": "1",
"uuid": "WEHGLF8iRyGk3Xgbmo7H8Q",
"version": {
"created": "2040499"
}
}
},
"warmers": {}
}
}
You can try to give it as a keyword like this :
{
"range": {
"list_time.keyword": {
"gte": "2020-08-12 22:24:55.56",
"lte": "2020-08-12 22:24:56.56"
}
}
}
I have this mapping on ES 1.7.3:
{
"customer": {
"aliases": {},
"mappings": {
"customer": {
"properties": {
"addresses": {
"type": "nested",
"include_in_parent": true,
"properties": {
"address1": {
"type": "string"
},
"address2": {
"type": "string"
},
"address3": {
"type": "string"
},
"country": {
"type": "string"
},
"latitude": {
"type": "double",
"index": "not_analyzed"
},
"longitude": {
"type": "double",
"index": "not_analyzed"
},
"postcode": {
"type": "string"
},
"state": {
"type": "string"
},
"town": {
"type": "string"
},
"unit": {
"type": "string"
}
}
},
"companyNumber": {
"type": "string"
},
"id": {
"type": "string",
"index": "not_analyzed"
},
"name": {
"type": "string"
},
"status": {
"type": "string"
},
"timeCreated": {
"type": "date",
"format": "dateOptionalTime"
},
"timeUpdated": {
"type": "date",
"format": "dateOptionalTime"
}
}
}
},
"settings": {
"index": {
"refresh_interval": "1s",
"number_of_shards": "5",
"creation_date": "1472372294516",
"store": {
"type": "fs"
},
"uuid": "RxJdXvPWSXGpKz8pdcF91Q",
"version": {
"created": "1050299"
},
"number_of_replicas": "1"
}
},
"warmers": {}
}
}
The spring application generates this query:
{
"query": {
"bool": {
"should": {
"query_string": {
"query": "(addresses.\\*:sample* AND NOT status:ARCHIVED)",
"fields": [
"type",
"name",
"companyNumber",
"status",
"addresses.unit",
"addresses.address1",
"addresses.address2",
"addresses.address3",
"addresses.town",
"addresses.state",
"addresses.postcode",
"addresses.country"
],
"default_operator": "or",
"analyze_wildcard": true
}
}
}
}
}
on which "addresses.*:sample*" is the only input.
"query": "(sample* AND NOT status:ARCHIVED)"
Code above works but searches all fields of the customer object.
Since I want to search only on address fields I used the "addresses.*"
Query works only if the fields of the address object are of String type and before I added longitude and latitude fields of double type on address object. Now the error occurs because of these two new fields.
Error:
Parse Failure [Failed to parse source [{
"query": {
"bool": {
"should": {
"query_string": {
"query": "(addresses.\\*:sample* AND NOT status:ARCHIVED)",
"fields": [
"type",
"name",
"companyNumber","country",
"state",
"status",
"addresses.unit",
"addresses.address1",
"addresses.address2",
"addresses.address3",
"addresses.town",
"addresses.state",
"addresses.postcode",
"addresses.country",
],
"default_operator": "or",
"analyze_wildcard": true
}
}
}
}
}
]]
NumberFormatException[For input string: "sample"
Is there a way to search "String" fields within a nested object using addresses.* only?
The solution was to add "lenient": true. As per the documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html
lenient - If set to true will cause format based failures (like providing text to a numeric field) to be ignored.