bucket Terms aggregation Elasticsearch - elasticsearch

elasticsearch version
{
"name" : "abc-Inspiron-5521",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "2vLvphpURJOtfAZSGDDX5w",
"version" : {
"number" : "7.10.2",
"build_flavor" : "default",
"build_type" : "deb",
"build_hash" : "747e1cc71def077253878a59143c1f785afa92b9",
"build_date" : "2021-01-13T00:42:12.435326Z",
"build_snapshot" : false,
"lucene_version" : "8.7.0",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
Document mapping
"user_data" : {
"aliases" : { },
"mappings" : {
"properties" : {
"experience" : {
"properties" : {
"brand" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"brand_segment" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"company" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"duration" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"property_type" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"real_estate_type" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
Document structure is right, please make changes if there is mismatch in parenthesis accordingly.
document sample
{
"_index" : "user_data",
"_type" : "_doc",
"_id" : "dONuEXgBU9vYaZRqY8Jo",
"_score" : 1.0,
"_source" : {
"experience" : [
{
"brand" : "Hilton",
"company" : "Hilton LLC",
"brand_segment" : "Luxury",
"property_type" : "All-Inclusive",
"duration" : "2 years",
"real_estate_type" : "Institutional"
},
{
"brand" : "Mantis",
"company" : "Accor LLC",
"brand_segment" : "Upper-Upscale",
"property_type" : "Condo",
"duration" : "2 years",
"real_estate_type" : "Family Office"
},
{
"brand" : "Marriott",
"company" : "Marriott LLC",
"brand_segment" : "Independent",
"property_type" : "Convention",
"duration" : "2 years",
"real_estate_type" : "Family Office"
}
]
}
}
my term aggregation query on brand_segment
GET user_data/_search
{
"aggs": {
"experience": {
"terms": { "field": "experience.brand_segment" }
}
}
}
Now I have 2 problems while making term aggregation
While executing term aggregation on 'brand_segment', the value 'Upper-Upscale' is suppose to be considered as single unit and count is to be made according but currently I am getting it as:
Second concern is if I want to count number of times brand_segment value is 'Luxury' or any value, but currently from above query I am getting count of number of documents in which Luxury occurs, not the number of times in all documents Luxury occurs. (multiple occurrences are getting counted as single for 1 document as of now).
wrong result
"aggregations" : {
"experience" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "independent",
"doc_count" : 15
},
{
"key" : "luxury",
"doc_count" : 15
},
{
"key" : "upper",
"doc_count" : 14
},
{
"key" : "upscale",
"doc_count" : 14
}
]
}
}
Desired output should have Upper-Upscale as one value. I have taken multiple sample documents hence this result.
feel free to use this as sample document for creating index
{
"id": 1,
"name": "abcs",
"source": "csv_status",
"profile_complition": "70%",
"creation_date": "2020-04-02",
"current_position": [
{
"position": "Financial Reporting",
"position_category": "Finance",
"position_level": 2
}
],
"seeking_position": [
{
"position": "Financial Planning and Analysis",
"position_category": "Finance",
"position_level": 3
}
],
"last_updation_date": "2021-02-02",
"experience": [
{
"brand": "Hilton",
"company": "Hilton LLC",
"brand_segment": "Luxury",
"property_type": "All-Inclusive",
"duration": "2 years",
"real_estate_type": "Institutional"
},
{
"brand": "Accor",
"company": "Accor LLC",
"brand_segment": "Luxury",
"property_type": "Condo",
"duration": "2 years",
"real_estate_type": "Family Office"
},
{
"brand": "Marriott",
"company": "Marriott LLC",
"brand_segment": "Independent",
"property_type": "Convention",
"duration": "2 years",
"real_estate_type": "Family Office"
}
]
}
other occurrences in brand_segment = ['Economy', 'Upscale', 'Midscale', 'Upper-Upscale', 'Luxury', 'Independent', 'Extended Stay']
PS: all brand_segment are desired to be considered as single entity ('Upper-Upscale' is not desired as 'Upper', 'Upscale'. Same for 'Extended Stay')
Let me know if further clarification required.

For the first issue, you need to make your aggregation on the keyword subfield:
GET user_data/_search
{
"aggs": {
"experience": {
"terms": { "field": "experience.brand_segment.keyword" }
}
}
}
To solve the second issue, you need to make your experience field nested, which means your mapping needs to look as follows:
"user_data" : {
"aliases" : { },
"mappings" : {
"properties" : {
"experience" : {
"type": "nested", <--- add this
"properties" : {
"brand" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},

Related

ElasticSearch - how to get aggregation of aggregation

I am very new to elasticsearch
I work for a dating website that has data as follows:
Single - with fields: name, signUpDate, state, and other data fields.
Encounter - with fields: state, encounterDate, singlesInvolved, and other data fields.
These are my 2 indexes
Now I have to write a query that returns as follows:
For every state, how many singles, how many encounters, the longest time a single has been part of our website, and the average time a single has been part of our website
And also return one result that is that same average for all states
Like this example:
[
{ //this one is the average of all states
"singles": 45,
"dates": 18,
"minWaitingTime": 1644677979530,
"avgWaitingTime": 15603
},
{ //these are the averages of each state
"state": "MA",
"singles": 50,
"dates": 23,
"minWaitingTime": 1644677979530,
"avgWaitingTime": 15603
},
{
"state": "NY",
"singles": 39,
"dates": 13,
"minWaitingTime": 1644850558872,
"avgWaitingTime": 6033
}
]
I've been working on the query for each state individually but i dont know how to get an average of all states
so far what i have is this:
GET /single,encounter/_search
{
"size": 0,
"aggs": {
"bystate": {
"terms": {
"field": "state",
"size": 59
},
"aggs": {
"group-by-index": {
"terms": {
"field": "_index"
}
},
"min_date": {
"min": {
"field": "signedUpAt"
}
},
"avg_date": {
"avg": {
"field": "signedUpAt"
}
}
}
}
}
}
I don't know if there is a better way to do this, likewise I don't know how to calculate the average (singles, encounters, min_date and average_date average) for all states using this result
Every result of the previous query looks like this:
{
"key" : "MA",
"doc_count" : 164,
"avg_date" : {
"value" : 1.6457900076508965E12,
"value_as_string" : "2022-02-25T11:53:27.650"
},
"min_date" : {
"value" : 1.64467797953E12,
"value_as_string" : "2022-02-12T14:59:39.530"
},
"group-by-index" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "single",
"doc_count" : 135
},
{
"key" : "encounter",
"doc_count" : 29
}
]
}
},
I would really appreciate help on this one
Addition: index mapping.
Encounter:
{
"encounter" : {
"aliases" : { },
"mappings" : {
"properties" : {
"_class" : {
"type" : "keyword",
"index" : false,
"doc_values" : false
},
"avgAge" : {
"type" : "integer",
"index" : false,
"doc_values" : false
},
"application" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"createdAt" : {
"type" : "date",
"format" : "date_hour_minute_second_millis"
},
"encounterId" : {
"type" : "keyword"
},
"locationType" : {
"type" : "keyword",
"index" : false,
"doc_values" : false
},
"singleOneId" : {
"type" : "keyword",
"index" : false,
"doc_values" : false
},
"singleTwoId" : {
"type" : "keyword",
"index" : false,
"doc_values" : false
},
"serviceLine" : {
"type" : "keyword"
},
"state" : {
"type" : "keyword"
},
"rating" : {
"type" : "keyword"
}
}
},
"settings" : {
"index" : {
"refresh_interval" : "1s",
"number_of_shards" : "1",
"provided_name" : "encounter",
"creation_date" : "1643704661932",
"number_of_replicas" : "1",
"uuid" : "MliXQL_bRBKDN7_d8G_BYw",
"version" : {
"created" : "7100299"
}
}
}
}
}
And Single:
{
"single" : {
"aliases" : { },
"mappings" : {
"properties" : {
"_class" : {
"type" : "keyword",
"index" : false,
"doc_values" : false
},
"id" : {
"type" : "keyword"
},
"singleId" : {
"type" : "keyword"
},
"state" : {
"type" : "keyword"
},
"preferedGender" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"settings" : {
"index" : {
"refresh_interval" : "1s",
"number_of_shards" : "1",
"provided_name" : "single",
"creation_date" : "1643704662136",
"number_of_replicas" : "1",
"uuid" : "Js_tqZfRRx-IxbjVRRN4wQ",
"version" : {
"created" : "7100299"
}
}
}
}
}
You can use avg bucket aggregation, where you can provide bucket_path and based on value it will calculate avg of entire aggregation.
Below is sample query:
{
"size": 0,
"aggs": {
"bystate": {
"terms": {
"field": "state",
"size": 59
},
"aggs": {
"group-by-index": {
"terms": {
"field": "_index"
}
},
"min_date": {
"min": {
"field": "signedUpAt"
}
},
"avg_date": {
"avg": {
"field": "signedUpAt"
}
}
}
},
"avg_all_state": {
"avg_bucket": {
"buckets_path": "bystate>avg_date"
}
}
}
}

Elastic Search and product nomenclature: hyphens and spaces

I'm having a hard time figuring out how to set up Elasticsearch for the typical product model nomenclature. For instance, a product called "Shure SM7B" should appear as a result when searching for SM7B, SM 7B, SM 7, SM-7... and vice versa: searching for SM7B should give results like SM-7, SM7...
For now, I'm getting this kind of results: if I search for "Roland D 50", I get Roland D 50, Roland D-50, Roland D-550, Roland D-20 and so on... but if I search for "Roland D50", I get only "Roland D50" results.
This is my current mapping/settings:
{
"products" : {
"mappings" : {
"Product" : {
"properties" : {
"article_reviews" : {
"type" : "integer"
},
"brand_id" : {
"type" : "integer"
},
"category" : {
"type" : "text"
},
"category_id" : {
"type" : "integer"
},
"date" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"description" : {
"type" : "text"
},
"has_image" : {
"type" : "integer"
},
"id" : {
"type" : "integer"
},
"last_review_date" : {
"type" : "date",
"format" : "yyyy-MM-dd HH:mm:ss"
},
"min_price" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"name" : {
"type" : "text"
},
"name_order" : {
"type" : "keyword"
},
"price_history" : {
"type" : "integer"
},
"rating" : {
"type" : "float"
},
"reviews" : {
"type" : "integer"
},
"shops" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"widget" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
Also, I'd need to autocomplete my searches, so for instance, "Shure SM" should show results like Shure SM-7, Shure SM7, Shure SM58, Shure SM 57, etc... narrowing them down as I type.
Any clues? Thank you!

How to get the exact match of using DSL

How mapping have role to find the search??
GET courses/_search
return is below
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0226655,
"hits" : [
{
"_index" : "courses",
"_type" : "classroom",
"_id" : "7",
"_score" : 1.0226655,
"_source" : {
"name" : "Computer Internals 250",
"room" : "C8",
"professor" : {
"name" : "Gregg Va",
"department" : "engineering",
"facutly_type" : "part-time",
"email" : "payneg#onuni.com"
},
"students_enrolled" : 33,
"course_publish_date" : "2012-08-20",
"course_description" : "cpt Int 250 gives students an integrated and rigorous picture of applied computer science, as it comes to play in the construction of a simple yet powerful computer system. "
}
},
{
"_index" : "courses",
"_type" : "classroom",
"_id" : "4",
"_score" : 0.2876821,
"_source" : {
"name" : "Computer Science 101",
"room" : "C12",
"professor" : {
"name" : "Gregg Payne",
"department" : "engineering",
"facutly_type" : "full-time",
"email" : "payneg#onuni.com"
},
"students_enrolled" : 33,
"course_publish_date" : "2013-08-27",
"course_description" : "CS 101 is a first year computer science introduction teaching fundamental data structures and algorithms using python. "
}
}
]
}
}
mapping is below
{
"courses" : {
"mappings" : {
"properties" : {
"course_description" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"course_publish_date" : {
"type" : "date"
},
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"professor" : {
"properties" : {
"department" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"email" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"facutly_type" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"room" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"students_enrolled" : {
"type" : "long"
}
}
}
}
}
I need to return the exact match phrase professor.name=Gregg Payne
I tried below query as per direction from https://www.elastic.co/guide/en/elasticsearch/guide/current/_finding_exact_values.html
GET courses/_search
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"professor.name" : "Gregg Payne"
}
}
}
}
}
Based on your mapping, here is the query that shall work for you -
POST http://localhost:9200/courses/_search
{
"query" : {
"constant_score" : {
"filter" : {
"term" : {
"professor.name.keyword" : "Gregg Payne"
}
}
}
}
}
Answering your question in the comments - search is always about mappings :) In your case you use Term query which is about searching for exact values and it needs a keyword field. Text fields get analyzed:
Avoid using the term query for text fields.
By default, Elasticsearch changes the values of text fields as part of
analysis. This can make finding exact matches for text field values
difficult.
To search text field values, use the match query instead

How to sort an Elasticsearch query result by a determined field in DESC?

Let's say I have the following query:
curl -XGET 'localhost:9200/library/document/_search?pretty=true'
That returns me the following example results:
{
"took" : 108,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 5,
"max_score" : 1.0,
"hits" : [
{
"_index" : "library",
"_type" : "document",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"page content" : [
"Page 0:",
"Page 1: something"
],
"publish date" : "2015-12-05",
"keywords" : "sample, example, article, alzheimer",
"author" : "Author name",
"language" : "",
"title" : "Sample article",
"number of pages" : 2
}
},
{
"_index" : "library",
"_type" : "document",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"page content" : [
"Page 1: eBay",
"Page 2: Paypal",
"Page 3: Google"
],
"publish date" : "2017-08-03",
"keywords" : "something, another, thing",
"author" : "Alex",
"language" : "english",
"title" : "Microsoft Word - TL0032.doc",
"number of pages" : 21
}
},
...
I want to order by publish date and by id (different querys) so that the most recent one shows first in the list. Is it possible to do? I know I have to use the sort function of Elasticsearch together with the DESC parameter. But somehow it is not working for me.
EDIT: Mapping of the fields
curl -XGET 'localhost:9200/library/_mapping/document?pretty'
{
"library" : {
"mappings" : {
"document" : {
"properties" : {
"author" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"keywords" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"language" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"number of pages" : {
"type" : "long"
},
"page content" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"publish date" : {
"type" : "date"
},
"title" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
First you need good mapping like this:
PUT my_index
{
"mappings": {
"documents": {
"properties": {
"post_date" : {
"type": "date"
, "format": "yyyy-MM-dd HH:mm:ss"
}
}
}
}
}
And then the search:
GET my_index/_search
{
"sort": [
{
"post_date": {
"order": "desc"
}
}
]
}
Thank you everyone. Managed to get it working with this query:
curl -XGET 'localhost:9200/library/document/_search?pretty=true' -d '{"query": {"match_all": {}},"sort": [{"publish date": {"order": "desc"}}]}'
Didn't need aditional mapping.

ElasticSearch Illegal Argument Exception

I'm using Elasticsearch latest version on Ubuntu 16.04 and I'm having a little issue on putting data on it.
here is my json document (relevant part of it)
{ "products" : {
"232CDFDW89ENUXRB" : {
"sku" : "232CDFDW89ENUXRB",
"productFamily" : "Compute Instance",
"attributes" : {
"servicecode" : "AmazonEC2",
"location" : "US East (N. Virginia)",
"locationType" : "AWS Region",
"instanceType" : "d2.8xlarge",
"currentGeneration" : "Yes",
"instanceFamily" : "Storage optimized",
"vcpu" : "36",
"physicalProcessor" : "Intel Xeon E5-2676v3 (Haswell)",
"clockSpeed" : "2.4 GHz",
"memory" : "244 GiB",
"storage" : "24 x 2000 HDD",
"networkPerformance" : "10 Gigabit",
"processorArchitecture" : "64-bit",
"tenancy" : "Host",
"operatingSystem" : "Linux",
"licenseModel" : "No License required",
"usagetype" : "HostBoxUsage:d2.8xlarge",
"operation" : "RunInstances",
"enhancedNetworkingSupported" : "Yes",
"preInstalledSw" : "NA",
"processorFeatures" : "Intel AVX; Intel AVX2; Intel Turbo" }
}
}
}
and here's the returning response from ES when i try "PUT http://localhost:9200/aws"
{ "error": {
"root_cause": [
{
"type": "illegal_argument_exception",
"reason": "unknown setting [index.products.232CDFDW89ENUXRB.attributes.clockSpeed] please check that any required plugins are installed, or check the breaking changes documentation for removed settings"
}
],
"type": "illegal_argument_exception",
"reason": "unknown setting [index.products.232CDFDW89ENUXRB.attributes.clockSpeed] please check that any required plugins are installed, or check the breaking changes documentation for removed settings" }, "status": 400 }
Seems to me ES thinks that "clockSpeed" is some sort of setting...?
I was hoping to use dynamic mapping to speed the process up instead of first mapping all the document and then importing it in ES.
Any suggestion?
The issue is you are missing document type and document id while indexing a document through PUT http://localhost:9200/aws command.
Proper way to index document is:
POST my-index/my-type/my-id-1
{
"name": "kibana"
}
i.e You have to provide document type (here my-type) and document id (here my-id-1). Note that document id is optional here so if you don't provide one then elasticsearch create one alphanumeric id for you.
Other couple of ways indexing a doc:
POST my-index/my-type
{
"name": "kibana"
}
//if you want to index document through PUT then you must provide document id
PUT my-index/my-type/my-id-1
{
"name": "kibana"
}
Note: If automatic index creation is disabled then you have to create index before indexing documents.
Given a clean mapping, XPOST works perfectly for me on elasticsearch 5.1.1.,
$ curl -XPOST localhost:9200/productsapp/productdocs -d '
{ "products" : {
"sku1" : {
"sku" : "SKU-Name",
"productFamily" : "Compute Instance",
"attributes" : {
"servicecode" : "AmazonEC2",
"location" : "US East (N. Virginia)",
"locationType" : "AWS Region",
"instanceType" : "d2.8xlarge",
"currentGeneration" : "Yes",
"instanceFamily" : "Storage optimized",
"vcpu" : "36",
"physicalProcessor" : "Intel Xeon E5-2676v3 (Haswell)",
"clockSpeed" : "2.4 GHz",
"memory" : "244 GiB",
"storage" : "24 x 2000 HDD",
"networkPerformance" : "10 Gigabit",
"processorArchitecture" : "64-bit",
"tenancy" : "Host",
"operatingSystem" : "Linux",
"licenseModel" : "No License required",
"usagetype" : "HostBoxUsage:d2.8xlarge",
"operation" : "RunInstances",
"enhancedNetworkingSupported" : "Yes",
"preInstalledSw" : "NA",
"processorFeatures" : "Intel AVX; Intel AVX2; Intel Turbo" }
}
}
}'
{"_index":"productsapp","_type":"productdocs","_id":"AVuhXdYYUiSguAb0FsSX","_version":1,"result":"created","_shards":{"total":2,"successful":1,"failed":0},"created":true}
GET the inserted doc
curl -XGET localhost:9200/productsapp/productdocs/_search
{"took":11,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"productsapp","_type":"productdocs","_id":"AVuhXdYYUiSguAb0FsSX","_score":1.0,"_source":{ "products" : {
"sku1" : {
"sku" : "SKU-Name",
"productFamily" : "Compute Instance",
"attributes" : {
"servicecode" : "AmazonEC2",
"location" : "US East (N. Virginia)",
"locationType" : "AWS Region",
"instanceType" : "d2.8xlarge",
"currentGeneration" : "Yes",
"instanceFamily" : "Storage optimized",
"vcpu" : "36",
"physicalProcessor" : "Intel Xeon E5-2676v3 (Haswell)",
"clockSpeed" : "2.4 GHz",
"memory" : "244 GiB",
"storage" : "24 x 2000 HDD",
"networkPerformance" : "10 Gigabit",
"processorArchitecture" : "64-bit",
"tenancy" : "Host",
"operatingSystem" : "Linux",
"licenseModel" : "No License required",
"usagetype" : "HostBoxUsage:d2.8xlarge",
"operation" : "RunInstances",
"enhancedNetworkingSupported" : "Yes",
"preInstalledSw" : "NA",
"processorFeatures" : "Intel AVX; Intel AVX2; Intel Turbo" }
}
}
}}]}}
The mapping it creates is as below, with clockSpeed as text type.
curl -XGET localhost:9200/productsapp/productdocs/_mapping?pretty=true
{
"productsapp" : {
"mappings" : {
"productdocs" : {
"properties" : {
"products" : {
"properties" : {
"232CDFDW89ENUXRB" : {
"properties" : {
"attributes" : {
"properties" : {
"clockSpeed" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"currentGeneration" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"enhancedNetworkingSupported" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"instanceFamily" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"instanceType" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"licenseModel" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"location" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"locationType" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"memory" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"networkPerformance" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"operatingSystem" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"operation" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"physicalProcessor" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"preInstalledSw" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"processorArchitecture" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"processorFeatures" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"servicecode" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"storage" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"tenancy" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"usagetype" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"vcpu" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
},
"productFamily" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"sku" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
}
}
}
}
}
Can you check you mapping for attributes.clockSpeed and make sure its not screwed up.
And if you want to update the document do XPUT on the id of first document (which is AVuhXdYYUiSguAb0FsSX),
In following example, I am updating sku field to "sku name updated"
curl -XPUT localhost:9200/productsapp/productdocs/AVuhXdYYUiSguAb0FsSX -d '
{
"products" : {
"sku1" : {
"sku" : "sku name updated",
"productFamily" : "Compute Instance",
"attributes" : {
"servicecode" : "AmazonEC2",
"location" : "US East (N. Virginia)",
"locationType" : "AWS Region",
"instanceType" : "d2.8xlarge",
"currentGeneration" : "Yes",
"instanceFamily" : "Storage optimized",
"vcpu" : "36",
"physicalProcessor" : "Intel Xeon E5-2676v3 (Haswell)",
"clockSpeed" : "2.4 GHz",
"memory" : "244 GiB",
"storage" : "24 x 2000 HDD",
"networkPerformance" : "10 Gigabit",
"processorArchitecture" : "64-bit",
"tenancy" : "Host",
"operatingSystem" : "Linux",
"licenseModel" : "No License required",
"usagetype" : "HostBoxUsage:d2.8xlarge",
"operation" : "RunInstances",
"enhancedNetworkingSupported" : "Yes",
"preInstalledSw" : "NA",
"processorFeatures" : "Intel AVX; Intel AVX2; Intel Turbo"
}
}
}}'
{"_index":"productsapp","_type":"productdocs","_id":"AVu5OLfHPw6Pv_3O38-V","_version":2,"result":"updated","_shards":{"total":2,"successful":1,"failed":0},"created":false}

Resources