Nested query not working on Elasticsearch 1.7 if mapping with same name exists - elasticsearch

I just downgraded my local ES from 2.1.8 to 1.7.5 to match AWS Elasticsearch and now my nested queries aren't working. I have to admit I'm baffled and couldn't find anything helpful online.
I've abbreviated the following for clarity and changed some of the names but otherwise these are real outputs from my local ES. The final nested result correctly returned file documents with the matching package on 2.1 but nothing on 1.7.
Update: I actually have another nested field that is not exhibiting this problem. The difference is the value for that is a single nested object instead of an array. Known issue?
Update #2: Changing the value to a single value made no difference. However, changing the nested property name from package to packages made the problem go away. The only thing I can think of is that I also have a mapping called package, would that cause a problem?
Mapping
"file": {
"dynamic": "strict",
"_all": {
"enabled": false
},
"properties": {
"name": {
"type": "string"
},
"type": {
"type": "string",
"index": "not_analyzed"
},
"package": {
"type": "nested",
"dynamic": "strict",
"properties": {
"name": {
"type": "string",
"index": "not_analyzed"
},
"path": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
Document
Search
{ "query": {"term": {"type": "file"}} }
Result
{
"_index": "blah",
"_type": "file",
"_id": "slkdfjsdfjsoijfoisjfisdjf",
"_score": 7.8872123,
"_source": {
"name": "foo",
"type": "file",
"package": [
{
"name": "the_package",
"path": "the_package!path"
}
]
}
}
Term Vectors
localhost:9200/blah/file/slkdfjsdfjsoijfoisjfisdjf/_termvector?pretty=true&fields=package.name
{
"_index": "blah",
"_type": "file",
"_id": "slkdfjsdfjsoijfoisjfisdjf",
"_version": 1,
"found": true,
"took": 1,
"term_vectors": {
"package.name": {
"field_statistics": {
"sum_doc_freq": 1040,
"doc_count": 1040,
"sum_ttf": 1040
},
"terms": {
"the_package": {
"term_freq": 1,
"tokens": [
{
"position": 0,
"start_offset": 0,
"end_offset": 7
}
]
}
}
}
}
}
Nested Query
{
"query": {
"nested":{
"path": "package",
"query": {
"term": {
"package.name": "the_package"
}
}
}
}
}
Result
{
"took": 8,
"timed_out": false,
"_shards": {
"total": 10,
"successful": 10,
"failed": 0
},
"hits": {
"total": 0,
"max_score": null,
"hits": []
}
}

Following update #2 I tried deleting the package mapping and sure enough the nested query now works as expected. I'll update my mappings to avoid this issue.
Nothing in the ES nested object documentation suggests this should be an issue and it has obviously been fixed between 1.7.5 and 2.1.8 so if anyone knows of such documentation or a link to a fixed bug feel free to add. Posting this as an answer in case anyone else hits this.

Related

Source to destination Key Field mapping in Elastic Search

I have a elastic search index with source data coming in the following way:
"_source": {
"email": "smithamber#example.com",
"time": "2022-09-08T13:52:50.347861",
"message": "Pattern thank talk mention. Manage nearly tell beat. Difficult husband feel talk radio however.",
"sIp": "192.168.11.156",
"dIp": "80.254.211.60",
"ts": "2022-09-08T13:52:50"
}
Now I want a way to treat dynamically map #timestamp [destination key] field of ES doc to be time [source key]. For this i am using:
"runtime_mappings": {
"#timestamp": {
"type": "date",
"format": "yyyyMMdd'T'HHmmss.SSSZ",
"script": {
"source": "if (doc[\"time\"].size() == 0) {return} else {return doc[\"time\"].value;}",
"lang": "painless"
}
}
}
However, this does not work. Is there a better way to map source key field to destination key field in elastic search. I am open to static mapping as well if we set once before creating the index for one kind of source data.
I am looking for correct syntax for mapping my field.
Edited:
When I add the query -
{ "query": {
"range": {
"#timestamp": {
"gte": "now-5d",
"lte": "now"
}
}
}
}
I see no hits.
{
"took": 20,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 0,
"relation": "eq"
},
"max_score": null,
"hits": []
}
}
However, same query on field time gets me all filtered docs.
{
"took": 27,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 10000,
"relation": "gte"
},
"max_score": 1.0,
"hits": [
{
"_index": "topic-indexer-xxx",
"_id": "c28sIYMB0xJUJru8c47O",
"_score": 1.0,
"_source": {
"email": "albertthompson#example.com",
"time": "2022-09-07T15:25:33.672016",
"message": "Candidate future staff ever former run. Like quality personal specific trouble cell money move. Available majority memory model thing TV wrong. Summer anyone light key.",
"sIp": "192.168.103.75",
"dIp": "191.27.68.163"
}
},
....
}
For mapping I have also tried dynamic templates; but still no results on query for #timestamp field:
{
"dynamic_templates": [
{
"#timestamp": {
"match": "time",
"mapping": {
"type": "date",
"format": "strict_date_optional_time",
"copy_to": "#timestamp"
}
}
}
]
}
With #paulo's response, I just did a little fine tuning to resolve the issue; The below mapping (as set) works and then I can run range queries on the #timestamp field:
{
"runtime": {
"#timestamp": {
"type": "date",
"script": {
"source": "if (doc['time'].size() != 0){ emit(doc['time'].value.toEpochMilli());}",
"lang": "painless"
}
}
},
"properties": {
"#timestamp": {
"type": "date"
}
}
}
Tldr;
I feel you go mixed up in your painless script.
Please find below an example you should be able to reproduce on your side.
Time is already a date on my side. Elasticsearch was able to detect it automatically.
On another note, using runtime fields while very flexible, may lead to performance issue on the long run.
Maybe you should be looking into ingest pipeline.
Solution
POST /73684302/_doc
{
"email": "smithamber#example.com",
"time": "2022-09-08T13:52:50.347861",
"message": "Pattern thank talk mention. Manage nearly tell beat. Difficult husband feel talk radio however.",
"sIp": "192.168.11.156",
"dIp": "80.254.211.60",
"ts": "2022-09-08T13:52:50"
}
POST /73684302/_doc
{
"email": "smithamber#example.com",
"message": "Pattern thank talk mention. Manage nearly tell beat. Difficult husband feel talk radio however.",
"sIp": "192.168.11.156",
"dIp": "80.254.211.60",
"ts": "2022-09-08T13:52:50"
}
GET /73684302/_search
{
"runtime_mappings": {
"#timestamp": {
"type": "date",
"script": {
"source": """
if (doc["time"].size() != 0){
emit(doc["time"].value.toEpochMilli());
}
""",
"lang": "painless"
}
}
},
"_source": false,
"fields": ["#timestamp"]
}

Some Elastic fields DSL query searchable and some not

I'm using Elastic Search 6.8.1 and Dynamic Mapping. I have one document in the index now, and am testing out searching on various fields. I make a post to http://localhost:9200/documents/_search and send a DSL query
{
"query":
{"bool":{"must":{"term":{"name": "item2"}}} }
}
and I get the document I expect:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2876821,
"hits": [
{
"_index": "documents",
"_type": "document",
"_id": "nRMOs5DZg",
"_score": 0.2876821,
"_source": {
"freeform": "DEF",
"name": "item2",
"url": "s3://mybucket/key",
"visible": true
}
}
]
}
}
Now, I want to make sure that I can search on the "freeform" field by changing the query to
{
"query":
{"bool":{"must":{"term":{"freeform": "DEF"}}} }
}
This results in no hits and I can't understand why.
[EDIT]
Here is the dynamic mapping
{
"documents": {
"aliases": {},
"mappings": {
"document": {
"properties": {
"freeform": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"url": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
},
"visible": {
"type": "boolean"
}
}
}
},
"settings": {
"index": {
"creation_date": "1564776393764",
"number_of_shards": "5",
"number_of_replicas": "1",
"uuid": "2er2TF-ySEKgk6gd32K6Ig",
"version": {
"created": "6080199"
},
"provided_name": "documents"
}
}
}
}
It's hard to answer without seeing your mapping, but my guess would be this:
The dynamic mapping tries to guess the data type to assign to your fields; the default for string fields is the "text" data type, which means their value is analyzed and stored as a list of normalized terms, which is useful for free-text search. The string "item2" happens to survive this analysis unchanged, but "DEF" would be analyzed to "def".
Since you're using a term query, the queried term doesn't go through the same analysis process, so you have to query using the analyzed term in order to match the document.
Try searching for "def" instead of "DEF" to test this hypothesis. Also, take a look at the automatically-generated mapping for your index and you'll see which data type each field was mapped to.
If this is indeed the case, you can do one of several things:
If you want exact-string matching: change the mapping from text to keyword (you can control dynamic mapping using Dynamic Templates); or alternatively search using the keyword sub-field which is created automatically for you by searching against freeform.raw instead of freeform.
If you want "free-text" matching: use a match query instead of a term query so both the input and the document value undergo the same analysis (but make sure you understand how analysis and match queries work).

How to perform an exact match query on an analyzed field in Elasticsearch?

This is probably a very commonly asked question, however the answers I've got so far isn't satisfactory.
Problem:
I have an es index that is composed of nearly 100 fields. Most of the fields are string type and set as analyzed. However, the query can be both partial (match) or exact (more like term). So, if my index contains a string field with value super duper cool pizza, there can be partial query like duper super and will match with the document, however, there can be exact query like cool pizza which should not match the document. On the other hand, Super Duper COOL PIzza again should match with this document.
So far, the partial match part is easy, I used AND operator in a match query. However can't get the other type done.
I have looked into other posts related to this problem and this post contains the closest solution:
Elasticsearch exact matches on analyzed fields
Out of the three solutions, the first one feels very complex as I have a lot of fields and I do not use the REST api, I am creating queries dynamically using QueryBuilders with NativeSearchQueryBuilder from their Java api. Also it generates a lots of possible patterns which I think will cause performance issues.
The second one is a much easier solution but again, I have to maintain a lot more (almost) redundant data and, I don't think using term queries are ever going to solve my problem.
The last one has a problem I think, it will not prevent super duper to be matched with super duper cool pizza which is not the output I want.
So is there any other way I can achieve the goal? I can post some sample mapping if required for clearing the question farther. I am already keeping the source as well (in case that can be used). Please feel free to suggest any improvements as well.
Thanks in advance.
[UPDATE]
Finally, I used multi_field, keeping a raw field for exact queries. When I insert I use some custom modification on data, and during searching, I used the same modification routines on input text. This part is not handled by Elasticsearch. If you want to do that, you have to design appropriate analyzers as well.
Index settings and mapping queries:
PUT test_index
POST test_index/_close
PUT test_index/_settings
{
"index": {
"analysis": {
"analyzer": {
"standard_uppercase": {
"type": "custom",
"char_filter": ["html_strip"],
"tokenizer": "keyword",
"filter": ["uppercase"]
}
}
}
}
}
PUT test_index/doc/_mapping
{
"doc": {
"properties": {
"text_field": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"analyzer": "standard_uppercase"
}
}
}
}
}
}
POST test_index/_open
Inserting some sample data:
POST test_index/doc/_bulk
{"index":{"_id":1}}
{"text_field":"super duper cool pizza"}
{"index":{"_id":2}}
{"text_field":"some other text"}
{"index":{"_id":3}}
{"text_field":"pizza"}
Exact query:
GET test_index/doc/_search
{
"query": {
"bool": {
"must": {
"bool": {
"should": {
"term": {
"text_field.raw": "PIZZA"
}
}
}
}
}
}
}
Response:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.4054651,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "3",
"_score": 1.4054651,
"_source": {
"text_field": "pizza"
}
}
]
}
}
Partial query:
GET test_index/doc/_search
{
"query": {
"bool": {
"must": {
"bool": {
"should": {
"match": {
"text_field": {
"query": "pizza",
"operator": "AND",
"type": "boolean"
}
}
}
}
}
}
}
}
Response:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "3",
"_score": 1,
"_source": {
"text_field": "pizza"
}
},
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 0.5,
"_source": {
"text_field": "super duper cool pizza"
}
}
]
}
}
PS: These are generated queries, that's why there are some redundant blocks, as there would be many other fields concatenated into the queries.
Sad part is, now I need to rewrite the whole mapping again :(
I think this will do what you want (or at least come as close as is possible), using the keyword tokenizer and lowercase token filter:
PUT /test_index
{
"settings": {
"analysis": {
"analyzer": {
"lowercase_analyzer": {
"type": "custom",
"tokenizer": "keyword",
"filter": ["lowercase_token_filter"]
}
},
"filter": {
"lowercase_token_filter": {
"type": "lowercase"
}
}
}
},
"mappings": {
"doc": {
"properties": {
"text_field": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
},
"lowercase": {
"type": "string",
"analyzer": "lowercase_analyzer"
}
}
}
}
}
}
}
I added a couple of docs for testing:
POST /test_index/doc/_bulk
{"index":{"_id":1}}
{"text_field":"super duper cool pizza"}
{"index":{"_id":2}}
{"text_field":"some other text"}
{"index":{"_id":3}}
{"text_field":"pizza"}
Notice we have the outer text_field set to be analyzed by the standard analyzer, then a sub-field raw that's not_analyzed (you may not want this one, I just added it for comparison), and another sub-field lowercase that creates tokens exactly the same as the input text, except that they have been lowercased (but not split on whitespace). So this match query returns what you expected:
POST /test_index/_search
{
"query": {
"match": {
"text_field.lowercase": "Super Duper COOL PIzza"
}
}
}
...
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.30685282,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 0.30685282,
"_source": {
"text_field": "super duper cool pizza"
}
}
]
}
}
Remember that the match query will use the field's analyzer against the search phrase as well, so in this case searching for "super duper cool pizza" would have exactly the same effect as searching for "Super Duper COOL PIzza" (you could still use a term query if you want an exact match).
It's useful to take a look at the terms generated in each field by the three documents, since this is what your search queries will be working against (in this case raw and lowercase have the same tokens, but that's only because all the inputs were lower-case already):
POST /test_index/_search
{
"size": 0,
"aggs": {
"text_field_standard": {
"terms": {
"field": "text_field"
}
},
"text_field_raw": {
"terms": {
"field": "text_field.raw"
}
},
"text_field_lowercase": {
"terms": {
"field": "text_field.lowercase"
}
}
}
}
...{
"took": 26,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 0,
"hits": []
},
"aggregations": {
"text_field_raw": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "pizza",
"doc_count": 1
},
{
"key": "some other text",
"doc_count": 1
},
{
"key": "super duper cool pizza",
"doc_count": 1
}
]
},
"text_field_lowercase": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "pizza",
"doc_count": 1
},
{
"key": "some other text",
"doc_count": 1
},
{
"key": "super duper cool pizza",
"doc_count": 1
}
]
},
"text_field_standard": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "pizza",
"doc_count": 2
},
{
"key": "cool",
"doc_count": 1
},
{
"key": "duper",
"doc_count": 1
},
{
"key": "other",
"doc_count": 1
},
{
"key": "some",
"doc_count": 1
},
{
"key": "super",
"doc_count": 1
},
{
"key": "text",
"doc_count": 1
}
]
}
}
}
Here's the code I used to test this out:
http://sense.qbox.io/gist/cc7564464cec88dd7f9e6d9d7cfccca2f564fde1
If you also want to do partial word matching, I would encourage you to take a look at ngrams. I wrote up an introduction for Qbox here:
https://qbox.io/blog/an-introduction-to-ngrams-in-elasticsearch

Elasticsearch aggregation turns results to lowercase

I've been playing with ElasticSearch a little and found an issue when doing aggregations.
I have two endpoints, /A and /B. In the first one I have parents for the second one. So, one or many objects in B must belong to one object in A. Therefore, objects in B have an attribute "parentId" with parent index generated by ElasticSearch.
I want to filter parents in A by children attributes of B. In order to do it, I first filter children in B by attributes and get its unique parent ids that I'll later use to get parents.
I send this request:
POST http://localhost:9200/test/B/_search
{
"query": {
"query_string": {
"default_field": "name",
"query": "derp2*"
}
},
"aggregations": {
"ids": {
"terms": {
"field": "parentId"
}
}
}
}
And get this response:
{
"took": 91,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "child",
"_id": "AU_fjH5u40Hx1Kh6rfQG",
"_score": 1,
"_source": {
"parentId": "AU_ffvwM40Hx1Kh6rfQA",
"name": "derp2child2"
}
},
{
"_index": "test",
"_type": "child",
"_id": "AU_fjD_U40Hx1Kh6rfQF",
"_score": 1,
"_source": {
"parentId": "AU_ffvwM40Hx1Kh6rfQA",
"name": "derp2child1"
}
},
{
"_index": "test",
"_type": "child",
"_id": "AU_fjKqf40Hx1Kh6rfQH",
"_score": 1,
"_source": {
"parentId": "AU_ffvwM40Hx1Kh6rfQA",
"name": "derp2child3"
}
}
]
},
"aggregations": {
"ids": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "au_ffvwm40hx1kh6rfqa",
"doc_count": 3
}
]
}
}
}
For some reason, the filtered key is returned in lowercase, hence not being able to request parent to ElasticSearch
GET http://localhost:9200/test/A/au_ffvwm40hx1kh6rfqa
Response:
{
"_index": "test",
"_type": "A",
"_id": "au_ffvwm40hx1kh6rfqa",
"found": false
}
Any ideas on why is this happening?
The difference between the hits and the results of the aggregations is that the aggregations work on the created terms. They will also return the terms. The hits return the original source.
How are these terms created? Based on the chosen analyser, which in your case is the default one, the standard analyser. One of the things this analyser does is lowercasing all the characters of the terms. Like mentioned by Andrei, you should configure the field parentId to be not_analyzed.
PUT test
{
"mappings": {
"B": {
"properties": {
"parentId": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
I am late from the party but I had the same issue and understood that it caused by the normalization.
You have to change the mapping of the index if you want to prevent any normalization changes the aggregated values to lowercase.
You can check the current mapping in the DevTools console by typing
GET /A/_mapping
GET /B/_mapping
When you see the structure of the index you have to see the setting of the parentId field.
If you don't want to change the behaviour of the field but you also want to avoid the normalization during the aggregation then you can add a sub-field to the parentId field.
For changing the mapping you have to delete the index and recreate it with the new mapping:
creating the index
Adding multi-fields to an existing field
In your case it looks like this (it contains only the parentId field)
PUT /B/_mapping
{
"properties": {
"parentId": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
then you have to use the subfield in the query:
POST http://localhost:9200/test/B/_search
{
"query": {
"query_string": {
"default_field": "name",
"query": "derp2*"
}
},
"aggregations": {
"ids": {
"terms": {
"field": "parentId.keyword",
"order": {"_key": "desc"}
}
}
}
}

How to use _timestamp in a scripted update

I was trying to come up with an elegant answer to this question and ran into an unexpected problem. The basic idea is to update a document based on its current timestamp. Seems straightforward enough, but I seem to be missing something. At the bottom of the Update API page, the ES docs say:
It also allows to update the ttl of a document using ctx._ttl and timestamp using ctx._timestamp. Note that if the timestamp is not updated and not extracted from the _source it will be set to the update date.
The ES documentation is often enigmatic at best, especially when it comes to scripting, but I took this to mean that I could use the _timestamp field in an update script.
So I set up a simple index with a timestamp:
PUT /test_index
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"doc": {
"_timestamp": {
"enabled": true,
"store": true,
"path": "doc_date",
"format" : "YYYY-MM-dd"
},
"properties": {
"doc_date": {
"type": "date",
"format" : "YYYY-MM-dd"
},
"doc_text": {
"type": "string"
}
}
}
}
}
and added some docs:
POST /test_index/_bulk
{"index":{"_index":"test_index","_type":"doc","_id":1}}
{"doc_text":"doc1", "doc_date":"2015-2-5"}
{"index":{"_index":"test_index","_type":"doc","_id":2}}
{"doc_text":"doc2", "doc_date":"2015-2-10"}
{"index":{"_index":"test_index","_type":"doc","_id":3}}
{"doc_text":"doc3", "doc_date":"2015-2-15"}
If I query for the first doc, I get back what I expect:
POST /test_index/_search
{
"query": {
"match": {
"doc_text": "doc1"
}
},
"fields": [
"_timestamp",
"_source"
]
}
...
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.4054651,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 1.4054651,
"_source": {
"doc_text": "doc1",
"doc_date": "2015-2-5"
},
"fields": {
"_timestamp": 1423094400000
}
}
]
}
}
So far so good. Now I want to conditionally update the first doc, based on its timestamp. First I tried this, and got an error:
POST /test_index/doc/1/_update
{
"script": "if(ctx._timestamp < new_ts){ctx._source.doc_date=new_date;ctx._source.doc_text=new_text}",
"params": {
"new_ts": 1423526400000,
"new_date": "2015-2-10",
"new_text": "doc1-updated"
}
}
...
{
"error": "ElasticsearchIllegalArgumentException[failed to execute script]; nested: PropertyAccessException[[Error: could not access: _timestamp; in class: java.util.HashMap]\n[Near : {... if(ctx._timestamp < new_ts){ctx._ ....}]\n ^\n[Line: 1, Column: 4]]; ",
"status": 400
}
Then I tried this:
POST /test_index/doc/1/_update
{
"script": "if(ctx[\"_timestamp\"] < new_ts){ctx._source.doc_date=new_date;ctx._source.doc_text=new_text}",
"params": {
"new_ts": 1423526400000,
"new_date": "2015-2-10",
"new_text": "doc1-updated"
}
}
...
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_version": 2
}
I didn't get an error, but the update didn't happen:
POST /test_index/_search
{
"query": {
"match": {
"doc_text": "doc1"
}
},
"fields": [
"_timestamp",
"_source"
]
}
...
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1.287682,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 1.287682,
"_source": {
"doc_text": "doc1",
"doc_date": "2015-2-5"
},
"fields": {
"_timestamp": 1423094400000
}
}
]
}
}
Just out of curiosity, I inverted the conditional:
POST /test_index/doc/1/_update
{
"script": "if(ctx[\"_timestamp\"] > new_ts){ctx._source.doc_date=new_date;ctx._source.doc_text=new_text}",
"params": {
"new_ts": 1423526400000,
"new_date": "2015-2-10",
"new_text": "doc1-updated"
}
}
with the same result: no update.
Okay, so as a sanity check I tried to set the timestamp, and got an error:
POST /test_index/doc/1/_update
{
"script": "ctx._source.doc_date=new_date;ctx._source.doc_text=new_text;ctx._timestamp=new_ts",
"params": {
"new_ts": 1423526400000,
"new_date": "2015-2-10",
"new_text": "doc1-updated"
}
}
...
{
"error": "ClassCastException[java.lang.Long cannot be cast to java.lang.String]",
"status": 500
}
I also tried it with "ctx[\"_timestamp\"]=new_ts;", and got the same error.
So it seems that the _timestamp field is not available to the script, even though the documentation says it is. What am I doing wrong?
I also tried updating without the conditional or updating the timestamp, and it worked as expected.
I used Elasticsearch version 1.3.4 (with dynamic scripting enabled, obviously), running on an Ubuntu 12 VM.
Here is the code I used to set this up:
http://sense.qbox.io/gist/ca2b3c6b84572e5f87d57d22f8c38252fa4ee216

Resources