ELASTICSEARCH - Include date automatically without a predefined date field - elasticsearch

It is possible to include a "date and time" field in a document that receives elasticsearch without it being previously defined.
The date and time corresponds to the one received by the json to elasticsearch
This is the mapping:
{
"mappings": {
"properties": {
"entries":{"type": "nested"
}
}
}
}
Is it possible that it can be defined in the mapping field so that elasticsearch includes the current date automatically?

What you can do is to define an ingest pipeline to automatically add a date field when your document are indexed.
First, create a pipeline, like this (_ingest.timestamp is a built-in field that you can access):
PUT _ingest/pipeline/add-current-time
{
"description" : "automatically add the current time to the documents",
"processors" : [
{
"set" : {
"field": "#timestamp",
"value": "_ingest.timestamp"
}
}
]
}
Then when you index a new document, you need to reference the pipeline, like this:
PUT test-index/_doc/1?pipeline=add-current-time
{
"my_field": "test"
}
After indexing, the document would look like this:
GET test-index/_doc/1
=>
{
"#timestamp": "2020-08-12T15:48:00.000Z",
"my_field": "test"
}
UPDATE:
Since you're using index templates, it's even easier because you can define a default pipeline to be run for each indexed documents.
In your index templates, you need to add this to the index settings:
{
"order": 1,
"index_patterns": [
"attom"
],
"aliases": {},
"settings": {
"index": {
"number_of_shards": "5",
"number_of_replicas": "1",
"default_pipeline": "add-current-time" <--- add this
}
},
...
Then you can keep indexing documents without referencing the pipeline, it will be automatic.

"value": "{{{_ingest.timestamp}}}"
Source

Related

Elastic search get the index template

I want to find out with a rest call to which template an index is bound to. So basically pass the index name and get which template it belongs to.
Basically, I know I can list all the templates and see by patterns what indices will bind to the template, but we have so many templates and so many orderings on them that it's hard to tell.
You can use the _meta mapping field for this in order to attach any custom information to your indexes.
So let's say you have an index template like this
PUT _index_template/template_1
{
"index_patterns": ["index*"],
"template": {
"settings": {
"number_of_shards": 1
},
"mappings": {
"_meta": { <---- add this
"template": "template_1" <---- add this
}, <---- add this
"_source": {
"enabled": true
},
"properties": {
"host_name": {
"type": "keyword"
},
"created_at": {
"type": "date",
"format": "EEE MMM dd HH:mm:ss Z yyyy"
}
}
},
"aliases": {
}
},
"_meta": {
"description": "my custom template"
}
}
Once you create and index that matches that template's pattern, the _meta field will also make it into the new index you're creating.
PUT index1
Then if you get that new index settings, you'll see from which template it was created:
GET index1?filter_path=**._meta.template
=>
{
"index1" : {
"mappings" : {
"_meta" : {
"template" : "template_1" <---- you get this
},

How to restrict a field in Elasticsearch document to be updated?

I have an Elasticsearch index, I am saving document into the index.
Is there any way, when I try to index/save same document(with same _id) again, with new value/updated value for some field,
Elasticsearch should throw an exception,
only if that particular field is what we are trying to update? for other field it can work as default behavior.
For Ex: I have index mapping as below
PUT /_index_template/example_template
{
"index_patterns": [
"example*"
],
"priority": 1,
"template": {
"aliases": {
"example":{}
},
"mappings": {
"dynamic":"strict",
"_source":
{"enabled": false},
"properties": {
"SomeID": {
"type": "keyword"
},
"AnotherInfo": {
"type": "keyword"
}
}
}
}
}
Then I create an index based on this index mapping
PUT example01
After that I save a document against the this index
POST example01/_doc/1
{
"SomeId": "abcdedf",
"AnotherInfo":"xyze"
}
Now next time if I try to save the document again with different "SomeId" value
POST example01/_doc/1
{
"SomeId": "uiiuiiu",
"AnotherInfo":"xyze"
}
I want to say "Sorry "someId" field can not be updated"
basically, Preventing document field from getting updated in Elastic
Search.
Thanks in advance!
Elastic search support revision on documents by default it meant it trace the changes on indexed documents with their generated _id and each time you manipulate the document for example with id 17 it's increase the value of #Version field so you can not have two duplicated document with same id if you don't have custom_routing but if you have custom routing always be careful about duplication on _id field because this field is not just identifier it's also keep record of which shard it located.
More over i guess elastic has no way to enforce restrictions at the field level within a document and you may control restriction on updating fields on application level or field level security based on roles.
As an example of field level security consider below role definition grants read access only to the category, #timestamp, and message fields in all the events-* data streams and indices.
POST /_security/role/test_role1
{
"indices": [
{
"names": [ "events-*" ],
"privileges": [ "read" ],
"field_security" : {
"grant" : [ "category", "#timestamp", "message" ]
}
}
]
}

ElasticSearch Indexing, Adding Fields

I would like to use elastic search to index the JSON schema provided below
{
"data": "etc",
"metadata": {
"foo":"bar",
"baz": "etc"
}
}
However the metadata can vary and I do not know all the fields that could be present. Is there a way to tell elastic search that if it sees a value in the metadata object to index it in a certain way? (I do know that all the values would be strings)
Thanks
Yes, you can do that using dynamic templates, basically like this:
PUT my_index
{
"mappings": {
"_doc": {
"dynamic_templates": [
{
"full_name": {
"path_match": "metadata.*",
"mapping": {
"type": "text" <---- add your desired mapping here
}
}
}
]
}
}
}

Elasticsearch: Duplicate properties in a single record

I have to find every document in Elasticsearch that has duplicate properties. My mapping looks something like this:
"type": {
"properties": {
"thisProperty": {
"properties" : {
"id":{
"type": "keyword"
},
"other_id":{
"type": "keyword"
}
}
}
The documents I have to find have a pattern like this:
"thisProperty": [
{
"other_id": "123",
"id": "456"
},
{
"other_id": "123",
"id": "456"
},
{
"other_id": "4545",
"id": "789"
}]
So, I need to find any document by type that has repeat property fields. Also I cannot search by term because I do not what the value of either Id field is. So far the API hasn't soon a clear way to do this via query and the programmatic approach is possible but cumbersome. Is it possible to get this result set in a elasticsearch query? If so, how?
(The version of Elasticsearch is 5.3)

Range filter not working

I'm using the elasticsearch search engine and when I run the code below, the results returned doesn't match the range criteria(I get items with published date below the desired limit):
#!/bin/bash
curl -X GET 'http://localhost:9200/newsidx/news/_search?pretty' -d '{
"fields": [
"art_text",
"title",
"published",
"category"
],
"query": {
"bool": {
"should": [
{
"fuzzy": {"art_text": {"boost": 89, "value": "google" }}
},
{
"fuzzy": {"art_text": {"boost": 75, "value": "twitter" }}
}
],
"minimum_number_should_match": 1
}
},
"filter" : {
"range" : {
"published" : {
"from" : "2013-04-12 00:00:00"
}
}
}
}
'
I also tried putting the range clause in a must one, inside the bool query, but the results were the same.
Edit: I use elasticsearch to search in a mongodb through a river plugin. This is the script I ran to search the mongodb db with ES:
#!/bin/bash
curl -X PUT localhost:9200/_river/mongodb/_meta -d '{
"type":"mongodb",
"mongodb": {
"db": "newsful",
"collection": "news"
},
"index": {
"name": "newsidx",
"type": "news"
}
}'
Besides this, I didn't create another indexes.
Edit 2:
A view to the es mappings:
http://localhost:9200/newsidx/news/_mapping
published: {
type: "string"
}
The reason is in your mapping. The published field, which you are using as a date, is indexed as a string. That's probably because the date format you are using is not the default one in elasticsearch, thus the field type is not auto-detected and it's indexed as a simple string.
You should change your mapping using the put mapping api. You need to define the published field as a date there, specifying the format you're using (can be more than one) and reindex your data.
After that your range filter should work!

Resources