Does Elasticsearch curator Rollover action doesn't support Date math in the name? - elasticsearch

I'm trying to use the date math in the elasticsearch curator rollover action, but it seems like it doesn't support alias name as a date math like '<indexname-{now/d}>'
---
# Remember, leave a key empty if there is no value. None will be a string,
# not a Python "NoneType"
#
# Also remember that all examples have 'disable_action' set to True. If you
# want to use this action as a template, be sure to set this to False after
# copying it.
actions:
1:
action: rollover
description: >-
Rollover the index associated with alias 'indexname-{now/d}', index should be in the format of indexname-{now/d}-000001.
options:
disable_action: False
name: '<indexname-{now/d}>'
conditions:
max_age: 1d
max_docs: 1000000
max_size: 50g
extra_settings:
index.number_of_shards: 3
index.number_of_replicas: 1
It is taking that name '<indexname-{now/d}>' as a string/ alias name and gives an error
Failed to complete action: rollover. <class 'ValueError'>: Unable to perform index rollover with alias "<indexname-{now/d}>".
I'll suggest adding the support for date math in the alias name for rollover action in the elasticsearch curator.

What it appears you are trying to do is to rollover an alias named indexname-2021.10.28. Is that correct? I mention this because the name directive is for the alias name rather than the index name. Additionally, using this pattern would be looking for an alias with today's date {now/d}, but the rollover conditions appear to be looking for something older than 1 day (or 1M docs, or over 50g). If that alias is older than 24 hours, the lookup will fail because it's looking for something that has likely not been created yet.
I presume you are more likely looking for an alias with a name like index name that points to indices that look like indexname-YYYY.MM.dd. Did you know that this behavior is automatic if the original index and alias combination are created with date math?
For example, if I had created this index + alias combination yesterday (and it's URLencoded for use in the dev tools console):
# PUT <my-index-{now/d}-000001>
PUT %3Cmy-index-%7Bnow%2Fd%7D-000001%3E
{
"aliases": {
"my-index": {
"is_write_index": true
}
}
}
The results would say:
{
"acknowledged" : true,
"shards_acknowledged" : true,
"index" : "my-index-2021.10.27-000001"
}
And if I forced a rollover today:
POST my-index/_rollover
{
"conditions": {
"max_age": "1d"
}
}
This is the resulting output:
{
"acknowledged" : true,
"shards_acknowledged" : true,
"old_index" : "my-index-2021.10.27-000001",
"new_index" : "my-index-2021.10.28-000002",
"rolled_over" : true,
"dry_run" : false,
"conditions" : {
"[max_age: 1d]" : true
}
}
With this behavior, it's very simple to get a date in the index name while still using default rollover behavior.

Related

what is the reason for this logstash error("error"=>{"type"=>"illegal_argument_exception", "reason"=>"mapper)

bellow is my filebeat config and I added a logId :
- type: log
fields:
source: 'filebeat2'
logID: debugger
fields_under_root: true
enabled: true
paths:
- /var/log/path/*
and below is my output section of logstash conf :
if "debugger" in [logID] and ("" not in [Exeption]) {
elasticsearch {
user => ""
password => ""
hosts => ["https://ip:9200"]
index => "debugger"
}
}
and I put some log files in path(10 files) and I randomely got this error in logstash-plain.log :
{"index"=>{"_index"=>"debugger", "_type"=>"_doc", "_id"=>"9-DmvoIBPs8quoIM7hCa",
"status"=>400, "error"=>{"type"=>"illegal_argument_exception", "reason"=>"mapper
[request.dubugeDate] cannot be changed from type [text] to [long]"}}}}
and also this :
"error"=>{"type"=>"mapper_parsing_exception", "reason"=>"failed to parse field
[debug.registrationDate] of type [long] in document with id 'Bt_YvoIBPs8quoIMXfwd'.
Preview of field's value: '2022-08-1707:37:08.256'", "caused_by"=>
{"type"=>"illegal_argument_exception", "reason"=>"For input string: \"2022-08-
1707:37:08.256\""}}}}}
can anybody help me ?
Look like, in the first case, in the index mapping, your field request.dubugeDate defined as long, and you try to ingest some string data.
In the second case the field debug.registrationDate find mapping, defined as long, and you try to ingest string (date).
You can check the mapping of your index with GET /YOUR_INDEX/_mapping command from the Kibana or same via curl

Conditional indexing not working in ingest node pipelines

Am trying to implement an index template with datastream enabled and then set contains in ingest node pipelines. So that I could get metrics with below-mentioned index format :
.ds-metrics-kubernetesnamespace
I had tried this sometime back and I did these things as mentioned above and it was giving metrics in such format but now when I implement the same it's not changing anything in my index. I cannot see any logs in openshift cluster so ingest seems to be working fine(when I add a doc and test it works fine)
PUT _ingest/pipeline/metrics-index
{
"processors": [
{
"set": {
"field": "_index",
"value": "metrics-{{kubernetes.namespace}}",
"if": "ctx.kubernetes?.namespace==\"dev\""
}
}
]
}
This is the ingest node condition I have used for indexing.
metricbeatConfig:
metricbeat.yml: |
metricbeat.modules:
- module: kubernetes
enabled: true
metricsets:
- state_node
- state_daemonset
- state_deployment
- state_replicaset
- state_statefulset
- state_pod
- state_container
- state_job
- state_cronjob
- state_resourcequota
- state_service
- state_persistentvolume
- state_persistentvolumeclaim
- state_storageclass
- event
Since you're using Metricbeat, you have another way to do this which is much better.
Simply configure your elasticsearch output like this:
output.elasticsearch:
hosts: ["http://<host>:<port>"]
indices:
- index: "%{[kubernetes.namespace]}"
mappings:
dev: "metrics-dev"
default: "metrics-default"
or like this:
output.elasticsearch:
hosts: ["http://<host>:<port>"]
indices:
- index: "metrics-%{[kubernetes.namespace]}"
when.equals:
kubernetes.namespace: "dev"
default: "metrics-default"
or simply like this would also work if you have plenty of different namespaces and you don't want to manage different mappings:
output.elasticsearch:
hosts: ["http://<host>:<port>"]
index: "metrics-%{[kubernetes.namespace]}"
Steps to create datastreams in elastic stack:
create an ILM policy
Create an index template that has an index pattern that matches with the index pattern of metrics/logs.(Set number of primary shards/replica shards and mapping in index template)
Set a condition in ingest pipeline.(Make sure no such index exist)
If these conditions meet it will create a data stream and logs/metrics would have an index starting with .ds- and it will be hidden in index management.
In my case the issue was I did not have enough permission to create a custom index. When I checked my OpenShift logs I could find metricbeat was complaining about the privilege. So I gave Superuser permission and then used ingest node to set conditional indexing
PUT _ingest/pipeline/metrics-index
{
"processors": [
{
"set": {
"field": "_index",
"value": "metrics-{{kubernetes.namespace}}",
"if": "ctx.kubernetes?.namespace==\"dev\""
}
}
]
}

Individually update a large amount of documents with the Python DSL Elasticsearch UpdateByQuery

I'm trying to use the UpdateByQuery to update a property of a large amount of documents. But as each document will have a different value, I need to execute ir one by one. I'm traversing a big amount of documents, and for each document I call this funcion:
def update_references(self, query, script_source):
try:
ubq = UpdateByQuery(using=self.client, index=self.index).update_from_dict(query).script(source=script_source)
ubq.execute()
except Exception as err:
return False
return True
Some example values are:
query = {'query': {'match': {'_id': 'VpKI1msBNuDimFsyxxm4'}}}
script_source = 'ctx._source.refs = [\'python\', \'java\']'
The problem is that when I do that, I got an error: "Too many dynamic script compilations within, max: [75/5m]; please use indexed, or scripts with parameters instead; this limit can be changed by the [script.max_compilations_rate] setting".
If I change the max_compilations_rate using Kibana, it has no effect:
PUT _cluster/settings
{
"transient": {
"script.max_compilations_rate": "1500/1m"
}
}
Anyway, it would be better to use a parametrized script. I tried:
def update_references(self, query, script_source, script_params):
try:
ubq = UpdateByQuery(using=self.client, index=self.index).update_from_dict(query).script(source=script_source, params=script_params)
ubq.execute()
except Exception as err:
return False
return True
So, this time:
script_source = 'ctx._source.refs = params.value'
script_params = {'value': [\'python\', \'java\']}
But as I have to update the query and the parameters each time, I need to create a new instance of the UpdateByQuery for each document in the large collection, and the result is the same error.
I also tried to traverse and update the large collection with:
es.update(
index=kwargs["index"],
doc_type="paper",
id=paper["_id"],
body={"doc": {
"refs": paper["refs"] # e.g. [\\'python\\', \\'java\\']
}}
)
But I'm getting the following error: "Failed to establish a new connection: [Errno 99] Cannot assign requested address juil. 10 18:07:14 bib gunicorn[20891]: POST http://localhost:9200/papers/paper/OZKI1msBNuDimFsy0SM9/_update [status:N/A request:0.005s"
So, please, if you have any idea on how to solve this it will be really appreciated.
Best,
You can try it like this.
PUT _cluster/settings
{
"persistent" : {
"script.max_compilations_rate" : "1500/1m"
}
}
The version update is causing these errors.

Sematext Logagent Elasticsearch - Indexes not being created?

I'm trying to send data to Elasticsearch using logagent but while there doesn't seem to be any error sending the data, the index isn't being created in ELK. I'm trying to find the index by creating a new index pattern via the Kibana GUI but the index does not seem to exist. This is my logagent.conf right now:
input:
# bro-start:
# module: command
# # store BRO logs in /tmp/bro in JSON format
# command: mkdir /tmp/bro; cd /tmp/bro; /usr/local/bro/bin/bro -i eth0 -e 'redef LogAscii::use_json=T;'
# sourceName: bro
# restart: 1
# read the BRO logs from the file system ...
files:
- '/usr/local/bro/logs/current/*.log'
parser:
json:
enabled: true
transform: !!js/function >
function (sourceName, parsed, config) {
var src = sourceName.split('/')
// generate Elasticsearch _type out of the log file sourceName
// e.g. "dns" from /tmp/bro/dns.log
if (src && src[src.length-1]) {
parsed._type = src[src.length-1].replace(/\.log/g,'')
}
// store log file path in each doc
parsed.logSource = sourceName
// convert Bro timestamps to JavaScript timestamps
if (parsed.ts) {
parsed['#timestamp'] = new Date(parsed.ts * 1000)
}
}
output:
stdout: false
elasticsearch:
module: elasticsearch
url: http://10.10.10.10:9200
index: bro_logs
Maybe I have to create the index mappings manually? I don't know.
Thank you for any advice or insight!
I found out that there actually was an error . I was trying to send some authentication via a field called "auth" but that doesn't exist. I can do url: https://USERNAME:PASSWORD#10.10.10.10:9200 though.

Is it possible with Elastic Curator to delete indexes matching field value

We have on our elasticsearch several indexes. They come from FluentD pluging sendings logs fron our docker containers. We would like to delete old indexes not only older than specific amount of days based on index name but applying different delete rules depending on log fields.
Here is an example of log:
{
"_index": "fluentd-2018.03.28",
"_type": "fluentd",
"_id": "o98123bcbd_kqpowkd",
"_version": 1,
"_score": null,
"_source": {
"container_id": "bbd72ec5e46921ab8896a05684a7672ef113a79e842285d932f",
"container_name": "/redis-10981239d5",
"source": "stdout",
"log": "34:M 28 Mar 15:07:51.086 * 10 changes in 300 seconds. Saving...\r34:M 28 Mar 15:07:51.188 * Background saving terminated with success\r",
"#timestamp": "2018-03-28T15:07:56.217739954+00:00",
"#log_name": "docker.redis"
},
"fields": {
"#timestamp": [
"2018-03-28T15:07:56.217Z"
]
}
}
In that case, we would like to delete all logs matching #log_name = docker.redis older than 7 days.
Is it possible to define a Curator action which deletes indexes filtered by such a field value?
We tried different filtering without any success. The only action we manage to perform successfully is based on index name:
actions:
1:
action: delete_indices
description: >-
Delete indices older than 30 days
options:
ignore_empty_list: True
disable_action: True
filters:
- filtertype: pattern
kind: prefix
value: fluentd-
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 30
Curator offer only an index level retention configuration. If you need a retention based on document level, you can try with a script that execute a delete by query.
Otherwise, using curator, you need to separate your data in different indexes for applying different retention.

Resources