I'm stuck on a similar problem seen on this post, but can't find a solution : https://github.com/elastic/curator/issues/1513
To snapshot my Elasticsearch cluster (7.7.1), I use curator (5.8) to daily snapshot all indices.
I realised today that only my indices starting with "." are being snapshoted by Curator.
If I use the curator-cli, all indices are indeed seen by curator and snapshoted.
I tried to remove all filters in my action file, replaced them by :
filters:
- filtertype: none
Nothing seems to work, my dry-runs always end up listing all indices beggining with a dot.
This is my action file :
---
actions:
1:
action: snapshot
description: >-
Snapshot all indices
options:
repository: backup
name: testbackup6
ignore_unavailable: False
include_global_state: True
partial: False
wait_for_completion: True
skip_repo_fs_check: False
disable_action: False
filters:
- filtertype: none
Curator logs (I have anonymized some results)
2021-01-08 18:34:44,021 INFO DRY-RUN: snapshot: testbackup6 in repository backup with arguments: {'ignore_unavailable': False, 'include_global_state': True, 'partial': False, 'indices': '.apm-XXX,.apm-customXXX,.async-sXXX,.kibana_1,.kibana_task_manager_1,.monitoring-alerts-7,.monitoring-es-7-2021.01.02,.monitoring-es-7-2021.01.03,.monitoring-es-7-2021.01.04
...
,.triggered_watches,.watches'}
I went to see the DEBUG logs, and the indices lifecycle seems to be a problem.
Here are some accepted/rejected indices :
2021-01-08 19:54:07,925 DEBUG curator.indexlist __not_actionable:39 Index XXXX_supervision-server_logs-2020.12.31-000014 is not actionable, removing from list.
2021-01-08 19:54:07,925 DEBUG curator.indexlist __excludify:58 **Removed** from actionable list: XXX_supervision-server_logs-2020.12.31-000014 has index.lifecycle.name XXX_supervision-server_logs-policy
2021-01-08 19:54:07,925 DEBUG curator.indexlist __actionable:35 Index .monitoring-es-7-2021.01.05 is actionable and remains in the list.
2021-01-08 19:54:07,925 DEBUG curator.indexlist __excludify:58 **Remains** in actionable list: index.lifecycle.name is not set for index .monitoring-es-7-2021.01.05
2021-01-08 19:54:07,925 DEBUG curator.indexlist __not_actionable:39 Index XXX_logs-2021.01.05-000019 is not actionable, removing from list.
Has anyone experienced this ?
I can't see the link between indices having ILM policies and curator not matching them.
I can't find a workaround with regex to help me match all my indices. With the same "filtertype: none" on curator-cli, everything is OK.
Thanks a lot
I just found it ><
"allow_ilm_indices: True" must be added in the action file in order to show all indices...
The curator_cli has this option on True by default, which is not the case of curator itself.
Related
I have been trying to make use of the keyword needs (following the doc) to control the order of installation of the releases.
Here is my helmfile:
helmDefaults:
createNamespace: false
timeout: 600
helmBinary: /usr/local/bin/helm
releases:
- name: dev-sjs-pg
chart: ../helm_charts/sjs-pg
- name: dev-sjs
chart: ../helm_charts/sjs
needs: ['dev-sjs-pgg']
Regarding versions:
helmfile version v0.139.9
helm version.BuildInfo{Version:"v3.5.4", GitCommit:"1b5edb69df3d3a08df77c9902dc17af864ff05d1", GitTreeState:"clean", GoVersion:"go1.15.11"}
When I run helmfile sync , both releases are installed simultaneously. In particular, there is no error due to my spelling error (dev-sjs-pgg instead of dev-sjs-pg). It is like needs is just not read.
Could you help me understanding what I am doing wrong please ?
I tried to reproduce this. When executing helmfile --log-level=debug sync I see in the debug log:
processing 2 groups of releases in this order:
GROUP RELEASES
1 dev-sjs-pg
2 dev-sjs
I also see these are deployed one after another (just a few seconds difference because I am deploying a fast nginx chart):
I am trying to index my custom log file using filebeat. I am successfully running filebeat with pre-built modules like mysql, nginx etc. But when I actually try to use it with my application specific log file, index is created with 0 documents.
I could not find anywhere in the filebeats document if there are any specific steps need to be taken to ensure indexing takes place for the custom log files.
I did not get any error when I setup filebeats or run filebeats post setup.
Below is the filebeat.yml:
filebeat.inputs:
- type: log
enabled: true
paths:
- /Applications/MAMP/htdocs/247around-adminp-aws/application/logs/log-2020-12-21.log
include_lines: ['^INFO', '^ERROR']
fields:
app_id: crm
filebeat.config.modules:
setup.template.settings:
index.number_of_shards: 1
path: ${path.config}/modules.d/*.yml
setup.kibana:
output.elasticsearch:
hosts: ["localhost:9200"]
processors:
As can be seen, it is majorly default .yml file with very minor changes.
My custom log file log-2020-12-21.php is:
INFO - 2020-12-21 15:10:26 --> index Logging details have been captured for employee. Details are : Array
INFO - 2020-12-21 15:10:36 --> editpartner partner_id:1
INFO - 2020-12-21 15:10:36 --> SELECT DISTINCT service_id, brand, active
ERROR - 2020-12-21 15:10:36 --> Query error: Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'boloaaka.collateral.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
INFO - 2020-12-21 15:10:36 --> Database Error: A Database Error Occurred<br/>Array
ERROR - 2020-12-21 15:10:54 --> Query error: Expression #5 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'boloaaka.service_centres.district' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
INFO - 2020-12-21 15:10:54 --> Database Error: A Database Error Occurred<br/>Array
INFO - 2020-12-21 23:53:21 --> Loginindex
INFO - 2020-12-21 23:54:50 --> Loginindex
INFO - 2020-12-21 23:55:42 --> Loginindex
INFO - 2020-12-21 23:56:24 --> Loginindex
Index file is getting created with 0 documents:
Log file showing logs for filebeats setup and filebeats running:
https://pastebin.com/TK6uYXuq
Please help:
Why there are no error messages if something is wrong because of which documents are not getting indexed? I should be getting some error if things are not right.
How should I index my log file?
Where should I add pattern for my log file like key-value pair which would help me in searching the documents for relevant values later on?
Thanks for your help.
In your filebeat configuration, are you sure you are referring to the exact file where your logs are stored? Your 'paths' in filebeat.yml is referring to a .log file extension while the custom log file you've pasted is log-2020-12-21.php Try changing your paths to match this .php extension instead.
If filebeat correctly picks this file up, you could see something like the code below in your filebeat logs
INFO log/harvester.go:287 Harvester started for file: /Applications/MAMP/htdocs/247around-adminp-aws/application/logs/log-2020-12-21.php
I want to set the loggingService field of an existing container.v1.cluster through deployment-manager.
I have the following config
resources:
- name: px-cluster-1
type: container.v1.cluster
properties:
zone: europe-west1-b
cluster:
description: "dev cluster"
initialClusterVersion: "1.13"
nodePools:
- name: cluster-pool
config:
machineType: "n1-standard-1"
oauthScopes:
- https://www.googleapis.com/auth/compute
- https://www.googleapis.com/auth/devstorage.read_only
- https://www.googleapis.com/auth/logging.write
- https://www.googleapis.com/auth/monitoring
management:
autoUpgrade: true
autoRepair: true
initialNodeCount: 1
autoscaling:
enabled: true
minNodeCount: 3
maxNodeCount: 10
ipAllocationPolicy:
useIpAliases: true
loggingService: "logging.googleapis.com/kubernetes"
masterAuthorizedNetworksConfig:
enabled: false
locations:
- "europe-west1-b"
- "europe-west1-c"
When I try to run gcloud deployment-manager deployments update ..., I get the following error
ERROR: (gcloud.deployment-manager.deployments.update) Error in Operation [operation-1582040492957-59edb819a5f3c-7155f798-5ba37285]: errors:
- code: NO_METHOD_TO_UPDATE_FIELD
message: No method found to update field 'cluster' on resource 'px-cluster-1' of
type 'container.v1.cluster'. The resource may need to be recreated with the new
field.
The same succeeds if I remove loggingService.
Is there a way to update loggingService using deployment-manager without deleting the cluster?
The error NO_METHOD_TO_UPDATE_FIELD is due to updating "initialClusterVersion" when you issued the update call to GKE. This field is only used on creation of the cluster, and the type definition doesn't currently allow for it to be updated later. So that should remain static at the original value and will have no effect on the deployment moving forward or try to delete/comment that line.
Even when the previous entry is true, there is also no method to update the logging service, actually Deployment Manager doesn't have many update methods, so, try using the gcloud command to update the cluster directly, keep in mind that you have to use the monitoring service together with the logging service, so, the commando would look like:
gcloud container clusters update px-cluster-1 --logging-service=logging.googleapis.com/kubernetes --monitoring-service=monitoring.googleapis.com/kubernetes --zone=europe-west1-b
We're getting this message:
[2017-08-11T04:00:02,908][WARN ][r.suppressed ] path: /_snapshot/s3_currently/curator-20170811040002, params: {repository=s3_currently, wait_for_completion=true, snapshot=curator-20170811040002}
org.elasticsearch.snapshots.ConcurrentSnapshotExecutionException: [s3_currently:curator-20170811040002]a snapshot is already running
We've configured x-pack curator with two actions:
/home/curator/actions/currently.yml
---
actions:
1:
action: snapshot
description: Create snapshot every 30 minutes.
options:
repository: s3_currently
wait_for_completion: true
filters:
- filtertype: alias
aliases: living
2:
action: delete_snapshots
description: Remove recently snapshots
options:
repository: s3_currently
retry_interval: 120
retry_count: 3
filters:
- filtertype: count
count: 48
And /home/curator/actions/currently-dev.yml:
---
actions:
1:
action: snapshot
description: Create snapshot every hour for development.
options:
repository: s3_currently_dev
wait_for_completion: true
filters:
- filtertype: alias
aliases: living
2:
action: delete_snapshots
description: Remove recently snapshots
options:
repository: s3_currently_dev
retry_interval: 120
retry_count: 3
filters:
- filtertype: count
count: 24
We've added two cron jobs:
0 * * * * -> currently_dev
0,30 * * * * -> currently
Any ideas? It seems that elasticsearch doesn't allow to execute two concurrent snapshots, does it?
Elasticsearch does not allow for more than one snapshot to run at a time. The reason for this is that it is compelled to freeze the Lucene segments for the selected indices for the duration of the snapshot. It would be extremely taxing to the cluster to do this for multiple concurrent snapshots, not in terms of processing, but in terms of how it has to track all segments at all times. It must allow for new data to be indexed into new segments while others are locked/frozen for snapshotting. This could create a situation where there are too many open segments, which could deprive one or more nodes of needed memory resources. As a result, it's safer for Elasticsearch to only permit a single snapshot at a time.
Assume, need to automate the snapshot restoring of 2 or more snapshots to elastic cluster.
It is necessary to detect, that snapshot operation is completed before next api call: _snaphot/<repository>/<snapshot>/_restore.
If I call while snapshot is restoring, cluster responses 503.
I tried to use thread pool api with running snapshot operation:
curl -XGET 'http://127.0.0.1:9200/_cat/thread_pool?h=snapshot.active
But, it returns 0 anyway.
What is proper way to do get info about current running restore operation?
UPDATE:
An example how have it managed to work with ansible:
- name: shell | restore latest snapshot
uri:
url: "http://127.0.0.1:9200/_snapshot/{{ es_snapshot_repository }}/snapshot_name/_restore"
method: "POST"
body: '{"index_settings":{"index.number_of_replicas": 0}}'
body_format: json
- name: shell | get state of active recovering operations | log indices
uri:
url: "http://127.0.0.1:9200/_recovery?active_only"
method: "GET"
register: response
until: "response.json == {}"
retries: 6
delay: 10
You can monitor status of indices being restored using Indices Recovery API.
The easiest way of doing this is looking at the stage property:
init: Recovery has not started
index: Reading index meta-data and copying bytes from source to destination
start: Starting the engine;
opening the index for use translog: Replaying transaction log
finalize: Cleanup done: Complete
done: Complete
Parameter active_only returns info about shards that are not in done state:
http://127.0.0.1:9200/_recovery?active_only