hear is my filebeat.yml
filebeat.inputs:
- type: log
enabled: true
paths:
- ../typescript/rate-limit-test/logs/*.log
json.message_key: "message"
json.keys_under_root: true
json.overwrite_keys: true
scan_frequency: 1s
filebeat.config.modules:
path: ${path.config}/modules.d/*.yml
reload.enabled: false
setup.template.settings:
index.number_of_shards: 1
logging.level: debug
output.elasticsearch:
hosts: ["34.97.108.113:9200"]
index: "filebeat-%{+yyyy-MM-dd}"
setup.template:
name: 'filebeat'
pattern: 'filebeat-*'
enabled: true
setup.template.overwrite: true
setup.template.append_fields:
- name: time
type: date
processors:
- drop_fields:
fields: ["agent","host","ecs","input","log"]
setup.ilm.enabled: false`
I changed scan_frequncy but elasticsearch couldn't get logs faster
How can i get logs in elasticsearch instantly?
Please help me..
There will be never an 'instantly' available logline in elasticsearch. The file needs to be watched for a considerable amount of changes or time, then the newly added lines need to be sent to elasticsearch in a bulk request and indexed into the appropriate shard on the correct cluster node. Network latency, TLS, authentification + authorization, concurrent write/search load: all the things affects the 'instantly' experience.
The speed of log ingestion and NRT (near-real-time search) depends on many factors and configuration options in elasticsearch and filebeat.
Regarding tuning elasticsearch for indexing speed, have a look at this documentation, and apply what you have missed yet. A brief overview:
Disable swapping and enable memory locking (bootstrap.memory_lock: true)
Consider reducing index.refresh_interval (defaults to 1s) for the index in order to have the docs flushed more often (produces more IO in the cluster)
For Filebeat, there is also good documentation about tuning, but in general, I see the following options:
Try different output.elasticsarch.bulk_max_size values (defaults to a batch size of 50) and monitor the ingestion speed. For each cluster configuration, there are different optimal settings.
In high load scenarios, when the logs are written fast, consider increasing the number of workers output.elasticsarch.workers (defaults to 1)
In the opposite scenario, having just a few log lines being written, consider increasing the close_inactive and scan_frequency value for the harvester. Specifying a more suitable backoff will have an effect on how aggressively Filebeat checks files for updates.
Related
With the below fluentbit configuration we are getting errors from opensearch under heavy load.
Http bulk requests to opensearch by fluentbit(respresenting 429 errors as spike)
Fluentbit config:
[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/*.log
DB /var/log/flb_kube.db
Mem_Buf_Limit 400M
storage.type filesystem
Skip_Long_Lines On
Refresh_Interval 1
Rotate_Wait 600
[OUTPUT]
Name es
Match kube.*
Host ${ES_HOST}
Port ${PORT}
Buffer_Size False
AWS_Auth Off
AWS_Role_ARN ${ES_ARN}
AWS_External_ID ${ES_IAMROLE}
HTTP_User ${ES_USER}
HTTP_Passwd ${ES_PASSWD}
tls On
tls.verify Off
Trace_Output ${TRACE_OUTPUT}
Trace_Error On
Replace_Dots On
Index fluentbit
Type flb
AWS_Region ${AWS_REGION}
Logstash_Format On
Logstash_Prefix ${ES_LOGSTASHPREFIX}_app_log
Logstash_DateFormat %Y.%m.%d
Retry_Limit 10
storage.total_limit_size 1G
For resolving this we have upgraded our opensearch instance type from r5.xlarge.search(4 nodes) to r5.2xlarge.search(3 nodes) but that also didn't solve the issue.
We have also increased the ES index refresh_interval to 60s but that didn't help.
We read that output to ES from fluentbit can be controlled via buffering so we decreased Mem_Buf_Limit to 400M and it didn't help.
Can someone help if can try any other things or we are missing something.
The issue here is not that of fluentbit but is of opensearch/elasticsearch.
The HTTP 429 errors (es_request_rejected_exception) in ES occur when too many requests are sent to the cluster, than what the thread pool for it can handle. The thread pool in OpenSearch for different tasks are allocated differently with search operations getting a larger share. The option to manually modify thread pool allocation is not available for versions 5.1 and later.
You can try to resolve this by few ways.
1: Refresh rate (you already did that and it didn't help).
2: Change the indexing speed. Try to send logs with an interval greater than your current.
3: Upscale (you did and it didn't work either)
You can get an idea with the following formula for thread pools.
Number of thread pools allocated for writes = Number of Virtual CPUs (your case)
Number of thread pools allocated for search = ((3 * Number of virtual CPUs)/2) + 1
So, I am guessing your issue here is a big number of shards! You can either decrease the shards for each index or if you are having this issue only once in a while when there is extra load, you can change the replica count to 0 and when the period is finished, change it back to the original.
Check these two links to find out more about optimizing your ES domain.
indexing performance
Best practices
Is there a way to stop curator deleting the last index when deleting by time?
actions:
1:
action: delete_indices
description: Delete kube- indices older than 14 days. Ignore the error if there are none and exit cleanly.
options:
disable_action: False
ignore_empty_list: True
filters:
- filtertype: pattern
kind: prefix
value: kube-
- filtertype: age
source: name
direction: older
timestring: '%Y.%m.%d'
unit: days
unit_count: 14
I have this which works great to keep a currently running K8s cluster logs in check. However when we move AWS region the log name changes e.g. from kube-eu-west-1-<date> to kube-eu-west-2-<date>.
Curator diligently cleans up all the data after 14 days. What I'd like is to prevent it from removing the last entry for a particular index, so there is always a record of what happened the last time the cluster was in that region.
(It would also "fix" some less well written pieces of code that throw errors when the data they expect to be there has legitimately gone away).
You could use the count filter:
filters:
# your existing filters go BEFORE the count filter...
- filtertype: count
count: 1
This example should exclude (in spite of the exclude: false) the most recent index from the list of actionable indices, preserving it. If this is not the index you want to exclude, explore using exclude: true/false (default is true), and/or reverse: true/false (default is true) until it excludes the index you want.
NOTE: Always use the --dry-run flag to test your filters before deploying them on actual data. Iterate until it looks right. Use of loglevel: DEBUG in your client settings will show how filters make their decisions, if that helps.
I have running ZooKeeper and single Kafka broker and I want to get metrics with MetricBeat, index it with ElasticSearch and display with Kibana.
However, MetricBeat can only get data from partition metricset and nothing comes from consumergroup metricset.
Since kafka module is defined as periodical in metricbeat.yml, it should send some data on it's own, not just waiting for users interaction (f.exam. - write to topic) ?
To ensure myself, I tried to create consumer group, write and consume from topic, but still no data was collected by consumergroup metricset.
consumergroup is defined in both metricbeat.template.json and metricbeat.template-es2x.json.
While metricbeat.full.yml is completely commented off, this is my metricbeat.yml kafka module definition :
- module: kafka
metricsets: ["partition", "consumergroup"]
enabled: true
period: 10s
hosts: ["localhost:9092"]
client_id: metricbeat1
retries: 3
backoff: 250ms
topics: []
In /logs directory of MetricBeat, lines like this show up :
INFO Non-zero metrics in the last 30s:
libbeat.es.published_and_acked_events=109
libbeat.es.publish.write_bytes=88050
libbeat.publisher.messages_in_worker_queues=109
libbeat.es.call_count.PublishEvents=5
fetches.kafka-partition.events=106
fetches.kafka-consumergroup.success=2
libbeat.publisher.published_events=109
libbeat.es.publish.read_bytes=2701
fetches.kafka-partition.success=2
fetches.zookeeper-mntr.events=3
fetches.zookeeper-mntr.success=3
With ZooKeeper's mntr and Kafka's partition, I can see events= and success= values, but for consumergroup there is only success. It looks like no events are fired.
partition and mntr data are properly visible in Kibana, while consumergroup is missing.
Data stored in ElasticSearch are not readable with human eye, there are some internal strings used for directory names and logs do not contain any useful information.
Can anybody help me to understand what is going on and fix it(probably MetricBeat) to send data to ElasticSearch ? Thanks :)
You need to have an active consumer consuming out of the topics, to be able to generate events for consumergroup metricset.
How can I configure filebeat to only ship a percentage of logs (a sample if you will) to logstash?
In my application's log folder the logs are chunked to about 20 megs each. I want filebeat to ship only about 1/300th of that log volume to logstash.
I need to pare down the log volume before I send it over the wire to logstash so I cannot do this filtering from logstash it needs to happen on the endpoint before it leaves the server.
I asked this question in the ES forum and someone said it was not possible with filebeat: https://discuss.elastic.co/t/ship-only-a-percentage-of-logs-to-logstash/77393/2
Is there really no way I can extend filebeat to do this? Can nxlog or another product to this?
To the best of my knowledge, there is no way to do that with FileBeat. You can do it with Logstash, though.
filter {
drop {
percentage => 99.7
}
}
This may be a use-case where you would use Logstash in shipping mode on the server, rather than FileBeat.
input {
file {
path => "/var/log/hugelogs/*.log"
add_tags => [ 'sampled' ]
}
}
filter {
drop {
percentage => 99.7
}
}
output {
tcp {
host => 'logstash.prod.internal'
port => '3390'
}
}
It means installing Logstash on your servers. However, you configure it as minimally as possible. Just an input, enough filters to get your desired effect, and a single output (Tcp in this case, but it could be anything). Full filtering will happen down the pipeline.
There's no way to configure Filebeat to drop arbitrary events based on a probability. But Filebeat does have the ability to drop events based on conditions. There are two way to filter events.
Filebeat has a way to specify lines to include or exclude when reading the file. This is the most efficient place to apply the filtering because it happens early. This is done using include_lines and exclude_lines in the config file.
filebeat.prospectors:
- paths:
- /var/log/myapp/*.log
exclude_lines: ['^DEBUG']
All Beats have "processors" that allow you to apply an action based on a condition. One action is drop_events and the conditions are regexp, contains, equals, and range.
processors:
- drop_event:
when:
regexp:
message: '^DEBUG'
Apparently logstash OnDemand account does not work when I wanted to post an issue.
Anyways, I have a logstash setup with redis, elasticsearch, and kibana. My logstash are collecting logs from several files and putting in redis just fine.
Logstash version 1.3.3
Elasticsearch version 1.0.1
The only thing I have in elasticsearch_http for logstash is the host name. This all setup seems to glue together just fine.
The problem is that the elasticsearch_http is not consuming the redis entries as they come. What I have seen by running it in debug mode is that it flush about 100 entries after every 1 min (flush_size and idle_flush_time's default values). The documentation however states, from what I understand is, that it will force a flush in case the 100 flush_size is not satisfied (for example we had 10 messages in last 1 min). But it seems to work the other way. Its flushing about 100 messages every 1 min only. I changed the size to 2000 and it flush 2000 every min or so.
Here is my logstash-indexer.conf
input {
redis {
host => "1xx.xxx.xxx.93"
data_type => "list"
key => "testlogs"
codec => json
}
}
output {
elasticsearch_http {
host => "1xx.xxx.xxx.93"
}
}
Here is my elasticsearch.yml
cluster.name: logger
node.name: "logstash"
transport.tcp.port: 9300
http.port: 9200
discovery.zen.ping.unicast.hosts: ["1xx.xxx.xxx.93:9300"]
discovery.zen.ping.multicast.enabled: false
#discovery.zen.ping.unicast.enabled: true
network.bind_host: 1xx.xxx.xxx.93
network.publish_host: 1xx.xxx.xxx.93
The indexer, elasticsearch, redis, and kibana are on same server. The log collection from file is done on another server.
So I'm going to suggest a couple of different approaches to solve you problem. Logstash as you are discovering can be a bit quirky so I've found a these approaches useful in dealing with unexpected behavior from logstash.
Use the elasticsearch output instead of elasticsearch_http. You
can get the same functionality by using elasticsearch output with
protocol set to http. The elasticsearch output is more mature
(milestone 2 vs milestone 3) and I've seen this change make a
difference before.
Set the defaults for idle_flush_time and flush_size. There have
been issues with Logstash defaults previously, I've found it to be a
lot safer to set them explicitly. idle_flush_time is in seconds,
flush_size is the number of records to flush.
Upgrade to more recent versions of logstash. There is
enough of a change in how logstash is deployed with version 1.4.X
(http://logstash.net/docs/1.4.1/release-notes) that I'd that I'd
bite the bullet and upgrade. It's also significantly easier to get
attention if you still have a problem with the most recent stable
major release.
Make certain your Redis version matches those support by your
logstash version.
Experiment with setting the batch, batch_events and batch_timeout
values for the Redis output. You are using the list data_type.
list supports various batch options and as with some other
parameters it's best not to assume the defaults are always being set
correctly.
Do all of the above. In addition to trying the first set of
suggestions, I'd try all of them together in various combinations.
Keep careful records of each test run. Seems obvious but between all
the variations above it's easy to lose track - I'd keep careful
records and try to change only one variation at a time.