How to configure GCS as filebeat input - elasticsearch

We are storing our audit logs in GCS bucket. we would like to ingest them to Elasticsearch when required - not regularly - using filebeat. I have checked S3 option where it let us use s3 like storages as input using providers.
I'm using following configuration but it is not writing any data however when I test the filebeat configuration it is fine however input is stopped working.
Here is warning from logs
WARN [aws-s3] awss3/config.go:54 neither queue_url nor bucket_arn were provided, input aws-s3 will stop
INFO [crawler] beater/crawler.go:141 Starting input (ID: 17738867761700079737)
INFO [crawler] beater/crawler.go:108 Loading and starting Inputs completed. Enabled inputs: 1
INFO [input.aws-s3] compat/compat.go:111 Input aws-s3 starting {"id": "F62D1E3EA5C30879"}
INFO [input.aws-s3] compat/compat.go:124 Input 'aws-s3' stopped {"id": "F62D1E3EA5C30879"}
I doubt my input configuration is not right in someway. Please check the following and help me understand what's wrong
filebeat.inputs:
- type: aws-s3
non_aws_bucket_name: test-bucket
number_of_workers: 5
bucket_list_interval: 300s
access_key_id: xxxxx
secret_access_key: xxxxxxxx
endpoint: https://storage.googleapis.com
output.elasticsearch:
hosts: "https://es-test-xxx.aivencloud.com"
username: "avnadmin"
password: "xxxxx"
indices:
- index: 'restore-test'

Related

Filebeat Kubernetes cannot output to ElasticSearch

Filebeat Kubernetes cannot output to ElasticSearch,
ElasticSearch is OK.
filebeat is daemonset,relevant environment variables have been added.
filebeat.yml
filebeat.autodiscover:
providers:
- type: kubernetes
node: ${NODE_NAME}
hints.enabled: true
hints.default_config:
enabled: false
type: container
paths:
- /var/log/containers/*-${data.container.id}.log
output.elasticsearch:
hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
username: ${ELASTICSEARCH_USERNAME}
password: ${ELASTICSEARCH_PASSWORD}
Kubernetes
Use nginx app to test:
image=nginx:latest
Deployment annotations have been added.
co.elastic.logs/enabled: "true"
pod.yaml (in node1)
But cannot output to ElasticSearch,Logs and indexes for related input are not seen.
filebeat pod(node1) logs
Expect the filebeat to collect logs for the specified container(Pod) to elasticsearch.
#baymax first off, you don't need to explicitly define the property anywhere:
co.elastic.logs/enabled: "true"
since filebeat, by default, reads all the container log files on the node.
Secondly, you are disabling hints.default_config which ensures filebeat will only read the log files of pods which are annotated as above; however, you haven't provided any template config to be used for reading such log files.
For more info, read: https://www.elastic.co/guide/en/beats/filebeat/current/configuration-autodiscover-hints.html
Thirdly, in your filebeat logs, do you see any harvester being started, handles created and events published ? Posting a snapshot of logs doesn't give a clear picture. May be try starting filebeat in debug mode for few minutes and paste the logs here in proper formatting.

Filebeat read all logs, not only that one defined in configuration

I try to configure filebeat version 7.17.5 (amd64), libbeat 7.17.5, for reading Spring boot logs and sending them via logstash to elasticsearch. All works fine, logs are send and I can read it in Kibana but the problem is that I configured filebeat in file /etc/filebeat/filebeat.yml and defined there only one source of logs, but filebeat's still getting all logs from /var/log
It's my only one config for inputs:
filebeat.inputs:
- type: filestream
id: some_id
enabled: true
paths:
- "/var/log/dir_with_logs/application.log"
But when I check status of filebeat a have the information that:
[input] log/input.go:171 Configured paths: [/var/log/auth.log* /var/log/secure*]
And also I have logs from files: auth or secure in Kibana, which I don't want to have.
What I'm doing wrong or what I don't know what I should?
Based on the configured paths of /var/log/auth.log* and /var/log/secure*, I think this is the Filebeat system module. You can disable the system module by renaming /etc/filebeat/modules.d/system.yml to /etc/filebeat/modules.d/system.yml.disabled.
Alternatively you can run the filebeat modules command to disable the module (it simply renames the file for you).
filebeat modules disable system

How to decode JSON in ElasticSearch load pipeline

I set up ElasticSearch on AWS and I am trying to load application log into it. The twist is that application log entry is in JSON format, like
{"EventType":"MVC:GET:example:6741/Common/GetIdleTimeOut","StartDate":"2021-03-01T20:46:06.1207053Z","EndDate":"2021-03-01","Duration":5,"Action":{"TraceId":"80001266-0000-ac00-b63f-84710c7967bb","HttpMethod":"GET","FormVariables":null,"UserName":"ZZZTHMXXN"} ...}
So, I am trying to unwrap it. Filebeat docs suggest that there is decode_json_fields processor; however, I am getting message fields in Kinbana as a single JSON string; nothing unwrapped.
I am new to ElasticSearch, but I am not going to use it as an excuse not to do analysis first. Only as an explanation that I am not sure which information is helpful for answering the question.
Here is filebeat.yml:
filebeat.inputs:
- type: log
enabled: true
paths:
- /var/opt/logs/**/*.json
processors:
- add_host_metadata:
when.not.contains.tags: forwarded
- add_cloud_metadata: ~
- add_docker_metadata: ~
- add_kubernetes_metadata: ~
- decode_json_fields:
fields: ["message"]
output.logstash:
hosts: ["localhost:5044"]
And here is Logstash configuration file:
input {
beats {
port => "5044"
}
}
output {
elasticsearch {
hosts => ["https://search-blah-blah.us-west-2.es.amazonaws.com:443"]
ssl => true
user => "user"
password => "password"
index => "my-logs"
ilm_enabled => false
}
}
I am still trying to understand the filtering and grok parts of Logstash, but it seems that it should work the way it is. Also, I am not sure where the actual tag messages comes from (probably, from Logstash or Filebeat), but it seems irrelevant as well.
UPDATE: AWS documentation doesn't give an example of just loading through filebeat, without logstash.
If I don't use logstash (just FileBeat) and have the following section in filebeat.yml:
output.elasticsearch:
hosts: ["https://search-bla-bla.us-west-2.es.amazonaws.com:443"]
protocol: "https"
#index: "mylogs"
# Authentication credentials - either API key or username/password.
#api_key: "id:api_key"
username: "username"
password: "password"
I am getting the following errors:
If I use index: "mylogs" - setup.template.name and setup.template.pattern have to be set if index name is modified
And if I don't use index (where would it go in ES then?) -
Failed to connect to backoff(elasticsearch(https://search-bla-bla.us-west-2.es.amazonaws.com:443)): Connection marked as failed because the onConnect callback failed: cannot retrieve the elasticsearch license from the /_license endpoint, Filebeat requires the default distribution of Elasticsearch. Please make the endpoint accessible to Filebeat so it can verify the license.: unauthorized access, could not connect to the xpack endpoint, verify your credentials
If transmitting via logstash works in general, add a filter block as Val proposed in the comments and use this json plugin/filter: elastic.co/guide/en/logstash/current/plugins-filters-json.html - it automatically parses the json into elasticsearch fields

Filebeat havesting problems

I've been facing a probem since a while now. My filebeat plugin does not harvest the fields that i ask him to harvest in my conf file. I'm using filebeat 7.6.0
My conf:
filebeat.inputs:
- type: log
paths:
- /var/log/user.log
- /var/log/slapd.log
output.kafka:
hosts: ["kafka1:9092"]
topic: 'log'
partition.round_robin:
reachable_only: false
required_acks: 1
compression: gzip
max_message_bytes: 10000000
The log output:
|2020-02-14T07:55:58.664Z|INFO|crawler/crawler.go:72|Loading Inputs: 1,
|2020-02-14T07:55:58.665Z|INFO|log/input.go:152|Configured paths: [/var/log/user.log /var/log/slapd.log],|
|2020-02-14T07:55:58.665Z|INFO|input/input.go:114|Starting input of type: log; ID: 6297130742941599674 ,|
|2020-02-14T07:55:58.665Z|INFO|crawler/crawler.go:106|Loading and starting Inputs completed. Enabled inputs: 1,
|2020-02-14T07:56:00.664Z INFO [monitoring] log/log.go:145 Non-zero metrics in the last 30s {"monitoring": {"metrics": {"beat":{"cpu":{"system":{"ticks":250,"time":{"ms":8}},"total":{"ticks":390,"time":{"ms":16},"value":390},"user":{"ticks":140,"time":{"ms":8}}},"handles":{"limit":{"hard":1048576,"soft":1048576},"open":6},"info":{"ephemeral_id":"a601336e-8252-460f-9a25-f05dad5851b2","uptime":{"ms":480275}},"memstats":{"gc_next":8594432,"memory_alloc":5169696,"memory_total":17158072},"runtime":{"goroutines":20}},"filebeat":{"harvester":{"open_files":0,"running":0}},"libbeat":{"config":{"module":{"running":0}},"pipeline":{"clients":1,"events":{"active":0}}},"registrar":{"states":{"current":0}},"system":{"load":{"1":0.21,"15":1.06,"5":0.88,"norm":{"1":0.105,"15":0.53,"5":0.44}}}}}}
I've been trying to solve it by making it harvest various files. But without success so far. I always get this "filebeat":{"harvester":{"open_files":0,"running":0}}
Thanks!
Probably rule out that the issue is not with Kafka output. For testing, set up the FIlebeat file output and check whether you are getting any data or not.

Filebeat is processing all the logs instead of the specified application logs

I have an app server, where I have configured filebeat(through Chef) to extract the logs and publish it to logstash(a separate ELK server), and subsequently to ES and Kibana.
I have configured filebeat to process logs only from /opt/app_logs/*.log, but it seems it is reading logs from other locations too, because in the /etc/filebeat configuration directory, I have filebeat.full.yml and other yml files generated automatically, and they seem to have all those other file locations, thus due to such a huge amount of logs, logstash service is getting out of memory within minutes with logstash.log. How can I not autogenerate the other yml files?
I tried to remove this file and also tried to comment out all the /var/log paths from the prospectors, but then filebeat itself is not starting.
filebeat.yml file:
filebeat:
prospectors: []
registry_file: "/var/lib/filebeat/registry"
config_dir: "/etc/filebeat"
output:
logstash:
hosts:
- elk_host:5044
index: logstash-filebeat
shipper:
name: serverA
tags:
- A
logging:
to_files: 'true'
files:
path: "/var/log/filebeat"
name: filebeat_log
rotateeverybytes: '10485760'
level: info
prospectors:
- paths:
- "/opt/app_logs/*.log"
encoding: plain
input_type: log
ignore_older: 24h
The main problem with your configuration is that for Filebeat 1.2.3 you have the prospectors list defined twice and second one is not in the correct location.
The second problem is that you have defined the config_dir as /etc/filebeat. config_dir is used to specify an additional directory where to look for config files. It should never be set to /etc/filebeat because this is where the main config file should be located. See https://stackoverflow.com/a/39987501/503798 for usage information.
A third problem is that you have used string types in to_files and rotateeverybytes. They should be boolean and integer types respectively.
Here's how the config should look for Filebeat 1.x.
filebeat:
registry_file: "/var/lib/filebeat/registry"
config_dir: "/etc/filebeat/conf.d"
prospectors:
- paths:
- "/opt/app_logs/*.log"
encoding: plain
input_type: log
ignore_older: 24h
output:
logstash:
hosts:
- elk_host:5044
index: logstash-filebeat
shipper:
name: serverA
tags:
- A
logging:
to_files: true
files:
path: "/var/log/filebeat"
name: filebeat_log
rotateeverybytes: 10485760
level: info
I highly recommend that you upgrade to Filebeat 5.x because it has better configuration validation using filebeat -configtest.

Resources