Duplicate and missing log entries with FluentBit and ES - elasticsearch

We're using FluentBit to ship microservice logs into ES and recently found an issue on one of the environments: some log entries are duplicated (up to several hundred times) while other entries are missing in ES/Kibana but can be found in the microservice's container (kubectl logs my-pod -c my-service).
Each duplicate log entry has a unique _id and _fluentBitTimestamp so it really looks like the problem is on FluentBit's side.
FluentBit version is 1.5.6, the configuration is:
[SERVICE]
Flush 1
Daemon Off
Log_Level info
Log_File /fluent-bit/log/fluent-bit.log
Parsers_File /fluent-bit/etc/parsers.conf
Parsers_File /fluent-bit/etc/parsers_java.conf
[INPUT]
Name tail
Path /home/xng/log/*.log
Exclude_Path /home/xng/log/*.zip
Parser json
Buffer_Max_Size 128k
[FILTER]
Name record_modifier
Match *
Record hostname ${HOSTNAME}
[OUTPUT]
Name es
Match *
Host es-logging-service
Port 9210
Type flink-logs
Logstash_Format On
Logstash_Prefix test-env-logstash
Time_Key _fluentBitTimestamp
Any help would be much appreciated.

We had same problem
Can you try in your configuration
Write_operation upsert
So if log has duplicate _id it will update instead of create
Please note, Id_Key or Generate_ID is required in update, and upsert scenario.
https://docs.fluentbit.io/manual/pipeline/outputs/elasticsearch#write_operation

Related

Kubernetes add timestamp into every docker container log entry

I have 2 cluster
Server Version: version.Info{Major:"1", Minor:"20+", GitVersion:"v1.20.11-eks-f17b81", GitCommit:"f17b810c9e5a82200d28b6210b458497ddfcf31b", GitTreeState:"clean", BuildDate:"2021-10-15T21:46:21Z", GoVersion:"go1.15.15", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.6-gke.1500", GitCommit:"7ce0f9f1939dfc1aee910732e84cba03840df91e", GitTreeState:"clean", BuildDate:"2021-11-17T09:30:26Z", GoVersion:"go1.16.9b7", Compiler:"gc", Platform:"linux/amd64"}
I use fluent-bit to tail contailer log files and push log to elasticsearch
In 1st k8s cluster, the container log 's format is:
{"log":"{\"method\":\"GET\",\"path\":\"/healthz\",\"format\":\"*/*\",\"controller\":\"Api::ApplicationController\",\"action\":\"healthz\",\"status\":204,\"duration\":0.61,\"view\":0.0,\"request_id\":\"4d54cc06-08d2-4487-b2d9-fabfb2286e89\",\"headers\":{\"SCRIPT_NAME\":\"\",\"QUERY_STRING\":\"\",\"SERVER_PROTOCOL\":\"HTTP/1.1\",\"SERVER_SOFTWARE\":\"puma 5.4.0 Super Flight\",\"GATEWAY_INTERFACE\":\"CGI/1.2\",\"REQUEST_METHOD\":\"GET\",\"REQUEST_PATH\":\"/healthz\",\"REQUEST_URI\":\"/healthz\",\"HTTP_VERSION\":\"HTTP/1.1\",\"HTTP_HOST\":\"192.168.95.192:80\",\"HTTP_USER_AGENT\":\"kube-probe/1.20+\",\"HTTP_ACCEPT\":\"*/*\",\"HTTP_CONNECTION\":\"close\",\"SERVER_NAME\":\"192.168.95.192\",\"SERVER_PORT\":\"80\",\"PATH_INFO\":\"/healthz\",\"REMOTE_ADDR\":\"192.168.79.131\",\"ROUTES_19640_SCRIPT_NAME\":\"\",\"ORIGINAL_FULLPATH\":\"/healthz\",\"ORIGINAL_SCRIPT_NAME\":\"\"},\"params\":{\"controller\":\"api/application\",\"action\":\"healthz\"},\"response\":{},\"custom\":{},\"#version\":\"dutycast-b2c-backend-v1.48.0-rc.5\",\"#timestamp\":\"2022-03-04T11:16:14.236Z\",\"message\":\"[204] GET /healthz (Api::ApplicationController#healthz)\"}\n","stream":"stdout","time":"2022-03-04T11:16:14.238067813Z"}
It is in json format and I can parse easily using fluent-bit parser
And I do the same behavior for the 2nd k8s cluster but the the container log 's format is:
2022-03-04T11:19:24.050132912Z stdout F {"method":"GET","path":"/healthz","format":"*/*","controller":"Public::PublicPagesController","action":"healthz","status":204,"duration":0.52,"view":0.0,"request_id":"bcc799bb-5e5c-4758-9169-ecebb04b801f","headers":{"SCRIPT_NAME":"","QUERY_STRING":"","SERVER_PROTOCOL":"HTTP/1.1","SERVER_SOFTWARE":"puma 5.6.2 Birdie's Version","GATEWAY_INTERFACE":"CGI/1.2","REQUEST_METHOD":"GET","REQUEST_PATH":"/healthz","REQUEST_URI":"/healthz","HTTP_VERSION":"HTTP/1.1","HTTP_HOST":"10.24.0.22:3000","HTTP_USER_AGENT":"kube-probe/1.21","HTTP_ACCEPT":"*/*","HTTP_CONNECTION":"close","SERVER_NAME":"10.24.0.22","SERVER_PORT":"3000","PATH_INFO":"/healthz","REMOTE_ADDR":"10.24.0.1","ROUTES_71860_SCRIPT_NAME":"","ORIGINAL_FULLPATH":"/healthz","ORIGINAL_SCRIPT_NAME":"","ROUTES_71820_SCRIPT_NAME":""},"params":{"controller":"public/public_pages","action":"healthz"},"custom":null,"request_time":"2022-03-04T11:19:24.048+00:00","process_id":8,"#version":"vcam-backend-v0.1.0-rc24","response":"#\u003cActionDispatch::Response:0x00007f9d1f600888 #mon_data=#\u003cMonitor:0x00007f9d1f600838\u003e, #mon_data_owner_object_id=144760, #header={\"X-Frame-Options\"=\u003e\"ALLOW-FROM https://vietcapital.com.vn\", \"X-XSS-Protection\"=\u003e\"0\", \"X-Content-Type-Options\"=\u003e\"nosniff\", \"X-Download-Options\"=\u003e\"noopen\", \"X-Permitted-Cross-Domain-Policies\"=\u003e\"none\", \"Referrer-Policy\"=\u003e\"strict-origin-when-cross-origin\"}, #stream=#\u003cActionDispatch::Response::Buffer:0x00007f9d1f6045a0 #response=#\u003cActionDispatch::Response:0x00007f9d1f600888 ...\u003e, #buf=[\"\"], #closed=false, #str_body=nil\u003e, #status=204, #cv=#\u003cMonitorMixin::ConditionVariable:0x00007f9d1f600720 #monitor=#\u003cMonitor:0x00007f9d1f600838\u003e, #cond=#\u003cThread::ConditionVariable:0x00007f9d1f6006f8\u003e\u003e, #committed=false, #sending=false, #sent=false, #cache_control={}, #request=#\u003cActionDispatch::Request GET \"http://10.24.0.22:3000/healthz\" for 10.24.0.1\u003e\u003e","#timestamp":"2022-03-04T11:19:24.049Z","message":"[204] GET /healthz (Public::PublicPagesController#healthz)"}
Both case we have same config log for the service, use the same fluent-bit version and same elasticsearch version, only different k8s cluster. In 2nd case, container log has been insert something like timestamp into begin of every entry log, and I can not parse this log 's format because it 's not json format.
I think kubernetes add default option into docker container 's log (https://docs.docker.com/engine/reference/commandline/logs/)
How can I fix format log into json format in 2nd case?

Filebeat index is getting created but with 0 documents

I am trying to index my custom log file using filebeat. I am successfully running filebeat with pre-built modules like mysql, nginx etc. But when I actually try to use it with my application specific log file, index is created with 0 documents.
I could not find anywhere in the filebeats document if there are any specific steps need to be taken to ensure indexing takes place for the custom log files.
I did not get any error when I setup filebeats or run filebeats post setup.
Below is the filebeat.yml:
filebeat.inputs:
- type: log
enabled: true
paths:
- /Applications/MAMP/htdocs/247around-adminp-aws/application/logs/log-2020-12-21.log
include_lines: ['^INFO', '^ERROR']
fields:
app_id: crm
filebeat.config.modules:
setup.template.settings:
index.number_of_shards: 1
path: ${path.config}/modules.d/*.yml
setup.kibana:
output.elasticsearch:
hosts: ["localhost:9200"]
processors:
As can be seen, it is majorly default .yml file with very minor changes.
My custom log file log-2020-12-21.php is:
INFO - 2020-12-21 15:10:26 --> index Logging details have been captured for employee. Details are : Array
INFO - 2020-12-21 15:10:36 --> editpartner partner_id:1
INFO - 2020-12-21 15:10:36 --> SELECT DISTINCT service_id, brand, active
ERROR - 2020-12-21 15:10:36 --> Query error: Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'boloaaka.collateral.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
INFO - 2020-12-21 15:10:36 --> Database Error: A Database Error Occurred<br/>Array
ERROR - 2020-12-21 15:10:54 --> Query error: Expression #5 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'boloaaka.service_centres.district' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
INFO - 2020-12-21 15:10:54 --> Database Error: A Database Error Occurred<br/>Array
INFO - 2020-12-21 23:53:21 --> Loginindex
INFO - 2020-12-21 23:54:50 --> Loginindex
INFO - 2020-12-21 23:55:42 --> Loginindex
INFO - 2020-12-21 23:56:24 --> Loginindex
Index file is getting created with 0 documents:
Log file showing logs for filebeats setup and filebeats running:
https://pastebin.com/TK6uYXuq
Please help:
Why there are no error messages if something is wrong because of which documents are not getting indexed? I should be getting some error if things are not right.
How should I index my log file?
Where should I add pattern for my log file like key-value pair which would help me in searching the documents for relevant values later on?
Thanks for your help.
In your filebeat configuration, are you sure you are referring to the exact file where your logs are stored? Your 'paths' in filebeat.yml is referring to a .log file extension while the custom log file you've pasted is log-2020-12-21.php Try changing your paths to match this .php extension instead.
If filebeat correctly picks this file up, you could see something like the code below in your filebeat logs
INFO log/harvester.go:287 Harvester started for file: /Applications/MAMP/htdocs/247around-adminp-aws/application/logs/log-2020-12-21.php

Loading logs in one machine into elasticsearch located setup in another machine using logstash

I have my logs and logstash running on the one EC2 machine (M1), so I read my logs placed on my local machine with this config:
input {
file{
path => "/path/to/logs/in/M1"
start_position => "beginning"
}
}
Now, we have elasticsearch running on a different EC2 machine (M2) and I need to transfer the logs from M1 to elasticsearch in M2 using logstash. I used the following output config:
output {
stdout { codec => rubydebug }
elasticsearch {
hosts => "http://<M2 ip address>:9200"
index => "logstash-%{+YYYY.MM.dd}"
}
}
When I run the config file, I get the following error:
04:18:57.640 [[main]>worker0] WARN logstash.outputs.elasticsearch - UNEXPECTED POOL ERROR {:e=>#<LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError: No Available connections>}
04:18:57.646 [[main]>worker0] ERROR logstash.outputs.elasticsearch - Attempted to send a bulk request to elasticsearch, but no there are no living connections in the connection pool. Perhaps Elasticsearch is unreachable or down? {:error_message=>"No Available connections", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError", :will_retry_in_seconds=>2}
04:18:59.682 [[main]>worker0] WARN logstash.outputs.elasticsearch - UNEXPECTED POOL ERROR {:e=>#<LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError: No Available connections>}
04:18:59.686 [[main]>worker0] ERROR logstash.outputs.elasticsearch - Attempted to send a bulk request to elasticsearch, but no there are no living connections in the connection pool. Perhaps Elasticsearch is unreachable or down? {:error_message=>"No Available connections", :class=>"LogStash::Outputs::ElasticSearch::HttpClient::Pool::NoConnectionAvailableError", :will_retry_in_seconds=>4}
04:19:01.109 [Ruby-0-Thread-17: /usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-5.4.0-java/lib/logstash/outputs/elasticsearch/http_client/pool.rb:188] WARN logstash.outputs.elasticsearch - Attempted to resurrect connection to dead ES instance, but got an error. {:url=>#<URI::HTTP:0x1d08c988 URL:http://10.60.40.120:9200>, :error_type=>LogStash::Outputs::ElasticSearch::HttpClient::Pool::HostUnreachableError, :error=>"Elasticsearch Unreachable: [http://10.60.40.120:9200][Manticore::ConnectTimeout] connect timed out"}
04:19:02.111 [Ruby-0-Thread-17: /usr/share/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-5.4.0-java/lib/logstash/outputs/elasticsearch/http_client/pool.rb:188] INFO logstash.outputs.elasticsearch - Running health check to see if an Elasticsearch connection is working {:url=>#<URI::HTTP:0x55444fcf URL:http://10.60.40.120:9200>, :healthcheck_path=>"/"}
I am new to logstash. Any help is appreciated.
UPDATE:
So I looked around in forumns and I got one solution which told me to update logstash output using the command:
sudo /usr/share/logstash/bin/logstash-plugin update logstash-output-elasticsearch
I also updated the logstash config file to include username and password:
output {
stdout { codec => rubydebug }
elasticsearch {
hosts => ["<M2 ip address>"]
user => 'username'
password => 'changeme'
index => "logstash-%{+YYYY.MM.dd}"
manage_template => false
}
}
Now I'm getting a different error. Pleas help:
09:16:21.305 [[main]>worker0] WARN logstash.outputs.elasticsearch - Could not index event to Elasticsearch. {:status=>404, :action=>["index", {:_id=>nil, :_index=>"logstash-2017.04.17", :_type=>"Messagelog", :_routing=>nil}, 2017-04-17T10:06:11.348Z ip-10-60-40-201 No valid licenses found for COLL], :response=>{"index"=>{"_index"=>"logstash-2017.04.17", "_type"=>"Messagelog", "_id"=>nil, "status"=>404, "error"=>{"type"=>"index_not_found_exception", "reason"=>"no such index and [action.auto_create_index] ([.security,.monitoring*,.watches,.triggered_watches,.watcher-history*]) doesn't match", "index_uuid"=>"_na_", "index"=>"logstash-2017.04.17"}}}}
Thanks.
It looks like you have disable auto creation of index on elasticsearch. By default elasticsearch supports auto creation of indexes.
Remove
action.auto_create_index: -b*,+a*,-*
(whatever the pattern) in your elasticsearch.yml and you will be good.
Furthermore if you want to accept auto creation of indexes starting with l used the pattern +l*. That is by adding
action.auto_create_index: +l*
Read this for additional informations.

WARN Error while fetching metadata with correlation id 1 : {MY_TOPIC?=INVALID_TOPIC_EXCEPTION} (org.apache.kafka.clients.NetworkClient)

When I run the following command with kafka 0.9.0.1, I get these warnings[1]. Can you please tell me what is wrong with my topics? (I'm talking to the kafka broker which runs in ec2)
./kafka-console-consumer.sh --new-consumer --bootstrap-server kafka.xx.com:9092 --topic MY_TOPIC?
[1]
[2016-04-06 10:57:45,839] WARN Error while fetching metadata with correlation id 1 : {MY_TOPIC?=INVALID_TOPIC_EXCEPTION} (org.apache.kafka.clients.NetworkClient)
[2016-04-06 10:57:46,066] WARN Error while fetching metadata with correlation id 3 : {MY_TOPIC?=INVALID_TOPIC_EXCEPTION} (org.apache.kafka.clients.NetworkClient)
[2016-04-06 10:57:46,188] WARN Error while fetching metadata with correlation id 5 : {MY_TOPIC?=INVALID_TOPIC_EXCEPTION} (org.apache.kafka.clients.NetworkClient)
[2016-04-06 10:57:46,311] WARN Error while fetching metadata with correlation id 7 : {MY_TOPIC?=INVALID_TOPIC_EXCEPTION} (org.apache.kafka.clients.NetworkClient)
You topic name is not valid because it has character '?' which is not legalCharacter for topic names.
I got same error. in my case problem was space between comma separated topics in my code:
#source(type='kafka',
topic.list="p1, p2, p3",
partition.no.list='0',
threading.option='single.thread',
group.id="group",
bootstrap.servers='kafka:9092',
#map(type='json')
)
finally find solution:
#source(type='kafka',
topic.list="p1,p2,p3",
partition.no.list='0',
threading.option='single.thread',
group.id="group",
bootstrap.servers='kafka:9092',
#map(type='json')
)
it happens when our producer is not able to produce to the respective address, Kindly check in /kafka/config/server.properties the value of advertised listeners,
if its commented out , there are other issues.
But if its not please put your ip address in place of localhost and then restart both zookeeper and kafka
Try starting the console producer hopefully it will work.
Just in case anyone is having this issue related with a comma " , " and logstash output to kafka or a calculated topic name:
In the topic_id of logstash output to kafka we tried to create the topic_id appending a variable we calculated in the filter.
The problem is that this field was already present in the source document and we later add it "again" in the logstash filter, converting the string field into a hash (array/list).
So as we used in the logstash output
topic_id => ["topicName_%{field}"]
we end up with:
topic_id : "topicName_fieldItem1,FieldItem2"
Which caused the exception in logstash logs
[WARN ][org.apache.kafka.clients.NetworkClient] [Producer clientId=logstash] Error while fetching metadata with correlation id 3605264 : {topicName_fieldItem1,FieldItem2=INVALID_TOPIC_EXCEPTION}

Logstash error message when using ElasticSearch output=>"Failed to flush outgoing items"

Im using ES 1.4.4 and LS 1.5 and Kibana 4 on Debian.
I start logstash, it works fine for a couple of minutes then i have a fatal error.
In order to shutdown logstash i have to delete the recent datas stored in ES, that's the only way i found.
One more relevant fact is that Elastic Search looks OK, i can see old datas in kibana and plugin head works fine.
My output config : output { elasticsearch {port => 9200 protocol => http host => "127.0.0.1"}}
Any help will be appreciated :)
Here is the full error message :
Got error to send bulk of actions to elasticsearch server at 127.0.0.1 : Read timed out {:level=>:error}
Failed to flush outgoing items {:outgoing_count=>1362, :exception=>#, :backtrace=>["/opt/logstash/vendor/bundle/jruby/1.9/gems/manticore-0.3.5-java/lib/manticore/response.rb:35:in initialize'", "org/jruby/RubyProc.java:271:incall'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/manticore-0.3.5-java/lib/manticore/response.rb:61:in call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/manticore-0.3.5-java/lib/manticore/response.rb:224:incall_once'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/manticore-0.3.5-java/lib/manticore/response.rb:127:in code'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.7/lib/elasticsearch/transport/transport/http/manticore.rb:50:inperform_request'", "org/jruby/RubyProc.java:271:in call'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.7/lib/elasticsearch/transport/transport/base.rb:187:inperform_request'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.7/lib/elasticsearch/transport/transport/http/manticore.rb:33:in perform_request'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-transport-1.0.7/lib/elasticsearch/transport/client.rb:115:inperform_request'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/elasticsearch-api-1.0.7/lib/elasticsearch/api/actions/bulk.rb:80:in bulk'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.1.18-java/lib/logstash/outputs/elasticsearch/protocol.rb:82:inbulk'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.1.18-java/lib/logstash/outputs/elasticsearch.rb:413:in submit'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.1.18-java/lib/logstash/outputs/elasticsearch.rb:412:insubmit'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.1.18-java/lib/logstash/outputs/elasticsearch.rb:438:in flush'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.1.18-java/lib/logstash/outputs/elasticsearch.rb:436:inflush'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.19/lib/stud/buffer.rb:219:in buffer_flush'", "org/jruby/RubyHash.java:1341:ineach'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.19/lib/stud/buffer.rb:216:in buffer_flush'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.19/lib/stud/buffer.rb:193:inbuffer_flush'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/stud-0.0.19/lib/stud/buffer.rb:159:in buffer_receive'", "/opt/logstash/vendor/bundle/jruby/1.9/gems/logstash-output-elasticsearch-0.1.18-java/lib/logstash/outputs/elasticsearch.rb:402:inreceive'", "/opt/logstash/lib/logstash/outputs/base.rb:88:in handle'", "(eval):1070:ininitialize'", "org/jruby/RubyArray.java:1613:in each'", "org/jruby/RubyEnumerable.java:805:inflat_map'", "(eval):1067:in initialize'", "org/jruby/RubyProc.java:271:incall'", "/opt/logstash/lib/logstash/pipeline.rb:279:in output'", "/opt/logstash/lib/logstash/pipeline.rb:235:inoutputworker'", "/opt/logstash/lib/logstash/pipeline.rb:163:in `start_outputs'"], :level=>:warn}
Your elasticsearch have surpassed storage and it is unable to write new documents coming from logstash, try deleting old indices and then
PUT your_index/_settings
{
"index": {
"blocks.read_only": false
}
}
I hope this will work for you. Thanks !!

Resources