Unavailable_shards_exception, reason: primary shard is not active - elasticsearch

I am trying to send data from filebeat-->logstash-->Elasticsearch Cluster-->kibana.
I have a cluster with 3 nodes. 2 are master eligible nodes and 1 is client node.
I have checked the health of the cluster using the below command,
curl -XGET "http://132.186.102.61:9200/_cluster/state?pretty"
I can see the output properly with elected master.
When the filebeat pushes data to logstash, i see the following error,
logstash.outputs.elasticsearch - retrying failed action with response code: 503 ({"type"=>"unavailable_shards_exception", "reason"=>"[logstash-2017.06.05][1] primary shard is not active Timeout: [1m], request: [BulkShardRequest [[logstash-2017.06.05][1]] containing [3] requests]"})
this is my logstash.conf,
input {
beats {
port => "5043"
#ssl => true
#ssl_certificate_authorities => "D:/Softwares/ELK/ELK_SSL_Certificates/testca/cacert.pem"
#ssl_certificate => "D:/Softwares/ELK/ELK_SSL_Certificates/server/cert.pem"
#ssl_key => "D:/Softwares/ELK/ELK_SSL_Certificates/server/pkcs8.key"
#ssl_key_passphrase => "MySecretPassword"
#ssl_verify_mode => "force_peer"
}
}
filter{
grok
{
match => {"message" =>"%{IP:client} %{NUMBER:duration} %{GREEDYDATA:messageFromClient}"}
}
}
#filter{
#if "_grokparsefailure" in [tags] {
# drop { }
# }
#}
output {
elasticsearch {hosts => ["132.186.189.127:9200","132.186.102.61:9200","132.186.102.43:9200"]}
stdout { codec => rubydebug }
}
May i please know the reason for this issue.

Related

Logstash delay of log sending

I'm forwarding application logs to elasticsearch, while performing some grok filters before.
The application has a timestamp field and there's the timestamp field of logstash itself.
We regularly check the difference between those timestamp, and on many cases the delay is very big, meaning the log took very long time to be shipped to elasticsearch.
I'm wondering how can I isolate the issue to know if the delay is coming from logstash or elasticsearch.
Example logstash scrape config:
input {
file {
path => "/app/app-core/_logs/app-core.log"
codec => multiline {
pattern => "(^[a-zA-Z.]+(?:Error|Exception).+)|(^\s+at .+)|(^\s+... \d+ more)|(^\t+)|(^\s*Caused by:.+)"
what => "previous"
}
}
}
filter {
if "multiline" not in [tags]{
json {
source => "message"
remove_field => ["[request][body]","[response][body][response][items]"]
}
}
else {
grok {
pattern_definitions => { APPJSON => "{.*}" }
match => { "message" => "%{APPJSON:appjson} %{GREEDYDATA:stack_trace}"}
remove_field => ["message"]
}
json {
source => "appjson"
remove_field => ["appjson"]
}
}
}
output {
elasticsearch {
hosts => ["elasticsearch-logs.internal.app.io:9200"]
index => "logstash-core-%{+YYYY.MM.dd}"
document_type => "logs"
}
}
We tried adjusting the number of workers and batch size, no value we tried reduced the delay:
pipeline.workers: 9
pipeline.output.workers: 9
pipeline.batch.size: 600
pipeline.batch.delay: 5
Nothing was done on the elasticsearch side because I think the issue is with logstash, but I'm not sure.

How to send logs from multiple servers to ELK server

I have a server in which ELK installed, On other end i have 2 source servers which sending logs to ELK server through filebeat. But the issue is both server's logs showing on same page on kibana. which is too complicated to identify which log is coming from which server! How multiple server's logs show separate on kibana.
Following are my logstash.conf:
input {
beats {
port => 5044
}
}
# Used to parse syslog messages and send it to Elasticsearch for storing
filter {
if [type] == "syslog" {
grok {
match => { "message" => "%{SYSLOGLINE}" }
}
date {
match => [ "timestamp", "MMM d HH:mm:ss", "MMM dd HH:mm:ss" ]
}
}
}
# Specify an Elastisearch instance
output {
Elasticsearch {
hosts => ["localhost:9200"]
index => "%{[#metadata][beat]}-%{+YYYY.MM.dd}"
}
}

why does logstash stops processing logs

Logstash stops processing logs after some hours. When logs stops processing logstash service consumes high amount of CPU performance (about 25 cores of 32 total).
When logstash service works normal it consumes about 4-5 cores total.
Pipeline generates about 50k events per minute.
Logstash Conf (non default):
pipeline.workers: 15
pipeline.batch.size: 100
JVM CONF:
-Xms15g
-Xmx15g
input {
tcp {
port => 5044
type => syslog
}
udp {
port => 5044
type => syslog
}
}
filter {
if [type] == "syslog" {
grok {
match => [ "message", "%{SYSLOG5424PRI}%{NOTSPACE:syslog_timestamp} %{NOTSPACE:syslog_hostname} %{DATA:syslog_program}(?:\[%{POSINT:syslog_pid}\])?: %{GREEDYDATA:syslog_message}" ]
}
kv {
id => "logs_kv"
source => "syslog_message"
trim_key => " "
trim_value => " "
value_split => "="
field_split => " "
}
mutate {
remove_field => [ "syslog_message", "syslog_timestamp" ]
}
#now check if source IP is a private IP, if so, tag it
cidr {
address => [ "%{srcip}" ]
add_tag => [ "src_internalIP" ]
network => [ "10.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16" ]
}
# don't run geoip if it's internalIP, otherwise find the GEOIP location
if "src_internalIP" not in [tags] {
geoip {
add_tag => [ "src_geoip" ]
source => "srcip"
database => "/usr/share/elasticsearch/modules/ingest-geoip/GeoLite2-City.mmdb"
}
geoip {
source => "srcip"
database => "/usr/share/elasticsearch/modules/ingest-geoip/GeoLite2-ASN.mmdb"
}
}
else {
#check DST IP now. If it is a private IP, tag it
cidr {
add_tag => [ "dst_internalIP" ]
address => [ "%{dstip}" ]
network => [ "10.0.0.0/8", "172.16.0.0/12", "192.168.0.0/16" ]
}
# don't run geoip if it's internalIP, otherwise find the GEOIP location
if "dst_internalIP" not in [tags] {
geoip {
add_tag => [ "dst_geoip" ]
source => "dstip"
database => "/usr/share/elasticsearch/modules/ingest-geoip/GeoLite2-City.mmdb"
}
geoip {
source => "dstip"
database => "/usr/share/elasticsearch/modules/ingest-geoip/GeoLite2-ASN.mmdb"
}
}
}
}
}
output {
if [type] == "syslog" {
elasticsearch {hosts => ["127.0.0.1:9200" ]
index => "sysl-%{syslog_hostname}-%{+YYYY.MM.dd}"
}
#stdout { codec => rubydebug }
}
}
When logstash stops processing i dont see any errors in log file (log level - trace). Only see these messages:
[2019-04-19T00:00:12,004][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
[2019-04-19T00:00:17,011][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
[2019-04-19T00:00:17,012][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
[2019-04-19T00:00:22,015][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
[2019-04-19T00:00:22,015][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
[2019-04-19T00:00:27,023][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
[2019-04-19T00:00:27,024][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
[2019-04-19T00:00:32,030][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
[2019-04-19T00:00:32,030][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
events format:
[2019-04-22T13:04:27,190][DEBUG][logstash.pipeline ] filter received {"event"=>{"type"=>"syslog", "#version"=>"1", "#timestamp"=>2019-04-22T10:04:27.159Z, "port"=>50892, "message"=>"<30>2019:04:22-13:05:08 msk ulogd[18998]: id=\"2002\" severity=\"info\" sys=\"SecureNet\" sub=\"packetfilter\" name=\"Packet accepted\" action=\"accept\" fwrule=\"6\" initf=\"eth2\" outitf=\"eth1\" srcmac=\"70:79:b3:ab:e0:e8\" dstmac=\"00:1a:8c:f0:89:02\" srcip=\"10.0.134.138\" dstip=\"10.0.131.134\" proto=\"17\" length=\"66\" tos=\"0x00\" prec=\"0x00\" ttl=\"126\" srcport=\"63936\" dstport=\"53\" ", "host"=>"10.0.130.235"}}
Help me please debug this problem.
According to the internet, the ParNew garbage collector is "stop the world". If it takes 5 seconds to resume, and you're getting GC every 5 seconds, you'll get no throughput from logstash as it's always blocked.
Solved.
Problem was in kv filter, which stopped logstash when trying to parse non-structured data going trough pipeline.

Use logstash to copy index on same cluster

I have an ES 1.7.1 cluster and I wanted to "reindex" an Index. Since I cannot use _reindex I came across this tutorial to use logstash to copy an index onto a different cluster. However in my case I want to copy it over to the same cluster. This is my logstash.conf :
input {
# We read from the "old" cluster
elasticsearch {
hosts => "myhost"
port => "9200"
index => "my_index"
size => 500
scroll => "5m"
docinfo => true
}
}
output {
# We write to the "new" cluster
elasticsearch {
host => "myhost"
port => "9200"
protocol => "http"
index => "logstash_test"
index_type => "%{[#metadata][_type]}"
document_id => "%{[#metadata][_id]}"
}
# We print dots to see it in action
stdout {
codec => "dots"
}
}
filter {
mutate {
remove_field => [ "#timestamp", "#version" ]
}
}
This errors out with the following error message:
A plugin had an unrecoverable error. Will restart this plugin.
Plugin: <LogStash::Inputs::Elasticsearch hosts=>[{:scheme=>"http", :user=>nil, :password=>nil, :host=>"myhost", :path=>"", :port=>"80", :protocol=>"http"}], index=>"my_index", scroll=>"5m", query=>"{\"query\": { \"match_all\": {} } }", docinfo_target=>"#metadata", docinfo_fields=>["_index", "_type", "_id"]>
Error: Connection refused - Connection refused {:level=>:error}
Failed to install template: http: nodename nor servname provided, or not known {:level=>:error}
Any help would be appreciated

Filebeat -> Logstash indexing documents twice

I have Nginx logs being sent from Filebeat to Logstash which is indexing them into Elasticsearch.
Every entry gets indexed twice. Once with the correct grok filter and then again with no fields found except for the "message" field.
This is the logstash configuration.
02-beats-input.conf
input {
beats {
port => 5044
ssl => false
}
}
11-nginx-filter.conf
filter {
if [type] == "nginx-access" {
grok {
patterns_dir => ['/etc/logstash/patterns']
match => {"message" => "%{NGINXACCESS}"
}
date {
match => [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z", "d/MMM/YYYY:HH:mm:ss Z" ]
}
}
}
Nginx Patterns
NGUSERNAME [a-zA-Z\.\#\-\+_%]+
NGUSER %{NGUSERNAME}
NGINXACCESS %{IPORHOST:clientip}\s+%{NGUSER:ident}\s+%{NGUSER:auth}\s+\[%{HTTPDATE:timestamp}\]\s+\"%{WORD:verb}\s+%{URIPATHPARAM:request}\s+HTTP/%{NUMBER:httpversion}\"\s+%{NUMBER:response}\s+(?:%{NUMBER:bytes}|-)\s+(?:\"(?:%{URI:referrer}|-)\"|%{QS:referrer})\s+%{QS:agent}
30-elasticsearc-output.conf
output {
elasticsearch {
hosts => ["elastic00:9200", "elastic01:9200", "elastic02:9200"]
manage_template => false
index => "%{[#metadata][beat]}-%{+YYYY.MM.dd}"
document_type => "%{[#metadata][type]}"
}
}
Check your filebeat configuration!
During setup I had accidentally un-commented and configured the output.elasticsearch section of the filebeat.yml.
I then also configured the output.logstash section of the configuration but forgot to comment out the elasticsearch output section.
This caused one entry to be sent to logstash where it was grok'd and another one to be sent directly to elasticsearch.

Resources