For ELK,sometimes Logstash says “no such index”, how to set automatic create index in ES while “no such index”? - elasticsearch

I found some pb with ELK, can anyone help me?
logstash 2.4.0
elasticsearch 2.4.0
3 elasticsearch instance for cluster
some time logstash warning:
“ "status"=>404, "error"=>{"type"=>"index_not_found_exception", "reason"=>"no such index", ...”,
and it doesn't work. curl -XGET ES indices, it truly not have the index.
when this happen, I must kill -9 logstash, and start it again, then it can create a index in ES and it works ok again.
So, my question is how to set automatic create index in ES while “no such index”?
My logstash conf is:
input {
tcp {
port => 10514
codec => "json"
}
}
output {
elasticsearch {
hosts => [ "9200.xxxxxx.com:9200" ]
index => "log001-%{+YYYY.MM.dd}"
}

Related

Timeout reached in KV filter with value entry too large

I'm trying to build a new ELK project. I'm a newbie here so not sure what I'm missing. I'm trying to move very huge logs to ELK and while doing so, its timing out in KV filter with the error "Timeout reached in KV filter with value entry too large".
My logstash is in the below format:
grok {
match => [ "message", "(?<timestamp>%{MONTHDAY:monthday} %{MONTH:month} %{YEAR:year} % {TIME:time} \[%{LOGLEVEL:loglevel}\] %{DATA:requestId} \(%{DATA:thread}\) %{JAVAFILE:className}: %{GREEDYDATA:logMessage}" ]
}
kv {
source => logMessage"
}
Is there a way, i can skip execution to go through kv filter when the logs are huge? If so, can someone guide me on how that can be done.
Thank you
I have tried multiple things but nothing seemed to work.
I solved this by using dissect.
The query was something along the lines of:
dissect{
mapping => { "message" => "%{[#metadata][timestamp] %{[#metadata][timestamp] %{[#metadata][timestamp] %{[#metadata][timestamp] %{loglevel} %{requestId} %{thread} %{classname} %{logMessage}"
}

Create a new index in elasticsearch for each log file by date

Currently
I have completed the above task by using one log file and passes data with logstash to one index in elasticsearch :
yellow open logstash-2016.10.19 5 1 1000807 0 364.8mb 364.8mb
What I actually want to do
If i have the following logs files which are named according to Year,Month and Date
MyLog-2016-10-16.log
MyLog-2016-10-17.log
MyLog-2016-10-18.log
MyLog-2016-11-05.log
MyLog-2016-11-02.log
MyLog-2016-11-03.log
I would like to tell logstash to read by Year,Month and Date and create the following indexes :
yellow open MyLog-2016-10-16.log
yellow open MyLog-2016-10-17.log
yellow open MyLog-2016-10-18.log
yellow open MyLog-2016-11-05.log
yellow open MyLog-2016-11-02.log
yellow open MyLog-2016-11-03.log
Please could I have some guidance as to how do i need to go about doing this ?
Thanks You
It is also simple as that :
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "MyLog-%{+YYYY-MM-DD}.log"
}
}
If the lines in the file contain datetime info, you should be using the date{} filter to set #timestamp from that value. If you do this, you can use the output format that #Renaud provided, "MyLog-%{+YYYY.MM.dd}".
If the lines don't contain the datetime info, you can use the input's path for your index name, e.g. "%{path}". To get just the basename of the path:
mutate {
gsub => [ "path", ".*/", "" ]
}
wont this configuration in output section be sufficient for your purpose ??
output {
elasticsearch {
embedded => false
host => localhost
port => 9200
protocol => http
cluster => 'elasticsearch'
index => "syslog-%{+YYYY.MM.dd}"
}
}

Elasticsearch Bulk Write is slow using Scan and Scroll

I am currently running into an issue on which i am really stuck.
I am trying to work on a problem where I have to output the Elasticsearch documents and write them to csv. The docs range from 50,000 to 5 million.
I am experience serious performance issues and I get a feeling that I am missing something here.
Right now I have a dataset to 400,000 documents on which I am trying to scan and scroll and which would ultimately be formatted and written to csv. But the time taken to just output is 20 mins!! That is insane.
Here is my script:
import elasticsearch
import elasticsearch.exceptions
import elasticsearch.helpers as helpers
import time
es = elasticsearch.Elasticsearch(['http://XX.XXX.XX.XXX:9200'],retry_on_timeout=True)
scanResp = helpers.scan(client=es,scroll="5m",index='MyDoc',doc_type='MyDoc',timeout="50m",size=1000)
resp={}
start_time = time.time()
for resp in scanResp:
data = resp
print data.values()[3]
print("--- %s seconds ---" % (time.time() - start_time))
I am using a hosted AWS m3.medium server for Elasticsearch.
Can anyone please tell me what I might be doing wrong here?
A simple solution to output ES data to CSV is to use Logstash with an elasticsearch input and a csv output with the following es2csv.conf config:
input {
elasticsearch {
host => "localhost"
port => 9200
index => "MyDoc"
}
}
filter {
mutate {
remove_field => [ "#version", "#timestamp" ]
}
}
output {
csv {
fields => ["field1", "field2", "field3"] <--- specify the field names you want
path => "/path/to/your/file.csv"
}
}
You can then export your data easily with bin/logstash -f es2csv.conf

Stop pushing data in elasticsearch initiate by logstash "exec" plugin

I am very new to elasticsearch stuck in a problem. I have made a logstash configuration file named test.conf which is as follows :-
input
{
exec
{
command => "free"interval => 1
}
}
output
{
elasticsearch
{
host => "localhost"protocol => "http"
}
}
Now I execute this config file so that it will start pushing data in elasticsearch every 1 sec by following command :-
$ /opt/logstash/bin/logstash -f test.conf
I m using kibana to display data inserted in elasticsearch.
Since the data is keep on adding into elasticsearch every second I am not getting how to stop this data insertion job. Please help me out.

Can anyone give a list of REST APIs to query elasticsearch?

I am trying to push my logs to elasticsearch through logstash.
My logstash.conf have 2 log files as input; elasticsearch as output; and grok as filter. Here is my grok match:
grok {
match => [ "message", "(?<timestamp>[0-9]{4}-[0-9]{2}-[0-9]{2}
[0-9]{2}:[0-9]{2}:[0-9]{2},[0-9]{3})
(?:\[%{GREEDYDATA:caller_thread}\]) (?:%{LOGLEVEL:level})
(?:%{DATA:caller_class})(?:\-%{GREEDYDATA:message})" ]
}
When elasticsearch is started, all my logs are added to elasticsearch server with seperate index name as mentioned in logstash.conf.
My doubt is that how my logs are stored in elasticsearch? I only know that it is stored with the index name as mentioned in logstash.
'http://164.99.178.18:9200/_cat/indices?v' API given me the following:
health status index pri rep docs.count docs.deleted store.size pri.store.size
yellow open tomcat-log 5 1 6478 0 1.9mb 1.9mb
yellow open apache-log 5 1 212 0 137kb
137kb
But, how 'documents', 'fields' are created in elasticsearch for my logs.
I read that elasticsearch is REST based search engine. So, if there any REST APIs that I could use to analyze my data in elasticsearch.
Indeed.
curl localhost:9200/tomcat-log/_search
Will give you back the first 10 documents but also the total number of docs in your index.
curl localhost:9200/tomcat-log/_search -d '{
"query": {
"match": {
"level" : "error"
}
}
}'
might gives you all docs in tomcat-log which have level equal to error.
Have a look at this section of the book. It will help.

Resources