Create a new index in elasticsearch for each log file by date - elasticsearch

Currently
I have completed the above task by using one log file and passes data with logstash to one index in elasticsearch :
yellow open logstash-2016.10.19 5 1 1000807 0 364.8mb 364.8mb
What I actually want to do
If i have the following logs files which are named according to Year,Month and Date
MyLog-2016-10-16.log
MyLog-2016-10-17.log
MyLog-2016-10-18.log
MyLog-2016-11-05.log
MyLog-2016-11-02.log
MyLog-2016-11-03.log
I would like to tell logstash to read by Year,Month and Date and create the following indexes :
yellow open MyLog-2016-10-16.log
yellow open MyLog-2016-10-17.log
yellow open MyLog-2016-10-18.log
yellow open MyLog-2016-11-05.log
yellow open MyLog-2016-11-02.log
yellow open MyLog-2016-11-03.log
Please could I have some guidance as to how do i need to go about doing this ?
Thanks You

It is also simple as that :
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "MyLog-%{+YYYY-MM-DD}.log"
}
}

If the lines in the file contain datetime info, you should be using the date{} filter to set #timestamp from that value. If you do this, you can use the output format that #Renaud provided, "MyLog-%{+YYYY.MM.dd}".
If the lines don't contain the datetime info, you can use the input's path for your index name, e.g. "%{path}". To get just the basename of the path:
mutate {
gsub => [ "path", ".*/", "" ]
}

wont this configuration in output section be sufficient for your purpose ??
output {
elasticsearch {
embedded => false
host => localhost
port => 9200
protocol => http
cluster => 'elasticsearch'
index => "syslog-%{+YYYY.MM.dd}"
}
}

Related

How to resolve parsing error for CSV file in Logstash

I am using Filebeat to send a CSV file to Logstash and then up to Kibana, however I am getting a parsing error when the CSV file is picked up by Logstash.
This is the contents of the CSV file:
time version id score type
May 6, 2020 # 11:29:59.863 1 2 PPy_6XEBuZH417wO9uVe _doc
The logstash.conf:
input {
beats {
port => 5044
}
}
filter {
csv {
separator => ","
columns =>["time","version","id","index","score","type"]
}
}
output {
elasticsearch {
hosts => ["http://localhost:9200"]
index => "%{[#metadata][beat]}-%{[#metadata][version]}-%{+YYYY.MM.dd}"
}
}
Filebeat.yml:
filebeat.inputs:
# Each - is an input. Most options can be set at the input level, so
# you can use different inputs for various configurations.
# Below are the input specific configurations.
- type: log
# Change to true to enable this input configuration.
enabled: true
# Paths that should be crawled and fetched. Glob based paths.
paths:
- /etc/test/*.csv
#- c:\programdata\elasticsearch\logs\*
and the error in Logstash:
[2020-05-27T12:28:14,585][WARN ][logstash.filters.csv ][main] Error parsing csv {:field=>"message", :source=>"time,version,id,score,type,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,", :exception=>#<TypeError: wrong argument type String (expected LogStash::Timestamp)>}
[2020-05-27T12:28:14,586][WARN ][logstash.filters.csv ][main] Error parsing csv {:field=>"message", :source=>"\"May 6, 2020 # 11:29:59.863\",1,2,PPy_6XEBuZH417wO9uVe,_doc,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,", :exception=>#<TypeError: wrong argument type String (expected LogStash::Timestamp)>}
I do get some data in Kibana but not what I want to see.
I have managed to get it to work locally. the mistakes I have noticed so far were:
Using ES reserved fields like #timestamp, #version, and more.
The timestamp was not in ISO8601 format. It had an # sign in the middle.
Your filter set the separator to , but your CSV real separator is "\t".
According to the error you can see it is trying to also work on your titles line, I suggest you remove it from the CSV or use the skip_header option.
Below is the logstash.conf file I used:
input {
file {
path => "C:/work/elastic/logstash-6.5.0/config/test.csv"
start_position => "beginning"
}
}
filter {
csv {
separator => ","
columns =>["time","version","id","score","type"]
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "csv-test"
}
}
The CSV file I used:
May 6 2020 11:29:59.863,1,PPy_6XEBuZH417wO9uVe,_doc
May 6 2020 11:29:59.863,1,PPy_6XEBuZH417wO9uVe,_doc
May 6 2020 11:29:59.863,1,PPy_6XEBuZH417wO9uVe,_doc
May 6 2020 11:29:59.863,1,PPy_6XEBuZH417wO9uVe,_doc
From my Kibana:

Issue in reading log file that contains date in it's name

I have 2 linux boxes setup in which 1 box contains one component which generates log and logstash installed in it to transfer the logs. And in other box I have redis elasticsearch and logstash. here logstash will act as logstash indexer to grok the data.
Now my problem is that in 1st box component generate new log file everyday, but only difference in log file name varies as per date.
like
counters-20151120-0.log
counters-20151121-0.log
counters-20151122-0.log
and so on, I have included below type of code in my logstash shipper conf file:
file {
path => "/opt/data/logs/counters-%{YEAR}%{MONTHNUM}%{MONTHDAY}*.log"
type => "rg_counters"
}
And in my logstash indexer, I have below type of code to catch those log files:
if [type] == "rg_counters" {
grok{
match => ["message", "%{YEAR}%{MONTHNUM}%{MONTHDAY}\s*%{HOUR}:%{MINUTE}:%{SECOND}\s*(?<counters_raw_data>[0-9\-A-Z]*)\s*(?<counters_operation_type>[\-A-Z]*)\s*%{GREEDYDATA:counters_extradata}"]
}
}
output {
elasticsearch { host => ["elastichost1","elastichost1" ] port => "9200" protocol => "http" }
stdout { codec => rubydebug }
}
Please note that this is working setup and other types log files are getting transfered and processed successfully, so there is no issue of setup.
The problem is how do I process this log file which contains date in it's file name.
Any help here?
Thanks in advance!!
Based on the comments...
Instead of trying to use regexp patterns in your path:
path => "/opt/data/logs/counters-%{YEAR}%{MONTHNUM}%{MONTHDAY}*.log"
just use glob patterns:
path => "/opt/data/logs/counters-*.log"
logstash will remember which files (inodes) that it's seen before.

Stop pushing data in elasticsearch initiate by logstash "exec" plugin

I am very new to elasticsearch stuck in a problem. I have made a logstash configuration file named test.conf which is as follows :-
input
{
exec
{
command => "free"interval => 1
}
}
output
{
elasticsearch
{
host => "localhost"protocol => "http"
}
}
Now I execute this config file so that it will start pushing data in elasticsearch every 1 sec by following command :-
$ /opt/logstash/bin/logstash -f test.conf
I m using kibana to display data inserted in elasticsearch.
Since the data is keep on adding into elasticsearch every second I am not getting how to stop this data insertion job. Please help me out.

Cannot load index to elasticsearch from external file, using logstash 1.4.2 on Windows 7

when trying to load a file into elastic, using logstash that is running the config file below, I get the following output msgs on elastic and no file is loaded (when input is configured to be stdin everything seems to be working just fine)
[2014-08-20 10:51:10,957][INFO ][cluster.service ] [Max] added {[logsta
sh-GURWB02038-5480-4002][dstQagpWTfGkSU5Ya-sUcQ][GURWB02038][inet[/10.203.152.13
9:9301]]{client=true, data=false},}, reason: zen-disco-receive(join from node[[l
ogstash-GURWB02038-5480-4002][dstQagpWTfGkSU5Ya-sUcQ][GURWB02038][inet[/10.203.1
52.139:9301]]{client=true, data=false}])
Logstash Config File that I used is below:-
input {
file {
path => "D:/example.log"
}
}
output {
elasticsearch {
host => "localhost"
}
}
You might be missing start_position.
Try with something like this.
input {
file {
path => "D:/example.log"
start_position => "beginning"
}
}
Also take the "first contact" restriction into account, according to the documentation.
start_position
Value can be any of: "beginning", "end"
Default value is "end"
Choose where Logstash starts initially reading files: at the beginning or at the end.
The default behavior treats files like live streams and thus starts at the end.
If you have old data you want to import, set this to ‘beginning’
This option only modifies “first contact” situations where a file is new and not seen
before. If a file has already been seen before, this option has no effect.
Hope this helps.
From all the examples it seems that the syntext is:
output {
elasticsearch {
host => localhost
}
}

Email alert after threshold crossed, logstash?

I am using logstash, elasticsearch and kibana to analyze my logs.
I am alerting via email when a particular string comes into the log via email output in logstash:
email {
match => [ "Session Detected", "logline,*Session closed*" ]
...........................
}
This works fine.
Now, I want to alert on the count of a field (when a threshold is crossed):
Eg If user is field, I want to alert when number of unique users go more than 5.
Can this be done via email output in logstash??
Please help.
EDIT:
As #Alcanzar told I did this:
config file:
if [server] == "Server2" and [logtype] == "ABClog" {
grok{
match => ["message", "%{TIMESTAMP_ISO8601:timestamp} %{HOSTNAME:server-name} abc\[%{INT:id}\]:
\(%{USERNAME:user}\) CMD \(%{GREEDYDATA:command}\)"]
}
metrics {
meter => ["%{user}"]
add_tag => "metric"
}
}
So according to above, for server2 and abclog I have a grok pattern for parsing my file and on the user field parsed by grok I want the metric applied.
I did that in the config file as above, but I get strange behaviour when I check logstash console with -vv.
So if there are 9 log lines in the file it parses the 9 first, after that it starts metric part but there the message field is not the logline in the log file but it's the user-name of my PC, thus it gives _grokparsefailure. Something like this:
output received {
:event=>{"#version"=>"1", "#timestamp"=>"2014-06-17T10:21:06.980Z", "message"=>"my-pc-name",
"root.count"=>2, "root.rate_1m"=>0.0, "root.rate_5m"=>0.0, "root.rate_15m"=>0.0,
"abc.count"=>2, "abc.rate_1m"=>0.0, "abc.rate_5m"=>0.0, "abc.rate_15m"=>0.0, "tags"=>["metric",
"_grokparsefailure"]}, :level=>:debug, :file=>"(eval)", :line=>"137"
}
Any help is appreciated.
I believe what you need is http://logstash.net/docs/1.4.1/filters/metrics.
You'd want to use a metrics tag to calculate the rate of your event, and then use the thing.rate_1m or thing.rate_5m in an if statement around your email output.
For example:
filter {
if [message] =~ /whatever_message_you_want/ {
metrics {
meter => "user"
add_tag => "metric"
}
}
}
output {
if "metric" in [tags] and [user.rate_1m] > 1 {
email { ... }
}
}
Aggregating on the logstash side is fairly limited. It also increases the state size thus memory consumption may grow. Alerts that run on the Elasticsearch layer offer more freedom and possibilities.
Logz.io alerts on top of ELK are offered in the below blog: http://logz.io/blog/introducing-alerts-for-elk/

Resources