csv file input processing using logstash stops working after 2/3 days - elasticsearch

I am using logstash-1.5.1 to process csv file and get upload into elasticsearch-1.5.1.This process should happen for every day.So I put my logstash and elastic search engines up once and left it hoping that csv file processing should happen for every day and get uploaded into elasticsearch.Every day one new csv file is being downloaded from internet and get stored in local folder from where logstash reads. But surprisingly the logstash stop processing the csv file after 2/3 days.I don't know the reason please help me . The logstash input file configuration is as follows.
input {
file {
type => "csv"
path => "D:/Tools/logstash-1.5.1/data/**/*"
start_position => beginning
sincedb_path => "D:/Tools/logstash-1.5.1/sincedb/.sincedb"
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}" }
}
}
output {
elasticsearch {
host => "localhost"
cluster => "Test"
node_name => "data"
index => "client"
template => "D:/Tools/logstash-1.5.1/lib/elasticsearch-template.json"
template_overwrite => true
}
}

So, try to use logstash-forwarder and please post result, I am really interesting in it.
You can install logstash-forwarder with, for example, this configuration:
{
"network": {
"servers": [ "$YOUR_SERVER:$PORT" ],
"timeout": 20,
"ssl ca": "/path/to/logstash/*.crt_file"
},
"files": [
{
"paths": ["D:/Tools/logstash-1.5.1/data/**/*"],
"fields": { "type": "csv" },
"dead time" : "5m"
}
]
}
And in your logstash server you can use this input:
input {
lumberjack {
port => "$PORT"
ssl_key => "/path/to/your/*.key_file"
ssl_certificate => " "/path/to/your/*.key_file""
}
}

Related

Logstash - Send output from log files to elk

I have an index in elastic search that has a field named locationCoordinates. It's being sent to ElasticSearch from logstash.
The data in this field looks like this...
-38.122, 145.025
When this field appears in ElasticSearch it is not coming up as a geo point.
I know if I do this below it works.
{
"mappings": {
"logs": {
"properties": {
"http_request.locationCoordinates": {
"type": "geo_point"
}
}
}
}
}
But what I would like to know is how can i change my logstash.conf file so that it does this at startup.
At the moment my logstash.conf looks a bit like this...
input {
# Default GELF input
gelf {
port => 12201
type => gelf
}
# Default TCP input
tcp {
port => 5000
type => syslog
}
# Default UDP input
udp {
port => 5001
type => prod
codec => json
}
file {
path => [ "/tmp/app-logs/*.log" ]
codec => json {
charset => "UTF-8"
}
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
json{
source => "message"
}
}
output {
elasticsearch {
hosts => "elasticsearch:9200"
}
}
And I end up with this in Kibana (without the little Geo sign).
You simply need to modify your elasticsearch output to configure an index template in which you can add your additional mapping.
output {
elasticsearch {
hosts => "elasticsearch:9200"
template_overwrite => true
template => "/path/to/template.json"
}
}
And then in the file at /path/to/template.json you can add your additional geo_point mapping
{
"template": "logstash-*",
"mappings": {
"logs": {
"properties": {
"http_request.locationCoordinates": {
"type": "geo_point"
}
}
}
}
}
If you want to keep the official logstash template, you can download it and add your specific geo_point mapping to it.

How to parse a xml-file with logstash filters

I'm trying to index some simple XML-files with elasticsearch and logstash. So far I have the ELK-stack set up, and logstash-forwarder. I am trying to use the documentation to set up a xml filter, but I just cant seem to get it right.
My XML format is pretty straigth forward;
<Recording>
<DataFile description="desc" fileName="test.wav" Source="mic" startTime="2014-12-12_121212" stopTime="2014-12-12_131313"/>
</Recording>
I just want each file to be an entry in elasticsearch, and every parameter in the DataFile-tag to be a key-value that I can search. Since the documentation is getting me nowhere, how would such a filter look? I have also tried to use the answers in this and this without any luck.
Add the below in your logstash-forwarder configuration and change the logstash server IP, Certificate path and the log path accordingly.
{
"network": {
"servers": [ "x.x.x.x:5043" ],
"ssl ca": " / cert/server.crt",
"timeout": 15
},
"files": [
{
"paths": [
"D:/ELK/*.log"
],
"fields": { "type": "log" }
}
]
}
Add the below input plugin in your logstash server configuration. Change the certificate ,key path and name accordingly.
lumberjack {
port => 5043
type => "lumberjack"
ssl_certificate => " /cert/server.crt"
ssl_key => "D:/ELK/logstash/cert/server.key"
codec => multiline {
pattern => "(\/Recording>)"
what => "previous"
negate => true
}
}
Now add the below grok filter under your logstash filter section
grok {
match => ["message", "(?<content>(< Recording(.)*?</Recording>))"]
tag_on_failure => [ ]
}
Finally in the logstash output session add
elasticsearch {
host => "127.0.0.1"
port => "9200"
protocol => "http"
index => "Recording-%{+YYYY.MM.dd}"
index_type => "log"
}
Now when you add your xml messages into your log file. Each entry will be processed and stored in your elastic search server.
Thanks,

Changing the elasticsearch host in logstash 1.3.3 web interface

I followed the steps in this document and I was able to do get some reports on the Shakespeare data.
I want to do the same thing with elastic search remotely installed.I tried configuring the "host" in config file but the queries still run on host as opposed to remote .This is my config file
input {
stdin{
type => "stdin-type" }
file {
type => "accessLog"
path => [ "/Users/akushe/Downloads/requests.log" ]
}
}
filter {
grok {
match => ["message","%{COMMONAPACHELOG} (?:%{INT:responseTime}|-)"]
}
kv {
source => "request"
field_split => "&?"
}
if [lng] {
kv {
add_field => [ "location" , ["%{lng}","%{lat}"]]
}
}else if [lon] {
kv {
add_field => [ "location" , ["%{lon}","%{lat}"]]
}
}
}
output {
elasticsearch {
host => "slc-places-qa-es3001.slc.where.com"
port => 9200
}
}
You need to add protocol => http in to make it use HTTP transport rather than joining the cluster using multicast.

Issue using grok filter with logstash and a windows file

I am attempting to filter a sql server error log using Logstash and grok. Logstash 1.3.3 is running as a windows service using NSSM and JRE6. My config file is below
input {
file {
path => "c:\program files\microsoft sql server\mssql10_50.mssqlserver\mssql\log\errorlog"
type => SQLServerLog
start_position => "beginning"
codec => plain {
charset => "UTF-8"
}
}
}
filter {
grok {
type => "SQLServerLog"
match => [ "message", "%{DATESTAMP:DateStamp} %{WORD:Process} %{GREEDYDATA:Message}" ]
named_captures_only => true
singles => true
remove_tag => [ "_grokparsefailure" ]
add_tag => [ "GrokFilterWorked" ]
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
embedded => true
}
}
A sample of the log file content is below.
2014-01-31 00:00:38.73 spid21s This instance of SQL Server has been using a process ID of 14632 since 28/01/2014 13:09:24 (local) 28/01/2014 13:09:24 (UTC). This is an informational message only; no user action is required.
Events are visible in Kibana but when collapsed the message is displayed like {"message":"\u00002\u00000\u00001\u00004...
When expanded the table view shows the event message as text instead. The raw data for the event when viewed is as below.
{
"_index": "logstash-2014.01.31",
"_type": "SQLServerLog",
"_id": "NpvKSf4eTFSHkBdoG3zw6g",
"_score": null,
"_source": {
"message": "\u00002\u00000\u00001\u00004\u0000-\u00000\u00001\u0000-\u00003\u00000\u0000 \u00000\u00000\u0000:\u00000\u00000\u0000:\u00002\u00001\u0000.\u00006\u00004\u0000 \u0000s\u0000p\u0000i\u0000d\u00002\u00004\u0000s\u0000 \u0000 \u0000 \u0000 \u0000 \u0000T\u0000h\u0000i\u0000s\u0000 \u0000i\u0000n\u0000s\u0000t\u0000a\u0000n\u0000c\u0000e\u0000 \u0000o\u0000f\u0000 \u0000S\u0000Q\u0000L\u0000 \u0000S\u0000e\u0000r\u0000v\u0000e\u0000r\u0000 \u0000h\u0000a\u0000s\u0000 \u0000b\u0000e\u0000e\u0000n\u0000 \u0000u\u0000s\u0000i\u0000n\u0000g\u0000 \u0000a\u0000 \u0000p\u0000r\u0000o\u0000c\u0000e\u0000s\u0000s\u0000 \u0000I\u0000D\u0000 \u0000o\u0000f\u0000 \u00001\u00004\u00006\u00003\u00002\u0000 \u0000s\u0000i\u0000n\u0000c\u0000e\u0000 \u00002\u00008\u0000/\u00000\u00001\u0000/\u00002\u00000\u00001\u00004\u0000 \u00001\u00003\u0000:\u00000\u00009\u0000:\u00002\u00004\u0000 \u0000(\u0000l\u0000o\u0000c\u0000a\u0000l\u0000)\u0000 \u00002\u00008\u0000/\u00000\u00001\u0000/\u00002\u00000\u00001\u00004\u0000 \u00001\u00003\u0000:\u00000\u00009\u0000:\u00002\u00004\u0000 \u0000(\u0000U\u0000T\u0000C\u0000)\u0000.\u0000 \u0000T\u0000h\u0000i\u0000s\u0000 \u0000i\u0000s\u0000 \u0000a\u0000n\u0000 \u0000i\u0000n\u0000f\u0000o\u0000r\u0000m\u0000a\u0000t\u0000i\u0000o\u0000n\u0000a\u0000l\u0000 \u0000m\u0000e\u0000s\u0000s\u0000a\u0000g\u0000e\u0000 \u0000o\u0000n\u0000l\u0000y\u0000;\u0000 \u0000n\u0000o\u0000 \u0000u\u0000s\u0000e\u0000r\u0000 \u0000a\u0000c\u0000t\u0000i\u0000o\u0000n\u0000 \u0000i\u0000s\u0000 \u0000r\u0000e\u0000q\u0000u\u0000i\u0000r\u0000e\u0000d\u0000.\u0000\r\u0000",
"#version": "1",
"#timestamp": "2014-01-31T08:55:03.373Z",
"type": "SQLServerLog",
"host": "MyMachineName",
"path": "C:\\Program Files\\Microsoft SQL Server\\MSSQL10_50.MSSQLSERVER\\MSSQL\\Log\\ERRORLOG"
},
"sort": [
1391158503373,
1391158503373
]
}
I am unsure whether the encoding of the message is preventing Grok from filtering it properly.
I would like to be able to filter these events using Grok and am unsure how to proceed.
Further info:
I created a copy of the log file as UTF-8 and the filter worked fine. So it's definitely a charset issue. I guess I need to determine what the correct charset for the log file is and it should work.
So I had the same issue with reading SQL Server log file.
Then I realised that SQL Server will log the same entries to the Windows Event Log, which logstash supports as an input.
SQL Server logs entries with 'MSSQLSERVER' source on my systems. You will need the logstash-contrib package, simply extract the contents over base logstash files on your Windows box (wherever you run logstash to collect data).
I have my logstash agent configured to simply ship the entries to another logstash instance on a linux box that does some other stuff not relevant to this question ;)
Example logstash.conf:
input {
eventlog {
type => "Win32-EventLog"
logfile => ["Application", "Security", "System"]
}
}
filter {
if "MSSQLSERVER" in [SourceName] {
# Track logon failures
grok {
match => ["Message", "Login failed for user '%{DATA:username}'\..+CLIENT: %{IP:client_ip}"]
}
dns {
action => "append"
resolve => "client_ip"
}
}
}
output {
stdout { codec => rubydebug }
tcp {
host => "another-logstash-instance.local"
port => "5115"
codec => "json_lines"
}
}
Hope this helps.

Logstash not importing files due to missing index error

I am having a difficult time trying to get the combination of the Logstash, Elasticsearch & Kibana working in my Windows 7 environment.
I have set all 3 up and they all seem to be running fine, Logstash and Elasticsearch are running as Windows services and Kibana as a website in IIS.
Logstash is running from http://localhost:9200
I have a web application creating log files in .txt with the format:
Datetime=[DateTime], Value=[xxx]
The log files get created in this directory:
D:\wwwroot\Logs\Errors\
My logstash.conf file looks like this:
input {
file {
format => ["plain"]
path => ["D:\wwwroot\Logs\Errors\*.txt"]
type => "testlog"
}
}
output {
elasticsearch {
embedded => true
}
}
My Kibana config.js file looks like this:
define(['settings'],
function (Settings) {
return new Settings({
elasticsearch: "http://localhost:9200",
kibana_index: "kibana-int",
panel_names: [
'histogram',
'map',
'pie',
'table',
'filtering',
'timepicker',
'text',
'fields',
'hits',
'dashcontrol',
'column',
'derivequeries',
'trends',
'bettermap',
'query',
'terms'
]
});
});
When I view Kibana I see the error:
No index found at http://localhost:9200/_all/_mapping. Please create at least one index.If you're using a proxy ensure it is configured correctly.
I have no idea on how to create the index, so if anyone can shed some light on what I am doing wrong that would be great.
It seems like nothing is making it to elasticsearch currently.
For the current version of es (0.90.5), I had to use elasticsearch_http output. The elasticsearch output seemed to be too closely associated with 0.90.3.
e.g: here is how my config is for log4j format to elastic search
input {
file {
path => "/srv/wso2/wso2am-1.4.0/repository/logs/wso2carbon.log"
path => "/srv/wso2/wso2as-5.1.0/repository/logs/wso2carbon.log"
path => "/srv/wso2/wso2is-4.1.0/repository/logs/wso2carbon.log"
type => "log4j"
}
}
output {
stdout { debug => true debug_format => "ruby"}
elasticsearch_http {
host => "localhost"
port => 9200
}
}
For my file format, I have a grok filter as well - to parse it properly.
filter {
if [message] !~ "^[ \t\n]+$" {
# if the line is a log4j type
if [type] == "log4j" {
# parse out fields from log4j line
grok {
match => [ "message", "TID:%{SPACE}\[%{BASE10NUM:thread_name}\]%{SPACE}\[%{WORD:component}\]%{SPACE}\[%{TIMESTAMP_ISO8601:timestamp}\]%{SPACE}%{LOGLEVEL:level}%{SPACE}{%{JAVACLASS:java_file}}%{SPACE}-%{SPACE}%{GREEDYDATA:log_message}" ]
add_tag => ["test"]
}
if "_grokparsefailure" not in [tags] {
mutate {
replace => ["message", " "]
}
}
multiline {
pattern => "^TID|^ $"
negate => true
what => "previous"
add_field => {"additional_log" => "%{message}"}
remove_field => ["message"]
remove_tag => ["_grokparsefailure"]
}
mutate {
strip => ["additional_log"]
remove_tag => ["test"]
remove_field => ["message"]
}
}
} else {
drop {}
}
}
Also, I would get elasticsearch head plugin to monitor your content in elasticsearch- to easily verify the data and state it is in.

Resources