Collect log files from FTP into Logstash/Elasticsearch - elasticsearch

I am investigated the Elastic stack for collecting logs files. As I understand, Elasticsearch is used for storage and indexing, and Logstash for parsing them. There is also Filebeat that can send the files to the Logstash server.
But it seems like this entire stack assumes that you have root access to the server that is producing the logs. In my case, I don't have root access, but I have FTP access to the files. I looked at various input plugins for Logstash, but couldn't find something suitable.
Is there a component of the Elastic system that can help with this setup, without requiring me to write (error-prone) custom code?

May be you can use exec input plugin with curl. Something like:
exec {
codec => plain { }
command => "curl ftp://server/logs.log"
interval => 3000}
}

Related

How to output logs in a containerized application?

Normally, in docker/k8s, it's recommended that directly ouput the logs to the stdout.
Then we can use kubectl logs or docker logs to see the logs.
like: time=123 action=write msg=hello world , as a tty, it might be colorized for human friendliness.
However, if we want to export the logs to a log processing center, like EFK (elasticsearch-fluentd-kibana), we need a json-format log file.
like: {"time"=123,"action"="write","msg"="hello world"}
What I want?
Is there a log method that can take into account both human friendliness and json format?
I'm looking for a way that if I use docker logs, then I can get human-readable logs, and at the same time, the log collector can still get the logs in json-format
Conclusion
Thanks for the answer below. I have got 2 methods:
different log format in different env:
1.1 use text-format in developing: docker logs will print colorized and human readable logs.
1.2 use json-format in production: EFK can process json-format well.
log collector's format convertion
2.1 we use text-format, but in log collector like fluentd we can define some scripts to translate text-format kv pair to json-format kv pair.
Kubernetes has such option of structured logging for its system components.
Klog library allows to use --logging-format=json flag that enables to change the format of logs to JSON output - more information about it here and here.
yes you can do that with Flunetd, below are the basic action items that you need to take to finalize this setup
Configure docker container to log to stdout (you can use any format you like)
Configure FluentD to tail docker files from /var/lib/docker/containers/*/*-json.log
parse logs with Flunetd and change the format to JSON logs
output the logs to Elasticsearch.
This article shows exactly how to do this setup also this one explain how to parse Key-Value logs

filebeat modify data enriching json from other sources

Log format consist on json encoded in line by line format.
Each line is
{data,payload:/local/path/to/file}
{data,payload:/another/file}
{data,payload:/a/different/file}
the initial idea is configure logstash to use http input, write a java (or anything) daemon that get the file, parse it line by line, replace the payload with the content of file, and send the data to logstash.
I can't modify how the server work, so log format can't be changed.
Logstash machine are different host, so no direct access to files.
Logstash can't mount a shared folder from the server_host.
I can't open port apart a single port for logstash due to compliance of the solution that need ot respect some silly rules that aren't under my control.
Now, to save some times and have a more reliable than a custom-made solution, it's possible to configure filebeat to process every line of json, before sending it to logstash, adding to it
{data,payload:content_of_the_file}
Filebeat won't be able to do advanced transformations of this kind, as it is only meant to forward logs, it can't even do basic string processing like logstash does. I suggest you write a custom script that does this transformation & writes the output to a different file.
You can use filebeat to send the contents of this new file to logstash.

Logstash seemingly changes the Elasticsearch output URL

I have my Logstash configured with the following output:
output {
hosts => ["http://myhost/elasticsearch"]
}
This is a valid URL, as I can cURL commands to Elasticsearch with it, such as
curl "http://myhost/elasticsearch/_cat/indices?v"
returns my created indices.
However, when Logstash attempts to create a template, it uses the following URL:
http://myhost/_template/logstash
when I would expect it to use
http://myhost/elasticsearch/_template/logstash
It appears that the /elasticsearch portion of my URL is being chopped off. What's going on here? Is "elasticsearch" a reserved word in the URL that is removed? As far as I can tell, when I issue http://myhost/elasticsearch/elasticsearch, it attempts to find an index named "elasticsearch" which leads me to believe it isn't reserved.
Upon changing the endpoint URL to be
http://myhost/myes
Logstash is still attempting to access
http://myhost/_template/logstash
What might be the problem?
EDIT
Both Logstash and Elasticsearch are v5.0.0
You have not specified which version of logstash you are using. If you are using one of the 2.x versions, you need to use use the path => '/myes/' parameter to specify that your ES instance is behind a proxy. In 2.x, the hosts parameter was just a list of hosts, not URIs.

Logstash not creating index on Elasticsearch

I'm trying to setup a ELK stack on EC2, Ubuntu 14.04 instance. But everything install, and everything is working just fine, except for one thing.
Logstash is not creating an index on Elasticsearch. Whenever I try to access Kibana, it wants me to choose an index, from Elasticsearch.
Logstash is in the ES node, but the index is missing. Here's the message I get:
"Unable to fetch mapping. Do you have indices matching the pattern?"
Am I missing something out? I followed this tutorial: Digital Ocean
EDIT:
Here's the screenshot of the error I'm facing:
Yet another screenshot:
I got identical results on Amazon AMI (Centos/RHEL clone)
In fact exactly as per aboveā€¦ Until I injected some data into Elastic - this creates the first day index - then Kibana starts working. My simple .conf is:
input {
stdin {
type => "syslog"
}
}
output {
stdout {codec => rubydebug }
elasticsearch {
host => "localhost"
port => 9200
protocol => http
}
}
then
cat /var/log/messages | logstash -f your.conf
Why stdin you ask? Well it's not super clear anywhere (also a new Logstash user - found this very unclear) that Logstash will never terminate (e.g. when using the file plugin) - it's designed to keep watching.
But using stdin - Logstash will run - send data to Elastic (which creates index) then go away.
If I did the same thing above with the file input plugin, it would never create the index - I don't know why this is.
I finally managed to identify the issue. For some reason, the port 5000 is being accessed by another service, which is not allowing us to accept any incoming connection. So all your have to do is to edit the logstash.conf file, and change the port from 5000 to 5001 or anything of your convenience.
Make sure all of your logstash-forwarders are sending the logs to the new port, and you should be good to go. If you have generated the logstash-forwarder.crt using the FQDN method, then the logstash-forwarder should be pointing to the same FQDN and not an IP.
Is this Kibana3 or 4?
If it's Kibana4, can you click on settings in the top-line menu, choose indices and then make sure that the index name contains 'logstash-*', then click in the 'time-field' name and choose '#timestamp'
I've added a screenshot of my settings below, be careful which options you tick.

Can't ship log from local server [duplicate]

This question already has answers here:
Old logs are not imported into ES by logstash
(2 answers)
Closed 8 years ago.
I'have a problem related to logstash and elasticsearch.
When I try to ship logs via logstash from a remote machine to my elasticsearch server, no problem, index are created.
But when I try to ship logs via logstash from the server that is hosting elasticsearch, no index are created, nothing happens.
Logging from logstash shows that logstash sees whitout problem the logs I'm trying to ship.
I can't figure out why this is happening.
Any idea ?
Thanks a lot
ES version : 1.0.1
Logstash version : 1.4.0
logstash config file :
input {
file {
type => "dmprocess"
path => "/logs/mysql.log*"
}
}
filter{
grok{
type => "dmprocess"
match => [ "message", "%{DATESTAMP:processTimeStamp} %{GREEDYDATA} Extraction done for %{WORD:alias} took %{NUMBER:milliseconds:int} ms for %{NUMBER:rows:int} rows",
"message", "%{DATESTAMP:processTimeStamp} %{GREEDYDATA} : %{GREEDYDATA} %{WORD:alias} took %{NUMBER:milliseconds:int} ms"]
}
date{
match => [ "processTimeStamp", "YY/MM/dd HH:mm:ss"]
}
}
output {
elasticsearch {
host=>"devmonxde"
cluster => "devcluster"
}
}
UPDATE:
It seems that I'am not able to ship logs via input:file to an elasticsearch instance(remote or local) from a linux host.
Though I am able to send data to ES via input:stdin. So no connection/port problem.
It works like a charm if I run logstash with same config, but from a windows host.
The default behaviour on windows seems to be "beginning". This look in contradiction with the doc http://logstash.net/docs/1.4.0/inputs/file#start_position
It seems that logstash does not import old logs, even with start_position="beginning" of the file input.
The problem is that my old log are not imported into ES.
I'am creating another post for this.
Thanks
Old logs are not imported into ES by logstash
My first try would be to change the host to localhost, 127.0.0.1 or the external ip address to make sure the host name is not the problem.
Another thing would be to add an output to log everything to the console, easy way to check if the messages are coming in and are parsed the right way.

Resources