How to debug input records in Elasticsearch cluster? - elasticsearch

where can I find all possible keys for logger.org.elasticsearch.*?
For example:
"logger.org.elasticsearch.transport"
"logger.org.elasticsearch.discovery"
I don't have access to source of logs agent (cannot configure debug). I would like to debug input raw records on elasticsearch side. But I can't find which keys I have to enable.
And of course I am aware about impact of debug on cluster perfomance.
I've already read this question.
How to set up logging level in Elasticsearch?

Related

How can I get statistics about what clients search for when querying Elasticsearch?

I'm using Elasticsearch to drive a "search website" feature. I'd like to collect statistics about what people search for (and which search queries are popular).
Elasticsearch is currently running behind Nginx, so I could extract this information from the Nginx access logs - but maybe Elasticsearch can be made to track this iinformation itself?
I found the Index stats API but that seems to be more abstract. It can be used to determne the average time needed to answer a query and such things, but it does not keep track of individual queries.
I am using a similar configuration (ES behind nginx), and I up to now I always just checked nginx' logfiles directly. However, thinking about your question, it makes much sense to route the nginx log files through the Elastic stack to Elastic Search using logstash, this seems to be the cleanest way.
Apparently in deprecated version there were some security auditing options using a plugin termed Shield or Security, but as I said, configuring logstash to ingest nginx logfiles directly seems most endurable way for your purposes.
Further reading and detailed instructions
discuss.elastic.co: How to get elaticsearch access logs
https://sysadmins.co.za/how-to-ingest-nginx-access-logs-to-elasticsearch-using-filebeat-and-logstash/
Elasticsearch Access Log
how to enable ElasticSearch http access log

Difference between using Filebeat and Logstash to push log file to Elasticsearch

I am trying out the ELK to visualise my log file. I have tried different setups:
Logstash file input plugin https://www.elastic.co/guide/en/logstash/current/plugins-inputs-file.html
Logstash Beats input plugin https://www.elastic.co/guide/en/logstash/current/plugins-inputs-beats.html with Filebeat Logstash output https://www.elastic.co/guide/en/beats/filebeat/current/logstash-output.html
Filebeat Elasticsearch output https://www.elastic.co/guide/en/beats/filebeat/current/elasticsearch-output.html
Can someone list out their differences and when to use which setup? If it is not for here, please point me to the right place like Super User or DevOp or Server Fault.
1) To use logstash file input you need a logstash instance running on the machine from where you want to collect the logs, if the logs are on the same machine that you are already running logstash this is not a problem, but if the logs are on remote machines, a logstash instance is not always recommended because it needs more resources than filebeat.
2 and 3) For collecting logs on remote machines filebeat is recommended since it needs less resources than a logstash instance, you would use the logstash output if you want to parse your logs, add or remove fields or make some enrichment on your data, if you don't need to do anything like that you can use the elasticsearch output and send the data directly to elasticsearch.
This is the main difference, if your logs are on the same machine that you are running logstash, you can use the file input, if you need to collect logs from remote machines, you can use filebeat and send it to logstash if you want to make transformations on your data, or send directly to elasticsearch if you don't need to make transformations on your data.
Another advantage of using filebeat, even on the logstash machine, is that if your logstash instance is down, you won't lose any logs, filebeat will resend the events, using the file input you can lose events in some cases.
An additional point for large scale application is that if you have a lot of Beat (FileBeat, HeartBeat, MetricBeat...) instances, you would not want them altogether open connection and sending data directly to Elasticsearch instance at the same time.
Having too many concurrent indexing connections may result in a high bulk queue, bad responsiveness and timeouts. And for that reason in most cases, the common setup is to have Logstash placed between Beat instances and Elasticsearch to control the indexing.
And for larger scale system, the common setup is having a buffering message queue (Apache Kafka, Rabbit MQ or Redis) between Beats and Logstash for resilency to avoid congestion on Logstash during event spikes.
Figures are captured from Logz.io. They also have a good
article on this topic.
Not really familiar with (2).
But,
Logstash(1) is usually a good choice to take a content play around with it using input/output filters, match it to your analyzers, then send it to Elasticsearch.
Ex.
You point the Logstash to your MySql which takes a row modify the data (maybe do some math on it, then Concat some and cut out some words then send it to ElasticSearch as processed data).
As for Logbeat(2), it's a perfect choice to pick up an already processed data and pass it to elasticsearch.
Logstash (as the name clearly states) is mostly good for log files and stuff like that. usually you can do tiny changes to those.
Ex. I have some log files in my servers (incl errors, syslogs, process logs..)
Logstash listens to those files, automatically picks up new lines added to it and sends those to Elasticsearch.
Then you can filter some things in elasticsearch and find what's important to you.
p.s: logstash has a really good way of load balancing too many data to ES.
You can now use filebeat to send logs to elasticsearch directly or logstash (without a logstash agent, but still need a logstash server of course).
Main advantage is that logstash will allow you to custom parse each line of the logs...whereas filebeat alone will simply send the log and there is not much separation of fields.
Elasticsearch will still index and store the data.

How to push performance test logs to kibana via elastic search

Is there a possibility to push the analysis report taken from the Performance Center to Logstash and visualize them in Kibana? I just wanted to automate the task of checking each vuser log file and then push errors to ELK stack. How can I retrieve the files by script and automate this. I can't get any direction on this because I need to automate the task of automatically reading from each vuser_log file.
Filebeat should be your tool to get done what you mentioned.
To automatically read entries you write in a file (could be a log file) you simply need a shipper tool which can be Filebeat (It integrates well with ELK stack. Logstash can also do the same thing though but that's heavy and requires JVM )
To do this in ELK stack you need following :
Filebeat should be setup on "all" instances where your main application is running- and generating logs.
Filebeat is simple lightweight shipper tool that can read your logs entries and then send them to Logstash.
Setup one instance of Logstash (that's L of ELK) which will receive events from Filebeat. Logstash will send data to Elastic Search
Setup one instance of Elastic Search (that's E of ELK) where your data will be stored
Setup one instance of Kibana (that's K of ELK). Kibana is the front end tool to view and interact with Elastic search via Rest calls
Refer following link for setting up above mentioned:
https://logz.io/blog/elastic-stack-windows/

Should I use elastic search for logging without logstash

I'm planning on using Elasticsearch to log all my application activities (like an audit log).
Considering how I have direct control over the application, should I directly push the data into Elasticsearch using their REST APIs or should I somehow use Logstash to feed data into Elasticsearch?
Is there any reason I should use Logstash when I can directly push data into Elasticsearch? It's an additional layer to manage.
If you need to parse different log formats (eventlog, syslog and so on), support different transports (UDP, TCP and so on) and log outputs use Logstash. If http is good for you and you collect logs only from one application use ES directly. Logstash is an additional tool. Details are here.

How to Analyze logs from multiple sources in ELK

I have started working on ELK recently and have a doubt regarding handling of multiple types of logs.
I have two sets of logs on my server that I want to analyse, one from my android application and the other from my website. I have successfully transferred logs from this server via filebeat to the ELK server.
I have created two filters for either types of logs and have successfully imported these logs into logstash and then Kibana.
This link helped do the above stuff.
https://www.digitalocean.com/community/tutorials/how-to-install-elasticsearch-logstash-and-kibana-elk-stack-on-centos-7
The above link directs to use the logs in the filebeat index in Kibana and start analysing(I successfully did for one type of logs). But the problem that I am facing is that since both these logs are very different, they need to be analysed differently. How do I do this in Kibana. Should I create multiple filebeat indexes there and import them, or should it be just one single index, or some other way. I am not very clear on this(could not find much documentation), hence would request to please help and guide me here.
Elasticsearch organizes by index and type. Elastic used to compare these to SQL concepts, but now offers a new explanation.
Since you say that the logs are very different, Elastic is saying that you should use different indexes.
In Kibana, the visualization is tied to an index. If you had one panel from each index, you can show them both on the same dashboard.

Resources