Use clevercloud drain with Elasticsearch target - elasticsearch

I'm using Clevercloud to host my application and I want to store logs in an Elasticsearch index. I read the documentation here and I tried to create a drain as explained. I was able to create the drain but no data has been added in my Elasticsearch.
Somebody has an idea ?

I found the solution : I couldn't see datas because I was looking at the wrong ES index. Even if you specified an index in your URL, logs are in logstash format so by default it will create a new index per day named logstash-YYYY-MM-DD. The datas was in those indexes.

Related

How data is getting mapped in Elastic search in ELK?

I am new to the ELK and i am in the progress of learning it. In my project, they are importing the data from Amazon S3 -> File Beat -> logstash -> Elastic search -> Kibanna.
In the logstash file, they have directly importing the data and sending to the Elastic search something like below and there was no indexes mentioned in the config file,
output elasticsearch
{
hosts => ["http://localhost:9200"]
}
In Amazon s3, we have logs from Salesforce and in future we are going to implement from multiple sources.
In Elastic search, i could see 41 indexes(Used Get Curl script) is present. Assume if we keep the same setup in logstash, then all logs(Multiple sources) will be sent to elastic search in same manner. I would like to know how the data is getting mapped to the particular index in elastic search ??
In many tutorials, they have given indexes in the logstash config file so in kibanna we could see the index name along with timestamp. I have tried to check by placing a sample Mulesoft log file in Amazon S3 but i cant able to find those data in Kibanna. So shall i need to create one more new index with a name Mule along with mappings??
There is no ELK expert in my project so please guide me on how to approach this one or any references will be more helpful.
This page (https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html) documents Logstash's Elasticsearch output plugin.
As you can see in the Configuration Options section, the option index is not mandatory. If this option is not specified, its default-value is logstash-%{+YYYY.MM.dd}.
With that being said, the documents will get indexed into indices with the prefix 'logstash-' followed by the date of ingestion. For example:
logstash-2020.04.07
logstash-2020.04.08
Since someone in your organization has chosen to go with the default value, this option can be left out. This explains why you can't find a particular index name in the Logstash configuration. If you need to index documents into different indices, then you'd have to set a particular value for the index option.
Elasticsearch will automatically create these indices with a dynamic mapping (https://www.elastic.co/guide/en/elasticsearch/reference/current/dynamic-mapping.html) if you haven't setup an explicit mapping via index templates in advance. In order to see the data in kibana, you first need to create an index pattern matching the index name.
I hope I could help you.

Index existing documents on startup

I'm new to elasticsearch and this is a question I've been trying to find an answer to. Basically I have around a thousand documents that I would like elasticsearch to index for me. Do I have to write a bash/python script that would just use CURL to put/post all these documents in my elasticsearch server or can I configure my server so that it would automatically index documents in a specific folder/location on disk when I start it up for the first time?
I far as I know Elasticsearch does not have any option for pulling document to index itself. As you mentioned you need to create a script and push your documents to ES yourself.

Can Beats update existing documents in Elasticsearch?

Consider the following use case:
I want the information from one particular log line to be indexed into Elasticsearch, as a document X.
I want the information from some log line further down the log file to be indexed into the same document X (not overriding the original, just adding more data).
The first part, I can obviously achieve with filebeat.
For the second, does anyone have any idea about how to approach it? Could I still use filebeat + some pipeline on an ingest node for example?
Clearly, I can use the ES API to update the said document, but I was looking for some solution that doesn't require changes to my application - rather, it is all possible to achieve using the log files.
Thanks in advance!
No, this is not something that Beats were intended to accomplish. Enrichment like you describe is one of the things that Logstash can help with.
Logstash has an Elasticsearch input that would allow you to retrieve data from ES and use it in the pipeline for enrichment. And the Elasticsearch output supports upsert operations (update if exists, insert new if not). Using both those features you can enrich and update documents as new data comes in.
You might want to consider ingesting the log lines as is to Elasticearch. Then using Logstash, build a separate index that is entity specific and driven based on data from the logs.

Where / How ElasticSearch stores logs received from Logstash?

Disclaimer: I am very new to ELK Stack, so this question can be very basic.
I am setting up ELK stack now. I have below basic questions about ElasticSearch.
What is the storage model elastic search is following?
For example Oracle is using relational model ,Alfresco is using "document model" and Apache Jackrabbit is using "hierarchial model"
2.Log data stored in elastic search is persistent/permanent ? Or ElasticSearch deletes log data after certain period?
3.How we will manage/backup this data?
4.Log/data files in Elastic Search is human-readable?
Any help/route to documentation will be appreciated.
the storage model is a Document model. Everything is a document. The documents are of a particular type and they are stored in an index.
Data send to ES is stored on disk. It can be then read, searched or deleted through a REST API.
The Data is managed through the rest API. Usually for log centralisation, the logs are stored in date-based index (one index for today, one for yesterday and so on), so to delete the logs from one day, you delete the relevant index. Curator can help in this case. ES offers a backup and restore module.
To access the data in ES, you'll have to use the REST API or use the Kibana client.
Documentation:
https://www.elastic.co/guide/en/elasticsearch/guide/current/index.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html

Does ElasticSearch store a duplicate copy of each record?

I started looking into ElasticSearch, and most examples of creating and reading involve POSTing data to the ElasticSearch server and then doing a GET to retrieve them.
Is this data that is POSTed stored separately by the ElasticSearch server? So, if I want to use ElasticSearch with MongoDB, does the raw data, not including the search indices, get stored twice (once copy for MongoDB and one for ElasticSearch)?
In conjunction with an answer to this question, a description or a link to a description of how ElasticSearch and the primary data store interact would be very helpful.
Yes, ElasticSearch can only search within its own data store, so a separate copy will be there.
You can use the mongodb connector to keep the data in elastic in sync with the mongo database: https://github.com/mongodb-labs/mongo-connector

Resources