How to index a csv document in elasticsearch? - elasticsearch

I'm trying to upload some csv files in elastic search. I don't want to mess it up , so I'm writing for some guidance. Can someone help with a video/tutorial/documentation , of how to index a document in elastic search? I've read the official documentation, but I feel a bit lost as a begginer. It will be fine If you'll recommend me a video tutorial , or you'll describe me some steps. Hope you are all doing well! Thank you for your time !

The best way is to use Logstash, which is official and very fast pipeline for elastic ,you can download it from here
First of all create a configuration file as below example and save it as logstashExample.conf in bin directory of logstash.
With assuming that elastic server and kibana console are up and running , run the configuration file with this command "./logstash -f logstashExample.conf" .
I've also added a suitable example of related configuration file for Logstash , please change the index name in output and your file path in input with respect of your need, you can also disable filtering by removing csv components in below example.
input {
file {
path => "/home/timo/bitcoin-data/*.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
#Date,Open,High,Low,Close,Volume (BTC),Volume (Currency),Weighted Price
columns => ["Date","Open","High","Low","Close","Volume (BTC)", "Volume (Currency)" ,"Weighted Price"]
}
}
output {
elasticsearch {
hosts => "http://localhost:9200"
index => "bitcoin-prices"
}
stdout {}
}

Related

Logstash doesn't import last line in txt file

I am trying to load a txt file into Elasticsearch with Logstash. The txt file is a simple text which consists of 3 lines, and it looks like this:
text file
My conf file looks like this:
input {
file {
path => "C:/Users/dinar/Desktop/myfolder/mytest.txt"
start_position => "beginning"
sincedb_path => "NULL"
}
}
output {
elasticsearch {
hosts => "localhost:9200"
index => "mydemo"
document_type => "intro"
}
stdout {}
}
After I run this and I go to Kibana, I can see the index being created. However, the only messages I see are the first two lines, and the last line is not displayed. This is what I see:
Kibana page
Does anybody know why the last line is not being imported and how I can fix this?
Thank you all for your help.
Based on the screenshot, my assumption is that you are editing the file manually. In that case please verify, there is a newline after the last log entry: Just hit Enter & Save.

Elasticsearch is slow to update mapping

Elasticsearch has become extremely slow when receiving input from Logstash, particularly when using file input. I have had to wait for 10+ minutes to get results. Oddly enough, standard input is mapped very quickly, which leads me to believe that my config file may be too complex, but it really isn't.
My config file uses file input, my filter is grok with 4 fields, and my output is to Elasticsearch. The file I am inputting is a .txt with 5 lines in it.
Any suggestions? Elasticsearch and Logstash newbie here.
input {
file {
type => "type"
path => "insert_path"
start_position = > "beginning'
}
}
filter {
grok { match => {"message" => "%{IP:client} %{WORD:primary} %{WORD:secondary} %{NUMBER:speed}" }
}
output {
elasticsearch {
hosts => ["localhost:9200"]
}
}
Thought: Is it a sincedb_path issue? I often try to reparse files without changing the filename.

Create index in kibana without using kibana

I'm very new to the elasticsearch-kibana-logstash and can't seem to find solution for this one.
I'm trying to create index that I will see in kibana without having to use the POST command in Dev Tools section.
I have set test.conf-
input {
file {
path => "/home/user/logs/server.log"
type => "test-type"
start_position => "beginning"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "new-index"
}
}
and then
bin/logstash -f test.conf from logstash directory
what i get is that I can't find the new-index in kibana (index patterns section), when I use elasticsearch - http://localhost:9200/new-index/ it presents an error and when I go to http://localhost:9600/ (the port it's showing) it doesn't seem to have any errors
Thanks a lot for the help!!
It's obvious that you won't be able to find the index which you've created using logstash in Kibana, unless you're manually creating it there within the Managemen section of Kibana.
Make sure, that you have the same name of the indice which you created using logstash. Have a look at the doc, which conveys:
When you define an index pattern, indices that match that pattern must
exist in Elasticsearch. Those indices must contain data.
which pretty much says that the indice should exist for you to create the index in Kibana. Hope it helps!
I have actually succeeded to create index even without first creating it in Kibana
I used the following config file -
input {
file {
path => "/home/hadar/tmp/logs/server.log"
type => "test-type"
id => "NEWTRY"
start_position => "beginning"
}
}
filter {
grok {
match => { "message" => "%{YEAR:year}-%{MONTHNUM:month}-%{MONTHDAY:day} %{HOUR:hour}:%{MINUTE:minute}:%{SECOND:second} - %{LOGLEVEL:level} - %{WORD:scriptName}.%{WORD:scriptEND} - " }
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "new-index"
codec => line { format => "%{year}-%{month}-%{day} %{hour}:%{minute}:%{second} - %{level} - %{scriptName}.%{scriptEND} - \"%{message}\"" }
}
}
I made sure that the index wasn't already in Kibana (I tried with other indexes names too just to be sure...) and eventually I did see the index with the log's info in both Kibana (I added it in the index pattern section) and Elasticsearch when I went to http://localhost:9200/new-index
The only thing I should have done was to erase the .sincedb_XXX files which are created under data/plugins/inputs/file/ after every Logstash run
OR
the other solution (for tests environment only) is to add sincedb_path=>"/dev/null" to the input file plugin which indicates to not create the .sincedb_XXX file
You can create directly index in elastic search using https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-create-index.html
and these indices can be used in Kibana.

JSON parser in logstash ignoring data?

I've been at this a while now, and I feel like the JSON filter in logstash is removing data for me. I originally followed the tutorial from https://www.digitalocean.com/community/tutorials/how-to-install-elasticsearch-logstash-and-kibana-elk-stack-on-ubuntu-14-04
I've made some changes, but it's mostly the same. My grok filter looks like this:
uuid #uuid and fingerprint to avoid duplicates
{
target => "#uuid"
overwrite => true
}
fingerprint
{
key => "78787878"
concatenate_sources => true
}
grok #Get device name from the name of the log
{
match => { "source" => "%{GREEDYDATA}%{IPV4:DEVICENAME}%{GREEDYDATA}" }
}
grok #get all the other data from the log
{
match => { "message" => "%{NUMBER:unixTime}..." }
}
date #Set the unix times to proper times.
{
match => [ "unixTime","UNIX" ]
target => "TIMESTAMP"
}
grok #Split up the message if it can
{
match => { "MSG_FULL" => "%{WORD:MSG_START}%{SPACE}%{GREEDYDATA:MSG_END}" }
}
json
{
source => "MSG_END"
target => "JSON"
}
So the bit causing problems is the bottom, I think. My gork stuff should all be correct. When I run this config, I see everything in kibana displayed correctly, except for all the logs which would have JSON code in them (not all of the logs have JSON). When I run it again without the JSON filter it displays everything.
I've tried to use a IF statement so that it only runs the JSON filter if it contains JSON code, but that didn't solve anything.
However, when I added a IF statement to only run a specific JSON format (So, if MSG_START = x, y or z then MSG_END will have a different json format. In this case lets say I'm only parsing the z format), then in kibana I would see all the logs that contain x and y JSON format (not parsed though), but it won't show z. So i'm sure it must be something to do with how I'm using the JSON filter.
Also, whenever I want to test with new data I started clearing old data in elasticsearch so that if it works I know it's my logstash that's working and not just running of memory from elasticsearch. I've done this using XDELETE 'http://localhost:9200/logstash-*/'. But logstash won't make new indexes in elasticsearch unless I provide filebeat with new logs. I don't know if this is another problem or not, just thought I should mention it.
I hope that all makes sense.
EDIT: I just check the logstash.stdout file, it turns out it is parsing the json, but it's only showing things with "_jsonparsefailure" in kibana so something must be going wrong with Elastisearch. Maybe. I don't know, just brainstorming :)
SAMPLE LOGS:
1452470936.88 1448975468.00 1 7 mfd_status 000E91DCB5A2 load {"up":[38,1.66,0.40,0.13],"mem":[967364,584900,3596,116772],"cpu":[1299,812,1791,3157,480,144],"cpu_dvfs":[996,1589,792,871,396,1320],"cpu_op":[996,50]}
MSG_START is load, MSG_END is everything after in the above example, so MSG_END is valid JSON that I want to parse.
The log bellow has no JSON in it, but my logstash will try to parse everything after "Inf:" and send out a "_jsonparsefailure".
1452470931.56 1448975463.00 1 6 rc.app 02:11:03.301 Inf: NOSApp: UpdateSplashScreen not implemented on this platform
Also this is my output in logstash, since I feel like that is important now:
elasticsearch
{
hosts => ["localhost:9200"]
document_id => "%{fingerprint}"
}
stdout { codec => rubydebug }
I experienced a similar issue and found that some of my logs were using a UTC time/date stamp and others were not.
Fixed the code to use exclusively UTC and sorted the issue for me.
I asked this question: Logstash output from json parser not being sent to elasticsearch
later on, and it has more relevant information on it, maybe a better answer if anyone ever has a similar problem to me you can check out that link.

To copy an index from one machine to another in elasticsearch

I have some indexes in one of my machines. I need to copy them to another machine, how can i do that in elasticsearch.
I did get some good documentation here, but since im an newbie to elasticsearch ecosystem and since im toying with lesser data indices, I thought I would use some plugins or ways which would be less time consuming.
I would use Logstash with an elasticsearch input plugin and an elasticsearch output plugin.
After installing Logstash, you can create a configuration file copy.conf that looks like this:
input {
elasticsearch {
hosts => "localhost:9200" <--- source ES host
index => "source_index"
}
}
filter {
mutate {
remove_field => [ "#version", "#timestamp" ] <--- remove added junk
}
}
output {
elasticsearch {
host => "localhost" <--- target ES host
port => 9200
protocol => "http"
manage_template => false
index => "target_index"
document_id => "%{id}" <--- name of your ID field
workers => 1
}
}
And then after setting the correct values (source/target host + source/target index), you can run this with bin/logstash -f copy.conf
I can see 3 options here
Snapshot/Restore - You can move your data across geographical locations.
Logstash reindex - As pointed out by Val
Stream2ES - This is a more simpler solution
You can use Snapshot and restore feature as well, where you can take snapshot (backup) of one index and then can Restore to somewhere else.
Just have a look at
https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-snapshots.html

Resources