Appending Elasticsearch Data - elasticsearch

I have Elasticsearch indices with the same name as logstash-2015.12.10. on different servers, with different data. Now I want only Elasticsearch so there is the requirement of appending this data of both servers into one.
Is it possible to do it?

You could copy one index from one host to the same index on your other host using Logstash. Using the configuration below, make sure to replace the source and target hosts to match your host names.
File: copylogs.conf
input {
elasticsearch {
hosts => "server1:9200" <---- the host you want to copy from
index => "logstash-2015.12.10"
}
}
filter {
mutate {
remove_field => [ "#version", "#timestamp" ]
}
}
output {
elasticsearch {
host => "server2" <--- the host you want to copy to
port => 9200
protocol => "http"
manage_template => false
index => "logstash-2015.12.10"
}
}
And then you can simply launch it with
bin/logstash -f copylogs.conf

Related

How hit_cache_size in logstash dns filter works?

I am using dns filter in logstash for my csv file. In my csv file, I have two fields. They are website and count.
Here's the sample content of my csv file:
|website|n|
|www.google.com|n1|
|www.yahoo.com|n2|
|www.bing.com|n3|
|www.stackoverflow.com|n4|
|www.smackcoders.com|n5|
|www.zoho.com|n6|
|www.quora.com|n7|
|www.elastic.co|n8|
Here's my logstash config file:
input {
file {
path => "/home/paulsteven/log_cars/cars_dns.csv"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
csv {
separator => ","
columns => ["website","n"]
}
dns {
resolve => [ "website" ]
action => "replace"
hit_cache_size => 8000
hit_cache_ttl => 300
failed_cache_size => 1000
failed_cache_ttl => 10
}
}
output {
elasticsearch {
hosts => "localhost:9200"
index => "dnsfilter03"
document_type => "details"
}
stdout{}
}
Here's the sample data passing through logstash:
{
"#version" => "1",
"path" => "/home/paulsteven/log_cars/cars_dns.csv",
"website" => "104.28.5.86",
"n" => "n21",
"host" => "smackcoders",
"message" => "www.smackcoders.com,n21",
"#timestamp" => 2019-04-23T10:41:15.680Z
}
In the logstash config file, I want to know about hit_cache_size. What is the use of it. I read the guide of dns filter in th elastic website but unable to figure it out. I added the field in my logstash config but nothing happened. can i get any examples for that. I want to know the use of hit_cache_size. What is the job, it's doing in dns filter
The hit_cache_size allows you to store the result of a successful request, so if you need to run a dns request on the same host will look into the cache instead and only will do a dns lookup if the host is not cached.
If your data has unique hosts then there is no reason to use the hit_cache_size since the hosts only appears once.
The hit_cache_ttl works with the hit_cache_size and says how many seconds the request will be stored in the cache.

How to send different logstash event to different output

There are many events as fields that in logstash filter section are extracted from message field like below:
match => ["message", "%{type1:f1} %{type2:f2} %{type3:f3}"]
The purpose is to send f1, f2, f3 to one output and only f1 and f3 to other output plugin such that:
output {
elasticsearch {
action => "index"
hosts => "localhost"
index =>"indx1-%{+YYYY-MM}"
.
}
}
output {
elasticsearch {
action => "index"
hosts => "localhost"
index =>"indx2-%{+YYYY-MM}"
}
}
The problem is that all events are involved in every output pluggin but I want to handle which events goes to which output plugin.Is it possible to do this?
I found a solution by using filebeat to forward data to logstash.
If running two instancea of filebeat and one instance of logstash, each filebeat forwarda input data to the same logstash but with different type like:
document_type: type1
In logstash, appropriate filter and output is exceuted using if clause:
filter {
if [type] == "type1" {
}
else {
}
}
output {
if [type] == "type1" {
elasticsearch {
action => "index"
hosts => "localhost"
index => "%{type}-%{+YYYY.MM}"
}
}
else {
elasticsearch {
action => "index"
hosts => "localhost"
index => "%{type}-%{+YYYY.MM}"
}
}
}
If you have two distinct matching patterns in the "filter" section, then you can add specific "tags" for each match. Then in the output section use something like this:
if "matchtype1" in [tags] {
elasticsearch {
hosts => "localhost"
index => "indxtype1-%{+YYYY.MM}"
}
}
if "matchtype2" in [tags]{
elasticsearch {
hosts => "localhost"
index => "indxtype2-%{+YYYY.MM}"
}
}

I want to Delete document by logstash,but it throws a exception

Now,I meet a question. My logstash configuration file as follows:
input {
redis {
host => "127.0.0.1"
port => 6379
db => 10
data_type => "list"
key => "local_tag_del"
}
}
filter {
}
output {
elasticsearch {
action => "delete"
hosts => ["127.0.0.1:9200"]
codec => "json"
index => "mbd-data"
document_type => "localtag"
document_id => "%{album_id}"
}
file {
path => "/data/elasticsearch/result.json"
}
stdout {}
}
I want to read id from redis, by logstash, notify es to delete document.
Excuse me,My English is poor,I hope that someone will help me .
Thx.
I can't help you particularly, because your problem is spelled out in your error message - logstash couldn't connect to your elasticsearch instance.
That usually means one of:
elasticsearch isn't running
elasticsearch isn't bound to localhost
That's nothing to do with your logstash config. Using logstash to delete documents is a bit unusual though, so I'm not entirely sure this isn't an XY problem

How to move data from one Elasticsearch index to another using the Bulk API

I am new to Elasticsearch. How to move data from one Elasticsearch index to another using the Bulk API?
I'd suggest using Logstash for this, i.e. you use one elasticsearch input plugin to retrieve the data from your index and another elasticsearch output plugin to push the data to your other index.
The config logstash config file would look like this:
input {
elasticsearch {
hosts => "localhost:9200"
index => "source_index" <--- the name of your source index
}
}
filter {
mutate {
remove_field => [ "#version", "#timestamp" ]
}
}
output {
elasticsearch {
host => "localhost"
port => 9200
protocol => "http"
manage_template => false
index => "target_index" <---- the name of your target index
document_type => "your_doc_type" <---- make sure to set the appropriate type
document_id => "%{id}"
workers => 5
}
}
After installing Logstash, you can run it like this:
bin/logstash -f logstash.conf

How to efficiently move data from elasticsearch index (with one shard) to another index (with 5 shards)?

I have an elasticsearch index which contains around 5 GB of data on a single node in a single shard. Now I have created another index with same settings as older one, but with number_of_shards as 5 instead of 1.
I am looking for the most efficient approach to copy the data from older index to newer index without any downtime.
I would suggest using Logstash for this. You could use the following configuration. Make sure to replace the source and target hosts, as well as the index and type names to match your local environment.
File: reindex.conf
input {
elasticsearch {
hosts => "localhost:9200" <---- your source host
index => "my_source_index"
}
}
filter {
mutate {
remove_field => [ "#version", "#timestamp" ]
}
}
output {
elasticsearch {
host => "localhost" <--- your target host
port => 9200
protocol => "http"
manage_template => false
index => "my_target_index"
document_type => "my_type"
workers => 5
}
}
And then you can simply launch it with
bin/logstash -f reindex.conf

Resources