Elasticsearch - Duplicating Types? - elasticsearch

I created an index in Elasticsearch, with a type t1 and documents doc1-docN. Is there a way, via an API call, to create a new type, t2, that contains the same documents as t1 (doc1 - docN)?

No magic API call for this. You need to index those documents. I suggest this blog post from one of the Elastic developers: http://david.pilato.fr/blog/2015/05/20/reindex-elasticsearch-with-logstash/
You'd need something around these lines:
input {
elasticsearch {
hosts => [ "localhost:9200" ]
index => "test_index"
size => 500
scroll => "5m"
docinfo => true
query => '{"query":{"term":{"_type":{"value":"test_type_1"}}}}'
}
}
filter {
mutate {
remove_field => [ "#timestamp", "#version" ]
}
}
output {
elasticsearch {
host => "localhost"
port => "9200"
protocol => "http"
index => "test_index"
document_type => "test_type_2"
document_id => "%{[#metadata][_id]}"
}
}

Related

Use non whitelisted fields in Logstash if statement

I have a logstash configuration where I used prune to whitelist few fields.
prune {
whitelist_names => ["id","name"]
}
The problem is, I need to use an if condition in the output on a field other than id field, eg: "type". But since I have not whitelisted "type", the if condition is not working.
if ( [type] in ["abc","efg"] ) {
elasticsearch {
action => "update"
hosts => [ "localhost:9200" ]
index => "index"
document_id => "%{id}"
doc_as_upsert => true
}
}
How can I use non whitelisted field in if condition?
Before your prune filter, add a mutate filter to copy the value of the field you're going to delete (type) into a new metadata field. Then, prune. Then, use the new metadata field in your output condition.
...
filter {
...
mutate {
add_field => {
"[#metadata][type]" => "%{type}"
}
}
prune {
whitelist_names => ["id","name"]
}
...
}
output {
if [#metadata][type] in ["abc","efg"] {
elasticsearch {
action => "update"
hosts => [ "localhost:9200" ]
index => "index"
document_id => "%{id}"
doc_as_upsert => true
}
}
}

No mapping found for [#timestamp] in order to sort logstash

I am getting this error "No mapping found for [#timestamp] in order to sort logstash"
My conf file
input { elasticsearch {
hosts => ["localhost"]
index => "employees_data"
query => '{ "query": { "match_all": { } } }'
scroll => "5m"
docinfo => true}}filter {elasticsearch {
hosts => ["localhost"]
index => "transaction_data"
query => "code:1"
fields => {
"code"=>"Code"
"payment" => "Payment"
"moth"=>"Month"}}}output {elasticsearch { hosts => ["localhost"]index => "join"}}
This is because of the sort parameter of the elasticsearch filter plugin. If unspecified, it defaults to #timestamp:desc and you probably don't have that field.
Just make the following change and you should be good to go:
filter {
elasticsearch {
hosts => ["localhost"]
index => "transaction_data"
query => "code:1"
sort => "code:asc" <--- add this line
fields => {
"code"=>"Code"
"payment" => "Payment"
"moth"=>"Month"
}
}
}

logstash 5.0.1: setup elasticsearch multiple indexes ouput for multiple kafka input topics

I have a logstash input setup as
input {
kafka {
bootstrap_servers => "zookeper_address"
topics => ["topic1","topic2"]
}
}
I need to feed the topics into two different indexes in elasticsearch. Can anyone help me with how the ouput should be setup for such a task. At this time I am only able to setup
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "my_index"
codec => "json"
document_id => "%{id}"
}
}
I need two indexes on the same elasticsearch instance say index1 and index2 which will be fed by messages coming in on topic1 and topic2
First, you need to add decorate_events to your kafka input in order to know from which topic the message is coming
input {
kafka {
bootstrap_servers => "zookeper_address"
topics => ["topic1","topic2"]
decorate_events => true
}
}
Then, you have two options, both involving conditional logic. The first is by introducing a filter for adding the correct index name depending on the topic name. For this you need to add
filter {
if [kafka][topic] == "topic1" {
mutate {
add_field => {"[#metadata][index]" => "index1"}
}
} else {
mutate {
add_field => {"[#metadata][index]" => "index2"}
}
}
# remove the field containing the decorations, unless you want them to land into ES
mutate {
remove_field => ["kafka"]
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "%{[#metadata][index]}"
codec => "json"
document_id => "%{id}"
}
}
Then second option is to do the if/else directly in the output section, like this (but the additional kafka field will land into ES):
output {
if [#metadata][kafka][topic] == "topic1" {
elasticsearch {
hosts => ["localhost:9200"]
index => "index1"
codec => "json"
document_id => "%{id}"
}
} else {
elasticsearch {
hosts => ["localhost:9200"]
index => "index2"
codec => "json"
document_id => "%{id}"
}
}
}

logstash elastic search output configuration based on inputs

Is there any way I can use logstash configuration file to scale output accordingly with different types/indexes ?
For eg.,
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "index_resources"
if(%{some_field_id}==kb){
document_type => "document_type"
document_id => "%{some_id}"
}
else {
document_type => "other_document_type"
document_id => "%{some_other_id}"
}
}
Yes you could route your documents to multiple indexes within your logstash itself. Output could look something like this:
output {
stdout {codec => rubydebug}
if %{some_field_id} == "kb" { <---- insert your condition here
elasticsearch {
host => "localhost"
protocol => "http"
index => "index1"
document_type => "document_type"
document_id => "%{some_id}"
}
} else {
elasticsearch {
host => "localhost"
protocol => "http"
index => "index2"
document_type => "other_document_type"
document_id => "%{some_other_id}"
}
}
}
This thread might help you as well.

Can I use mutate filter in Logstash to convert some fields to integers of a genjdbc input?

I am using genjdbc input plugin for Logstash to get data from a DB2 database. It works perfectly, I get in Kibana all the database columns as fields.
The problem I have is that in Kibana all fields are string type, and I want the numeric fields to be integers. I have tried the following code, but the result is the same that if no filter clause exists.
Can someone help me solving this? Thanks in advance!
The logstash.conf code:
input {
genjdbc {
jdbcHost => "XXX.XXX.XXX.XXX"
jdbcPort => "51260"
jdbcTargetDB => "db2"
jdbcDBName => "XXX"
jdbcUser => "XXX"
jdbcPassword => "XXX"
jdbcDriverPath => "C:\...\db2jcc4.jar"
jdbcSQLQuery => "SELECT * FROM XXX1"
jdbcTimeField => "LOGSTAMP"
jdbcPStoreFile => "C:\elk\logstash\bin\db2.pstore"
jdbcURL => "jdbc:db2://XXX.XXX.XXX.XXX:51260/XXX"
type => "table1"
}
genjdbc {
jdbcHost => "XXX.XXX.XXX.XXX"
jdbcPort => "51260"
jdbcTargetDB => "db2"
jdbcDBName => "XXX"
jdbcUser => "XXX"
jdbcPassword => "XXX"
jdbcDriverPath => "C:\...\db2jcc4.jar"
jdbcSQLQuery => "SELECT * FROM XXX2"
jdbcTimeField => "LOGSTAMP"
jdbcPStoreFile => "C:\elk\logstash\bin\db2.pstore"
jdbcURL => "jdbc:db2://XXX.XXX.XXX.XXX:51260/XXX"
type => "table2"
}
}
filter {
mutate {
convert => [ "T1", "integer" ]
convert => [ "T2", "integer" ]
convert => [ "T3", "integer" ]
}
}
output {
if [type] == "table1" {
elasticsearch {
host => "localhost"
protocol => "http"
index => "db2_1-%{+YYYY.MM.dd}"
}
}
if [type] == "table2" {
elasticsearch {
host => "localhost"
protocol => "http"
index => "db2_2-%{+YYYY.MM.dd}"
}
}
}
What you have should work as long as the fields you are trying to convert to integer are names T1,T2,T3 and you are inserting into an index that doesn't have any data. If you already have data in the index, you'll need to delete the index so that logstash can recreate it with the correct mapping.

Resources