Delete data or document from elastic search using logstash - elasticsearch

I am trying to delete elastic search data or document using logstash configuration but delete seems to be not working.
I am using logstash 5.6.8 version
Below is the logstash configuration file:
```input {
jdbc {
#db configuration
'''
statement => " select * from table "
}
output {
elasticsearch {
action => "delete"
hosts => "localhost"
index => "myindex"
document_type => "doctype"
document_id => "%{id}"
}
stdout { codec => json_lines }
}```
But the above configuration are deleting the id's present in my db table and not deleting the id's that are not present.
when i sync from db to elastic search using logstash, i expect that deleted rows in db also synched and it should be consistent.
I also tried below configuration but getting some error:
```input {
jdbc {
#db configuration
'''
statement => " select * from table "
}
output {
elasticsearch {
action => "delete"
hosts => "localhost"
index => "myindex"
document_type => "doctype"
}
stdout { codec => json_lines }
}```
Error in logstash console:
"current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/stud-0.0.23/lib/stud/interval.rb:89:in sleep'"}]}}
[2019-12-27T16:30:16,087][WARN ][logstash.shutdownwatcher ] {"inflight_count"=>9, "stalling_thread_info"=>{"other"=>[{"thread_id"=>22, "name"=>"[main]>worker0", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/stud-0.0.23/lib/stud/interval.rb:89:insleep'"}]}}
[2019-12-27T16:30:18,623][ERROR][logstash.outputs.elasticsearch] Encountered a retryable error. Will Retry with exponential backoff {:code=>400, :url=>"http://localhost:9200/_bulk"}
[2019-12-27T16:30:21,086][WARN ][logstash.shutdownwatcher ] {"inflight_count"=>9, "stalling_thread_info"=>{"other"=>[{"thread_id"=>22, "name"=>"[main]>worker0", "current_call"=>"[...]/vendor/bundle/jruby/1.9/gems/stud-0.0.23/lib/stud/interval.rb:89:in `sleep'"}]}}
Can someone tell me how to delete document and sync db data or how to handle deleted records in elastic search?

Related

Logstash Elasticsearch plugin. Compare results from two sources

I have two deployed Elasticsearch clusters. Data "surpassingly" should be the same in both clusters. My main aim is to compare _source field for each elasticsearch document from source and target ES clusters.
I created logstash config in which I define Elasticsearch input plugin, which run over each document in source cluster, next using elasticsearch filter look up the target Elasticsearch cluster and query from it document by _id which I took from source cluster, match results of the _source field for both documents.
Could you please helm to implement such a config.
input {
elasticsearch {
hosts => ["source_cluster:9200"]
ssl => true
user => "user"
password => "password"
index => "my_index_pattern"
}
}
filter {
mutate {
remove_field => ["#version", "#timestamp"]
}
elasticsearch {
hosts => ["target_custer:9200"]
ssl => true
user => "user"
password => "password"
query => ???????
match _source field ????
}
}
output {
stdout { codec => rubydebug }
}
Maybe print some results of comparison...

JDBC input logstash plugin fetch data from mysql multiple times

Logstash jdbc input plugin fetch data from mysql multiple time and keep creating documents in elasticsearch
For 600 rows in mysql, it creates 8581812 documents in elasticsearch
I have created multiple config files to fetch data from each table in mysql and put in /etc/logstash/conf.d folder
Start logstash service as sudo systemctl start logstash
Run following command to execute files
/usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/spt_audit_event.conf
Data successfully fetched
input{
jdbc {
jdbc_driver_library => "/usr/share/jdbc_driver/mysql-connector-java-5.1.47.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://:3306/"
jdbc_user => ""
jdbc_password => ""
statement => "select * from spt_identity"
}
}
output {
elasticsearch {
"hosts" => "localhost:9200"
"index" => ""
}
stdout {}
}
Actual Results
Number of documents in elasticsearch keep on increasing and reached to 8581812 but there are only 600 rows in mysql table
Is it bug in plugin or I'm doing something wrong ?
You need to mention the unqiue id for elasticsearch
In order to avoid the duplication issues at elasticsearch you may need to add the unique id for the documents at elasticsearch.
Modify the logstash.conf by adding the "document_id" => "%{studentid}" in the output like below.
output {
stdout { codec => json_lines }
elasticsearch {
"hosts" => "localhost:9200"
"index" => "test-migrate"
"document_id" => "%{studentid}"
}
In your case it wont be studentid, but something else. Find the same and add it to your configuration.

Creating/updating array of objects in elasticsearch logstash output

I am facing an issue using elastic search output with logstash. Here is my sample event
{
"guid":"someguid",
"nestedObject":{
"field1":"val1",
"field2":"val2"
}
}
I expect the document with id to already be present in elasticsearch when this update happens.
Here is what I want to have in my elastic search document after 2 upserts:
{
"oldField":"Some old field from original document before upserts."
"nestedObjects":[{
"field1":"val1",
"field2":"val2"
},
{
"field3":"val3",
"field4":"val4"
}]
}
Here is my current elastic search output setting:
elasticsearch {
index => "elastictest"
action => "update"
document_type => "summary"
document_id => "%{guid}"
doc_as_upsert => true
script_lang => "groovy"
script_type => "inline"
retry_on_conflict => 3
script => "
if (ctx._source.nestedObjects) {
ctx._source.nestedObjects += event.nestedObject
} else {
ctx._source.nestedObjects = [event.nestedObject]
}
"
}
Here is the error I am getting:
response=>{"update"=>{"_index"=>"elastictest", "_type"=>"summary",
"_id"=>"64648dd3-c1e9-45fd-a00b-5a4332c91ee9", "status"=>400,
"error"=>{"type"=>"mapper_parsing_exception",
"reason"=>"failed to parse [event.nestedObject]",
"caused_by"=>{"type"=>"illegal_argument_exception",
"reason"=>"unknown property [field1]"}}}}
The issue turned out to be internally generated mapping in elasticsearch due to other documents with the same document_type with conflicting type on the field nestedObject. This caused elastic to throw a mapper parsing exception. Fixing this, fixed this issue.

I want to Delete document by logstash,but it throws a exception

Now,I meet a question. My logstash configuration file as follows:
input {
redis {
host => "127.0.0.1"
port => 6379
db => 10
data_type => "list"
key => "local_tag_del"
}
}
filter {
}
output {
elasticsearch {
action => "delete"
hosts => ["127.0.0.1:9200"]
codec => "json"
index => "mbd-data"
document_type => "localtag"
document_id => "%{album_id}"
}
file {
path => "/data/elasticsearch/result.json"
}
stdout {}
}
I want to read id from redis, by logstash, notify es to delete document.
Excuse me,My English is poor,I hope that someone will help me .
Thx.
I can't help you particularly, because your problem is spelled out in your error message - logstash couldn't connect to your elasticsearch instance.
That usually means one of:
elasticsearch isn't running
elasticsearch isn't bound to localhost
That's nothing to do with your logstash config. Using logstash to delete documents is a bit unusual though, so I'm not entirely sure this isn't an XY problem

How to move data from one Elasticsearch index to another using the Bulk API

I am new to Elasticsearch. How to move data from one Elasticsearch index to another using the Bulk API?
I'd suggest using Logstash for this, i.e. you use one elasticsearch input plugin to retrieve the data from your index and another elasticsearch output plugin to push the data to your other index.
The config logstash config file would look like this:
input {
elasticsearch {
hosts => "localhost:9200"
index => "source_index" <--- the name of your source index
}
}
filter {
mutate {
remove_field => [ "#version", "#timestamp" ]
}
}
output {
elasticsearch {
host => "localhost"
port => 9200
protocol => "http"
manage_template => false
index => "target_index" <---- the name of your target index
document_type => "your_doc_type" <---- make sure to set the appropriate type
document_id => "%{id}"
workers => 5
}
}
After installing Logstash, you can run it like this:
bin/logstash -f logstash.conf

Resources