Import MySQL-Entrys in Elasticsearch over Logstash - spring

I wanted to know how I can easily import data from my MySQL Database to my Elasticsearch Server using Logstash.
I have a Spring Boot App and want to import the information there.

In order for you to import data from a MySQL data to your elasticsearch index, you should be using the jdbc plugin as #hurb suggested above.
Your logstash jdbc input could look like this:
input {
jdbc {
jdbc_connection_string => "jdbc:mysql://yourhost:3306/yourdb"
jdbc_user => "root"
jdbc_password => "root"
jdbc_validate_connection => true
jdbc_driver_library => "/pathtojar/mysql-connector-java-5.1.39-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
schedule => "* * * * *" <-- if you need the query to be running continuously at a time span
statement => "SELECT * FROM yourtable" <-- change the query to your need
jdbc_paging_enabled => "true"
jdbc_page_size => "50000"
}
}
The above is just a sample, so that you could reproduce. Hope it helps!

Related

Only one Elasticsearch jdbc input is executing

I have two jdbc inputs in my logstash.conf file. The file validates and starts fine and I can see the pipeline running.
The second query shows up in the log and processes fine, but the first jdbc input query never even tries to run (at least there are no references to it in the log).
I use an identical template for all of the jdbc settings, so I know that is correct. The only difference is the name of the statement_filepath, but both of those files execute fine in Toad and return data.
input {
jdbc {
jdbc_driver_library => "/iappl/confluent-4.1.1/share/java/kafka-connect-jdbc/ojdbc7.jar"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
jdbc_connection_string => "..."
jdbc_user => "..."
jdbc_password => "..."
schedule => "*/30 * * * * * "
statement_filepath => "/iappl/log_conf/current/configs/scania/sql/V02_INBOUNDLOAD.sql"
type => "V02_INBOUND"
}
jdbc {
jdbc_driver_library => "/iappl/confluent-4.1.1/share/java/kafka-connect-jdbc/ojdbc7.jar"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
jdbc_connection_string => "..."
jdbc_user => "..."
jdbc_password => "..."
schedule => "*/30 * * * * * "
statement_filepath => "/iappl/log_conf/current/configs/scania/sql/V02_OUTBOUNDLOAD.sql"
type => "V02_OUTBOUND"
}
}
In the log, the second query shows up on schedule, but the first one never does, and there is no mention of it failing in the log.
Ideas?

logstash input plugin for postgresql issue - duplication (ignoring last run state)

I am using jdbc plugin to fetch data from postgresql db, it seems to be work fine for entire export and i am able to pull the data, but it is not working according to saved state, everytime all of data is queried and there are lot of duplicates.
I checked the .logstash_jdbc_last_run. The metadata state is updated as required, still plugin is importing entire data from table on every run. If any thing wrong in config.
input
{
jdbc {
jdbc_connection_string => "jdbc:postgresql://x.x.x.x:5432/dodb"
jdbc_user => "myuser"
jdbc_password => "passsword"
jdbc_validate_connection => true
jdbc_driver_library => "/opt/postgresql-9.4.1207.jar"
jdbc_driver_class => "org.postgresql.Driver"
statement => "select id,timestamp,distributed_query_id,distributed_query_task_id, "columns"->>'uid' as uid, "columns"->>'name' as name from distributed_query_result;"
schedule => "* * * * *"
use_column_value => true
tracking_column => "id"
tracking_column_type => "numeric"
clean_run => true
}
}
output
{
kafka{
topic_id => "psql-logs"
bootstrap_servers => "x.x.x.x:9092"
codec => "json"
}
}
Any help !! Thanks in advance,, I used below doc for reference.
https://www.elastic.co/guide/en/logstash/current/plugins-inputs-jdbc.html

How to load using logstash in AWS Elasticsearch

I didn't find any proper documentation in output plugins of logsatsh ,for loading data into AWS ES,i do find
output plugin only speaks the HTTP protocol. without specifying port 9200 can we load data in AWS ES
input {
jdbc {
jdbc_connection_string => "jdbc:mysql://localhost/elasticsearch"
jdbc_user => "root"
jdbc_password => "empower"
#jdbc_validate_connection => true
jdbc_driver_library => "/home/wtc082/Documents/com.mysql.jdbc_5.1.5.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
statement => "SELECT * FROM index_part_content_local LIMIT 10;"
schedule => "1 * * * *"
#codec => "json"
}
}
output {
elasticsearch {
index => "mysqltest"
document_type => "mysqltest_type"
document_id => "%{partnum}"
hosts => "AWSURI"
}
}
Can we do like this ?
Actually, I was using the log stash 2.4 to load data from Mysql to ES 5.X version when I used the log stash 5.X version it solved my issue.I didn't get any error while running the conf file
Thanks Val

Logstash is indexing only one row of select query from mysql to elastic search

I am trying to index data from mysql db to elasticsearch using logstash. Logstash is running without errors but the problem is, it indexing only one row from my SELECT query.
Below are the versions of softwares I am using:
elastic search : 2.4.1
logstash: 5.1.1
mysql: 5.7.17
jdbc_driver_library: mysql-connector-java-5.1.40-bin.jar
I am not sure if this is because logstash and elasticsearch versions are different.
Below is my pipeline configuration:
input {
jdbc {
jdbc_driver_library => "mysql-connector-java-5.1.40-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/mydb"
jdbc_user => "user"
jdbc_password => "password"
schedule => "* * * * *"
statement => "SELECT * FROM employee"
use_column_value => true
tracking_column => "id"
}
}
output {
elasticsearch {
index => "logstash"
document_type => "sometype"
document_id => "%{uid}"
hosts => ["localhost:9200"]
}
}
It seems like the tracking_column (id) which you're using in the jdbc plugin and the document_id (uid) in the output is different. What if you have both of them same since it'll be easy to get all the records by id and push them into ES using the same id as well which could look more understandable:
document_id => "%{id}" <-- make sure you've got the exact spellings
And also please try adding this following line to your jdbc input after tracking_column:
tracking_column_type => "numeric"
Additionally to make sure that you don't have the .logstash_jdbc_last_run file existing when you're running the logstash file include the following line as well:
clean_run => true
So this is how your jdbc input should look like:
jdbc {
jdbc_driver_library => "mysql-connector-java-5.1.40-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/mydb"
jdbc_user => "user"
jdbc_password => "password"
schedule => "* * * * *"
statement => "SELECT * FROM employee"
use_column_value => true
tracking_column => "id"
tracking_column_type => "numeric"
clean_run => true
}
Other than that the conf seems to be fine, unless you're willing to have :sql_last_value where if you only wanted to update the newly added records in your database table. Hope it helps!

Connect to multiple databases dynamically using Logstash JDBC Input plugin

I am using Logstash JDBC input plugin to read data from database and index it into Elastic Search.
I have separate database for each customer and I want to connect to them one by one dynamically to fetch data?
Is there any provision or parameter in JDBC-Input Plugin or Logstash to connect to multiple databases?
e.g
input {
jdbc {
jdbc_driver_library => "mysql-connector-java-5.1.36-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/MYDB"
//MYDB will be set dynamically.
jdbc_user => "mysql"
parameters => { "favorite_artist" => "Beethoven" }
schedule => "* * * * *"
statement => "SELECT * from songs where artist = :favorite_artist"
}
}
Only solution I can think of is writing script that will update logstash config to connect to specified databases one by one and run logstash through it.
Let me update this -
for the same kind of purpose, I used two input JDBC sections, but only first section considered.
input {
jdbc {
jdbc_connection_string => "XXXX"
jdbc_user => "XXXX"
jdbc_password => "XXXX"
statement => "select * from product"
jdbc_driver_library => "/usr/share/logstash/ojdbc7.jar"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
}
jdbc {
jdbc_connection_string => "YYYY"
jdbc_user => "YYYYY"
jdbc_password => "YYYY"
statement => "select * from product"
jdbc_driver_library => "/usr/share/logstash/ojdbc7.jar"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
}
}
output {
elasticsearch {
hosts => "localhost:9200"
user => "XXX"
password => "XXXX"
index => "XXXX"
document_type => "XXXX"
}
}
--

Resources