Logstash Amazon Output Elasticsearch Config Error - elasticsearch

I'm try to send the data from mysql using logstash jdbc mysql to Amazon Elasticsearch Service and I got an error , my config db.conf as follow :
input {
jdbc {
# Postgres jdbc connection string to our database, mydb
jdbc_connection_string => "jdbc:mysql://awsmigration.XXXXXXXX.ap-southeast-1.rds.amazonaws.com:3306/admin_slurp?zeroDateTimeBehavior=convertToNull"
# The user we wish to execute our statement as
jdbc_user => "root"
jdbc_password => "XXXXXX"
# The path to our downloaded jdbc driver
jdbc_driver_library => "/opt/logstash/drivers/mysql-connector-java-5.1.39/mysql-connector-java-5.1.39-bin.jar"
# The name of the driver class for Postgresql
jdbc_driver_class => "com.mysql.jdbc.Driver"
# our query
statement => "SELECT *, id as _id from Receipt"
jdbc_paging_enabled => true
jdbc_page_size => 200
}
}
output {
amazon_es {
hosts => ["https://search-XXXXXXXX.ap-southeast-1.es.amazonaws.com:443"]
region => "ap-southeast-1"
# aws_access_key_id, aws_secret_access_key optional if instance profile is configured
aws_access_key_id => 'XXXXXXXX'
aws_secret_access_key => 'XXXXXXXX'
index => "slurp_receipt"
}
}
The errors :
fetched an invalid config {:config=>" jdbc {\n # Postgres jdbc connection string to our database, mydb\n jdbc_connection_string => \"jdbc:mysql://awsmigration.XXXXXXXX.ap-southeast-1.rds.amazonaws.com:3306/admin_slurp?zeroDateTimeBehavior=convertToNull\"\n # The user we wish to execute our statement as\n jdbc_user => \"dryrun\"\n jdbc_password => \"dryruntesting\"\n # The path to our downloaded jdbc driver\n jdbc_driver_library => \"/opt/logstash/drivers/mysql-connector-java-5.1.39/mysql-connector-java-5.1.39-bin.jar\"\n # The name of the driver class for Postgresql\n jdbc_driver_class => \"com.mysql.jdbc.Driver\"\n # our query\n statement => \"SELECT *, id as _id from Receipt\"\n\n jdbc_paging_enabled => true\n jdbc_page_size => 200\n }\n}\noutput {\n amazon_es {\n hosts => [\"https://search-XXXXXXXX-southeast-1.es.amazonaws.com:443\"]\n region => \"ap-southeast-1\"\n # aws_access_key_id, aws_secret_access_key optional if instance profile is configured\n aws_access_key_id => 'XXXXXXXX'\n aws_secret_access_key => 'XXXXXXXX'\n index => \"slurp_receipt\"\n }\n}\n\n\n", :reason=>"Expected one of #, input, filter, output at line 1, column 5 (byte 5) after ", :level=>:error}
I'm using Ubuntu 14 , logstash 2.3.4 .
How to solve it?
Thank you

Related

Logstash JDBC adapter: Varbinary to UTF-8? (mysql to elastic import)

I'm trying to import a mysql table into elasticsearch via logstash. One column is of the type "varbinary" which causes the following error:
[2018-10-10T12:35:54,922][ERROR][logstash.outputs.elasticsearch] An unknown error occurred sending a bulk request to Elasticsearch. We will retry indefinitely {:error_message=>"\"\\xC3\" from ASCII-8BIT to UTF-8", :error_class=>"LogStash::Json::GeneratorError", :backtrace=>["/usr/share/logstash/logstash-core/lib/logstash/json.rb:27:in `jruby_dump'", "/usr/share/logstash/vendor/$
My logstash config:
input {
jdbc {
jdbc_connection_string => "jdbc:mysql://localhost:3306/xyz"
# The user we wish to execute our statement as
jdbc_user => "test"
jdbc_password => "test"
# The path to our downloaded jdbc driver
jdbc_driver_library => "/mysql-connector-java-5.1.47/mysql-connector-java-5.1.47.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
# our query
statement => "SELECT * FROM x"
}
}
output {
stdout { codec => json_lines }
elasticsearch {
"hosts" => "localhost:9200"
"index" => "x"
"document_type" => "data"
}
}
How can I convert the varbinary to uft-8? Do I have to use a special filter?
Alright...after spending hours on this I found the solution right after posting this question:
columns_charset => { "column0" => "UTF8" }
Try using optional in connection string ( characterEncoding=utf8 )
jdbc_connection_string => "jdbc:mysql://localhost:3306/xyz?useSSL=false&useUnicode=true&characterEncoding=utf8&zeroDateTimeBehavior=convertToNull&autoReconnect=true"

Exception: LogStash::ConfigurationError

Am trying to connect Oracle database via logstash and am getting below error.
Error: oracle.jdbc.OracleDriver not loaded. Are you sure you've included the correct jdbc driver in :jdbc_driver_library?
Exception: LogStash::ConfigurationError
Stack: D:/softwares/logstash-6.2.4/logstash-6.2.4/vendor/bundle/jruby/2.3.0/gems/logstash-input-jdbc-4.3.9/lib/logstash/plugin_mixins/jdbc.rb:162:in `open_jdbc_connection'
Please find my logstash config file :
input {
jdbc {
jdbc_driver_library => "D:\data\ojdbc14.jar"
jdbc_driver_class => "oracle.jdbc.OracleDriver"
jdbc_connection_string => "jdbc:oracle:thin:#localhost:1521:xe"
jdbc_user => "user_0ne"
jdbc_password => "xxxyyyzzz"
statement => "SELECT * FROM PRODUCT"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "my_index"
}
}
logstash config file : (corrected)
input {
jdbc {
jdbc_driver_library => "D:\Karthikeyan\data\ojdbc14.jar"
jdbc_driver_class => "Java::oracle.jdbc.OracleDriver" // problem in this line is corrected
jdbc_connection_string => "jdbc:oracle:thin:#localhost:1521:xe"
jdbc_user => "vb"
jdbc_password => "123456"
statement => "SELECT * FROM VB_PRODUCT"
}
}
output {
elasticsearch {
hosts => ["localhost:9200"]
index => "my_index"
}
}
You can validate the configuration file using,
./user/share/logstash/bin/logstash -f etc/logstash/conf.d/sample.conf --config.test_and_exit

Logstash is indexing only one row of select query from mysql to elastic search

I am trying to index data from mysql db to elasticsearch using logstash. Logstash is running without errors but the problem is, it indexing only one row from my SELECT query.
Below are the versions of softwares I am using:
elastic search : 2.4.1
logstash: 5.1.1
mysql: 5.7.17
jdbc_driver_library: mysql-connector-java-5.1.40-bin.jar
I am not sure if this is because logstash and elasticsearch versions are different.
Below is my pipeline configuration:
input {
jdbc {
jdbc_driver_library => "mysql-connector-java-5.1.40-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/mydb"
jdbc_user => "user"
jdbc_password => "password"
schedule => "* * * * *"
statement => "SELECT * FROM employee"
use_column_value => true
tracking_column => "id"
}
}
output {
elasticsearch {
index => "logstash"
document_type => "sometype"
document_id => "%{uid}"
hosts => ["localhost:9200"]
}
}
It seems like the tracking_column (id) which you're using in the jdbc plugin and the document_id (uid) in the output is different. What if you have both of them same since it'll be easy to get all the records by id and push them into ES using the same id as well which could look more understandable:
document_id => "%{id}" <-- make sure you've got the exact spellings
And also please try adding this following line to your jdbc input after tracking_column:
tracking_column_type => "numeric"
Additionally to make sure that you don't have the .logstash_jdbc_last_run file existing when you're running the logstash file include the following line as well:
clean_run => true
So this is how your jdbc input should look like:
jdbc {
jdbc_driver_library => "mysql-connector-java-5.1.40-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/mydb"
jdbc_user => "user"
jdbc_password => "password"
schedule => "* * * * *"
statement => "SELECT * FROM employee"
use_column_value => true
tracking_column => "id"
tracking_column_type => "numeric"
clean_run => true
}
Other than that the conf seems to be fine, unless you're willing to have :sql_last_value where if you only wanted to update the newly added records in your database table. Hope it helps!

multiple inputs on logstash jdbc

I am using logstash jdbc to keep the things syncd between mysql and elasticsearch. Its working fine for one table. But now I want to do it for multiple tables. Do I need to open multiple in terminal
logstash agent -f /Users/logstash/logstash-jdbc.conf
each with a select query or do we have a better way of doing it so we can have multiple tables being updated.
my config file
input {
jdbc {
jdbc_driver_library => "/Users/logstash/mysql-connector-java-5.1.39-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/database_name"
jdbc_user => "root"
jdbc_password => "password"
schedule => "* * * * *"
statement => "select * from table1"
}
}
output {
elasticsearch {
index => "testdb"
document_type => "table1"
document_id => "%{table_id}"
hosts => "localhost:9200"
}
}
You can definitely have a single config with multiple jdbc input and then parametrize the index and document_type in your elasticsearch output depending on which table the event is coming from.
input {
jdbc {
jdbc_driver_library => "/Users/logstash/mysql-connector-java-5.1.39-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/database_name"
jdbc_user => "root"
jdbc_password => "password"
schedule => "* * * * *"
statement => "select * from table1"
type => "table1"
}
jdbc {
jdbc_driver_library => "/Users/logstash/mysql-connector-java-5.1.39-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/database_name"
jdbc_user => "root"
jdbc_password => "password"
schedule => "* * * * *"
statement => "select * from table2"
type => "table2"
}
# add more jdbc inputs to suit your needs
}
output {
elasticsearch {
index => "testdb"
document_type => "%{type}" # <- use the type from each input
hosts => "localhost:9200"
}
}
This will not create duplicate data. and compatible logstash 6x.
# YOUR_DATABASE_NAME : test
# FIRST_TABLE : place
# SECOND_TABLE : things
# SET_DATA_INDEX : test_index_1, test_index_2
input {
jdbc {
# The path to our downloaded jdbc driver
jdbc_driver_library => "/mysql-connector-java-5.1.44-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
# Postgres jdbc connection string to our database, YOUR_DATABASE_NAME
jdbc_connection_string => "jdbc:mysql://localhost:3306/test"
# The user we wish to execute our statement as
jdbc_user => "root"
jdbc_password => ""
schedule => "* * * * *"
statement => "SELECT #slno:=#slno+1 aut_es_1, es_qry_tbl.* FROM (SELECT * FROM `place`) es_qry_tbl, (SELECT #slno:=0) es_tbl"
type => "place"
add_field => { "queryFunctionName" => "getAllDataFromFirstTable" }
use_column_value => true
tracking_column => "aut_es_1"
}
jdbc {
# The path to our downloaded jdbc driver
jdbc_driver_library => "/mysql-connector-java-5.1.44-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
# Postgres jdbc connection string to our database, YOUR_DATABASE_NAME
jdbc_connection_string => "jdbc:mysql://localhost:3306/test"
# The user we wish to execute our statement as
jdbc_user => "root"
jdbc_password => ""
schedule => "* * * * *"
statement => "SELECT #slno:=#slno+1 aut_es_2, es_qry_tbl.* FROM (SELECT * FROM `things`) es_qry_tbl, (SELECT #slno:=0) es_tbl"
type => "things"
add_field => { "queryFunctionName" => "getAllDataFromSecondTable" }
use_column_value => true
tracking_column => "aut_es_2"
}
}
# install uuid plugin 'bin/logstash-plugin install logstash-filter-uuid'
# The uuid filter allows you to generate a UUID and add it as a field to each processed event.
filter {
mutate {
add_field => {
"[#metadata][document_id]" => "%{aut_es_1}%{aut_es_2}"
}
}
uuid {
target => "uuid"
overwrite => true
}
}
output {
stdout {codec => rubydebug}
if [type] == "place" {
elasticsearch {
hosts => "localhost:9200"
index => "test_index_1_12"
#document_id => "%{aut_es_1}"
document_id => "%{[#metadata][document_id]}"
}
}
if [type] == "things" {
elasticsearch {
hosts => "localhost:9200"
index => "test_index_2_13"
document_id => "%{[#metadata][document_id]}"
# document_id => "%{aut_es_2}"
# you can set document_id . otherwise ES will genrate unique id.
}
}
}
If you need to run more than one pipeline in the same process, Logstash provides a way to do this through a configuration file called pipelines.yml and using multiple pipelines
multiple pipeline
Using multiple pipelines is especially useful if your current configuration has event flows that don’t share the same inputs/filters and outputs and are being separated from each other using tags and conditionals.
more helpfull resource

Connect to multiple databases dynamically using Logstash JDBC Input plugin

I am using Logstash JDBC input plugin to read data from database and index it into Elastic Search.
I have separate database for each customer and I want to connect to them one by one dynamically to fetch data?
Is there any provision or parameter in JDBC-Input Plugin or Logstash to connect to multiple databases?
e.g
input {
jdbc {
jdbc_driver_library => "mysql-connector-java-5.1.36-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/MYDB"
//MYDB will be set dynamically.
jdbc_user => "mysql"
parameters => { "favorite_artist" => "Beethoven" }
schedule => "* * * * *"
statement => "SELECT * from songs where artist = :favorite_artist"
}
}
Only solution I can think of is writing script that will update logstash config to connect to specified databases one by one and run logstash through it.
Let me update this -
for the same kind of purpose, I used two input JDBC sections, but only first section considered.
input {
jdbc {
jdbc_connection_string => "XXXX"
jdbc_user => "XXXX"
jdbc_password => "XXXX"
statement => "select * from product"
jdbc_driver_library => "/usr/share/logstash/ojdbc7.jar"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
}
jdbc {
jdbc_connection_string => "YYYY"
jdbc_user => "YYYYY"
jdbc_password => "YYYY"
statement => "select * from product"
jdbc_driver_library => "/usr/share/logstash/ojdbc7.jar"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
}
}
output {
elasticsearch {
hosts => "localhost:9200"
user => "XXX"
password => "XXXX"
index => "XXXX"
document_type => "XXXX"
}
}
--

Resources