Logstash Jdbc Input plugin for MYSQL - jdbc

I am using Logstash in windows. i was not able to install input jdbc plug so i downloaded the zip file manually and place the logstash folder from plugin in to my logstash-1.5.2 folder.
the folder structure- "D:\elastic search\logstash-1.5.2\lib\logstash\inputs\jdbc.rb".
my conf file
input {
jdbc {
jdbc_driver_library => "D:/elastic search/logstash-1.5.2/lib/mysql-connector-java-5.1.13-bin.jar"
jdbc_driver_class => "com.mysql.jdbc.Driver"
jdbc_connection_string => "jdbc:mysql://localhost:3306/test"
jdbc_user => "root"
jdbc_password => ""
statement => "SELECT * from data"
jdbc_paging_enabled => "true"
jdbc_page_size => "50000"
}
}
output {
stdout { codec => rubydebug }
elasticsearch {
embedded => true
index => "bike"
type => "bikeapp"
cluster =>"trailcluster"
protocol => "http"
port => "9200"
}
}
when i run the logstash i get the error
D:\elastic search\logstash-1.5.2\bin>logstash -f logtest.conf
io/console not supported; tty will not be manipulated
←[33mjdbc plugin doesn't have a version. This plugin isn't well
supported by the community and likely has no maintainer. {:level=>:warn}←[0m
←[33mYou are using a deprecated config setting "type" set in elasticsearch. Deprecated settings will continue to work, but are scheduled for removal from logstash in the future. You can achieve this same behavior with the new
conditionals, like: `if [type] == "sometype" { elasticsearch { ... } }`. If you have any questions about this, please visit the #logstash channel on freenode irc. {:name=>"type", :plugin=><LogStash::Outputs::ElasticSearch --
->, :level=>:warn}←[0m
LoadError: no such file to load -- sequel
require at org/jruby/RubyKernel.java:1072
require at D:/elastic search/logstash-1.5.2/vendor/bundle/jruby/1.9/gems/polyglot-0.3.5/lib/polyglot.rb:65
prepare_jdbc_connection at D:/elastic search/logstash-1.5.2/lib/logstash/plugin_mixins/jdbc.rb:65
register at D:/elastic search/logstash-1.5.2/lib/logstash/inputs/jdbc.rb:144
start_inputs at D:/elastic search/logstash-1.5.2/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.2.2-java/lib/logstash/pipeline.rb:148
each at org/jruby/RubyArray.java:1613
start_inputs at D:/elastic search/logstash-1.5.2/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.2.2-java/lib/logstash/pipeline.rb:147
run at D:/elastic search/logstash-1.5.2/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.2.2-java/lib/logstash/pipeline.rb:80
synchronize at org/jruby/ext/thread/Mutex.java:149
run at D:/elastic search/logstash-1.5.2/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.2.2-java/lib/logstash/pipeline.rb:80
execute at D:/elastic search/logstash-1.5.2/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.2.2-java/lib/logstash/agent.rb:150
run at D:/elastic search/logstash-1.5.2/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.2.2-java/lib/logstash/runner.rb:91
call at org/jruby/RubyProc.java:271
run at D:/elastic search/logstash-1.5.2/vendor/bundle/jruby/1.9/gems/logstash-core-1.5.2.2-java/lib/logstash/runner.rb:96
call at org/jruby/RubyProc.java:271
initialize at D:/elastic search/logstash-1.5.2/vendor/bundle/jruby/1.9/gems/stud-0.0.20/lib/stud/task.rb:12

After adding the Jar file to Plugin fodler,You just goto the folder path in CMD Prompt and install the plugin using below commands to logstash
Run in an installed Logstash :
Build your plugin gem
gem build logstash-input-jdbc.gemspec
Install the plugin from the Logstash home
bin/plugin install /your/local/plugin/logstash-input-jdbc.gem
Finally you will, Start Logstash and proceed to test the plugin using the configuration you are using....

Related

Logstash not performing task

I want some data in a postgresql database to be indexed to an elasticsearch index. To do so I decided to use Logstash.
I installed Logstash and JDBC.
I perform the following config:
jdbc {
jdbc_connection_string => "jdbc:postgresql://localhost:5432/product_data"
jdbc_user => "postgres"
jdbc_password => "<my_password>"
jdbc_driver_class => "org.postgresql.Driver"
schedule => "* * * * *" # cronjob schedule format (see "Helpful Links")
statement => "SELECT * FROM public.vendor_product" # the PG command for retrieving the documents IMPORTANT: no semicolon!
jdbc_paging_enabled => "true"
jdbc_page_size => "300"
}
}
output {
# used to output the values in the terminal (DEBUGGING)
# once everything is working, comment out this line
stdout { codec => "json" }
# used to output the values into elasticsearch
elasticsearch {
hosts => ["localhost:9200"]
index => "vendorproduct"
document_id => "document_%id"
doc_as_upsert => true # upserts documents (e.g. if the document does not exist, creates a new record)
}
}
As a test I sheduled this every minute. To run my test I did:
logstash.bat -f logstash_postgre_ES.conf --debug
On my console I get:
...
[2022-04-04T16:10:07,065][DEBUG][logstash.agent ] Starting puma
[2022-04-04T16:10:07,081][DEBUG][logstash.agent ] Trying to start WebServer {:port=>9600}
[2022-04-04T16:10:07,190][DEBUG][logstash.api.service ] [api-service] start
[2022-04-04T16:10:07,834][INFO ][logstash.agent ] Successfully started Logstash API endpoint {:port=>9600}
C:/Users/Admin/Desktop/Elastic_search/logstash-6.8.23/vendor/bundle/jruby/2.5.0/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/cronline.rb:77: warning: constant ::Fixnum is deprecated
[2022-04-04T16:10:09,577][DEBUG][logstash.instrument.periodicpoller.cgroup] One or more required cgroup files or directories not found: /proc/self/cgroup, /sys/fs/cgroup/cpuacct, /sys/fs/cgroup/cpu
[2022-04-04T16:10:09,948][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
[2022-04-04T16:10:09,951][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
[2022-04-04T16:10:11,717][DEBUG][logstash.pipeline ] Pushing flush onto pipeline {:pipeline_id=>"main", :thread=>"#<Thread:0xaf6cee2 sleep>"}
[2022-04-04T16:10:14,595][DEBUG][logstash.instrument.periodicpoller.cgroup] One or more required cgroup files or directories not found: /proc/self/cgroup, /sys/fs/cgroup/cpuacct, /sys/fs/cgroup/cpu
[2022-04-04T16:10:14,960][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
[2022-04-04T16:10:14,961][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
[2022-04-04T16:10:16,742][DEBUG][logstash.pipeline ] Pushing flush onto pipeline {:pipeline_id=>"main", :thread=>"#<Thread:0xaf6cee2 sleep>"}
[2022-04-04T16:10:19,604][DEBUG][logstash.instrument.periodicpoller.cgroup] One or more required cgroup files or directories not found: /proc/self/cgroup, /sys/fs/cgroup/cpuacct, /sys/fs/cgroup/cpu
The last part gets printed every 2 seconds or so, for me it seems like its waiting to start though I let this run several minutes and it kept printing the same lines. In Kibana I check if my index got created but that wasn't the case.
The logstash-plain.log gives the same output as the console.
Why is there no index created and filled with the postgresqldata?

Migrating 3 million records from Oracle to Elastic search using logstash

We are trying to migrate around 3 million records from oracle to Elastic Search using Logstash.
We are applying a couple of jdbc_streaming filters as a part of our logstash script, one to load connecting nested objects and another to run a hierarchical query to load data to another nested object in the index.
We are able to index 0.4 million records in 24 hours. The total size occupied by .4 million records is around 300MB.
We tried multiple approaches to migrate data quickly into elastic from oracle but were not able to achieve desired results.
Please find below the approaches we tried :
1.In the logstash script,
we used jdbc_fetch_size,
jdbc_page_size,
jdbc_paging_enabled,
clean_run parameters,
set pipeline workers to 20 and
pipeline batch size to 125 in logstash.yml file.
2. On the elastic side,
we set the number of replicas to 0,
refresh interval to -1,
tried increasing the value of indices.memory.index_buffer_size parameter, increased number of watcher queues in the elastic.yml file.
We basically googled out and followed various suggestions from this site and others too but nothing seems to work out so far.
We are using a single node elastic setup and neither the DB nor the elastic node are present on the machine from which we are running the logstash script.
Please find below the logstash config file
input {
jdbc {
jdbc_driver_library => "LIB"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
jdbc_connection_string => "connection url"
jdbc_user => "user"
jdbc_password => "pwd"
statement => "select * from "
}
}
filter{
jdbc_streaming {
jdbc_driver_library => "LIB"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
jdbc_connection_string => "connection url"
jdbc_user => "user"
jdbc_password => "pwd"
#statement => "select claimnumber,claimtype,is_active from claim where policynumber = :policynumber"
parameters => {"policynumber" => "policynumber"}
target => "nested node"
}
stdout { codec => json }
}
filter{
jdbc_streaming {
jdbc_driver_library => "LIB"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
jdbc_connection_string => "connection url"
jdbc_user => "user"
jdbc_password => "pwd"
statement => "select listagg(column name,'/' ) within group(order by column name) from
where LEVEL > 1
start with =:
connect by prior = "
parameters => {"p1" => "p1"}
target => "nested node1"
}
}
output {
elasticsearch {
hosts => [""]
index => "<index_name>"
document_id => "%{doc_id}"
}
}
Can you please help us identify bottlenecks and also make suggestions on how to increase indexing performance.
Thank You

Logstash not updating last run metadata file

In my Logstash I want to download from a database the most recent data using :sql_last_value in a query and tracking_column option in conf file. I've set
last_run_metadata_path because I have 2 pipelines for the same table but Logstash saved last date only once or stopped saving new dates and now I can see in logs that it runs queries with the same :sql_last_value from metadata file.
That's how my conf file looks like, it has many jdbc inputs and one of them below:
jdbc {
jdbc_driver_library => "/opt/logstash/lib/ojdbc8.jar"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
jdbc_connection_string => ""
jdbc_user => ""
jdbc_password => ""
schedule => "*/15 * * * *"
statement_filepath => "/etc/logstash/queries/UAT/transactions_UAT.sql"
use_column_value => true
tracking_column => 'sys_created_on'
tracking_column_type => "timestamp"
last_run_metadata_path => "/etc/logstash/conf.d/lastrun_metadata/transactions_uat_metadata"
tags => ["transactions_uat"]
}
Content of the metadata file:
--- 2018-05-26 08:41:55.000000000 -04:00
I can see in the logs that Logstash always uses the same date from the metadata file and newer updates it:
select * from snc_uat.syslog_transaction0007
where "sys_created_on" >= TIMESTAMP '2018-05-26 08:41:55.000000 -04:00'
Logstash is working and is downloading recent data but unnecessarily processes data that already exists. Why is Logstash not updating metadata?
This is because your comparison operator is greater than or equal to i.e. >= please change it to > and it will fix your problem.
Hope it helps.

Logstash: Error org.postgres.Driver not loaded

I need to get data from a PostgreSQL DB and index it into Elasticsearch.
https://www.elastic.co/blog/logstash-jdbc-input-plugin
When I run /opt/logstash-2.3.3/bin/logstash -v -f es_table.logstash.conf
I receive the following error:
Pipeline aborted due to error
{:exception=>#<LogStash::ConfigurationError: org.postgres.Driver not loaded.
Are you sure you've included the correct jdbc driver in :jdbc_driver_library?>, :backtrace=>["/opt/logstash-2.3.3/vendor/bundle/jruby/1.9/gems/logstash-input-jdbc-3.0.2/lib/logstash/plugin_mixins/jdbc.rb:156:in `prepare_jdbc_connection'", "/opt/logstash-2.3.3/vendor/bundle/jruby/1.9/gems/logstash-input-jdbc-3.0.2/lib/logstash/plugin_mixins/jdbc.rb:148:in `prepare_jdbc_connection'", "/opt/logstash-2.3.3/vendor/bundle/jruby/1.9/gems/logstash-input-jdbc-3.0.2/lib/logstash/inputs/jdbc.rb:167:in `register'", "/opt/logstash-2.3.3/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/pipeline.rb:330:in `start_inputs'", "org/jruby/RubyArray.java:1613:in `each'", "/opt/logstash-2.3.3/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/pipeline.rb:329:in `start_inputs'", "/opt/logstash-2.3.3/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/pipeline.rb:180:in `start_workers'", "/opt/logstash-2.3.3/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/pipeline.rb:136:in `run'", "/opt/logstash-2.3.3/vendor/bundle/jruby/1.9/gems/logstash-core-2.3.3-java/lib/logstash/agent.rb:473:in `start_pipeline'"], :level=>:error}
Here is a piece of my Logstash configuration:
input {
jdbc {
jdbc_user => 'user'
jdbc_driver_class => 'org.postgresql.Driver'
jdbc_connection_string => 'jdbc:postgresql://1.1.1.1:5432/db'
lowercase_column_names => false
clean_run => false
jdbc_driver_library => '/usr/share/java/postgresql-jdbc4.jar'
jdbc_password => 'pass'
jdbc_validate_connection => true
jdbc_page_size => 1000
jdbc_paging_enabled => true
statement => 'SELECT * FROM "table"'
type => 'table'
}
...
The jdbc4 driver exists. I tried jdbc3 too without success.
ls /usr/share/java | grep postgresql-jdbc
postgresql-jdbc3-9.2.jar
postgresql-jdbc3.jar
postgresql-jdbc4-9.2.jar
postgresql-jdbc4.jar
The Driver class is inside:
jar tf /usr/share/java/postgresql-jdbc4.jar | grep -i driver
org/postgresql/Driver$1.class
org/postgresql/Driver$ConnectThread.class
org/postgresql/Driver.class
org/postgresql/util/PSQLDriverVersion.class
META-INF/services/java.sql.Driver
The port 5432 is open:
telnet 192.168.109.108 5432
Trying 192.168.109.108...
Connected to 192.168.109.108.
Escape character is '^]'.
Authentication to the DB works.
The problem was that I made a mistake in the driver name.
I wrote jdbc_driver_class => 'org.postgres.Driver'
And the correct name is jdbc_driver_class => 'org.postgresql.Driver'
I resolved this issue by following the workaround suggested in this issue
Reason:
This is a known problem that we have with the modules changes in JDK 9 (Jigsaw). The classloaders have seen some changes and a work around we added before to some driver loading is now failing. The jdbc input has the same failing in JDK 11 (9+). We are working on a fix.
Workaround that worked for me:
An "extreme" work around is to copy the driver file to /logstash-core/lib/jars/ directory. These jar get added to the correct JDK classpath as logstash is started via java.

Pass "yes" to command given in a text file

I'm using rbvmomi gem to automate vsphere in ruby. I'm using vmware API StartProgramInGuest to run commands. The commands are given in a text file which is passed as an argument to GuestProgramSpec. One of the commands in the file requires a confirmation. Since the commands are passed in a text file, I'm not sure how to pass "yes" to the command. Any help would be appreciated.
gom = vim.serviceContent.guestOperationsManager
guest_auth = RbVmomi::VIM::NamePasswordAuthentication(
:interactiveSession => false,
:username => "user",
:password => "pass"
)
prog_spec = RbVmomi::VIM::GuestProgramSpec(
:programPath =>"/opt/system/bin/ssh",
:arguments => "-s /opt/system/etc/cli/default/main.par -f /home/admin/local.txt"
)
id = gom.processManager.StartProgramInGuest(
:vm => vm, :auth => guest_auth, :spec => prog_spec
)
Contents of local.txt :
show version > /home/admin/veriosn-1.txt
application upgrade appbundle.tar.gz local
show version > /home/admin/version-2.txt

Resources