Kafka JDBC Connect (Source and Sink) and Informix - jdbc

is there any way to convert ":" to "." in a query constructed by jdbc connector?
"SELECT * FROM database : user : table WHERE database : user : table "
connector configuration:
"name": "jdbc_source_connector",
"config": {
"connector.class" : "io.confluent.connect.jdbc.JdbcSourceConnector",
"connection.url" : "jdbc:informix-sqli://IP:PORT/databasa:informixserver=oninit;user=user;password=password",
"topic.prefix" : "table-",
"poll.interval.ms" : "100000",
"mode" : "incrementing",
"table.whitelist" : "table",
"query.suffix" : ";",
"incrementing.column.name" : "lp"
ERROR:
[2021-02-03 14:03:02,809] INFO Begin using SQL query: SELECT * FROM database : user : table WHERE database : user : table : lp > ? ORDER B
Y database : user : table : lp ASC ; (io.confluent.connect.jdbc.source.TableQuerier:164)
[2021-02-03 14:03:02,853] ERROR Failed to run query for table TimestampIncrementingTableQuerier{table="database "." user "." table", query='null', top
icPrefix='database-', incrementingColumn='lp', timestampColumns=[]}: {} (io.confluent.connect.jdbc.source.JdbcSourceTask:404)
java.sql.SQLSyntaxErrorException: A syntax error has occurred.
connector session:
informix database:/usr/informix$ onstat -g ses 544271
IBM Informix Dynamic Server Version 12.10.FC13 -- On-Line -- Up 60 days 18:26:06 -- 4985440 Kbytes
session effective #RSAM total used dynamic
id user user tty pid hostname threads memory memory explain
544271 user - - 1266352 kafkahost 1 172032 102672 off
Program :
Thread[id:175, name:task-thread-jdbc_source_connector_database, path:/app/kafka_2.13-2.7.0/plugins/kafka-connect-jdbc-10.0.1/lib/jdbc-4.50.4.1.jar]
tid name rstcb flags curstk status
583042 sqlexec 7000000437451a8 Y--P--- 6224 cond wait netnorm -
Memory pools count 2
name class addr totalsize freesize #allocfrag #freefrag
544271 V 70000004f6c7040 167936 68592 113 35
544271*O0 V 700000065ad0040 4096 768 1 1
name free used name free used
overhead 0 6656 scb 0 144
opentable 0 11352 filetable 0 1040
log 0 16536 temprec 0 22688
keys 0 816 gentcb 0 1592
ostcb 0 3472 sqscb 0 25128
hashfiletab 0 552 osenv 0 2056
sqtcb 0 9336 fragman 0 640
sapi 0 144 udr 0 520
sqscb info
scb sqscb optofc pdqpriority optcompind directives
7000000343d3360 700000039194028 0 0 0 1
Sess SQL Current Iso Lock SQL ISAM F.E.
Id Stmt type Database Lvl Mode ERR ERR Vers Explain
544271 - database CR Not Wait 0 0 9.28 Off
Last parsed SQL statement :
SELECT * FROM database : user : table WHERE database : user : table :
lp > ? ORDER BY database : user : table : lp ASC

so, the only workaround is to add the query option to the connector configuration
config after modifications:
"name": "jdbc_source_connector",
"config": {
"connector.class" : "io.confluent.connect.jdbc.JdbcSourceConnector",
"connection.url" : "jdbc:informix-sqli://IP:PORT/databasa:informixserver=oninit;user=user;password=password",
"topic.prefix" : "table-tablename",
"poll.interval.ms" : "100000",
"mode" : "incrementing",
"query" : "select * from table ",
"incrementing.column.name" : "lp"

Related

Interpreting the output of elasticsearch GET query

The output of the curl -X GET "localhost:9200/_cat/shards?v" is as follows:
index
shard
prirep
state
docs
store
ip
node
test_index
1
p
STARTED
0
283b
127.0.0.1
Deepaks-MacBook-Pro-2.local
test_index
1
r
UNASSIGNED
0
test_index
1
r
UNASSIGNED
0
test_index
0
p
STARTED
1
12.5kb
127.0.0.1
Deepaks-MacBook-Pro-2.local
test_index
0
r
UNASSIGNED
0
test_index
0
r
UNASSIGNED
0
And the output of the query curl -X GET "localhost:9200/test_index/_search?size=1000" | json_pp is as follows:
{
"_shards" : {
"failed" : 0,
"skipped" : 0,
"successful" : 2,
"total" : 2
},
"hits" : {
"hits" : [
{
"_id" : "101",
"_index" : "test_index",
"_score" : 1,
"_source" : {
"in_stock" : -4,
"name" : "pizza maker",
"prize" : 10
},
"_type" : "_doc"
}
],
"max_score" : 1,
"total" : {
"relation" : "eq",
"value" : 1
}
},
"timed_out" : false,
"took" : 2
}
MY QUESTION: As you can see, only text_index 0 primary shard has the data (from the output of first query), why successful key inside the _shards key has the value of 2?
Also, there is only 1 document, then why the value of total key inside _shards key is 2?
test_index has two primary shards and four unassigned replica shards (probably because you have a single node). Since a primary shard is a partition of your index, a document can only be stored in a single primary shard, in your case primary shard 0.
total: 2 means that the search was run over the two primary shards of your index (i.e. 100% of the data) and successful: 2 means that all primary shards responded. So you know you can trust the response to have searched over all your test_index data.
There's nothing wrong here.

convert output of a query to json in oracle 12c

I have a query, the output is like this:
1 2 3 4 5 6 7 8 9 10 11 12 13
- - - - - - - - - - - - -
40 20 22 10 0 0 0 0 0 0 0 0 0
I want to convert the output to one column and the column is like below:
output
-----------
{"1":40,"2":20,"3":22,"4":10,"5":0,"6":0,"7":0,"8":0,"9":0,"10":0,"11":0,"12":0,"13":0}
You can use the JSON formatting 'trick' in SQL Developer.
Full scenario:
CREATE TABLE JSON_SO (
"1" INTEGER,
"2" INTEGER,
"3" INTEGER,
"4" INTEGER,
"5" INTEGER,
"6" INTEGER
);
INSERT INTO JSON_SO VALUES (
40,
20,
22,
10,
0,
0
);
select /*json*/ * from json_so;
And the output when executing with F5 (Execute as Script):
{
"results":[
{
"columns":[
{
"name":"1",
"type":"NUMBER"
},
{
"name":"2",
"type":"NUMBER"
},
{
"name":"3",
"type":"NUMBER"
},
{
"name":"4",
"type":"NUMBER"
},
{
"name":"5",
"type":"NUMBER"
},
{
"name":"6",
"type":"NUMBER"
}
],
"items":[
{
"1":40,
"2":20,
"3":22,
"4":10,
"5":0,
"6":0
}
]
}
]
}
Note that the JSON output happens client-side via SQL Developer (this also works in SQLcl) and I formatted the JSON output for display here using https://jsonformatter.curiousconcept.com/
This will work with any version of Oracle Database that SQL Developer supports, while the JSON_OBJECT() function was introduced in Oracle Database 12cR2 - if you want to have the DB format the result set to JSON for you.
If you want Oracle DB server to return the result as JSON, you can do query as below -
SELECT JSON_OBJECT ('1' VALUE col1, '2' VALUE col2, '3' VALUE col3) FROM table

Elasticsearch : delete index and recreate same index result in incorrect data

I use elasticsearch-2.3.2
I created my index http://localhost:9200/github_inactivusr-2017.03.21
The command
curl http://localhost:9200/github_inactivusr-2017.03.21/_search
indicates I have a total of 7650 entries in my index
{
"took" : 40,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 7650,
"max_score" : 1.0,
...
}
I delete this index
curl -X DELETE http://localhost:9200/github_inactivusr-2017.03.21
I do get the message
{"acknowledged":true}
When I execute
curl http://localhost:9200/_cat/indices
I get
red open stats-new_format_membership-2016.08.12 5 1
red open stats-json3-jira-users-2017.07.13 5 1
yellow open github_activusr-2017.03.21 5 1 80495 0 16.6mb 16.6mb
yellow open github_activusr-2017.07.24 5 1 34697 0 9.3mb 9.3mb
The index github_inactivusr-2017.03.21 is no longer listed
I then recreate the index "github_inactivusr-2017.03.21" (exactly the same name and same mapping) again with 2550 entries
However, when I use the command curl http://localhost:9200/github_inactivusr-2017.03.21/_search,
I still get a total of 7650 entries.
After recreating the index, if I execute the command :
curl http://localhost:9200/_cat/indices
I get
red open stats-new_format_membership-2016.08.12 5 1
red open stats-json3-jira-users-2017.07.13 5 1
yellow open github_activusr-2017.03.21 5 1 80495 0 16.6mb 16.6mb
yellow open github_activusr-2017.07.24 5 1 34697 0 9.3mb 9.3mb
yellow open github_inactivusr-2017.03.21 5 1 7650 0 1.6mb 1.6mb
It is as if the index was not properly removed. Even if I stop and restart elasticsearch before recreating the index, I get this behaviour.
It is as if there is a cache or whatsoever that does not get rid of the data.
Does anyone have any idea ?
I found the root cause.
The problem was related to the logstash configuration file that I used.
It was reinjecting 3 times the same data, which explains the 7650 (2550 * 3)
Sorry about the time and thanks #Val

Get rid of unassigned shard

I've an ELK stack with two ElasticSearch nodes running and the cluster state turned red due to some unassigned shards which I can't get rid of. Looking up the unassigned shard, resp. the incomplete index with:
# curl -s elastic01.local:9200/_cat/shards | grep "logstash-2014.09.29"
Shows:
logstash-2014.09.29 4 p STARTED 745489 481.3mb 10.165.98.107 Crimson and the Raven
logstash-2014.09.29 4 r STARTED 745489 481.3mb 10.165.98.106 Glenn Talbot
logstash-2014.09.29 0 p STARTED 781110 502.3mb 10.165.98.107 Crimson and the Raven
logstash-2014.09.29 0 r STARTED 781110 502.3mb 10.165.98.106 Glenn Talbot
logstash-2014.09.29 3 p INITIALIZING 10.165.98.107 Crimson and the Raven
logstash-2014.09.29 3 r UNASSIGNED
logstash-2014.09.29 1 p STARTED 762991 490.1mb 10.165.98.107 Crimson and the Raven
logstash-2014.09.29 1 r STARTED 762991 490.1mb 10.165.98.106 Glenn Talbot
logstash-2014.09.29 2 p STARTED 761811 491.3mb 10.165.98.107 Crimson and the Raven
logstash-2014.09.29 2 r STARTED 761811 491.3mb 10.165.98.106 Glenn Talbot
My attempt to assign the shard to the other node fails:
curl XPOST -s 'http://elastic01.local:9200/_cluster/reroute?pretty=true' -d '{
"commands" : [ {
"allocate" : {
"index" : "logstash-2014.09.29",
"shard" : 3 ,
"node" : "Glenn Talbot",
"allow_primary" : 1
}
}
]
}'
With:
NO(primary shard is not yet active)]
I can't really seem to find an API to push the shard states any further. How could I proceed here?
Just for a complete picture, that what the system health looks like:
{
"cluster_name" : "logstash_es",
"status" : "red",
"timed_out" : false,
"number_of_nodes" : 2,
"number_of_data_nodes" : 2,
"active_primary_shards" : 114,
"active_shards" : 228,
"relocating_shards" : 0,
"initializing_shards" : 1,
"unassigned_shards" : 1
}
Thank you for your time and help
I actually ran into this situation with ElasticSearch 1.5 just the other day. After initially getting the same error, I simply repeated the /_cluster/reroute request the next day for lack of other ideas, and it worked, and it put the cluster back into a green state immediately.

Elasticsearch jdbc river eats up entire memory

I am trying to index 16 million docs(47gb) from a mysql table into elasticsearch index. I am using jparante's elasticsearch jdbc river to do this. But, after creating the river and waiting for about 15 mins, the entire heap memory gets consumed without any sign of the river running or docs getting indexed. The river used to run fine when I had around 10-12 million records to index. I have tried running the river 3-4 times, but in vain.
Heap Memory pre allocated to the ES process = 10g
elasticsearch.yml
cluster.name: test_cluster
index.cache.field.type: soft
index.cache.field.max_size: 50000
index.cache.field.expire: 2h
cloud.aws.access_key: BBNYJC25Dij8JO7YM23I(fake)
cloud.aws.secret_key: GqE6y009ZnkO/+D1KKzd6M5Mrl9/tIN2zc/acEzY(fake)
cloud.aws.region: us-west-1
discovery.type: ec2
discovery.ec2.groups: sg-s3s3c2fc(fake)
discovery.ec2.any_group: false
discovery.zen.ping.timeout: 3m
gateway.recover_after_nodes: 1
gateway.recover_after_time: 1m
bootstrap.mlockall: true
network.host: 10.111.222.33(fake)
river.sh
curl -XPUT 'http://--address--:9200/_river/myriver/_meta' -d '{
"type" : "jdbc",
"jdbc" : {
"driver" : "com.mysql.jdbc.Driver",
"url" : "jdbc:mysql://--address--:3306/mydatabase",
"user" : "USER",
"password" : "PASSWORD",
"sql" : "select * from mytable order by creation_time desc",
"poll" : "5d",
"versioning" : false
},
"index" : {
"index" : "myindex",
"type" : "mytype",
"bulk_size" : 500,
"bulk_timeout" : "240s"
}
}'
System properties:
16gb RAM
200gb disk space
Depending on your version of elasticsearch-river-jdbc (find out with ls -lrt plugins/river-jdbc/) this bug might be closed (https://github.com/jprante/elasticsearch-river-jdbc/issues/45)
Otherwise file a bug report on Github.

Resources