Kafka JDBC Connect: Insert key based on multiple values in message fields

Kafka JDBC Connect: Insert key based on multiple values in message fields - jdbc

I have the following json on a topic that the JDBC connector publishes to
{"APP_SETTING_ID":9,"USER_ID":10,"APP_SETTING_NAME":"my_name","SETTING_KEY":"my_setting_key"}
Here's my connector file
name=data.app_setting
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
poll.interval.ms=500
tasks.max=4
mode=timestamp
query=SELECT APP_SETTING_ID, APP_SETTING_NAME, SETTING_KEY,FROM MY_TABLE with (nolock)
timestamp.column.name=LAST_MOD_DATE
topic.prefix=data.app_setting
key.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=false
value.converter=org.apache.kafka.connect.json.JsonConverter
value.converter.schemas.enable=false
I now want to insert a key to this message by multiplying the two integer fields - APP_SETTING_ID and USER_ID. So the key for this message becomes 9*10 = 90
Is this transformation possible through Connect and if so could someone please shed light on it

I would try seeing how far you can get with
query=SELECT APP_SETTING_ID, APP_SETTING_NAME, SETTING_KEY, (APP_SETTING_ID*USER_ID) as _key FROM MY_TABLE with (nolock)
Then add an ExtractKey transform
transforms=AddKeys,ExtractKey
# this make a map
transforms.AddKeys.type=org.apache.kafka.connect.transforms.ValueToKey
transforms.AddKeys.fields=_key
# this gets one field from the map
transforms.ExtractKey.type=org.apache.kafka.connect.transforms.ExtractField$Key
transforms.ExtractKey.field=_key

Related

Insert and select is returning 1 instead of actual value in mybatis XML way

I have coded the below code snippet, the values are storing in DB is correct but while fetching the carId after inserting its always returning 1 instead of actual value, I can't use order="AFTER" as already I have used one order in generating sequence, and cannot use annotations based way as our org code structure does not allows that. Can someone please identify what I am doing wrong in XML-based way of inserting and fetching data?
Note** - Using oracle DB
<insert id = "insertIntoCar" parameterType = "CarEntity" useGeneratedKeys="true" keyColumn="id" keyProperty="carId">
<selectKey keyProperty="carId" resultType="long" order="BEFORE">
SELECT CAR_ID_SEQ.nextval From DUAL
<selectKey>
INSERT INTO CAR (CAR_ID, CAR_TYPE, CAR_STATUS)
VALUES (#{carId}, #{carType}, #{carStatus})
<insert>

Reg: Kafka connector configuration

I am using Kafka source connector configuration to produce data from a source table (Maria DB) into a Kafka topic. For some tables, I am using mode='timestamp' in the connect config.
I have two fields created_timestamp and updated_timestamp. I want to produce records as per updated_timestamp field. But there is a chance that the updated_timestamp field might contain NULL values.
Currently, I am using the NVL function in the query parameter to select created_time whenever updated_timestamp is NULL. But records not flown from source table to topic.
Kafka source connector configuration
curl -k -X POST https://:8083/connectors -H "Content-Type: application/json" -d '{"name": "TEST_SOURCECONNECT","config": {"topic.prefix": "topic-name","connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector","connection.url": "jdbc:mysql://db ip","connection.user": "","connection.password": "","mode": "timestamp","query": "SELECT t.* from (SELECT *, IFNULL(updated_timestamp, created_timestamp) as custom_timestamp FROM TABLE) t","timestamp.column.name": "custom_timestamp","validate.non.null": "false","poll.interval.ms": 60000}}'

Sorting Cassandra Query Output Data

I am sure this is the most common problem with Cassandra.
Nevertheless:
I have this example table:
CREATE TABLE test.test1 (
a text,
b text,
c timestamp,
id uuid,
d timestamp,
e decimal,
PRIMARY KEY ((a),c, b, id)) WITH CLUSTERING ORDER BY (b ASC, compartment ASC);
My query:
select b, (toUnixTimestamp(d) - toUnixTimestamp(c))/1000/60/60/24/365.25 as age from test.test1 where a = 'x' and c > -2208981600000 ;
This works fine but I can't get the data sorted by column b which I need. I need all the entries in column b and their corresponding 'age's.
eg:
select b, (toUnixTimestamp(d) - toUnixTimestamp(c))/1000/60/60/24/365.25 as age from test.test1 where a = 'x' and c > -2208981600000 order by b;
gives the error:
InvalidRequest: Error from server: code=2200 [Invalid query] message="Order by currently only supports the ordering of columns following their declared order in the PRIMARY KEY"
I have tried different orders in the clustering columns and different options in the partition key but I get caught by some logic and just can't seem to outwit Cassandra to get what I want. If I get the sort order I want, I loose the ability to filter on column 'c'.
Is there some logic I am not applying here, or alternatively, what must I omit(?) to get a list of entries in column b with the corresponding age.

Short answer - it's impossible to sort data on arbitrary column using CQL, even if it's a part of the primary key. Cassandra sorts data first by first clustering column, then inside it by second, etc. (see this answer).
So the only workaround right now is to fetch all data & sort on the client side.

Using Nifi ExecuteSQLRecord with parameterized SQL statements?

Can someone explain or show how Nifi's ExecuteSQLRecord would work with parameters? The documentation says:
If it is triggered by an incoming FlowFile, then attributes of that FlowFile will be available when evaluating the select query, and the query may use the ? to escape parameters. In this case, the parameters to use must exist as FlowFile attributes with the naming convention sql.args.N.type and sql.args.N.value,
where N is a positive integer. The sql.args.N.type is expected to be a number indicating the JDBC Type.
I've been able to use the HandleHttpRequest, ExtractText, to make this query work. curl -d "select * from MY_TABLE WHERE NAME = '1234'" http://localhost:5555
I'm unsure how I would update the ExecuteSQLRecord to make it work with parameters to avoid a sql injections.
Would I replace the 'test' with a ? and extract the attributes with another processor? I wish there was an example.

The query should be select * from MY_TABLE where NAME = '?', and then incoming flowfiles will need to have the following attributes (from your example):
sql.args.1.type: varchar
sql.args.1.value: 1234
For multiple parameters, it would follow this general pattern:
Query: select * from MY_TABLE where NAME = '?' and OTHER_COL = '?' ...
Flowfile attributes:
sql.args.1.type: varchar
sql.args.1.value: First Last
sql.args.2.type: integer
sql.args.2.value: 1234
...

Retrieving data from oracle table using scala jdbc giving wrong results

I am using scala jdbc to check whether a partition exists for an oracle table. It is returning wrong results when an aggregate function like count(*) is used.
I have checked the DB connectivity and other queries are working fine. I have tried to extract the value of count(*) using an alias, But it failed. Also tried using getString. But it failed.
Class.forName(jdbcDriver)
var connection = DriverManager.getConnection(jdbcUrl,dbUser,pswd)
val statement = connection.createStatement()
try{
val sqlQuery = s""" SELECT COUNT(*) FROM USER_TAB_PARTITIONS WHERE
TABLE_NAME = \'$tableName\' AND PARTITION_NAME = \'$partitionName\' """
val resultSet1 = statement.executeQuery(sqlQuery)
while(resultSet1.next())
{
var cnt=resultSet1.getInt(1)
println("Count="+cnt)
if(cnt==0)
// Code to add partition and insert data
else
//code to insert data in existing partition
}
}catch(Exception e) { ... }
The value of cnt always prints as 0 even though the oracle partition already exists. Can you please let me know what is the error in the code? Is this giving wrong results because I am using scala jdbc to get the result of an aggregate function like count(*)? If yes, then what would be the correct code? I need to use scala jdbc to check whether the partition already exists in oracle and then insert data accordingly.

This is just a suggestion or might be the solution in your case.
Whenever you search the metadata tables of the oracle always use UPPER or LOWER on both side of equal sign.
Oracle converts every object name in to the upper case and store it in the metadata unless you have specifically provided the lower case object name in double quotes while creating it.
So take an following example:
-- 1
CREATE TABLE "My_table_name1" ... -- CASE SENSISTIVE
-- 2
CREATE TABLE My_table_name2 ... -- CASE INSENSITIVE
In first query, we used double quotes so it will be stored in the metadata of the oracle as case sensitive name.
In second query, We have not used double quotes so the table name will be converted into the upper case and stored in the metadata of the oracle.
So If you want to create a query against any metadata in the oracle which include both of the above cases then you can use UPPER or LOWER against the column name and value as following:
SELECT * FROM USER_TABLES WHERE UPPER(TABLE_NAME) = UPPER('<YOUR TABLE NAME>');
Hope, this will help you in solving the issue.
Cheers!!

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Kafka JDBC Connect: Insert key based on multiple values in message fields - jdbc

Related

Insert and select is returning 1 instead of actual value in mybatis XML way

Reg: Kafka connector configuration

Sorting Cassandra Query Output Data

Using Nifi ExecuteSQLRecord with parameterized SQL statements?

Retrieving data from oracle table using scala jdbc giving wrong results

Categories

Resources