Kafka Connect JDBC Source running even when query fails - jdbc

I'm running a JDBC source connector and try to monitor its status somehow via the exposed JMX metrics and a prometheus exporter. However the status of the connector and all its tasks are still in the running state when the query fails or db can't be reached.
In earlier versions it seems that no value for source-record-poll-total in the source-task-metrics was exported when the query failed, in the versions I use (connect-runtime-6.2.0-ccs, confluentinc-kafka-connect-jdbc-10.2.0, jmx_prometheus_javaagent-0.14.0) even when failing the metric is exported with value 0.0.
Any ideas how I could detect such a failing query or db-connection?

This is resolved in version 10.2.4 of the jdbc connector. Tasks now fail when a SQLNonTransientException occurs and this can be detected using the exported metrics. See https://github.com/confluentinc/kafka-connect-jdbc/pull/1096

Related

Elastic Cloud APM not showing logs in Transactions Page

What makes Kibana to not show docker container logs in APM "Transactions" page under "Logs" tab.
I verified the logs are successfully being generated with the "trace.id" associated for proper linking.
I have the exact same environment and configs (7.16.2) up via docker-compose and it works perfectly.
Could not figure out why this feature works locally but does not show in Elastic Cloud deploy.
UPDATE with Solution:
I just solved the problem.
It's related to the Filebeat version.
From 7.16.0 and ON, the transaction/logs linking stops working.
Reverted Filebeat back to version 7.15.2 and it started working again.
If you are not using file beats, for example - We rolled our own logging implementation to send logs from a queue in batches using the Bulk API.
We have our own "ElasticLog" class and then use Attributes to match the logs-* Schema for the Log Stream.
In particular we had to make sure that trace.id was the same as the the actual Traces, trace.id property. Then the logs started to show up here (It does take a few minutes sometimes)
Some more info on how to get the ID's
We use OpenTelemetry exporter for Traces and ILoggerProvider for Logs. The fire off batches independently of each other.
We populate the Trace Id's at the time of instantiation of the class as a default value. This way you in the context of the Activity. Also helps set the timestamp exactly when the log was created.
This LogEntry then gets passed into the ElasticLogger processor and mapped as displayed above to the ElasticLog entry with the Attributes needed for ES

Kafka Streams 2.3.0 store get-rate metric not found in JMX

We've upgraded our Kafka Streams application form 2.2.0 to 2.3.0 (for the InMemoryWindowStore) and we were monitoring some metrics via JMX and after upgrading to 2.3.0 getting State Store metric get-rate throws a AttributeNotFoundException. Has anyone else has seen this issue? In the documentation there is no mention of it being removed..
Thank you
There are some inconsistencies in the metric names of read operations in key-value, window, and session stores. Key-value stores record read operations in metrics prefixed by get-, whereas window and session stores record read operations in metrics prefixed by fetch-. Unfortunately, the documentation about the metrics does not mention fetch at all.
In your case, you should look for fetch-rate since you are using a window store.
I opened the following ticket to document these inconsistencies https://issues.apache.org/jira/browse/KAFKA-8906
This issue was also there in 2.2, so I am wondering why you did not get this error before.

Adding extra jar to H2O when connecting to external instance using h2o.init()

I'm using H2O init to specify the Snowflake JDBC driver as extra_classpath when connecting to external H2O instance however, getting the following error (H2O connects successfully to external instance), when attempting to access Snowflake DB:
H2OServerError: HTTP 500 Server Error:
Server error java.lang.RuntimeException:
Error: SQLException: No suitable driver found for jdbc:snowflake:..
It works fine when starting a standalone H2O instance with nothing else changed.
Here is the init code:
h2o.init(ip='<ip>',
port=54321,
username='**',
password='**',
extra_classpath = ["snowflake-jdbc-3.8.0.jar"])
H2O version: 3.22.1.1
Python 3
extra_classpath is for use when starting H2O from Python. When you are connecting to H2O that is running on another machine, it has to already be started. So it is up to whoever started it, to have given that extra classpath, as part of the java command, when starting it. (And if a cluster, you have to make sure every node of the cluster uses the exact same command.)
The snowflake jar has to be available, on the path you give, on the server. In fact, there is no need for it to be on the client, unless you are also using it directly from your Python script (i.e. outside of h2o).
BTW, see https://github.com/h2oai/h2o-3/blob/master/h2o-py/h2o/h2o.py#L147 for the code. If you search for uses of extra_classpath you see it is only used when starting the local server.

PyHive ignoring Hive config

I'm intermittently getting the error message
DAG did not succeed due to VERTEX_FAILURE.
when running Hive queries via PyHive. Hive is running on an EMR cluster where hive.vectorized.execution.enabled is set to false in the hive-site.xml file for this reason.
I can set the above property through the configuration on the Hive connection and my query has run successfully every time I've executed it, however I want to confirm that this has fixed the issue and that it is definitely the case that hive-site.xml is being ignored.
Can anyone confirm if this is the expected behavior, or alternatively is there any way to inspect the Hive configuration via PyHive as I've not been able to find any way of doing this?
Thanks!
PyHive is a thin client that connects to HiveServer2, just like a Java or C client (via JDBC or ODBC). It does not use any Hadoop configuration files on your local machine. The HS2 session starts with whatever properties are set server-side.
Same goes for ImPyla BTW.
So it's your responsibility to set custom session properties from your Python code, e.g. execute this statement...
SET hive.vectorized.execution.enabled =False
... before running your SELECT.

Configuration Issue for IBM Filenet 5.2

I installed IBM Filenet Content Engine 5.2,on my machine.I am getting problem while configuring GCD datasources for new profile.
Let me first explain the setps I did,then I would mention the problem that I am getting.
First,I created GCD database in DB2,then I created datasources required for configuration of profile in WAS Admin Console.I created J2C Authentication Alias,for user which has access to GCD database and configured it with datasources.I am getting test database connection as successful but when I run task of configuring GCD datasources,it fails with the following error:-
Starting to run Configure GCD JDBC Data Sources
Configure GCD JDBC Data Sources ******
Finished running Configure GCD JDBC Data Sources
An error occurred while running Configure GCD JDBC Data Sources
Running the task failed with the following message: The data source configuration failed:
WASX7209I: Connected to process "server1" on node Poonam-PcNode01 using SOAP connector; The type of process is: UnManagedProcess
testing Database connection
DSRA8040I: Failed to connect to the DataSource. Encountered java.sql.SQLException: [jcc][t4][2013][11249][3.62.56] Connection authorization failure occurred. Reason: User ID or Password invalid. ERRORCODE=-4214, SQLSTATE=28000 DSRA0010E: SQL State = 28000, Error Code = -4,214.
It looks like simple error of user id and password not valid.I am using same alias for other datasources as well and they are working fine.so not sure,why I am getting error.I have also tried changing scope of datasources,but no success.Can somebody please help?
running "FileNet Configuration Manager" task of configuring GCD datasources will create all the needs things in WAS (including Alias), do not created it before manually.
I suspect it had an issue with exciting JDBC data sources/different names Alias
Seems from your message that you are running it from Filene configuration manager. Could you please double check from your database whether user id is authorised to execute query in GCD database. It is definitely do it with permission issue.

Resources