I am creating a jdbc connection to hive using javax.sql.DataSource and passing zookeeper service discovery (obtained from Ambari) string to Hive .
Zookeeper Hive URL : jdbc:hive2://localhost:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2;transportMode=http;httpPath=cliservice
If I make direct jdbc connection with HiveServer host and port then connection work properly but it fails with zookeeper string.
After that I tested zookeeper string with beeline and I worked fine.
Below is exception when connection is made.
Caused by: java.sql.SQLException: Could not open client transport for any of the Server URI's in ZooKeeper: Unable to read HiveServer2 uri from ZooKeeper
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:205)
at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:163)
at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105)
at org.apache.tomcat.jdbc.pool.PooledConnection.connectUsingDriver(PooledConnection.java:307)
at org.apache.tomcat.jdbc.pool.PooledConnection.connect(PooledConnection.java:200)
at org.apache.tomcat.jdbc.pool.ConnectionPool.createConnection(ConnectionPool.java:710)
at org.apache.tomcat.jdbc.pool.ConnectionPool.borrowConnection(ConnectionPool.java:644)
at org.apache.tomcat.jdbc.pool.ConnectionPool.init(ConnectionPool.java:466)
at org.apache.tomcat.jdbc.pool.ConnectionPool.<init>(ConnectionPool.java:143)
at org.apache.tomcat.jdbc.pool.DataSourceProxy.pCreatePool(DataSourceProxy.java:115)
at org.apache.tomcat.jdbc.pool.DataSourceProxy.createPool(DataSourceProxy.java:102)
at org.apache.tomcat.jdbc.pool.DataSourceProxy.getConnection(DataSourceProxy.java:126)
at org.apache.tomcat.jdbc.pool.DataSourceProxy.getConnection(DataSourceProxy.java:85)
at com.thinkbiganalytics.kerberos.KerberosUtil.getConnectionWithOrWithoutKerberos(KerberosUtil.java:60)
at com.thinkbiganalytics.hive.service.RefreshableDataSource.getConnectionForValidation(RefreshableDataSource.java:113)
at com.thinkbiganalytics.hive.service.RefreshableDataSource.testAndRefreshIfInvalid(RefreshableDataSource.java:133)
at com.thinkbiganalytics.hive.service.RefreshableDataSource.getConnection(RefreshableDataSource.java:145)
at com.thinkbiganalytics.kerberos.KerberosUtil.getConnectionWithOrWithoutKerberos(KerberosUtil.java:60)
at com.thinkbiganalytics.schema.DBSchemaParser.listCatalogs(DBSchemaParser.java:80)
... 118 more
Caused by: org.apache.hive.jdbc.ZooKeeperHiveClientException: Unable to read HiveServer2 uri from ZooKeeper
at org.apache.hive.jdbc.ZooKeeperHiveClientHelper.getNextServerUriFromZooKeeper(ZooKeeperHiveClientHelper.java:86)
at org.apache.hive.jdbc.Utils.updateConnParamsFromZooKeeper(Utils.java:506)
at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:203)
... 136 more
Caused by: org.apache.hive.jdbc.ZooKeeperHiveClientException: Tried all existing HiveServer2 uris from ZooKeeper.
at org.apache.hive.jdbc.ZooKeeperHiveClientHelper.getNextServerUriFromZooKeeper(ZooKeeperHiveClientHelper.java:73)
... 138 more
Did anyone encounter this ?
After spending my 2 days , i figured out problem . I have hive 0.14 dependency in my code where this problem is occurring. To fix i updated below two hive maven dependencies..
Hive Services - https://mvnrepository.com/artifact/org.apache.hive/hive-service/1.2.1000.2.4.2.10-1
Hive JDBC - https://mvnrepository.com/artifact/org.apache.hive/hive-jdbc/1.2.1000.2.4.2.10-1
Related
Using the kafka-lenses-dev image, I ran into problems connecting Kafka to the local Elasticsearch 7.1.1 instance. As you can see, the connection is refused, although the instance is up and running and it's possible to curl docs in. Is the version of Elastic problematic for the connector or is the config wrong?
Config
connector.class=io.confluent.connect.elasticsearch.ElasticsearchSinkConnector
type.name=kafka-connect
key.converter.schemas.enable=false
topics=cc_data
name=elastic-sink
value.converter.schemas.enable=false
connection.url=http://localhost:9200
key.ignore=true
Error
org.apache.kafka.connect.errors.ConnectException: Couldn't start ElasticsearchSinkTask due to connection error:
at io.confluent.connect.elasticsearch.jest.JestElasticsearchClient.<init>(JestElasticsearchClient.java:147)
at io.confluent.connect.elasticsearch.jest.JestElasticsearchClient.<init>(JestElasticsearchClient.java:114)
at io.confluent.connect.elasticsearch.ElasticsearchSinkTask.start(ElasticsearchSinkTask.java:120)
...
Caused by: io.searchbox.client.config.exception.CouldNotConnectException: Could not connect to http://localhost:9200
at io.searchbox.client.http.JestHttpClient.execute(JestHttpClient.java:60)
at io.confluent.connect.elasticsearch.jest.JestElasticsearchClient.getServerVersion(JestElasticsearchClient.java:168)
at io.confluent.connect.elasticsearch.jest.JestElasticsearchClient.<init>(JestElasticsearchClient.java:145)
Caused by: org.apache.http.conn.HttpHostConnectException: Connect to localhost:9200 [localhost/127.0.0.1] failed: Connection refused (Connection refused)
...
Caused by: java.net.ConnectException: Connection refused (Connection refused)
...
You've specified Elasticsearch as being at localhost:9200. Unless Elasticsearch is running on the same container as Kafka Connect then this hostname is wrong. You need to specify the hostname of Elasticsearch in a form that is reachable from where Kafka Connect is running.
As an example, you can see an example of how to configure this based on this Docker Compose here.
I am working on streaming the data from postgreSQL to HDFS. I had setup confluent environment on HDP 2.6 sandbox. My jdbc source configs for postgreSQL are
name=jdbc_1
connector.class=io.confluent.connect.jdbc.JdbcSourceConnector
tasks.max=1
connection.url=jdbc:postgresql://host:port/db?currentSchema=schema&user=user&password=password
mode=timestamp
timestamp.column.name=col1
validate.non.null=false
topic.prefix=psql-
All other properties for connection are also fine and i am running it by
./bin/connect-standalone ./etc/kafka/connect-standalone.properties ./etc/kafka-connect-jdbc/source.properties
Its working fine and creating topics based on the number of tables in the database as
psql-table1
psql-table2
Now i want to run HDFS sinks on all the topics to create separate dir for every table in the postgreSQL database.
But when i run HDFS sink with command
./bin/connect-standalone ./etc/kafka/connect-standalone.properties ./etc/kafka-connect-hdfs/hdfs-postGres.properties
by running the source i am getting error
ERROR Stopping after connector error (org.apache.kafka.connect.cli.ConnectStandalone:113)
org.apache.kafka.connect.errors.ConnectException: Unable to start REST server
at org.apache.kafka.connect.runtime.rest.RestServer.start(RestServer.java:214)
at org.apache.kafka.connect.runtime.Connect.start(Connect.java:53)
at org.apache.kafka.connect.cli.ConnectStandalone.main(ConnectStandalone.java:95)
Caused by: java.net.BindException: Address already in use
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at org.eclipse.jetty.server.ServerConnector.openAcceptChannel(ServerConnector.java:331)
at org.eclipse.jetty.server.ServerConnector.open(ServerConnector.java:299)
at org.eclipse.jetty.server.AbstractNetworkConnector.doStart(AbstractNetworkConnector.java:80)
at org.eclipse.jetty.server.ServerConnector.doStart(ServerConnector.java:235)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at org.eclipse.jetty.server.Server.doStart(Server.java:398)
at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:68)
at org.apache.kafka.connect.runtime.rest.RestServer.start(RestServer.java:212)
... 2 more
and if i stop the source connection and start the sink it works fine.
Anyone can help me that how i can setup multiple sink connectors.
Kafka Connect starts a rest server on port 8083.
If you run more that one standalone connector on a single machine, you need to change it with the rest.port property
Or you can run connect-distributed, then POST your source and sink configurations individually as JSON payloads running on a single Connect server, then you wouldn't have this Address already in use issue.
When run the spark example:
spark-hive-tables , I get errors on hadoop UI
User class threw exception: java.lang.RuntimeException:
java.lang.RuntimeException: Unable to instantiate
org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
and warning
executor.CoarseGrainedExecutorBackend: An unknown (x.x.x.x:x) driver
disconnected.
but I have start hive metastore on my spark-yarn cluster, what should I do?
It means you haven't started your Metastore service, so start you metastore service where you installed hive or in the remote if you have your metastore in remote.
To start metastore use hive --service metastore
what output you got after starting metastore service
I found out that i am using thrift server.after starting thrift by cmd
/SPARKPATH/sbin/start-thriftserver.sh ,here comes another error "java.lang.ClassNotFoundException: org.datanucleus.api.jdo.JDOPersistenceManagerFactory" , which display errors like my title here. and it can be fixed by add --jars /SPARKPATH/lib_managed/jars/datanucleus-api-jdo-3.2.6.jar,/SPARKPATH/lib_managed/jars/datanucleus-core-3.2.10.jar,/SPARKPATH/lib_managed/jars/datanucleus-rdbms-3.2.9.jar
In RedHat test server I installed hadoop 2.7 and I ran Hive ,Pig & Spark with out issues .But when tried to access metastore of Hive from Spark I got errors So I thought of putting hive-site.xml(After extracting 'apache-hive-1.2.1-bin.tar.gz' file I just add $HIVE_HOME to bashrc as per tutorial and everything was working other than this integration with Spark) In apache site I found that I need to put hive-site.xml as metastore configuration
I created the file as below
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby://localhost:1527/metastore_db;create=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
</configuration>
I put IP as localhost since it is single node machine .After that I am not able to connect to even Hive .It is throwing error
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)
....
Caused by: javax.jdo.JDOFatalDataStoreException: Unable to open a test connection to the given database. JDBC url = jdbc:derby://localhost:1527/metastore_db;create=true, username = APP. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------
java.sql.SQLException: No suitable driver found for jdbc:derby://localhost:1527/metastore_db;create=true
There are lot many error log pointing to the same thing . If I remove hive-site.xml from the conf folder hive is working without issues .Can anyone point me to the right path for default metastore configuration
Thanks
Anoop R
Derby is used as an embedded database. try using
jdbc:derby:metastore_db;create=true
as jdbc-url. see also
https://cwiki.apache.org/confluence/display/Hive/AdminManual+MetastoreAdmin#AdminManualMetastoreAdmin-EmbeddedMetastore
To use the metastore fully functional (and by that to be able to access it from different services), try setting up using mysql as described in the document above.
As you are setting up an embedded metastore database, use the property below as JDBC URL:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:derby:metastore_db;create=true </value>
<description>JDBC connect string for a JDBC metastore </description>
</property>
I was also facing similar kind of exception while installing hive. The thing which worked for me was to initialize the derby db. I used the following command to solve the problem : command -> Go to $HIVE_HOME/bin and run the command schematool -initSchema -dbType derby .
You can follow the link http://www.edureka.co/blog/apache-hive-installation-on-ubuntu
It will work if you put derbyclient.jar in lib folder of hive
I have added ojdbc.jar file in /usr/lib/sqoop/lib and I am trying to connect oracle to hadoop using sqoop but facing error.
I am using following command:
sqoop list-tables --connect jdbc:oracle:thin://#192.162.2.8:1521:orcl --username hr --password abc
But the i get following error:
15/05/05 09:21:31 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
15/05/05 09:21:32 ERROR manager.OracleManager: Failed to rollback transaction
java.lang.NullPointerException
at com.cloudera.sqoop.manager.OracleManager.listTables(OracleManager.java:596)
at com.cloudera.sqoop.tool.ListTablesTool.run(ListTablesTool.java:49)
at com.cloudera.sqoop.Sqoop.run(Sqoop.java:144)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:180)
at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:218)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:228)
15/05/05 09:21:32 ERROR manager.OracleManager: Failed to list tables
java.sql.SQLRecoverableException: IO Error: The Network Adapter could not establish the connection
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:489)
at oracle.jdbc.driver.PhysicalConnection.<init>(PhysicalConnection.java:553)
at oracle.jdbc.driver.T4CConnection.<init>(T4CConnection.java:254)
at oracle.jdbc.driver.T4CDriverExtension.getConnection(T4CDriverExtension.java:32)
at oracle.jdbc.driver.OracleDriver.connect(OracleDriver.java:528)
at java.sql.DriverManager.getConnection(DriverManager.java:582)
at java.sql.DriverManager.getConnection(DriverManager.java:185)
at com.cloudera.sqoop.manager.OracleManager.makeConnection(OracleManager.java:275)
at com.cloudera.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:51)
at com.cloudera.sqoop.manager.OracleManager.listTables(OracleManager.java:585)
at com.cloudera.sqoop.tool.ListTablesTool.run(ListTablesTool.java:49)
at com.cloudera.sqoop.Sqoop.run(Sqoop.java:144)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:180)
at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:218)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:228)
Caused by: oracle.net.ns.NetException: The Network Adapter could not establish the connection
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:439)
at oracle.net.resolver.AddrResolution.resolveAndExecute(AddrResolution.java:454)
at oracle.net.ns.NSProtocol.establishConnection(NSProtocol.java:693)
at oracle.net.ns.NSProtocol.connect(NSProtocol.java:251)
at oracle.jdbc.driver.T4CConnection.connect(T4CConnection.java:1140)
at oracle.jdbc.driver.T4CConnection.logon(T4CConnection.java:340)
... 16 more
Caused by: java.net.ConnectException: Connection refused
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:529)
at oracle.net.nt.TcpNTAdapter.connect(TcpNTAdapter.java:149)
at oracle.net.nt.ConnOption.connect(ConnOption.java:133)
at oracle.net.nt.ConnStrategy.execute(ConnStrategy.java:405)
is there anyhthing wrong with the sqoop command.?
The error "network adaptor could not establish connection" is coming because of incorrect jdbc url. Jdbc url in your sqoop command should be in this format: jdbc:oracle:thin:#192.162.2.8:1521:orcl
The connection refused error may occur by scenarios as far as I know.
The Oracle service might not be running on the specified host on the
given port number.
The firewall in between might restrict the client access to the
oracle server through the given port number.
So I suggest you to first confirm the oracle host, port and the firewall restriction in between.
you can easily check the access by using telnet as below,
telnet 192.162.2.8 1521
See if the listener and the database are initiated. I just started the listener (lsnrctl start) and the database (sqlplus / as sysdba and startup) and it worked.