I have tried to run the Sqoop eval script through AWS EMR CLI for Teradata connection but found the error
Error loading ManagerFactory information from file /usr/lib/sqoop/conf/managers.d/td_connector.txt: java.io.IOException: Could not load jar $SQOOP_HOME/lib/teradata-connector-1.6.5.jar into JVM. (Could not find class org.apache.sqoop.teradata.TeradataConnManager.)
Steps I have followed:
login to EMR version emr-6.2.0 with the configuration of hadoop 3 and sqoop 1.4.7 through SSH
Downloaded the Teradata Hadoop connector 3.x from teradata downloads
moved the teradata hadoop connector to $SQOOP_HOME/lib and installed.
created the text file td_connect at /usr/lib/sqoop/conf/managers.d/ and included the text org.apache.sqoop.teradata.TeradataConnManager=$SQOOP_HOME/lib/teradata-connector-1.6.5.jar
ran the script
sqoop eval --connection-manager org.apache.sqoop.teradata.TeradataConnManager --connect jdbc:teradata://host/database= --username username --password password --query 'select top 5 * from table'
Could you please help to identify the issue
I have a Hadoop cluster running on another server. I am able to ssh into that server and use Hive to run queries. I'm trying to determine if I can query that server remotely, using Hive or Beeline; would prefer Beeline, since it's not being deprecated.
I used Homebrew to install Hadoop and Hive. However it complains about missing environment variables and path. But it seems like those things are set, so I must not have configured it correctly. So, what are the steps I need to go through to execute queries on a remote Hadoop from my Mac? Do I have to go through all the steps to set up a local Hadoop instance just so I can query a remote Hadoop?
~ (master) 10:24:30
# next line is from the docs
$ beeline -u jdbc:hive2://localhost:10000/default -n scott -w password_file
Cannot find hadoop installation: $HADOOP_HOME or $HADOOP_PREFIX must be set or hadoop must be in the path
~ (master) 10:25:05
$ which hadoop
/usr/local/bin/hadoop
~ (master) 10:25:18
$ echo $HADOOP_HOME
/usr/local/Cellar/hadoop/2.7.3/bin
Is it possible to connect to Hive via beeline using (kerberos) keytab file similar to the approach used for JDBC at
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-UsingKerberoswithaPre-AuthenticatedSubject
PS : beeline does support connecting on a kerberos secured hive server with username and password. But I am looking for a way to connect it with a keytab file.
http://doc.mapr.com/display/MapR40x/Configuring+Hive+on+a+Secure+Cluster#ConfiguringHiveonaSecureCluster-UsingBeelinewithKerberos
I think you cannot connect with keytab file into beeline but you can get ticket with keytab using kinit and then pass the hive server principal with the jdbc connection string of beeline to connect.
kinit -k -t keytab principal
Connection string to connect with beeline
!connect jdbc:hive2://hostname:10000/default;principal=hive/_HOST#REALM
It is a bug, but it is not a critical one.
Though you provided kerberos details, still it will ask you the username and password. You can just enter -> enter, it allows us to connect.
Example:
!connect jdbc:hive2://:10000/default;principal=hive/_HOST#REALM.COM
Connecting to jdbc:hive2://:10000/default;principal=hive/_HOST#REALM.COM
Enter username for jdbc:hive2://:10000/default;principal=hive/_HOST#REALM.COM: press enter
Enter password for jdbc:hive2://:10000/default;principal=hive/_HOST#REALM.COM: press enter
Connected to: Apache Hive (version 0.13.1-cdh5.3.7-SNAPSHOT)
Driver: Hive JDBC (version 0.13.1-cdh5.3.7-SNAPSHOT)
Transaction isolation: TRANSACTION_REPEATABLE_READ
I am new to Hive and hopefully this is going to be an easy thing to solve
for someone with more experience, but I am having trouble doing it on my
own.
On my EC2 app server I am running the following command with no error:
beeline -u jdbc:hive2://master
This is working on Hive 13 which was installed through a bootstrap action
using the latest AMI version. 'master' is pointing to my EMR cluster
Then I downloaded the source for Hive 14 and built it. I have replaced my
/home/hadoop/hive directory with the package that was built.
However, if I try to execute the same command, I get an error:
scan complete in 6ms
Connecting to jdbc:hive2://master
Error: Could not open client transport with JDBC Uri: jdbc:hive2://master:
Cannot open without port. (state=08S01,code=0)
Beeline version 0.14.0 by Apache Hive
0: jdbc:hive2://master (closed)>
Running it with the port provided works correctly:
beeline -u jdbc:hive2://master:10000
I would like to be able to able to run the command without providing the
default port number.
Can anyone direct me with an instruction.
Thanks,
Hive Beeline Connection in Two Modes:
1.Embedded Mode:
If both Hive Client and Hive server are same then connect beeline by using below url:
!connect jdbc:hive2://
2.Remote Mode:
If server in one machine but client in one machine you can connect beeline using below url:
!connect jdbc:hive2://<host>:<port>
I have a CDH 5.3 instance.
I start the hive-server2 by first starting the hive-metastore and then the hive-server from command line.
After this I use beeline to connect to my hive-server2 but apparently it is not able to so.
Could not open connection to jdbc:hive2://localhost:10000: java.net.ConnectException: Connection refused (state=08S01,code=0)
Another issue, I tried to see if the hive-server2 was listening on port 10000.
I did " sudo netstat -tulpn | grep :10000" but none of the applications came up.
I also added the following property in the hive-site.xml but to no avail. Why it doesn't show up on the netstat?
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
<description>TCP port number to listen on, default 10000</description>
</property>
The connect command on beeline:
!connect jdbc:hive2://localhost:10000 org.apache.hive.jdbc.HiveDriver
when asked for username and password, I just enter test "user" and "password"
for the respective values and then it throws the error.
Any help will be appreciated
Hive Connecting to beeline from client having various Modes.
1.Embedded Mode:
Both Server and Client runs in same machine. No TCP Connection required.
If hive.server2.authentication is "NONE" in HIVE_HOME/conf/hive-site.xml then connect beeline with below url
Connection URL:
!connect jdbc:hive2://
2. Remote Mode:
It supports multiple clients to execute queries with help of following Authentication schemes.
Authentication Schemes:
i.)SASL Authentication:
If value of "hive.server2.authentication" property in HIVE_HOME/conf/hive-site.xml to be set as "SASL" then connect hive beeline with below url
Beeline URL:
!connect jdbc:hive2://<host>:<port>/<db>
ii.)NOSASL Authentication:
If "hive.server2.authentication" is nosasl then connect the beeline like below.
Beeline URL:
!connect jdbc:hive2://<host>:<port>/<db>;auth=noSasl
Hope this really helps you
References:
https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_dataintegration/content/beeline-vs-hive-cli.html
I had the same problem here.
This is simply because hiveserver2 failed to start -- the error does not show up in console, but in hive logs. In my case, hive logs are located in /tmp/ubuntu/hive.log
There might be different reason for you to cause hive-server2 failed to start, but it definitely worth to look into this log file.
The following worked for me. If you installed and configured hive for the first time and trying to connect from beeline, make sure you start the hive service using the following command in the current terminal
>hive --service hiverserver2 &
The process id for Hiverver2 appears in the console.Then retry connecting to hive from beeline using different terminal:
>beeline -u "jdbc:hive2://localhost:10000/default" -n <username> -p <password> -d "org.apache.hive.jdbc.HiveDriver"
In this case,your hiveserver2 service was not started.please follow blow steps to check and fix.
step:
1.see hive.log file to check "Service:HiveServer2 is started."
1) find / -name hive.log
2) vim hive.log
in hive.log file ,if you can not find "Service:HiveServer2 is started.",then prove hiveserver2 is not started.
2.start hiveserver2
command: ./bin/hiveserver2
3.see hive.log。
if you can find "Service:HiveServer2 is started." in hive.log. then connect hiveserver2 by beeline.
4.connect hiveserver2
./bin/beeline
!connect jdbc:hive2://localhost:10000
5.below information can appeare.
Beeline version 1.2.1 by Apache Hive
beeline> !connect jdbc:hive2://localhost:10000
Connecting to jdbc:hive2://localhost:10000
Enter username for jdbc:hive2://localhost:10000: root
Enter password for jdbc:hive2://localhost:10000: ******
Connected to: Apache Hive (version 1.2.1)
Driver: Hive JDBC (version 1.2.1)
Transaction isolation: TRANSACTION_REPEATABLE_READ
you have to give hiveserver2 username and password check it in hive-site.xml by default username(anonymous) and password(anonymous) otherwise just give enter without giving password and username.
try with verbose option so you can see more details...
beeline -u "jdbc:hive2://localhost:10000/default;user=user;password=*******" --verbose
Please make sure the hive2service deployment IP.
I meet the same problem, I use the cloudera server ip(XXX.42) to connect the hive2 service; but in fact hive thrift service(hive2service) is depoyed on other machine(XXX.41).