Impala ODBC connection using SIMBA - hadoop

I referred Impala ODBC connection provided by Simba/CDH and installed it. I'm using CDH5.3 and Impala 2.1.0.
in odbc.ini, i tried port 21050,21000 but not able to connect to Impala. Later I used port 10000 and able to connect. Not sure if it this port really runs query in Impala as when I run 'select count(*) from emp' at isql -v impala_simba, it takes lot time as compare to running same query in from Impala-shell.
I doubt if the configuration really uses Impala or Hive ?
I also tried with the CDH provided Impala ODBC connection but that also has same issue.
Can you please advise what's correct Impala port i should mention in .odbc.ini so that it runs query in Impala ?
here's configurations.
[ODBC Data Sources]
impala_simba=Simba Impala ODBC Driver
[impala_simba]
# Description: DSN Description.
# This key is not necessary and is only to give a description of the data source.
Description=Simba Impala ODBC Driver (64-bit) DSN
# Driver: The location where the ODBC driver is installed to.
Driver=/opt/simba/impalaodbc/lib/64/libsimbaimpalaodbc64.so
# The DriverUnicodeEncoding setting is only used for SimbaDM
# When set to 1, SimbaDM runs in UTF-16 mode.
# When set to 2, SimbaDM runs in UTF-8 mode.
DriverUnicodeEncoding=1
# Values for HOST, PORT, KrbFQDN, and KrbServiceName should be set here.
# They can also be specified on the connection string.
HOST=10.74.163.109
PORT=10000
Database=default
# The authentication mechanism.
# 0 - no authentication.
# 1 - Kerberos authentication
# 2 - Username authentication.
# 3 - Username/password authentication.
# 4 - Username/password authentication with SSL.
AuthMech=2
# Kerberos related settings.
#KrbFQDN=
#KrbRealm=
#KrbServiceName=
# Username/password authentication with SSL settings.
UID=abhi
#PWD
CAIssuedCertNamesMismatch=1
TrustedCerts=/opt/simba/impalaodbc/lib/64/cacerts.pem
# Specify the proxy user ID to use.
#DelegationUID=
# General settings
TSaslTransportBufSize=1000
RowsFetchedPerBlock=1000
SocketTimeout=0
StringColumnLength=32767
UseNativeQuery=0

Related

How to create ODBC "DSN" for cross-platform testing?

I need a simple ODBC test scenario on WIN which I can configure very simply and be assured it is working in support of another question at Unix.SE.
In a nutshell I'm trying to setup a PyODBC/Python script connection from Debian 10 (192.168.1.2) to Windows 10 in KVM/QEMU virtual system (192.168.1.12).
First, on the Windows 10/KVM, I see the ODBC Data Source Administrator has a tab File DSN and Microsoft Text Driver. Can I use FileDSN to test Python PyODBC connection to ODBC using a simple CSV file in place of Server?? (My research with ODBC only finds running server instances).
Next, what I tried:
On Debian I installed ODBC Microsoft driver for Linux.
Shutdown the Windows 10 firewall, and I can ping in both directions:
$nmap -p 22 192.168.1.12 # Deb to Win
> Test-NetConnection 192.168.1.2 -p 22 # Win to Deb
On Windows 10/KVM I added a FileDSN with Microsoft Text Driver. I created a CSV file (odbc_test_01.csv) with simple header and one row of data (IE. {'ID' : 1, 'NAME' : 'FOO'})
Created a Jupyter Notebook to make testing easier. Here is my connection string and the results:
cn = pyodbc.connect(r'Driver={ODBC Driver 17 for SQL Server};' # Driver installed above
r'FILEDSN=odbc_test_01.csv;' # my attempt at FileDSN
r'SERVER=192.168.1.12;' # KVM IP tested with ping
r'Trusted_Connection=no;' # explicit; use UID/PWD
r'UID=<username>;' # Windows user name
r'PWD=<password>', # Windows user password
autocommit=True)
OperationalError: ('HYT00', '[HYT00] [Microsoft][ODBC Driver 17 for SQL Server]Login timeout expired (0) (SQLDriverConnect)')
Tried isql from Debian command line with same string:
isql -v -k ''Driver={ODBC Driver 17 for SQL Server};FILEDSN=odbc_test_01.csv;SERVER=192.168.1.12; Trusted_Connection=no;UID=<username>;PWD=<password>'
Similar pages here at SO:
Authenticate from Linux to Windows SQL Server with pyodbc
Python pyodbc connect to Sql Server using SQL Server Authentication
An ODBC "File DSN" is not a driver for accessing data in a file. It is a way to specify a DSN (connection information for a target database) as values in a standalone file instead of in a standard configuration file on Linux (e.g., /etc/odbc.ini) or in the Windows registry.
If you need to "clone" a Windows DSN entry for use in a Linux environment then you may find my dump_dsn utility helpful. It retrieves an ODBC DSN from the Windows registry and presents it in a form that you could use to recreate the DSN on Linux.
For example, say I had a DSN named "mssql199" on Windows and when I ran dump_dsn.to_text("mssql199") on it I got
[mssql199]
Driver=ODBC Driver 17 for SQL Server
Description=with UseFMTONLY
Server=192.168.0.199
Database=myDb
Encrypt=No
TrustServerCertificate=No
ClientCertificate=
KeystoreAuthentication=
KeystorePrincipalId=
KeystoreSecret=
KeystoreLocation=
UseFMTONLY=Yes
Trusted_Connection=No
To use that same DSN on a Linux box I would have to
copy that block into /etc/odbc.ini (or equivalent, for a "System DSN") or ~/.odbc.ini (for a "User DSN") to use DSN=mssql199, or
save that block with an [ODBC] header instead of [mssql199] to a file, e.g., /home/gord/mssql199_file.dsn (for a "File DSN") and use FILEDSN=/home/gord/mssql199_file.dsn
Edit re: "Can I use FileDSN to test Python PyODBC connection to ODBC using a simple CSV file in place of Server??"
No. An ODBC DSN or FILEDSN on the Windows box will only be useful to connect from the Windows box to a data source (either locally, or on some other machine). We cannot connect from one machine (e.g., Linux) to an ODBC DSN entry on another machine
I created an SQLite database. Then I added SQLite drivers for ODBC.

Connecting to SQL Server with pyodbc, FreeTDS, and Kerberos authentication on macOS

I never ask questions to forums, as I can generally find the answer somewhere on the interweb.
However, in this instance I cannot.
Summary: I can connect to and query the database with kerberos authentication via Azure Data Studio and tsql with FreeTDS. I cannot connect to the pyodbc. I've tried tens of different configurations with no success.
My ultimate goal is to connect to the MSSQL server DB with python.
Thank you for any input.
Background
local machine macOS 10.15.4
Connected to VPN required for kerberos authentication
Have successfully queried DB from Azure Data Studio
database is Microsoft SQL Server 2016
FreeTDS
tsql -S -U 'directory\username' -> Works, can query DB
isql
isql dsn_name 'directory\username' 'password'
error DIAG [42000] [FreeTDS][SQL Server]Login failed. The login is from an untrusted domain and cannot be used with Windows authentication.
isql dsn_name 'directory\username'
error: DIAG [42000] [FreeTDS][SQL Server]Login failed. The login is from an untrusted domain and cannot be used with Windows authentication.
pyodbc
cnxn = pyodbc.connect('DSN=dsn_name;Trusted_Connection=yes')
error:
pyodbc.ProgrammingError: ('42000', '[42000] [FreeTDS][SQL Server]Login failed. The login is from an untrusted domain and cannot be used with Windows authentication. (18452) (SQLDriverConnect)')
cnxn = pyodbc.connect('DSN=dsn_name;UID=directory\username;PWD="password"')
error:
DIAG [01000] [FreeTDS][SQL Server]Adaptive Server connection failed
pyodbc.OperationalError: ('08001', '[08001] [FreeTDS][SQL Server]Unable to connect to data source (0) (SQLDriverConnect)')
Configuration
krb5.conf
libdefaults
default_realm = domain
[realms]
domain_same_as_default = {
kdc = kdc_address
}
odbc.ini
[dsn_name]
Description = MSSQL Server
Driver = FreeTDS
Servername = server_name
odbcinst.ini
[FreeTDS]
Description=FreeTDS Driver for Linux & MSSQL
Driver=/usr/local/lib/libtdsodbc.so
Setup=/usr/local/lib/libtdsodbc.so
UsageCount=1
[ODBC]
Trace=Yes
TraceFile=/dev/stdout
freetds.conf
[server_name]
host = ip_address
port = port_num
database = db_name
REALM = DOMAIN
I've avoided using DSNs with pyodbc, as I prefer to have all my configuration in one spot. Here's an example connection string I use with a domain.
con = pyodbc.connect(
r"DRIVER={FreeTDS};"
r'SERVER=mssql.mydomain.com;'
r"PORT=1433;"
r"DATABASE=my_db;"
f"UID=MYDOMAIN\\my_username;"
f"PWD=my_password;"
r"TDS_Version=7.3;"
r"Encrypt=yes;"
r"Trusted_Connection=yes;"
)
Give that a whirl? The two backslashes (\\) are needed for escaping if using Windows domain auth, that is not a typo. The key thing you may be missing is TDS_Version. You can read more about TDS Versions here: https://www.freetds.org/userguide/ChoosingTdsProtocol.html
Good luck!

Connecting to Hive Database with DBeaver

I have a Hortonworks Hadoop cluster where the data nodes are on a separate network off of the master/head node. The only way to access the data nodes is through the master node or an edge node. From the edge node, I execute the hive command to connect into my hive database.
I cannot connect to the hive database from my desktop with DBeaver (4.3.0, 64-bit Windows) or the hive command line interface. Through DBeaver, I tried creating an SSH tunnel to my edge node and continually receive "Could not open client transport with JDBC Uri. jdbc:hive2://127.0.0.1:[port#]/[database].
Configuration for Hive/Apache Hive driver:
General Tab:
Host: dataNodeName
Port: 10000
Database/Schema: databaseName
User name: myUID
SSH Tunnel Tab (Network page):
Checked Use SSH Tunnel
Host/IP: edgeNodeServerName
Port: 22
User Name: myUID
Authentication Method: Password
Password: myPWD
Advanced
Local port: 0
Keep-Alive interval (ms): 0
When I select "Test Connection" with local port set to "0", I receive the above error message with random port numbers. If I set the local port to "10000", I receive the above error with port number "10000".
It looks like DBeaver is ignoring the generic JDBC connection settings--the host name in the created JDBC string is 127.0.0.1 instead of the data node name.
What am I missing? How do I setup DBeaver to access a Hive database located on a "hidden" network?
Is your hostname configured with the IP address mentioned in the jdbc connect syntax (127.0.0.1)?
Are you able to connect to beeline from your Unix shell?
Syntax to connect to beeline(hiveserver2):
beeline -u jdbc:hive2://<hostname>:<hive listener port>/<database> -n username> -p <password>
If you're able to connect to beeline, you should be able to connect to hive using same port number and host from DBeaver.
Hive listener port by default is configured on 10000, but there's a possibility that your admin can change the port number. Check the port number in hive-site.xml, or get it from admin.
Could you please uncheck the SSH tunnel and try?
This link has all the setup from scratch, please check if you have missed any step.
https://www.linkedin.com/pulse/query-hive-hiveserver2-from-windows-using-universal-database-nimmala
Not sure if your environment is Kerberized or not but assuming it is -
Following is what worked for me while connecting to Cloudera -
Fetch the krb5.conf or krb5.ini from your admins and place it in some directory. I normally put the file in a location where I put my keytabs.
Create jaas.conf file and place it at the same location(or the location of your choice)
jaas.conf must look like below(copy paste) -
Client {
com.sun.security.auth.module.Krb5LoginModule required
debug=true
doNotPrompt=true
useKeyTab=true
keyTab="C:\Users{user}\krb5cc_{user}"
useTicketCache=true
renewTGT=true
principal="{user}#DOMAIN.ORG" ;
};
Edit your dbeaver.ini file and provide the reference to both of this files(append the following lines to existing dbeaver.ini). Make sure you backup dbeaver.ini, with re installations or replacing with newer version, dbeaver.ini may get replaced, in that case you can copy the lines below from your backup dbeaver.ini file -
-Djavax.security.auth.useSubjectCredsOnly=false
-Djava.security.krb5.debug=true
-Dsun.security.krb5.debug=true
-Djava.security.krb5.conf=C:\Users{User}\Documents\Keytabs\krb5.conf
-Djava.security.auth.login.config=C:\Users{User}\Documents\Keytabs\jaas.conf
Last Step(You may need or may not)
I init my keytab before connecting. So I use Shell Commands -
Press F4 after creating the connection
Make sure in user you just put the user name for which you are initializing the keytab and nothing else. It should not be {user}#domain.org.
Use the shell commands to init the keytab
I also was having trouble configuring DBeaver to Hive, my solution was to use Cloudera's ODBC Driver. It worked a lot better then the JDBC drivers (auto-complete working, quicker, no need to run kinit), and I could automatize its creation.
The only problem is that you must be admin to install it.

Connecting to Oracle 12 on remote server using Python 2.7.12 results into ORA-12170

Both databases are on remote server and I can get connected to and query on them using TOAD.
When connecting to database configured with OraClient11g_Home1 from Python on my desktop the connection is established successfully. However, trying to connect to database which is using OraClient12Home1 results into ORA-12170 error,i.e. TNS: Connect timeout occurred. Below are configurations.
Edited to contain more information:
I connect to the database using a remote desktop connection. The code is written to automate part of my testing activities by querying two databases and checking whether a single command has been successful on multiple systems(e.g. Ericsson and Huawei)
Output of one query is the input to another one (I can get output from the 11g DB and have previously wrote scripts for it, but this is the first time we're getting connected to the DB on Ora12 using python. I can access both DBs using TOAD on the remote desktop or connect and query 11g DB using python on my desktop but Ora12 throws time out for the same code.
the connection part of the code and how they are queried is as below:
#Get chrono number, action code and status from provisioning table
ip = '********'
port = *****
service_name = '*****'
dsn = cx_Oracle.makedsn(ip, port, service_name)
connection = cx_Oracle.connect("********","********",dsn)
cursor = connection.cursor()
totalChronoList = list()
myQuery=list()
inputData = list()
myQuery = ("select CHRONO_NUM_N, ACTION_CODE_V, STATUS_V from gsm_subs_provisioning where ACTION_DT_DT > SYSDATE - 2 order by ACTION_DT_DT desc")
cursor.execute(myQuery)
inputData.append(cursor.fetchall())
The configurations are as below:
OraClient11g_home1 (11.2.0.1)
ORACLE_HOME:C:\Oracle\product\11.2.0\client_1
ORACLE_HOME_NAME:OraClient11g_home1
ORACLE_HOME_KEY:HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\KEY_OraClient11g_home1
ORACLE_SID:
NLS_LANG:AMERICAN_AMERICA.WE8MSWIN1252
SQLPATH:C:\Oracle\product\11.2.0\client_1\dbs
LOCAL:
Client DLL:C:\Oracle\product\11.2.0\client_1\oci.dll
TNSNames.ora:C:\Oracle\product\11.2.0\client_1\Network\Admin\tnsnames.ora
SQLNet.ora:C:\Oracle\product\11.2.0\client_1\Network\Admin\sqlnet.ora
LDAP.ora:C:\Oracle\product\11.2.0\client_1\Network\Admin\ldap.ora
Login.sql:
GLogin.sql:
In system PATH:Yes
Home is valid:Yes
OraClient12Home1 (12.1.0.2)
ORACLE_HOME:E:\app\client\Oracle\product\12.1.0\client_1
ORACLE_HOME_NAME:OraClient12Home1
ORACLE_HOME_KEY:HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\KEY_OraClient12Home1
ORACLE_SID:
NLS_LANG:AMERICAN_AMERICA.WE8MSWIN1252
SQLPATH:E:\app\client\Oracle\product\12.1.0\client_1\dbs
LOCAL:
Client DLL:E:\app\client\Oracle\product\12.1.0\client_1\bin\oci.dll
TNSNames.ora:
SQLNet.ora:E:\app\client\Oracle\product\12.1.0\client_1\Network\Admin\sqlnet.ora
LDAP.ora:
Login.sql:
GLogin.sql:E:\app\client\Oracle\product\12.1.0\client_1\sqlplus\admin\glogin.sql
In system PATH:Yes
Home is valid:Yes
ORA-12170: TNS:Connect timeout occurred means you can't access the host and/or port of the DB. I bet in your case it is some restriction on firewalls (most reason, but may be others). First of all try to check is the port accessible. Easiest way - run powershell statement:
Test-NetConnection <host-or-ip> -port <port>
Then go with findings to sysadmin/dba.
Update: As you connect to DB using easy access method (ip, port, service name), you don’t have to care about tnsnames.ora.
In your 12c client we can see that no tnsnames.ora file is found.
Copy this file from the 11g client directory.

HIVE ODBC connector settings

I configured unixodbc to use the hive connector from cloudera in my Linux Mint machine,
but I keep receiving the following error when trying to connect to hive (e.g. using isql -v hive)
S1000][unixODBC][Cloudera][ODBC] (11560) Unable to locate SQLGetPrivateProfileString function.
[ISQL]ERROR: Could not SQLConnect
I think I set the /etc/odbcinst.ini and the ~/.odbc.ini in the correct way:
# content of /etc/odbcinst.ini
[hive]
Description = Cloudera ODBC Driver for Apache Hive (64-bit)
Driver=/opt/cloudera/hiveodbc/lib/64/libclouderahiveodbc64.so
ODBCInstLib=libodbcinst.a(libodbcinst.so.1)
UsageCount = 1
DriverManagerEncoding=UTF-16
ErrorMessagesPath=/opt/cloudera/hiveodbc/ErrorMessages/
LogLevel=0
SwapFilePath=/tmp
and my ~/.odbc.ini file contains:
[hive]
Description=Cloudera ODBC Driver for Apache Hive (64-bit) DSN
Driver = hive
ErrorMessagesPath=/opt/cloudera/hiveodbc/ErrorMessages/
# Values for HOST, PORT, KrbHostFQDN, and KrbServiceName should be set here.
# They can also be specified on the connection string.
HOST= <the host>
PORT= <the port>
Schema=<the schema>
# .. etc
Can you help me find out what is causing the error?
What does
ldd /opt/cloudera/hiveodbc/lib/64/libclouderahiveodbc64.so
Show you?
It may be that the driver is not linked to libodbcinst.so.
You could try a
LD_PRELOAD=/usr/local/libodbcinst.so
or wherever libodbcinst.so is on your machine.
Are you sure that you ODBCInstLib is set properly?
I was hitting the same issue with a Vertica driver and my libodbcinst.so.1 ended up needing an absolute path: /usr/lib/x86_64-linux-gnu/libodbcinst.so.1
I determined the path by running a Find for libodbcinst.so.

Resources