Knox Gateway database connector - hadoop

Is it possible to write DB connector using Knox Provider extension?
I want Knox to expose endpoint in response to which Knox would put to or get from database records.

Depends on how you want to connect and what functionality you want to add. Currently, Knox support HTTP and Websockets protocols so if you are using JDBC to connect then that will not work. If you are using JDBC over HTTP (e.g. beeline) then that might work. Here are few links to Knox documentation
User Guide
Dev Guide

Related

How to configure "Proxy user request" for NiFi CLI

According to the documentation, one prerequisite for using NiFi CLI against a secured NiFi instance is to configure proxy user request for the node's identity (e.g. CN=localhost, OU=NIFI).
https://nifi.apache.org/docs/nifi-docs/html/toolkit-guide.html#prerequisites-for-running-in-a-secure-environment
I understand how to configure it through the NiFi web user interface. However, is it possible to do the same through scripting?
The reason is that I am working on a NiFi installation script, and I would like to install NiFi and configure users/policies in one go if it is possible.
Thank you!
If you are trying to use NiFi CLI to setup NiFi itself, then you're only real option is for NiFi CLI to perform operations as the Initial Admin identity.
It then depends how NiFi is configured to perform authentication, meaning where is your initial admin identity coming from. Is it a DN from a client cert, a user in LDAP, a kerberos principal, etc?
If it is a client cert, then you can just configure NiFi CLI to use that cert and it should work.
If it is a LDAP user, then you need to have NiFi CLI use one of NiFi's server certs to proxy the LDAP user.
Both of these scenarios are shown in the docs:
https://nifi.apache.org/docs/nifi-docs/html/toolkit-guide.html#security-configuration

HDP: Confused between knox proxy (API), proxy (UI) and SSO

For now, in our HDP environment, we have been try set 1st and 2nd layer security, kerberos and ranger. Then we want to add the 3rd layer, Knox.
After we read some documentation in Knox reference for HDP 3.1.4, and the other page, we found 3 option for knox being implemented to a HDP cluster, first knox proxy (api), proxy (ui), and SSO.
In our needs, we want use the proxy (api) and we get the hint to implement it (combined with kerberos).
But we confused to implement the feature proxy ui and SSO
What is the different both of them?
when we use proxy ui and when we use SSO?
can we use three Knox options together?
Based on this link, there is a step that need to configure ambari authentication, "Configure Ambari Authentication for LDAP/AD". Does it mean we drop ambari authentication with kerberos?
But what about knox supported matrix that state Knox SSO can be configured in kerberized cluster?
We can't find how to use and configure Knox proxy (UI). Does it mean if we want launch atlas apps, the knox authentication form appear first or something different about that?
Regards

How to use anything but Google Shell or Web browser when oauth2.googleapis.com is blacklisted (not sure about this)?

I can not connect to Google Services from client application if it is trying to communicate with oauth2.googleapis.com (which is probably blocked in my corporate network - I dont know how to test it for sure).
I tried BigQuery with JDBC driver in Dbeaver. With basic settings.
User-based login does this:
It generates link for OAUTH. I open the browser and login with the right google account. Then I insert generated code into the Dbeaver and I recieve that AUTH has failed.
Service-based login does this:
It does not want me to visit any webpage. It just tells me:
[Simba][BigQueryJDBCDriver](100004) HttpTransport IO error : oauth2.googleapis.com.
I also tried to use ODBC, where PROXY can be filled in. But no luck.
When I take a look into 'Proxy Options' the proxy port is always rewritten by proxy host. Weird.
This is what happens when i click on 'catalog' or 'dataset' drop-down field. I cant do any further steps.
BUT!
When I set my HTTP PROXY in GCLOUD CLI APP then communication works. And I can call BQ from it.
Does it mean that GCLOUD communicates through HTTP Proxy and DBeaver or ODBC does not? Or does it mean that GCLOUD does not need oauth2.googleapis.com but ODBC and JDBC do and it is blacklisted? I am confused.
We need to migrate from our internal environment to GCP. We would love to use various applications. I would ask for whitelisting oauth2.googleapis.com but i am not sure this is the only problem as GCLOUD app works without any flaws.
I am not-experienced with networking so i am more than happy to update / correct this question or add any info (if you need) to help me understand this issue. Thank you
According to your description, your corporate network is using a Proxy to reach out Internet, this is the reason why gcloud is capable to reach out BigQuery service when Proxy settings are configured in your system; through Cloud SDK Proxy settings or HTTP PROXY environment variable.
You require to setup the proxy settings within the JDBC connection string as described in Simba JDBC driver documentation, e.g.:
jdbc:bigquery:DataSetId=MyDataSetId;ProjectId=MyProjectId;OAuthType=1;ProxyHost=MyProxyHost;ProxyPort=MyProxyPort;ProxyUID=MyProxyUsername;ProxyPWD=MyProxyPassword
This connection string will indicate the Proxy settings to Simba JDBC driver.

Authenticate Nifi using OpenID Connect using API

I am new to OpenID connect & security domain. I have configured Nifi to use OpenID for authentication using online documentation. And to automate a few nifi related tasks we are using nipyapi.
I have already written python code which does automated flow deployment for basic nifi installation (unsecured & without user authentication)
Now, I have to move the code to secured Nifi installation. How to authenticate to OpenID connect using nipyapi/rest API ?
AS per discussion with Bryan, i am planning to use client certificate for authentication but it started giving authorization error. and have created another question with the details.
Nifi - Client Certificate Authorization Error
OpenID Connect generally requires that you follow a flow of re-directs, typically in the browser. NiFi re-directs you to the login page of the OIDC provider, upon completion, the OIDC provider redirects you back to NiFi. I'm not exactly sure how, or if you even can, perform this login process from scripts. An easy alternative would be to just generate a client certificate to represent an automation user for any NiPyApi scripts. Client certificate authentication is always enabled by default for NiFi.

Hive and Tableau - Proxies/ODBC?

So, I have a Hive server (Cloudera, Thrift via HTTP) set up and working, and can connect to it from Tableau using the ODBC driver for Cloudera Hive - all good, from the servers in the AWS farm.
However, no luck from the client site/their end-user PCs.
The reason for this is that they require all outbound traffic to the internet (here, my AWS instance) to go through proxies using NTLM, and I can't get the Cloudera ODBC driver to talk via the NTLM proxy. It appears to ignore the Windows proxy settings entirely, in fact.
I'm aware of two (obvious) solutions - use Fiddler/cntlm locally on the box as a reverse proxy / set up a reverse-proxy in the customer's net and point ODBC at that - both of these are somewhat unpalatable to the users.
So: Is there a way to get Cloudera's ODBC driver (or Windows itself) to forcibly go via an NTLM proxy without requiring additional software/servers? Or is there a Cloudera-Hive-compatible Tableau connector that works well with proxies in the middle?
TL;DR: Need to get from Tableau client on Windows to Cloudera Hive in AWS across an NTLM proxy. Thoughts?
The Cloudera Hive ODBC driver currently doesn't support proxy and NTLM authentication. If this feature is important for you I would suggest raising it as a feature request against Cloudera. I am not aware of any other Hive ODBC driver that supports proxy and NTLM.
Holman

Resources