How to switch or change user in Hue - hadoop

Is there an option to switch user in HUE?
In my organization, infrastructure team setup usecase id, which has all HDFS file system access and only usecase id can submit yarn jobs. Individual user can sudo
to usecase id sudo su - xyz. There are no password for usecase id.
I am able to login to HUE but can't submit any jobs as I don't have access to any queue so I want to switch to usecase id, after login to HUE. How to switch user ( sudo su - xyz) in hue?

Hue, by default, can only run under the first account that's logged in with.
You need to ask the infrastructure team to configure Hue with a PAM or LDAP login authentication, in which case the password will be required for any Hue login user
Once that's setup, you are also able to switch accounts.
There are other configurations, but for enterprise users, I think those are the best options other than some single sign on OAuth/OpenID tool.
There's also SPNEGO, and that'll require a completely kerberized cluster.
Realistically, your company sounds like their cluster is not using Kerberos, so it isn't even secure.
For example, don't even need to sudo... Just export a variable
export HADOOP_USER_NAME=usecase
Of course, this isn't possible in Hue, but if you already have SSH access, you really can do anything in the cluster you want to

Configure yarn core-site.xml to impersonate hue users to access yarn cluster for submitting jobs.
key: hadoop.proxyuser.default_user.hosts
value: *
key: hadoop.proxyuser.default_user.groups
value: *
Replace default_user with hue or any system user manages your application.
Hue supports different backend module for authentication and authorization.
django.contrib.auth.backends.ModelBackend
desktop.auth.backend.AllowAllBackend
desktop.auth.backend.AllowFirstUserDjangoBackend
desktop.auth.backend.LdapBackend
desktop.auth.backend.PamBackend
desktop.auth.backend.SpnegoDjangoBackend
desktop.auth.backend.RemoteUserDjangoBackend
libsaml.backend.SAML2Backend
libopenid.backend.OpenIDBackend
liboauth.backend.OAuthBackend (New oauth, support Twitter, Facebook, Google+ and Linkedin)
Each module has its own advantage and disadvantages. For a very large scale deployment where multiple user need to access HUE application, LDAP, SAML2, Oauth authentication is preferred over PAM and Django based login.

Related

Evolving Tarantool instance

What should we do if we have one tarantool instance (without Cartridge or VShard), then sometimes in the future we need to replicate it to another machine without downtime?
or if it's the easiest way is using cartridge, how to connect to tarantool cartridge from outside the cardridge? for example using golang (what's the username and password?):
taran, err = tarantool.Connect(cfg.Tarantool.Addr, tarantool.Opts{
User: cfg.Tarantool.User,
Pass: cfg.Tarantool.Pass,
Reconnect: 10 * time.Second,
MaxReconnects: 8640,
})
for example in other database only need to attach a new slave from the master (1 command line call) and wait for it to be sync (100% replicated).
Not sure I'll completely answer your question. But let's discuss every point separately.
Replication
You can use replication without vshard or cartridge. vshard is a module for sharding, if you don't need sharding you can use only replication feature.
Read about replication in documentation configuration - https://www.tarantool.io/en/doc/latest/book/replication/. Cartridge is just framework that simplifies cluster management and gives you huge amount of useful features.
User password
Also you ask about users/passwords. After you call box.cfg{listen=...} you could create some user, alter some rights for it and change its password. Please, read about user management in Tarantool in our documentation - https://www.tarantool.io/en/doc/latest/book/box/authentication/. After you create some user you can connect to Tarantool instance via connector, console (using tarantoolctl) or another Tarantool (using net.box module) under this user.
Talking about cartridge, it uses system user admin with cluster-cookie as a password.

What determines what user / groups Ranger can see when setting policies?

Have users on local machines that have HDFS /user dirs that do not show up as possible users when setting Ranger policies
I can see that Ranger already have a place where you can see and add users in the settings menu of the ranger UI, but not sure where this is getting populated from.
So my question then is what determines if Ranger can see cluster users for setting policies (and is there an easy way to manage this via ambari)?
The problem was that I had thought, looking at a answer on the Hortonworks community forums, that for a user to be recognized as "existing" on the HDP cluster, all that was required was for the user to 1) exist on a cluster node and 2) have a folder in hdfs:///user/<the username>. This apparantly is not correct (at least in the case of being recognized by Ranger as a valid user that can have policies set on them).
In order for a user to be recognized by Ranger (here, I do not have a cluster integrated with Kerberos or Active Directory), that user needs to exist on the usersync server machine which supports...
the ability [for Ranger] to get users and groups from the corporate AD to use in policy definitions.

How to understand the process of Kerberos (over Hadoop)?

I have deployed Kerberos in hadoop cluster. According to the theory, the KDC will verify you are the one as you clared, according to the private key.
However, using that system confused me. For example, if you need access to the HDFS, what you need to do is just to input "kinit hdfs#MY.REALM" and the password from a client. Then you will get ticket and manipulate the HDFS as the superuser "hdfs".
Does this the real process of kerberos? If the user are only verified by password, why don't we directly build a list inside the server and require the user to input its username/password? Where is the private key mentioned in the theory? Can anyone explain this to me please?

How to use the ResourceManager web interface as an user

Every time i try to use the Hadoop Resource Manager web interface (http://resource-manger.host:8088/cluster/) i show up logged in as dr.who.
My question, how can I login as another user? In this case i want to login as myself and have a higher lever of privileges than dr.who.
The user infomation is got from HttpServletRequest#getRemoteUser().
1. If you deployed an insecure cluster, the simplest way to pass the username to server is by url parameter. For example, http://localhost:8088/cluster?user.name=babu
2. If you deployed a secure cluster, you probably use Kerberos authentication. You can use kinit to get a kerberos tgt, then configure the browser to negotiate. (network.negotiate-auth.trusted-uris for firefox, and --auth-server-whitelist for chromium. I'm sure there's lots of answers about this)
For more information, you can check hadoop official documentation.(https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/HttpAuthentication.html)
You should set the access control list by changing the default configuration of:
yarn.resourcemanager.zk-acl
from
world:anyone:rwcda
to something else,which is Cluster-specific
The ACLs the ResourceManager uses for the znode structure to store the internal state.

How to run "hadoop jar" as another user?

hadoop jar uses the name of the currently logged-in user. Is there a way to change this without adding a new system user?
There is, through a feature called Secure Impersonation, which lets one user submit on behalf of another (that user must exist though). If you're running as the hadoop superuser, it's as simple as setting the env variable $HADOOP_PROXY_USER.
If you want to impersonate a user which doesn't exist, you'll have to do the above and then implement your own AuthenticationHandler.
If you don't have to impersonate too many users, I find it easiest to just create those users on the namenode and use secure impersonation in my scripts.

Resources