Can't get Master Kerberos principal for use as renewer for Talend Batch Jobs - hadoop

we are trying to use talend batch (spark) jobs to access hive in a Kerberos cluster but we are getting the below "Can't get Master Kerberos principal for use as renewer" error.
By using the standard jobs(non spark) in talend we are able to access hive without any issue.
Below are the observation:
When we are running spark jobs talend could able to connect to hive
metastore and validating the syntax. ex if I provide the wrong table
name it does return "table not found".
when we select count(*) from table where there is no data it returns
"NULL" but if some data present in Hdfs(table) It failed with the error
"Can't get Master Kerberos principal for use as renewer".
I am not sure exactly what is the issue which is causing the token problem. could some one help us know the root cause.
One more thing to add instead of hive if I read / write to hdfs using spark batch jobs it works , So only problem is with hive and Kerberos.

You should include the hadoop config in the classpath (:/path/hadoop-configuration). You should include all configuration files in that hadoop configuration directory, not only the core-site.xml and hdfs-site.xml files. It happened to me and that solved the problem.

the same problem when I start spark on k8s,
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: java.io.IOException: Can't get Master Kerberos principal for use as renewer
at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:133)
at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100)
at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80)
at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:243)
at org.apache.spark.input.WholeTextFileInputFormat.setMinPartitions(WholeTextFileInputFormat.scala:52)
at org.apache.spark.rdd.WholeTextFileRDD.getPartitions(WholeTextFileRDD.scala:54)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:273)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:269)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:269)
and I just add yarn-site.xml to the HADOOP_CONFIG_DIR.
the yarn-site.xml only contains yarn.resourcemanager.principal
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<property>
<name>yarn.resourcemanager.principal</name>
<value>yarn/_HOST#DM.COM</value>
</property>
</configuration>
this working for me.

Related

Write to HDFS/Hive using NiFi

I'm using Nifi 1.6.0.
I'm trying to write to HDFS and to Hive (cloudera) with nifi.
On "PutHDFS" I'm configure the "Hadoop Confiugration Resources" with hdfs-site.xml, core-site.xml files, set the directories and when I'm trying to Start it I got the following error:
"Failed to properly initialize processor, If still shcedule to run,
NIFI will attempt to initalize and run the Processor again after the
'Administrative Yield Duration' has elapsed. Failure is due to
java.lang.reflect.InvocationTargetException:
java.lang.reflect.InvicationTargetException"
On "PutHiveStreaming" I'm configure the "Hive Metastore URI" with
thrift://..., the database and the table name and on "Hadoop
Confiugration Resources" I'm put the Hive-site.xml location and when
I'm trying to Start it I got the following error:
"Hive streaming connect/write error, flow file will be penalized and routed to retry.
org.apache.nifi.util.hive.HiveWritter$ConnectFailure: Failed connectiong to EndPoint {metaStoreUri='thrift://myserver:9083', database='mydbname', table='mytablename', partitionVals=[]}:".
How can I solve the errors?
Thanks.
For #1, if you got your *-site.xml files from the cluster, it's possible that they are using internal IPs to refer to components like the DataNodes and you won't be able to reach them directly using that. Try setting dfs.client.use.datanode.hostname to true in your hdfs-site.xml on the client.
For #2, I'm not sure PutHiveStreaming will work against Cloudera, IIRC they use Hive 1.1.x and PutHiveStreaming is based on 1.2.x, so there may be some Thrift incompatibilities. If that doesn't seem to be the issue, make sure the client can connect to the metastore port (looks like 9083).

hadoop BlockMissingException

I am getting below error:
Diagnostics: org.apache.hadoop.hdfs.BlockMissingException: Could not obtain block: BP-467931813-10.3.20.155-1514489559979:blk_1073741991_1167 file=/user/oozie/share/lib/lib_20171228193421/oozie/hadoop-auth-2.7.2-amzn-2.jar
Failing this attempt. Failing the application.
Although I have set replication factor 3 for /user/oozie/share/lib/ directory. All the jars under this path are available on 3 datanode but few jars are missing.
Can any body suggest why this is happening and how to prevent this.
I was getting the same exception while trying to read a file from hdfs. The solution under the section "Clients use Hostnames when connecting to DataNodes" from this link worked for me:
https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsMultihoming.html#Clients_use_Hostnames_when_connecting_to_DataNodes
I added this XML block to "hdfs-site.xml" and restarted the datanode and namenode servers:
<property>
<name>dfs.client.use.datanode.hostname</name>
<value>true</value>
<description>Whether clients should use datanode hostnames when
connecting to datanodes.
</description>
</property>
please check the file's owner in hdfs directory, I met this issue because the owner is "root", it got solved when I changed it to "your_user".
Got the same error when using Trino to connect to hive, I tried to connect HDFS from a Trino worker and found that port 9866 is not open on HDFS, opened the port on HDFS datenode and solved the problem. Related document: https://www.ibm.com/docs/en/spectrum-scale-bda?topic=requirements-firewall-recommendations-hdfs-transparency https://hadoop.apache.org/docs/r3.0.0/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml

RDD to HDFS - authentication error - RetryInvocationHandler

I have an RDD that I wish to write to HDFS.
data.saveAsTextFile("hdfs://path/vertices")
This returns:
WARN RetryInvocationHandler: Exception while invoking ClientNamenodeProtocolTranslatorPB.getFileInfo over null. Not retrying because try once and fail.
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): SIMPLE authentication is not enabled. Available:[TOKEN, KERBEROS]
I have checked KERBEROS and it is properly authenticated.
How do I solve this?
Well,
You need to check your path /etc/security/keytabs and check if your spark keytab is there.
This path is the recommended to the Kerberos configuration. Maybe it can be in other path.
But the most important, this keytab should be in all workers machines in the same path.
Other thing that you can check is the configuration file of Spark that should be installed in:
SPARK_HOME/conf
This folder should have the spark conf file spark-defaults.conf this conf file need to have this stuffs:
spark.history.kerberos.enabled true
spark.history.kerberos.keytab /etc/security/keytabs/spark.keytab
spark.history.kerberos.principal user#DOMAIN.LOCAL
The issue was actually related to how you reference a file in HDFS when using kerberos.
Rather than hdfs://<HOST>:<HTTP_PORT>
It is webhdfs://<HOST>:<HTTP_PORT>

Hive not fully honoring fs.default.name/fs.defaultFS value in core-site.xml

I have the NameNode service installed on a machine called hadoop.
The core-site.xml file has the fs.defaultFS (equivalent to fs.default.name) set to the following:
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop:8020</value>
</property>
I have a very simple table called test_table that currently exists in the Hive server on the HDFS. That is, it is stored under /user/hive/warehouse/test_table. It was created using a very simple command in Hive:
CREATE TABLE new_table (record_id INT);
If I attempt to load data into the table locally (that is, using LOAD DATA LOCAL), everything proceeds as expected. However, if the data is stored on the HDFS and I want to load from there, an issue occurs.
I run a very simple query to attempt this load:
hive> LOAD DATA INPATH '/user/haduser/test_table.csv' INTO TABLE test_table;
Doing so leads to the following error:
FAILED: SemanticException [Error 10028]: Line 1:17 Path is not legal ''/user/haduser/test_table.csv'':
Move from: hdfs://hadoop:8020/user/haduser/test_table.csv to: hdfs://localhost:8020/user/hive/warehouse/test_table is not valid.
Please check that values for params "default.fs.name" and "hive.metastore.warehouse.dir" do not conflict.
As the error states, it is attempting to move from hdfs://hadoop:8020/user/haduser/test_table.csv to hdfs://localhost:8020/user/hive/warehouse/test_table. The first path is correct because it references hadoop:8020; the second path is incorrect, because it references localhost:8020.
The core-site.xml file clearly states to use hdfs://hadoop:8020. The hive.metastore.warehouse value in hive-site.xml correctly points to /user/hive/warehouse. Thus, I doubt this error message has any true value.
How can I get the Hive server to use the correct NameNode address when creating tables?
I found that the Hive metastore tracks the location of each table. You can see the that location be running the following in the Hive console.
hive> DESCRIBE EXTENDED test_table;
Thus, this issue occurs if the NameNode in core-site.xml was changed while the metastore service was still running. Therefore, to resolve this issue the service should be restarted on that machine:
$ sudo service hive-metastore restart
Then, the metastore will use the new fs.defaultFS for newly created tables such.
Already Existing Tables
The location for tables that already exist can be corrected by running the following set of commands. These were obtained from Cloudera documentation to configure the Hive metastore to use High-Availability.
$ /usr/lib/hive/bin/metatool -listFSRoot
...
Listing FS Roots..
hdfs://localhost:8020/user/hive/warehouse
hdfs://localhost:8020/user/hive/warehouse/test.db
Correcting the NameNode location:
$ /usr/lib/hive/bin/metatool -updateLocation hdfs://hadoop:8020 hdfs://localhost:8020
Now the listed NameNode is correct.
$ /usr/lib/hive/bin/metatool -listFSRoot
...
Listing FS Roots..
hdfs://hadoop:8020/user/hive/warehouse
hdfs://hadoop:8020/user/hive/warehouse/test.db

Cloudera Hadoop access with Kerberos gives TokenCache error : Can't get Master Kerberos principal for use as renewer

I am trying to access Cloudera Hadoop setup (HIVE + Impala) from Mac Book Pro OS X 10.8.4.
We have Cloudera CDH-4.3.0 installed on Linux servers. I have extracted CDH-4.2.0 tarball to my Mac Book Pro.
I have set proper configuration and Kerberos credentials so that commands like 'hadoop -fs -ls /' works and HIVE shell starts up.
However when I do 'show databases' command it gives following error:
> hive
> show databases;
>
Failed with exception java.io.IOException:java.io.IOException: Can't get Master Kerberos principal for use as renewer
The error is related to TokenCache.
When I searched for error, it seems following method 'obtainTokensForNamenodesInternal' throws this error when it tries to get a delegation token for specific FS and fails.
http://hadoop.apache.org/docs/current/api/src-html/org/apache/hadoop/mapreduce/security/TokenCache.html
On client side I don't see any error in HIVE shell logs. I have also tried using tarballs of CDH 4.3.0 with same configuration I get the same error.
Any help or pointers for resolving this error would be highly appreciated.
It seems that you have not config the kerberos for yarn.
Add the follow configure in your yarn-site.cml
<property>
<name>yarn.nodemanager.principal</name>
<value>yarn_priciple/fqdn#_HOST</value>
</property>
<property>
<name>yarn.resourcemanager.principal</name>
<value>yarn_priciple/fqdn#_HOST</value>
</property>
Create a new Gateway YARN role instance in the host from Cloudera Manager. It will automatically setup and update the yarn-site.xml.

Resources