Hadoop configuration error (on Mac): Cannot set priority of namenode and secondarynamenode process xxxx - hadoop

I've been receiving these errors when start-dfs.sh.
Checked every hint that I could find – without success. Not sure if it has to do with the dfs.namenode.name.dir folder possibly not existing (how can I find out?) and/or wrong permissions (how can I find out?).
Using Hadoop 3.3.4 and Java 11.0.17
On Macbook Air M1 with Ventura 13.1
Here are my configurations:
.bash_profile
export HADOOP_HOME=/Users/thi/hadoop-3.3.4
export HADOOP_MAPRED_HOME=$HADOOP_HOME\
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk-11.0.17.jdk/Contents/Home
PATH=$HADOOP_HOME/bin/:$HADOOP_HOME/sbin/:$JAVA_HOME/bin/:$PATH
export PATH
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://Namenode:9000</value>
</property>
</configuration>
hadoop-env.sh
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk-11.0.17.jdk/Contents/Home
export HADOOP_HOME=/Users/thi/hadoop-3.3.4
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>localhost</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>

Related

How to start datanode in hadoop slave machine?

I'm creating hadoop cluster using yarn configuration, i have 2 VMs from virtual box, but when i run the command start-all.sh (start-dfs.sh and start-yarn.sh), i get a possitive anwser with jps both on master and slave terminal, but when i access master-ip:9870 on web there is no datanode started
core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop-master:9000</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoopuser/hadoop/data/nameNode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoopuser/hadoop/data/dataNode</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.acl.enable</name>
<value>0</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop-master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
workers
hadoop-slave1
/etc/hosts
master-ip hadoop-master
slave-ip hadoop-slave1
The configuration above is in both master and slave machine.
I also have the JAVA_HOME, HADOOP_HOME and PDSH_RCMD_TYPE in my .bashrc. And i have created the ssh key in master and shared it with the slave authorized for allows ssh connection.
In master machine i have this output:
In my slave machine:
I have 0 nodes in my hdfs web visualization:
But i can see the slave node in yarn configuration:
I deleted hadoop tmp files and the datanode folders before format my hdfs on master, and start all processes. I'm using hadoop 3.2.1

Hadoop can't execute a basic Example

The software I'm using:
System:macOS Mojave 10.14.2
Hadoop:3.1.1
JDK:10.0.2
I execute this command:hadoop jar /usr/local/Cellar/hadoop/3.1.1/libexec/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1.jar pi 2 5, it failed:
I need help, thank you!!!
In hadoop-env.sh, I just add the sentence:
export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk-10.0.2.jdk/Contents/Home
core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
</property>
yarn-site.xml:
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
I solved it.
It's the reason of java version.
When I added the two lines of code to yarn-env.sh, it didnt't work for me.
export YARN_RESOURCEMANAGER_OPTS="--add-modules=ALL-SYSTEM"
export YARN_NODEMANAGER_OPTS="--add-modules=ALL-SYSTEM"
In the end, I change the java version, set it to java8 and deleted above two lines of code, it worked for me.
You can set it in hadoop-env.sh.
Thx

Hadoop: resourcemanager doesn't run on localhost

So I can't manage to access http://localhost:8088/ on hadoop 3.1.1
Here is what I did:
bin/hdfs namenode -format
sbin/start-dfs.sh
bin/hdfs dfs -mkdir /user
bin/hdfs dfs -mkdir /user/username
The web interface for the NameNode works but it doesn't for the Resource Manager.
core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
<configuration>
<property>
<name>mapreduce.application.classpath</name>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*:$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
</property>
</configuration>
yarn-site.xml:
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>127.0.0.1:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>127.0.0.1:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>127.0.0.1:8031</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
</configuration>
.bash_profile
export PATH="/usr/local/sbin:$PATH"
export SCALA_HOME=/usr/local/scala
export JAVA_HOME=/Library/Java/JavaVirtualMachines/openjdk-11.0.1.jdk/Contents/Home
export SPARK_HOME=/usr/local/spark
export HADOOP_HOME=/usr/local/Cellar/hadoop/3.1.1/libexec/
export HADOOP_CONF_DIR=/usr/local/Cellar/hadoop/3.1.1/libexec/etc/hadoop
export PATH=$PATH:/usr/local/hadoop/bin
export PATH=$PATH:/usr/local/spark/bin
export PATH=$PATH:/usr/local/scala/bin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS=-Djava.library.path=$HADOOP_HOME/lib
export PATH=$HADOOP_HOME/bin:$PATH
export PATH=$HADOOP_HOME/sbin:$PATH
The problem is when I run: sbin/start-yarn.sh here is the result
Starting resourcemanagers on []
Starting nodemanagers
shouldn't it say: Starting resourcemanagers on [localhost] ?
By default value for hostname from documentation is 0.0.0.0 and not localhost, if you want to configure it explicitly below are the properties and here its default values provided, you can override same.
yarn.resourcemanager.hostname 0.0.0.0 The hostname of the RM.
yarn.resourcemanager.address ${yarn.resourcemanager.hostname}:8032 The
address of the applications manager interface in the RM.

Apache Kylin not able to load models/configuration

I'm new to hadoop,hive, hbase and kylin. I tried to install thoose first three, and it's seems to be working.
After that I tried to install apache kylin, run the sample.sh and success.
After running the script I restart and open the web interface. Some page cannot be opened ex: /cube, /models, /admin/config
The problem is: I can see there are 5 tables created in hive, and also 2 cubes created. But when I open in web gui, the models is in loading-state and I cannot build the cube.
When I try to build the cube
I cannot find any infomative log (Or maybe there is one, but I don't know about it)
kylin.log
https://pastebin.com/TUZkQepa
hadoop-hadoop-namenode-master.log
https://pastebin.com/T8eNt3PY
hadoop-hadoop-secondarynamenode-master.log
https://pastebin.com/iMJDNFfU
yarn-hadoop-resourcemanager-master.log
https://pastebin.com/TGwJWTRF
hbase-hadoop-zookeeper-master.log
https://pastebin.com/Ym6eky5h
hbase-hadoop-master-master.log
https://pastebin.com/p1ygfw4W
Here is the configuration for hadoop
(yarn-site.xml)
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8031</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8032</value>
</property>
</configuration>
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/tmp</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Configuration for hbase
regionservers
slave2
hbase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://master:9000/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/datadir</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>master,slave2</value>
</property>
</configuration>
Configuration for hive
hive-site.xml
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://master:3306/metastore?createDatabaseIfNotExist=true</value>
<description>metadata is stored in a MySQL server</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>MySQL JDBC driver class</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
<description>user name for connecting to mysql server</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>gwudainget</value>
<description>password for connecting to mysql server</description>
</property>
<property>
<name>hive.cli.print.current.db</name>
<value>true</value>
<description>Whether to include the current database in the Hive prompt.</description>
</property>
</configuration>
For kylin, I use default configuration, because I don't really know what to do with the kylin configuration.
What i use:
hadoop 2.7.5 binary
hbase 1.2.6 binary
hive 1.2.2 binary
kylin 2.2.0 source (I just add logs)

error in command format hdfs

I install hadoop-0.20.2 and java-1.7.0 on centos. I've configured hadoop in the form below:
bashrc:
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.75/jre
export HADOOP_HOME=/home/hadoop/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
core-site.xml:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs-site.xml:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
mapred-site.xml:
<configuration>
<property>
<name>mapred.job.tasker</name>
<value>localhost:9001</value>
</property>
</configuration>
hadoop-env.sh:
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.75/jre
When I run the following command would fail
command:
bin/hadoop namenode -format
Error :
Error: Could not find or load main class org.apache.hadoop.hdfs.server.namenode.NameNode
Please help me.
Can anyone advise me?.....
try this command :
$ /home/hadoop/opt/hadoop/bin/hadoop namenode -format

Resources