i try to install hadoop from this video
https://www.youtube.com/watch?v=CtOhsZ0Sb1E&t=126s
When i run the last command
start-all.sh
i got this message:
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [localhost]
localhost: namenode running as process 6283. Stop it first.
localhost: starting datanode, logging to /home/myname/hadoop- 2.7.3/logs/hadoop-myname-datanode-MYNAME.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: secondarynamenode running as process 6379. Stop it first.
starting yarn daemons
starting resourcemanager, logging to /home/myname/hadoop- 2.7.3/logs/yarn-myname-resourcemanager-MYNAME.out
Error: Could not find or load main class org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
localhost: starting nodemanager, logging to /home/myname/hadoop- 2.7.3/logs/yarn-myname-nodemanager-MYNAME.out
localhost: Error: Could not find or load main class org.apache.hadoop.yarn.server.nodemanager.NodeManager
my bashrc file
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export HADOOP_INSTALL=/home/myname/hadoop-2.7.3
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib/native"
my hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/myname/hadoop-2.7.3/etc/hadoop/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/home/myname/hadoop-2.7.3/etc/hadoop/hadoop_store/hdfs/datanode</value>
</property>
</configuration>
my core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/myname/hadoop-2.7.3/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>
</configuration>
my mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
</configuration>
I have tried a lot of things but the error is still there..
Any idea ?
Add the following line to your .bashrc file:
export HADOOP_PREFIX=/path_to_hadoop_location
You have to include yarn-site.xml file while configuring hadoop
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
mapred-site.xml : add this also
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
I think you can resolve this issue by adding these properties.
Related
I have a problem with Hadoop. I am on mac OS and I have a problem when I want to launch my node.
I installed Hadoop this way :
brew install hadoop
I also configured the different files like this :
hadoop-env.sh :
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Djava.security.krb5.realm= -Djava.security.krb5.kdc="
export JAVA_HOME="/Library/Java/JavaVirtualMachines/jdk-17.0.2.jdk/Contents/Home"
core-site.xml :
<!-- Put site-specific property overrides in this file. --><configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/Cellar/hadoop/hdfs/tmp</value>
<description>A base for other temporary directories</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:8020</value>
</property>
</configuration>
mapred-site.xml :
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>localhost:8021</value>
</property>
</configuration>
hdfs-site.xml :
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
I finally executed this command:
hdfs namenode -format
Finally when I want to launch the command ./start-dfs.sh I get this error :
"ERROR: Cannot set priority of secondarynamenode process 31231"
I would like to specify that my main node launches correctly :
I can't find a solution on the internet.
Has anyone faced the same situation as me?
I tried all the solutions but doesn't work : localhost: ERROR: Cannot set priority of datanode process 32156
Sincerely,
For people who had the same problems as me, here is a tuto that might work:
:https://techblost.com/how-to-install-hadoop-on-mac-with-homebrew/
I get this exception when trying to execute any M/R2 job on Fedora. Hadoop 2.7.3 and 2.8.0 have the same issue. That includes Hive.
[hadoop#master hadoop]$ yarn classpath
/opt/hadoop/hadoop-2.7.3/conf
/opt/hadoop/hadoop-2.7.3/conf
/opt/hadoop/hadoop-2.7.3/conf:/opt/hadoop/hadoop/share/hadoop/common/lib/*
/opt/hadoop/hadoop/share/hadoop/common/*
/opt/hadoop/hadoop/share/hadoop/hdfs
/opt/hadoop/hadoop/share/hadoop/hdfs/lib/*
/opt/hadoop/hadoop/share/hadoop/hdfs/*
/opt/hadoop/hadoop/share/hadoop/yarn/lib/*
/opt/hadoop/hadoop/share/hadoop/yarn/*
/opt/hadoop/hadoop/share/hadoop/mapreduce/share/hadoop/mapreduce/*
/opt/hadoop/hadoop/contrib/capacity-scheduler/*.jar
/opt/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.7.3.jar
/opt/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.7.3.jar
/opt/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.7.3.jar
/opt/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.7.3.jar
/opt/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-2.7.3.jar
/opt/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.3.jar
/opt/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.7.3-tests.jar
/opt/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.7.3.jar
/opt/hadoop/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar
/opt/hadoop/hadoop/share/hadoop/yarn/*
/opt/hadoop/hadoop/share/hadoop/yarn/lib/*
mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
</property>
</configuration>
And yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.application.classpath</name>
<value>$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/*,$HADOOP_MAPRED_HOME/share/hadoop/mapreduce/lib/*</value>
</property>
</configuration>
Last but not least, env setup:
export HADOOP_HOME=/opt/hadoop/hadoop
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_YARN_HOME=$HADOOP_HOME
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
I am fairly sure I am missing something obvious. I have set this up multiple times, but something must be off.
The missing class is actually part of the jars in the classpath. The /opt/hadoop/hadoop folder is owned by user hadoop and has all access rights needed.
Set YARN_HOME=$HADOOP_HOME instead of HADOOP_YARN_HOME=$HADOOP_HOME
I faced the same issue(JAVA -1.8 291u ,Hadoop -2.8.0), which was resolved after setting up the property - YARN application classpath in yarn-site.xml.
Step 1: execute "hadoop classpath".
This command displays the list of path to be passed as a value in yarn-site.xml
Step 2: Edit the yarn-site.xml as below:
<property>
<name>yarn.application.classpath</name>
<value>output from step1 </value>
</property>
Restart the yarn script before triggering mapreduce jobs.
When I start-hbase.sh, I get the following error
localhost: starting zookeeper, logging to /usr/lib/HBase/bin/../logs/hbase-hduser-zookeeper-nkhl.out
localhost: java.io.FileNotFoundException: /home/hduser/zookeeperpropertydataDir/myid (Permission denied)
localhost: at java.io.FileOutputStream.open0(Native Method)
localhost: at java.io.FileOutputStream.open(FileOutputStream.java:270)
localhost: at java.io.FileOutputStream.<init>(FileOutputStream.java:213)
localhost: at java.io.FileOutputStream.<init>(FileOutputStream.java:162)
localhost: at java.io.PrintWriter.<init>(PrintWriter.java:263)
localhost: at org.apache.hadoop.hbase.zookeeper.HQuorumPeer.writeMyID(HQuorumPeer.java:162)
localhost: at org.apache.hadoop.hbase.zookeeper.HQuorumPeer.main(HQuorumPeer.java:70)
starting master, logging to /usr/lib/HBase/logs/hbase-hduser-master-nkhl.out
OpenJDK 64-Bit Server VM warning: ignoring option PermSize=128m; support was removed in 8.0
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=128m; support was removed in 8.0
regionserver running as process 25123. Stop it first.
After this, when I do hbase shell, it does open up, but when I list it throws this error:
ERROR: Can't get master address from ZooKeeper; znode data == null
Here is some help for this command:
List all tables in hbase. Optional regular expression parameter could
be used to filter the output. Examples:
hbase> list
hbase> list 'abc.*'
hbase> list 'ns:abc.*'
hbase> list 'ns:.*'
This is jps:
25123 HRegionServer
23975 SecondaryNameNode
23767 DataNode
24168 ResourceManager
26456 HMaster
26665 Jps
24297 NodeManager
23613 NameNode
Zookeeper starts fine:
ZooKeeper JMX enabled by default Using config:
/usr/lib/zookeeper/conf/zoo.cfg Starting zookeeper ... STARTED
My hbase-site.xml configuration:
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:54433/hbase</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>localhost</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hduser/zookeeperpropertydataDir</value>
</property>
<property >
<name>hbase.master.port</name>
<value>60010</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2181</value>
<description> The port at which the clients will connect.</description>
</property>
</configuration>
This is my hbase-env.sh configuration:
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64/jre/
export HBASE_REGIONSERVERS=${HBASE_HOME}/conf/regionservers
export HBASE_MANAGES_ZK=true
Any help in this will be appreciated.
Hbase zookeeper deamon HQuorumPeer is not running. one of the reason can be below dir doesn't exist as shown in logs:-
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hduser/zookeeperpropertydataDir</value>
</property>
Make sure the file that ZooKeeper is using has been created and has the right privileges.
Use chmod to grant access to all users. This fixed my problem.
chmod -R 777 path/file_name
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hduser /zookeeperpropertydataDir</value>
</property>
I am new to hadoop. When I run wordcount test project, evrything works fine. But, I can't access the JobTracker at http://localhost:50030. in fact, when I get my secondary node log file, I get exception message :
java.io.IOException: Bad edit log manifest (expected txid = 3: [[21,22], [23,24]
[8683,8684], [8685,8686], [8687,8688], [8689,8690], [8691,8692], [8693,8694], [8695,8696], [8697,8698], [8699,8700]]...
....
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.downloadCheckpointFiles(SecondaryNameNode.java:438)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:540)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:395)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$1.run(SecondaryNameNode.java:361)
at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:357)
at java.lang.Thread.run(Thread.java:745)
Btw, when I run jps, I get 53745 JobHistoryServer 77259 Jps
UPDATE : here's my config
in core-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/Cellar/hadoop/hdfs/tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
in hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9010</value>
</property>
</configuration>
and nothing is set in my yarn-site.xml
If you are using latest version of Hadoop, then Job Tracker will not be available. Job tracker is replaced by Resource Manager and History Server.
If you want to access past job details, go to http://hostname:19888. This is the web UI address for job history server.
Please refer Hadoop Cluster Setup for further details.
I am trying to connect my HBase to HDFS. I have my hdfs namenode(bin/hdfs namenode) and datnode(/bin/hdfs datanode) running. I can also start my Hbase (sudo ./bin/start-hbase.sh) and local region servers (sudo ./bin/local-regionservers.sh start 1 2). But when I try to execute a command from Hbase shell it gives the following error:
cis655stu#cis655stu-VirtualBox:/teaching/14f-cis655/proj-dtracing/hbase/hbase-0.99.0-SNAPSHOT$ ./bin/hbase shell
HBase Shell; enter 'help<RETURN>' for list of supported commands.
Type "exit<RETURN>" to leave the HBase Shell
Version 0.99.0-SNAPSHOT, rUnknown, Sat Aug 9 08:59:57 EDT 2014
hbase(main):001:0> list
TABLE
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/teaching/14f-cis655/proj-dtracing/hbase/hbase-0.99.0-SNAPSHOT/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/teaching/14f-cis655/proj-dtracing/hadoop-2.6.0/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
2015-01-19 13:33:07,179 WARN [main] util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
ERROR: Connection refused
Here is some help for this command:
List all tables in hbase. Optional regular expression parameter could
be used to filter the output. Examples:
hbase> list
hbase> list 'abc.*'
hbase> list 'ns:abc.*'
hbase> list 'ns:.*'
Below are my configuration files for HBase and Hadoop:
HBase-site.xml
<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://localhost:9000/hbase</value>
</property>
<!--for psuedo-distributed execution-->
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.master.wait.on.regionservers.mintostart</name>
<value>1</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/teaching/14f-cis655/tmp/zk-deploy</value>
</property>
<!--for enabling collection of traces
-->
<property>
<name>hbase.trace.spanreceiver.classes</name>
<value>org.htrace.impl.LocalFileSpanReceiver</value>
</property>
<property>
<name>hbase.local-file-span-receiver.path</name>
<value>/teaching/14f-cis655/tmp/server-htrace.out</value>
</property>
</configuration>
Hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/teaching/14f-cis655/proj-dtracing/hadoop-2.6.0/yarn/yarn_data/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/teaching/14f-cis655/proj-dtracing/hadoop-2.6.0/yarn/yarn_data/hdfs/datanode</value>
</property>
<property>
<name>hadoop.trace.spanreceiver.classes</name>
<value>org.htrace.impl.LocalFileSpanReceiver</value>
</property>
<property>
<name>hadoop.local-file-span-receiver.path</name>
<value>/teaching/14f-cis655/proj-dtracing/hadoop-2.6.0/logs/htrace.out</value>
</property>
</configuration>
Core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
Please check does you HDFS is available from shell:
$ hdfs dfs -ls /hbase
Also make sure that you've all environment variables in hdfs-env.sh file:
HADOOP_CONF_LIB_NATIVE_DIR="/hadoop/lib/native"
HADOOP_OPTS="-Djava.library.path=/hadoop/lib"
HADOOP_HOME=/hadoop
YARN_HOME=/hadoop
HBASE_HOME=/hbase
HADOOP_HDFS_HOME=/hadoop
HBASE_MANAGES_ZK=true
Do you run Hadoop and HBase using the same OS user? If you use separate users, please check if HBase user is allowed to access HDFS.
Make sure that you have a copy of hdfs-site.xml and core-stie.xml (or symlink) files in ${HBASE_HOME}/conf directory.
Also fs.default.name option is deprecated for YARN (but it must still work), you must consider using fs.defaultFS instead.
Do you use Zookeeper? Because you've specified hbase.zookeeper.property.dataDir option, but there is no hbase.zookeeper.quorum there, and other significant options. Please read http://hbase.apache.org/book.html#zookeeper for more information.
Please add next option to hdfs-site.xml to make HBase work correctly (replace $HBASE_USER variable by your system user, which is used to run HBase):
<property>
<name>hadoop.proxyuser.$HBASE_USER.groups</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.$HBASE_USER.hosts</name>
<value>*</value>
</property>
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>