Syntax error in hadoop-env.sh file - hadoop

I decided to using hadoop2.5.0 i set HADOOP_PREFIX,but when I want to see the version or format namenode this error happen:
[hdfs#master1 bin]$ ./hadoop version
: command not found.5.0/etc/hadoop/hadoop-env.sh: line 16:
: command not found.5.0/etc/hadoop/hadoop-env.sh: line 18:
: command not found.5.0/etc/hadoop/hadoop-env.sh: line 23:
: command not found.5.0/etc/hadoop/hadoop-env.sh: line 29:
: command not found.5.0/etc/hadoop/hadoop-env.sh: line 30:
: command not found.5.0/etc/hadoop/hadoop-env.sh: line 32:
'usr/local/hadoop-2.5.0/etc/hadoop/hadoop-env.sh: line 34: syntax error near unexpected token `do
'usr/local/hadoop-2.5.0/etc/hadoop/hadoop-env.sh: line 34: `for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
Error: Could not find or load main class org.apache.hadoop.util.VersionInfo
OS:CentOs 6.5.
Mode:Fully disterbuted with 4 nodes: 1 master + 3 Slaves.

Please verified steps from this :
Step-1 Create a dedicated user(hduser)for hadoop on three machine from terminal
Command-1 sudo addgroup hadoop
Command-2 sudo adduser --ingroup hadoop hduser
Command-3 sudo adduser hduser sudo
Step-2 Login to user(hduser) on three machine from terminal
Command-1 su hduser
Step-3 Create passwordless ssh between all the machines
Command-1 ssh-keygen -t rsa -P ""
Command-2 cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
Command-3 ssh-copy-id -i $HOME/.ssh/id_rsa.pub hduser#slave
hduser#slave means hduser#IP of slave (Eg:hduser#192.168.213.25)
Execute Command-3 from master machine to the two slave machines
Step-4 Setup JAVA JDK on all the machines
Command-1 sudo apt-get install sun-java6-jdk
Command-2 sudo update-java-alternatives -s java-6-sun
Step-5 Download hadoop-2.x tar and unzip it on all machines
Command-1 cd /usr/local
Command-2 sudo tar xzf hadoop-2.x.tar.gz
Command-3 sudo mv hadoop-2.x hadoop
Command-4 sudo chown -R hduser:hadoop hadoop
Step-6 Open $HOME/.bashrc on all machines
Command-1 vi $HOME/.bashrc
Step-7 Add the following lines to the end of opened .bashrc file on all machines
(Find location of JAVA_HOME on all of the machines. It should be set accordingly on each machine)
export JAVA_HOME=/usr/local/java/jdk1.6.0_20
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$JAVA_HOME/bin:$HADOOP_INSTALL/bin:$PATH
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop
export YARN_CONF_DIR=$HADOOP_INSTALL/etc/hadoop
Press Esc and type :wq! To update the file
Now execute this command
Command-1 source .bashrc
Step-8 Update /etc/hosts files on all machine
I. add ip and name of machines for all the machines in /etc/hosts file
eg:-
192.168.213.25 N337
192.168.213.94 N336
192.168.213.47 UBUNTU
II. comment all other things
Step-9 Tweak Config Files on all machines
I.hadoop-config.sh
Command-1 cd $HADOOP_INSTALL
Command-2 vi libexec/hadoop-config.sh
Now add the following line at the start of hadoop-config.sh(Take appropriate location of JAVA_HOME on each machine)
export JAVA_HOME=/usr/local/java/jdk1.6.0_20
II.yarn-env.sh
Command-1 cd $HADOOP_INSTALL/etc/hadoop
Command-2 vi yarn-env.sh
#Now add following lines
export JAVA_HOME=/usr/local/java/jdk1.6.0_20
export HADOOP_INSTALL=/usr/local/hadoop
export PATH=$JAVA_HOME/bin:$HADOOP_INSTALL/bin:$PATH
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_CONF_DIR=$HADOOP_INSTALL/etc/hadoop
export YARN_CONF_DIR=$HADOOP_INSTALL/etc/hadoop
#Press Esc and type :wq! To update the file
III.core-site.xml
Command-1 vi $HADOOP_INSTALL/etc/hosts/core-site.xml
Add the following lines
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://N364U:9000</value>
</property>
</configuration>
IV.yarn-site.xml
Command-1 vi $HADOOP_INSTALL/etc/hosts/yarn-site.xml
Add the following lines(change machine name according to your machine)
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>N337:8025</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>N337:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>N337:8040</value>
</property>
</configuration>
V.hdfs-site.xml
Add the following lines
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
</configuration>
VI.mapred-site.xml
Add the following lines
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
Don't forget to save all the configuration files. Cross check.
Step-10 Add slaves on master machine only
Command-I vi $HADOOP_INSTALL/etc/hosts/slaves
Add the IP of the two machines on master machine
eg:
192.168.213.94
192.168.213.47
Step-11 Format namenode once
Command-1 cd $HADOOP_INSTALL
Command-2 bin/hdfs namenode -format
Step-12 Now start hadoop
Command-1 cd $HADOOP_INSTALL
Command-2 sbin/start-all.sh

Related

Hive shell not opening when I have hive-site.xml and other errors

I'm relatively new to Hive. I installed Hive 2.3.2 on a Vagrant VM but I have been having issues when running Hive.
I installed it as follows:
wget http://ftp.heanet.ie/mirrors/www.apache.org/dist/hive/hive-2.3.2/apache-hive-2.3.2-bin.tar.gz
tar -xzf apache-hive-2.3.2-bin.tar.gz
sudo mv apache-hive-2.3.2-bin /usr/local/hive
I then added the following to my bashrc:
export HIVE_HOME=/usr/local/hive
export HIVE_CONF_DIR=/usr/local/hive/conf
export PATH=$HIVE_HOME/bin:$PATH
export CLASSPATH=$CLASSPATH:/usr/local/hadoop/lib/*:.
export CLASSPATH=$CLASSPATH:/usr/local/hive/lib/*:.
I then added the following to my hive-config.sh
export HADOOP_HOME=/usr/local/hadoop
And ran the followings commands:
sudo $HADOOP_HOME/bin/hadoop fs -mkdir -p /tmp
sudo $HADOOP_HOME/bin/hadoop fs -mkdir -p /user/hive/warehouse
sudo $HADOOP_HOME/bin/hadoop fs -chmod g+w /tmp
sudo $HADOOP_HOME/bin/hadoop fs -chmod g+w /user/hive/warehouse
sudo $HIVE_HOME/bin/schematool -dbType derby -initSchema
After this when I run the hive command the hive command line comes up as expected. However when I try any commands show as show tables; I get the following error:
FAILED: SemanticException org.apache.hadoop.hive.qlmetadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
I was getting this error a few days ago and managed to fix it. However without changing anything yesterday everything stopped working.
I originally fixed it by doing the following:
First I checked the conf folder and saw that there were only template files in there, so I ran the following commands:
sudo cp hive-default.xml.template hive-site.xml
sudo cp hive-env.sh.template hive-env.sh
I edited the hive-env.sh file to include the following:
export HADOOP_HEAPSIZE=1024
# Set HADOOP_HOME to point to a specific hadoop install directory
export HADOOP_HOME=/usr/local/hadoop
#hive
export HIVE_HOME=/usr/local/hive
# Hive Configuration Directory can be controlled by:
export HIVE_CONF_DIR=$HIVE_HOME/conf
and then ensured I has access by running:
sudo chmod a+rwx . --recursive
Originally this worked and when I typed in show tables; it ran. However yesterday it stopped working and the HIVE command line will no longer appear. and i get the following error message:
Logging initialized using configuration in jar:file:/usr/local/hive/lib/hive-common-2.3.2.jar!/hive-log4j2.properties Async: true
Exception in thread "main" java.lang.RuntimeException: java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: ${system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
at org.apache.hadoop.fs.Path.initialize(Path.java:254)
at org.apache.hadoop.fs.Path.<init>(Path.java:212)
at org.apache.hadoop.fs.Path.ql.session.SessionState.createSessionDirs(SessionState.java:659)
at org.apache.hadoop.fs.Path.ql.session.SessionState.start(SessionState.java:582)
at org.apache.hadoop.fs.Path.ql.session.SessionState.beginStart(SessionState.java:549)
at org.apache.hadoop.fs.Path.cli.CliDriver.run(CliDriver.java.750)
at org.apache.hadoop.fs.Path.cli.CliDriver.main(CliDriver:java.686)
at sun.reflect.NativeMethodAccessorImpl.invoked(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoked(NativeMethodAccessorimpl.java:62)
at sun.reflect.DelegatingMethodAcessorImpl.invoke(DelegatingMethodAccessorimpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.Runjar.run(Runjar.java:239)
at org.apache.hadoop.util.Runjar.main(Runjar.java:153)
Caused by: java.netURISyntaxException: Relative path in absolute URI: $(system:java.io.tmpdir%7D/$%7Bsystem:user.name%7D
at java.net.URI.checkPath(URI.java:1823)
at java.net.URI.<init>(URI.java:745)
at org.apache.hadoop.fs.Path.initialize(Path.java:251)
... 12 more
I'be been looking online and can;t find anything that fixes the above. The hive command line doesn't even appear anymore with above error.
Any help on this would be really appreciated. Thanks in advance.
check below configuration ,I hope , it will work.
(1) .bashrc file
#HIVE VARIABLE START
export HIVE_HOME=/usr/local/hive
export PATH=$PATH:$HIVE_HOME/bin
#HIVE VARIABLE END
(2) hive-env.sh
export HADOOP_HOME=/usr/local/hadoop
export HADOOP_HEAPSIZE=512
export HIVE_CONF_DIR=/usr/local/hive/conf
(3) hive-site.xml
<property>
<name>hive.exec.local.scratchdir</name>
<value>/tmp/${user.name}</value>
<description></description>
</property>
<property>
<name>hive.downloaded.resources.dir</name>
<value>/tmp/${user.name}_resources</value>
<description></description>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost/metastore_db1?createDatabaseIfNotExist=true&autoReconnect=true&useSSL=false</value>
<description></description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hiveuser</value>
<description></description>
</property>
(4) copy database connector jar file in
/usr/local/hive/lib
folder
i.e if you are using mysql then copy mysql-connector.jar file into lib
folder

Hadoop hdfs: input/output error when creating user folder

I've followed the instructions in Hadoop the definitive guide, 4th edition : Appendix A to configure Hadoop in pseudo-distributed mode. Everything is working good, except for when I try to make a directory :
hadoop fs -mkdir -p /user/$USER
The commande is returning the following message : mkdir: /user/my_user_name': Input/output error.
Although, when I first log into my root account sudo -s and then type the hadoop fs -mkdir -p /user/$USER commande, the directory 'user/root'is created (all directories in the path).
I think I'm having Hadoop permission issues.
Any help would be really appreciated,
Thanks.
It means that you have a mistake in the 'core-site.xml' file. For instance, I had an error in the first line (name) in which I wrote 'fa.defaultFS' instead 'fs.defaultFS'.
After that, you have to execute the script 'stop-all.sh' to stop Hadoop. Probably, here, you will have to format the namenode with the commands: 'rm -Rf /app/tmp/your-username/*' and 'hdfs namenode -format'. Next, you have to start Hadoop with the 'start-all.sh' script.
Maybe, you have to reboot the system when you have executed the stop script.
After these steps, I could run that command again.
I Corrected the core-site.xml file based on standard commands and it works fine now.
<property>
<name>hadoop.tmp.dir</name>
<value>/home/your_user_name/hadooptmpdata</value>
<description>Where Hadoop will place all of its working files</description>
</property>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
<description>Where HDFS NameNode can be found on the network</description>
</prosperty>

Namenode not starting -su: /home/hduser/../libexec/hadoop-config.sh: No such file or directory

Installed Hadoop 2.7.1 on Ubuntu 15.10
Everything is working fine, only when I hit JPS , I can see all the demons running, except namenode .
at start it shows : -su: /home/hduser/../libexec/hadoop-config.sh: No such file or directory
When I googled it I came to know that , I can ignore this , as my
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
are set properly and hduser ( the user which runs the hadoop) owns the permission for these folders
any clue ??
After spending some time , this simple change worked for me .
press ifconfig.
copy ip address
sudo gedit /etc/hosts
comment this line
#127.0.0.1 localhost
add the following line
10.0.2.15(your ip address) Hadoop-NameNode
This might be problem due to frequent Namenode format. Please see the namenode logs in logger.
Probable solution :
Check your hadoop.tmp.dir in core-site.xml.
On that location, make sure that you have same clusterid for namenode and datanode(otherwise make them same).
You can see clusterid inside VERSION file in dfs/name/current and dfs/data/current. If that make sense.

Hadoop permission issue

I've homebrew installed hadoop but now having permission control problems when doing
hadoop namenode -format and ./start-all.sh command.
I think it's because I put settings in "core-site.xml". The "hadoop.tmp.dir" I put "/tmp/${name}" under.
Now it's giving me error in namenode -format as: can't create folder, permission denied.
Even I sudo this command, but in the start-all.sh command, still a lot of permissions are denied. I tried to sudo start-all.sh but the password (I only use this pass for my admin on mac) but denied also.
I think it's because of the permission issues. Is there anyway I can fix it?
Thanks!
On your local system, it looks like you do not have the hduser user created.
As a typical setup process, it is a good process to create a hadoop group and a hduser user added to that group.
You can do that with the root/super user account with the following command:
$ sudo adduser --ingroup hadoop hduser
This assumes you have the hadoop group setup. If that is not setup, you can create a group with:
$ sudo addgroup hadoop
So when you run Hadoop it stores things in the data, name, and tmp dirs that you configure in the hdfs-site.xml file. If you don't set these settings they will point to ${hadoop.tmp.dir}/dfs/data, in your case the /tmp dir. This is not where you want your data stored. You will first need to add these to your hdfs config file, among other settings.
On master :
<property>
<name>dfs.data.dir</name>
<value>/app/hadoop/data</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/app/hadoop/name</value>
</property>
On slaves :
<property>
<name>dfs.data.dir</name>
<value>/app/hadoop/data</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>master:/app/hadoop/name</value>
</property>
Now once this is done you must actually make those directories. So create the following dirs on master :
/app/hadoop/name, /app/hadoop/data, and /app/hadoop/tmp.
Create the same on slaves except the name dir.
Now you need to set the permissions so that they can be used by Hadoop.
The second line just to be sure.
sudo chown <hadoop user>:<hadoop user> /app/hadoop/name /app/hadoop/data /app/hadoop/tmp
sudo chmod 0777 /app/hadoop/name /app/hadoop/data /app/hadoop/tmp
Try that, see if it works. I can answer questions if it's not the whole answer.

Getting error when trying to run Hadoop 2.4.0 (-bash: bin/start-all.sh: No such file or directory)

I am doing the following to install and run Hadoop on my Mac:
First I install HomeBrew as the Package Manager
ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go)"
Then I install Hadoop using the Brew command:
brew install hadoop
Then the following:
cd /usr/local/Cellar/hadoop/1.1.2/libexec
export HADOOP_OPTS="-Djava.security.krb5.realm= -Djava.security.krb5.kdc="
Then I configure Hadoop by adding the following to proper .xml files:
core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
hdfs-site.xml
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
mapred-site.xml
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
I then Enable SSH to localhost:
System Preferences > Sharing > “Remote Login” is checked.
ssh-keygen -t rsa
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
I then Format Hadoop filesystem:
bin/hadoop namenode -format
And then Start Hadoop (or at least try...this is where I get the error)
bin/start-all.sh
I get the error -bash: bin/start-all.sh: No such file or directory.
The one "odd" thing I did during setup was, since there is no longer a mapred-site.xml file in 2.4.0, I simply copied the mapred-site.xml.template file to my desktop, renamed it to mapred-site.xml, and put that new copy in the folder. I also tried running without any mapred-site.xml configuration but I still get this error.
AFAIK , brew installs hadoop-2.4.0 by default. See here https://github.com/Homebrew/homebrew/blob/master/Library/Formula/hadoop.rb
And in hadoop2.x there is no start-all.sh file in bin folder. It is moved to sbin. Also you need some more configurations. These links may be useful : http://codesfusion.blogspot.in/2013/10/setup-hadoop-2x-220-on-ubuntu.htmlhttps://hadoop.apache.org/docs/r2.2.0/

Resources