How to edit where Hadoop store log files? - hadoop

I need to edit the logs directory (contains application_* files) for hadoop. Currently they are at :
/home/hadoop/hadoop-3.2.0/logs/userlogs/

HADOOP_LOG_DIR is the property which can be configured to change the location of logs generated by Hadoop. You can find the same in your hadoop-env.sh file.
You can find the hadoop-env.sh file in the $HADOOP_HOME/etc/hadoop location.
echo $HADOOP_HOME
cd /path/to/hadoop_home/etc
vi hadoop-env.sh
Enter the value/path against the property HADOOP_LOG_DIR and it should be done.

Related

how to add a jar file in hive

I'm trying to add hive-contrib-0.10.0.jar in hive using ADD JAR hive-contrib-0.10.0.jar command but it always saying hive-contrib-0.10.0.jar does not exist.
I'm using HDP 2.1 version right now. I also added this jar file into /user/root folder using hue and run the command
ADD JAR hdfs:///hive-contrib-0.10.0.jar
but it giving me same error jar file doesn't exist.
Is there any way to solve this problem.
Where should I keep this jar file so that it will run successfully and what will be the command to be used?
upload the JAR file into hdfs path
Add the JAR File using Add command and HDFS full PATH
Example:
hadoop fs -put ~/Downloads/hive.jar /lib/
open hive shell
add jar hdfs:///lib/hive.jar
I see following issues with your approach. Before adding make sure you are able to list the file on Local file system or hdfs where ever it exists.
The jar you are trying to add is by default in hive class path as its part of $HIVE_HOME/lib (on local file system where ever you have hive client/service installed)
on the other hand in regards to your question about how to add jars in hive, we can add using local file system or hadoop distributed file system (HDFS)
Add jar file:///root/hive-contrib-0.10.0.jar (Given that you copied this jar on LFS root directory)
Add jar hdfs://<namenode_hostname>:8020/user/root/hive-contrib-0.10.0.jar (Given that you copied to HDFS root home)
if you want to permanently add the jars you need to do the following.
1. Hive-site.xml ( /etc/hive/conf )
<property>
<name>hive.aux.jars.path</name>
<value>file:///mnt1/hive-jars/hive-contrib-2.1.1.jar</value>
</property>
add hive-contrib-2.1.1.jar to the path "/mnt1/hive-jars" configured in hive-site.xml
This should ideally work after restarting hive-server2.
3. sudo stop hive-server2
4. sudo start hive-server2
But sometimes it does not work. i am not sure why so you can use the following dirty way.
put your jar file in the following path so that hive automatically picks it up while restart.
add hive-contrib-2.1.1.jar to /usr/lib/hive-hcatalog/share/hcatalog
sudo stop hive-server2
sudo start hive-server2
I have read these answers above which was very useful. And I combined all into one solution:
put jars into local disk and give read/write permission
chmod -R 777 /tmp/json.jar
upload to hdfs file system and give permissions too:
hdfs dfs -put /tmp/json.jar hdfs://1.1.1.1:8020/jars/
hdfs dfs -chmod -R 777 hdfs://1.1.1.1:8020/jars/
add jar into hive env.
add jar hdfs://1.1.1.1:8020/jars/json.jar
You have to give the full path to the jar JAR and not only its name.
Don't guess the location. Check the file system to see that it is there, before trying to add it.

Hadoop copying file to hadoop filesystem

I have copied a file from a local to the hdfs file system and the file got copied -- /user/hduser/in
hduser#vagrant:/usr/local/hadoop/hadoop-1.2.1$ bin/hadoop fs -copyFromLocal /home/hduser/afile in
Question:-
1.How does hadoop by default copies the file to this directory -- /user/hduser/in ...Where is this mapping specified in the conf file?
If you write the command like above, the file gets copied to your user's HDFS home directory, which is /home/username. See also here: HDFS Home Directory.
You can use an absolute pathname (one starting with "/") just like in a Linux filesystem, if you want to write the file to a different location.
Are u using a default vm? Basically if you configure hadoop from binaries without using the preconfigure yum package. It doesnt have a default path. But if you use yum via hortin or cloudera vm. It comes with default path i guess
Check the hdfs-site.xml to see the default fs path. So "/" will point to the base URL set in the above mentioned XML. Any folder mentioned in the command without the use of home path will be appended to that.
hadoop picks the default path defined in hdfs-site.xml and write data.
below image clear how writes works in HDFS.

hadoop installation -hadooop-home set error

I'm installing Hanborq optimized Hadoop Distribution (fully distribution mode ) ,i followed all steps exactly in the following links,and there is no errors happened .when I reach to step that format the hdfs file :
$ hadoop namenode -format
An error accursed tells that "HADOOP_HOME is not set correctly
please set your hadoop_home variable to the absolute path of the directorythat contains hadoop-core-VERSION.jar"
installation_steps_1
installation_steps_2
It seems you did not set HADOOP_HOME correctly in .bashrc file. Add below lines in your .bashrc file and execute it by . .bashrc. Please give reply if it works
#HADOOP_HOME setup
export HADOOP_HOME="/usr/local/hadoop/hadoop-2.6"
PATH=$PATH:$HADOOP_HOME/bin
export PATH
Note: HADOOP_HOME is location of hadoop directory

Moving data to hdfs using copyFromLocal switch

I don't know what's going on here but I am trying to copy a simple file from a directory in my local filesystem to the directory specified for hdfs.
In my hdfs-site.xml I have specified that the directory for hdfs will be /home/vaibhav/Hadoop/dataNodeHadoopData using the following properties -
<name>dfs.data.dir</name>
<value>/home/vaibhav/Hadoop/dataNodeHadoopData/</value>
and
<name>dfs.name.dir</name>
<value>/home/vaibhav/Hadoop/dataNodeHadoopData/</value>
I am using the following command -
bin/hadoop dfs -copyFromLocal /home/vaibhav/ml-100k/u.data /home/vaibhav/Hadoop/dataNodeHadoopData
to copy the file u.data from it's local filesystem location to the directory that I specified as Hdfs directory. But when I do this, nothing happens - no error, nothing. And no file gets copied to the hdsf. Am I doing something wrong? Any permissions issue could be there?
Suggestions needed.
I am using pseudo distributed single node mode.
Also, on a related note, I want to ask that in my map reduce program I have set the configuration to point to the inputFilePath as /home/vaibhav/ml-100k/u.data. So would it not automatically copy the file from given location to hdfs ?
I believe dfs.data.dir and dfs.name.dir have to point to two different and existing directories. Furthermore make sure you have formatted the namenode FS after changing the directories in the configuration.
While copying to HDFS you're incorrectly specifying the target. The correct syntax for copying a local file to HDFS is:
bin/hadoop dfs -copyFromLocal <local_FS_filename> <target_on_HDFS>
Example:
bin/hadoop dfs -copyFromLocal /home/vaibhav/ml-100k/u.data my.data
This would create a file my.data in your user's home directory in HDFS.
Before copying files to HDFS make sure, you master listing directory contents and directory creation first.

Hadoop:/usr/lib/hadoop-0.20.2/conf/slaves: No such file or directory

I followed the steps on the hadoop offical site exactly as it says, but it always shows the following error:
starting namenode, logging to /home/videni/Tools/hadoop-1.0.3/libexec/../logs/hadoop-videni-namenode-videni-Latitude-E6400.out
cat: /usr/lib/hadoop-0.20.2/conf/slaves: No such file or directory
cat: /usr/lib/hadoop-0.20.2/conf/masters: No such file or directory
starting jobtracker, logging to /home/videni/Tools/hadoop-1.0.3/libexec/../logs/hadoop-videni-jobtracker-videni-Latitude-E6400.out
cat: /usr/lib/hadoop-0.20.2/conf/slaves: No such file or directory
Right now, i just want to set a Standalone Operation, should I have to set the three xml files, such as core-site.xml, hdfs-site.xml, mapred-site.xml?
I checked the log files, some of them say Error: JAVA_HOME is not set, but I set it in the hadoop-env.sh file.
Configuration files depends on $HADOOP_HOME path. Make sure that you have $HADOOP_HOME set up properly in ~/.bashrc.

Resources