I'm Beginner in Hadoop. I wanted to view fs-image and Edit logs in hadoop. I have searched it in many blogs, nothing is clear. Please can any one tell me step by step procedure to view the Edit log/fs-image file in hadoop.
My version: Apache Hadoop: Hadoop-1.2.1
My Installed director is ![/home/students/hadoop-1.2.1]
I'm listing steps what i have tried based on some blogs.
Ex.1. $ hdfs dfsadmin -fetchImage /tmp
Ex.2. hdfs oiv -i /tmp/fsimage_0000000000000001386 -o /tmp/fsimage.txt
Nothing works for me.
It shows that hdfs is not a directory or a file.
For edit log, navigate to
/var/lib/hadoop-hdfs/cache/hdfs/dfs/name/current
then;
ls -l
to view the complete name of the log file you want to extract; after then
hdfs oev -i editFileName -o /home/youraccount/Desktop/edits_your.xml -p XML
For the fsimage;
hdfs oiv -i fsimage -o /home/youraccount/Desktop/fsimage_your.xml
Go to the bin directory and try to execute the same commands
Related
I am running Hadoop Mapreduce and Yarn in psuedo distributed mode and I want to get job history log. To get that, I tried solution 2 in this question and so, from the directory.
hadoop-3.0.0/bin
I executed
$ ./hdfs dfs -ls /tmp/hadoop-uname/mapred.
Following is what I get as response:
ls: `/tmp/hadoop-uname/mapred': No such file or directory
I get same response for:
$ ./hdfs dfs -ls /tmp/hadoop-uname/mapred/staging
also.
My questions are:
1) Are job history logs generated in psuedo history mode?
2) Is logging turned on by default? Or I need to do some other setting to turn it on?
3) Am I missing anything else?
I have created a folder in the root directly and I'm trying to copy a folder to hdfs hadoop but I'm getting an error message. This is the steps that I have followed:
[root#dh] ls
XXdirectoryXX
[root#dh] sudo –u hdfs hadoop fs –mkdir /user/uname
[root#hd] uname
[root#hd] sudo –u hdfs hadoop fs –chown uname /user/uname
[root#hd] su - uname
[uname#hd] hadoop fs –copyFromLocal XXdirectoryXX/ /user/uname
copyFromLocal: 'XXdirectoryXX/': No such file or directory
Is there a problem in the command or what I've done or should I use another command to copy the files over?
I'm using Centos 6.8 in the machine
Any ideas?
Thanks
Thanks to the comments I've managed to resolve the issue. Here is the code it it helps someone:
[root#dh] sudo -u hdfs hadoop fs -chown -R root /user/uname
[root#dh] hadoop fs –copyFromLocal XXdirectoryXX/ /user/uname
Regards
I was trying to unzip a zip file, stored in Hadoop file system, & store it back in hadoop file system. I tried following commands, but none of them worked.
hadoop fs -cat /tmp/test.zip|gzip -d|hadoop fs -put - /tmp/
hadoop fs -cat /tmp/test.zip|gzip -d|hadoop fs -put - /tmp
hadoop fs -cat /tmp/test.zip|gzip -d|hadoop put - /tmp/
hadoop fs -cat /tmp/test.zip|gzip -d|hadoop put - /tmp
I get errors like gzip: stdin has more than one entry--rest ignored, cat: Unable to write to output stream., Error: Could not find or load main class put on terminal, when I run those commands. Any help?
Edit 1: I don't have access to UI. So, only command lines are allowed. Unzip/gzip utils are installed on my hadoop machine. I'm using Hadoop 2.4.0 version.
To unzip a gzipped (or bzipped) file, I use the following
hdfs dfs -cat /data/<data.gz> | gzip -d | hdfs dfs -put - /data/
If the file sits on your local drive, then
zcat <infile> | hdfs dfs -put - /data/
I use most of the times hdfs fuse mounts for this
So you could just do
$ cd /hdfs_mount/somewhere/
$ unzip file_in_hdfs.zip
http://www.cloudera.com/content/www/en-us/documentation/archive/cdh/4-x/4-7-1/CDH4-Installation-Guide/cdh4ig_topic_28.html
Edit 1/30/16: In case if you use hdfs ACLs: In some cases fuse mounts don't adhere to hdfs ACLs, so you'll be able to do file operations that are permitted by basic unix access privileges. See https://issues.apache.org/jira/browse/HDFS-6255, comments at the bottom that I recently asked to reopen.
To stream the data through a pipe to hadoop, you need to use the hdfs command.
cat mydatafile | hdfs dfs -put - /MY/HADOOP/FILE/PATH/FILENAME.EXTENSION
gzip use -c to read data from stdin
hadoop fs -put doesnt support read the data from stdin
I tried a lots of things and would help.I cant find the zip input support of hadoop.So it left me no choice but download the hadoop file to local fs ,unzip it and upload to hdfs again.
I am using HDP mahout version 0.8. I have set MAHOUT_LOCAL="". When I run mahout, I see the message HADOOP LOCAL NOT SET RUNNING ON HADOOP but my program is not writing output to HDFS directory.
Can anyone tell me how to make my mahout program take input from HDFS and write output to HDFS?
Did you set the $MAHOUT_HOME/bin and $HADOOP_HOME/bin on the PATH ?
For example on Linux:
export PATH=$PATH:$MAHOUT_HOME/bin/:$HADOOP_HOME/bin/
export HADOOP_CONF_DIR=$HADOOP_HOME/conf/
Then, almost all the Mahout's commands use the options -i (input) and -o (output).
For example:
mahout seqdirectory -i <input_path> -o <output_path> -chunk 64
Assuming you have your mahout jar build which takes input and write to hdfs. Do the following:
From hadoop bin directory:
./hadoop jar /home/kuntal/Kuntal/BIG_DATA/mahout-recommender.jar mia.recommender.RecommenderIntro --tempDir /home/kuntal/Kuntal/BIG_DATA --recommenderClassName org.apache.mahout.cf.taste.impl.recommender.GenericItemBasedRecommender
#Input Output Args specify if required
-Dmapred.input.dir=./ratingsLess.txt -Dmapred.output.dir=/input/output
Please check this:
http://chimpler.wordpress.com/2013/02/20/playing-with-the-mahout-recommendation-engine-on-a-hadoop-cluster/
I have constructed a single-node Hadoop environment on CentOS using the Cloudera CDH repository. When I want to copy a local file to HDFS, I used the command:
sudo -u hdfs hadoop fs -put /root/MyHadoop/file1.txt /
But,the result depressed me:
put: '/root/MyHadoop/file1.txt': No such file or directory
I'm sure this file does exist.
Please help me,Thanks!
As user hdfs, do you have access rights to /root/ (in your local hdd)?. Usually you don't.
You must copy file1.txt to a place where local hdfs user has read rights before trying to copy it to HDFS.
Try:
cp /root/MyHadoop/file1.txt /tmp
chown hdfs:hdfs /tmp/file1.txt
# older versions of Hadoop
sudo -u hdfs hadoop fs -put /tmp/file1.txt /
# newer versions of Hadoop
sudo -u hdfs hdfs dfs -put /tmp/file1.txt /
--- edit:
Take a look at the cleaner roman-nikitchenko's answer bellow.
I had the same situation and here is my solution:
HADOOP_USER_NAME=hdfs hdfs fs -put /root/MyHadoop/file1.txt /
Advantages:
You don't need sudo.
You don't need actually appropriate local user 'hdfs' at all.
You don't need to copy anything or change permissions because of previous points.
try to create a dir in the HDFS by usig: $ hadoop fs -mkdir your_dir
and then put it into it $ hadoop fs -put /root/MyHadoop/file1.txt your_dir
Here is a command for writing df directly to hdfs file system in python script:
df.write.save('path', format='parquet', mode='append')
mode can be append | overwrite
If you want to put in in hdfs using shell use this command:
hdfs dfs -put /local_file_path_location /hadoop_file_path_location
You can then check on localhost:50070 UI for verification