How to decompress .Snappy files in Hadoop HDFS? - hadoop

I have some snappy compressed Snappy files in a directory in HDFS. I need to decompress each file and load into a Text file. Any Hadoop DFS commands are available? I am new here. Kindly help.
Thanks,
Praveen.

One way you can achieve it is via -text hadoop command
hadoop fs -text /hdfs_path/hdfs_file.snappy > some_unix_file.txt
hadoop fs -put some_unix_file.txt /hdfs_path

Related

How to decompress the gz files in hadoop

Wanted to know if there is any hadoop command to decompress the gz file
sitting on HDFS and display the content to stdout.
Just use text command
hdfs dfs -text file.gz
Hadoop knows how to detect gzip files and uncompresses it for you
You can do it easily by:
hdfs dfs -cat /path/to/file.gz | zcat

Most efficient way to write data to hadoop

I am new to Hadoop HDFS. I am trying to learn how to write data read from local file to hadoop HDFS . I want to know how to write in an efficient way. Please help
You can try like this
hadoop fs -put localpath hdfspath
Example
hadoop fs -put /user/sample.txt /sample.txt
You can google it to find more hdfs commands. Refer here

No such file or directory in copying file to hadoop

i'm beginner in hadoop, when i use
Hadoop fs -ls /
And
Hadoop fs - mkdir /pathname
Every thing is ok, but i want to use my csv file in hadoop, my file is in c drive, i used -put and wget and copyfromlocal commands like these:
Hadoop fs -put c:/ path / myhadoopdir
Hadoop fs copyFromLoacl c:/...
Wget ftp://c:/...
But in two of above it errors in no such file or directory /myfilepathinc:
And for the third
Unable to resolve host address"c"
Thanks for your help
Looking at your command, it seems that there could be couple of reasons for this issue.
Hadoop fs -put c:/ path / myhadoopdir
Hadoop fs copyFromLoacl c:/...
Use hadoop fs -copyFromLocal correctly.
Check your local file permission. You have to give full access to that file.
You have to give your absolute path location both in local and in hdfs.
Hope it will work for you.
salmanbw's answer is exact. To be more clear.
Suppose your file is "c:\testfile.txt", use the command below.
And also make sure you have write permission to your directory in HDFS.
hadoop fs -copyFromLocal c:\testfile.txt /HDFSdir/testfile.txt

How to read a .deflate file in hadoop

I got some pig generated files with part-r-00000.deflate extension. I know this is a compressed file. How do I generate a normal file in a readable format. When I used hadoop fs -text, I cannot get plaintext output. The output is still binary. How can I fix this problem?
You might be using a quite old Hadoop version (e.g: 0.20.0) in which fs -text can't inflate the compressed file.
As a workaround you may try this one-liner (based on this answer):
hadoop fs -text file.deflate | perl -MCompress::Zlib -e 'undef $/; print uncompress(<>)'
you can decompress on the fly by using this command
hdfs dfs -text file.deflate | hdfs dfs -put - uncompressed_destination_file

How do I use hadoop fs -getmerge to download .deflate files?

I've tried running
hadoop fs -getmerge
on a directory of .deflate files. The result is a compressed file on my local machine.
What is the easiest way to download the entire directory in uncompressed format on to my local machine?
Try this:
hadoop fs -text /some/where/job-output/part-*

Resources