sqoop installation error on fedora 15 - hadoop

I am trying to install sqoop on my machine,i downloaded tar file from here
and trying to install by seeing here
So when i tried the below command i getting the error as below
[root#065 local]# (cd /usr/local/ && sudo tar \-zxvf _</home/local/user/Desktop/sqoop-1.4.2.bin__hadoop-0.20.tar.gz>_)
Error
gzip: stdin: unexpected end of file
tar: Child returned status 1
tar: Error is not recoverable: exiting now
What wrong with the above command ?
Can anyone please let me know the easiest way of installing sqoop ?
Actually i had a 1GB xml file that need to be processed and saved in to MYSQL database, i used hadoop with python for doing this but it is taking hrs to process and save, so i decided to use sqoop and process the xml file and save the data in to database
Also please let me know the basic and easy tutorial to work with sqoop
Also please provide me a basic code that process the xml file and saves data in to database as i am newbie to sqoop.

Validate the tar by just executing this command
ls -l /home/local/user/Desktop/
and check the file size whether it is consistent with the 4.6M or close
after that just try running the following
tar -zxvf /home/local/user/Desktop/sqoop-1.4.2.bin__hadoop-0.20.tar.gz
Then copy to /usr/local/
UPDATE:
You have copied and pasted the exact command from cloudera documentation.
_<path_to_sqoop.tar.gz>_ you don't need _ < and > _ those are used for placeholder for the documentation.
Run this
(cd /usr/local/ && sudo tar \-zxvf /home/local/user/Desktop/sqoop-1.4.2.bin__hadoop-0.20.tar.gz)

Please also add $HADOOP_HOME in your ~/.bash_rc file.
vim ~/.bash_rc
Add this to your bash_rc file :
export HADOOP_HOME=/home/local/user/name/Hadoop/hadoop-1.0.4/
Save your file and then perform
source ~/.bash_rc .
Also, you need to copy sqoop-env-templat‌​e.sh to sqoop-env.sh. As the name suggests, it is only a template :
cp /home/local/user/name/Desktop/sqoop-1.4.2.bin__hadoop-0.20/conf/sqoop-env-templat‌​e.sh /home/local/user/name/Desktop/sqoop-1.4.2.bin__hadoop-0.20/conf/sqoop-env.sh
Edit sqoop-env.sh:
vim /home/local/user/name/Desktop/sqoop-1.4.2.bin__hadoop-0.20/conf/sqoop-env.sh
Add the following line to sqoop-env.sh:
export HADOOP_HOME=/home/local/user/name/Hadoop/hadoop-1.0.4/
Now test sqoop :
./bin/sqoop help
To make your life simpler , you can also add sqoop to your bashrc file

Related

Download zip file from s3 using s3cmd issue

I have two issues I need help with on bash, linux and s3cmd.
First, I'm running into linux permission issue. I am trying to download zip files from a s3 bucket using s3cmd with following command in a bash script.sh:
/usr/bin/s3cmd get s3://<bucketname>/<folder>/zipfilename.tar.gz
I am seeing following error: permission denied.
If I try to run this command manually from command line on a linux machine, it works and downloads the file:
sudo /usr/bin/s3cmd get s3://<bucketname>/<folder>/zipfilename.tar.gz
I really don't want to use sudo in front of the command in the script. How do I get this command to work? Do I need to give chown permission to the script.sh which is actually sitting in a path i.e /foldername/script.sh or how do I get this get command to work?
Two: Once I get this command to work, How do I get it to download from s3 to the linux home dir: ~/ ? Do I have to specifically issue a command in the bash script: cd ~/ before the above download command?
I really appreciate any help and guidance.
First, determine what's failing and the reason, otherwise you won't find the answer.
You can specify the destination in order to avoid permission problems when the script is invoked using a directory that's not writeable by that process
/usr/bin/s3cmd get s3:////zipfilename.tar.gz /path/to/writeable/destination/zipfilename.tar.gz
Fist of all ask 1 question at a time.
For the first one you can simply change the permission with chown like :
chown “usertorunscript” filename
For the second :
If it is users home directory you can just specify it with
~user
as you said but I think writing the whole directory is safer so it will work for more users (if you need to)

Bash `install -d` for existing file with same name

If a part of my bash script is to copy installation files to a directory:
install -d DIRECTORY but already there's a file named DIRECTORY , bash reports an error that file already exists. How can I resolve this?
Eg:
install -d bash_completion.d . Some software, previously installed, has a file name bash_completion.d in the exact same location, so make install fails.
Is the only option to check using if-else or is there some neater way of doing this? Some theory could help.

Stanford CS231n: how to download the dataset (using .sh file in windows)?

I am self-learning the convelutional neural network from the stanford cs231n course. I am trying to finish the assignment1. However, I just got totally confused about how to download the data. I followed the instructor and see
Download data: Once you have the starter code, you will need to
download the CIFAR-10 dataset. Run the following from the assignment1
directory:
cd cs231n/datasets
./get_datasets.sh
I don't understand what does it mean by "run" the following. Run what exactly? Previously, I use R so I understand what does "run R" means. But here it does not say run "what" or run the code in "where".
So I tried to run the code in command prompt, Anaconda Prompt, PowerShell or even Git Bash. The command prompt game error of "." is not recognized as an internal or external command. The PowerShell does not give error but does not give any result either. It just open a Text Document of code
# Get CIFAR10
wget http://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
tar -xzvf cifar-10-python.tar.gz
rm cifar-10-python.tar.gz
The Git Bash gives me error of
get_datasets.sh: line 2: wget: command not found
tar (child): cifar-10-python.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
rm: cannot remove 'cifar-10-python.tar.gz': No such file or directory
How to download this data? Please help! Thanks.
Download wget
& copy wget.exe to your git
C:\Program Files\Git\mingw64
then restart the git bash or Navigate to your folder where get_datasets.sh file is present.
run the following command in git bash
sh get_datasets.sh
Navigate to your folder where get_datasets.sh file is present using cmd.
Use the command .\get_datasets.sh in command prompt.
Make sure you are not missing wget. If wget is missing, install wget on your local machine and then run the shell script.
After installing wget properly use the instruction mentioned at the start.

Unable to run tar command - invalid option -- '▒'

I've run into a problem while running a tar script. I am getting an invalid option, as shown in the screenshots, that is stopping the script running. I don't get why however as the command worked outwith the script. Can anyone help me?
The script:
The error:
Thanks to Paul R I have an answer. No idea how to mark his comment as the answer though so here it is:
Some older versions of tar don't like the - at the start of the
commands - try tar cvpfz .... – Paul R 5 mins ago
Are you copy and pasting the command, instead of hand-typing it in terminal?
In my case, I was getting:
tar: invalid option -- '�'
I was pasting into terminal, the command from a raw text file, which I had copied from a tutorial:
tar –xvzf bitcoin-0.20.0-x86_64-linux-gnu.tar.gz
I hand-typed the entire command:
tar -xvzf bitcoin-0.20.0-x86_64-linux-gnu.tar.gz
and it worked. I suspect it was something like an extra space character or a - or similar which was not working.
In my case, i tried chmod 777 'FILE_NAME' for unlock the file.
Then installing, it works well!

file not found error while trying to move hadoop 2.6

I tried to install hadoop using the below link.
"http://www.bogotobogo.com/Hadoop/BigData_hadoop_Install_on_ubuntu_single_node_cluster.php"
I was moving the files to /usr/local/hadoop.But i got the following error.
hduser#vijaicricket-Lenovo-G50-70:~$ ~/hadoop-2.6.0$ sudo mv * /usr/local/hadoop
-bash: /home/hduser/hadoop-2.6.0$: No such file or directory
Where did you extract the hadoop tar file? As looking at the error from shell looks like " /home/hduser/hadoop-2.6.0" directory doesnt exist. Also make sure the valid permission user has.

Resources