Hadoop HBase scripts on Linux Mint gives strange errors - shell

I have installed Hadoop and Pig on my Mint (Ubuntu-like) virtual machine. I keep getting strange error messages when running scripts. In fact, when I run hadoop commands I also get errors but at least it works, but with HBase it just fails.
For example, running sh hadoop -rmr /home/myoutput I get:
hadoop: 102: [: fs: unexpected operator
Deleted hdfs://localhost/home/myoutput
When I run start-hbase it starts fine.
When I run sh hbase shell I get:
hbase: 163: hbase: [[: not found
hbase: 163: hbase: [[: not found
hbase: 197: hbase: Syntax error: "(" unexpected
These lines in the hbase script are:
163: if [[ $f = *sources.jar ]]
197: function append_path() {
What am I missing?

Mint is not just ubuntu-like, it is actually built on ubuntu, so you should be able to find the answer pretty easy for that.
Also, my suggestion to you is to tag this question with ubuntu, not mint.

Related

Anyone know how to fix hadoop-functions.sh "syntax error near unexpected token `<'"?

I've configured Hadoop 3.1.1 on my MacPro running OSX 10.14.2, and I'm getting the following error when I run start-all.sh
$ sudo /usr/local/Cellar/hadoop/3.1.1/sbin/start-all.sh
Starting namenodes on [localhost]
/usr/local/Cellar/hadoop/3.1.1/libexec/bin/../libexec/hadoop-functions.sh: line 398: syntax error near unexpected token `<'
/usr/local/Cellar/hadoop/3.1.1/libexec/bin/../libexec/hadoop-functions.sh: line 398: ` done < <(for text in "${input[#]}"; do'
/usr/local/Cellar/hadoop/3.1.1/libexec/bin/../libexec/hadoop-config.sh: line 70: hadoop_deprecate_envvar: command not found
/usr/local/Cellar/hadoop/3.1.1/libexec/bin/../libexec/hadoop-config.sh: line 87: hadoop_bootstrap: command not found
/usr/local/Cellar/hadoop/3.1.1/libexec/bin/../libexec/hadoop-config.sh: line 104: hadoop_parse_args: command not found
/usr/local/Cellar/hadoop/3.1.1/libexec/bin/../libexec/hadoop-config.sh: line 105: shift: : numeric argument required
/usr/local/Cellar/hadoop/3.1.1/libexec/bin/../libexec/hadoop-config.sh: line 244: hadoop_need_reexec: command not found
/usr/local/Cellar/hadoop/3.1.1/libexec/bin/../libexec/hadoop-config.sh: line 252: hadoop_verify_user_perm: command not found
/usr/local/Cellar/hadoop/3.1.1/libexec/bin/hdfs: line 213: hadoop_validate_classname: command not found
/usr/local/Cellar/hadoop/3.1.1/libexec/bin/hdfs: line 214: hadoop_exit_with_usage: command not found
/usr/local/Cellar/hadoop/3.1.1/libexec/bin/../libexec/hadoop-config.sh: line 263: hadoop_add_client_opts: command not found
/usr/local/Cellar/hadoop/3.1.1/libexec/bin/../libexec/hadoop-config.sh: line 270: hadoop_subcommand_opts: command not found
/usr/local/Cellar/hadoop/3.1.1/libexec/bin/../libexec/hadoop-config.sh: line 273: hadoop_generic_java_subcmd_handler: command not found
Same issues starting the datanodes, secondary namenodes, resourcemanager, and nodemanagers.
I have found a similar bug reference online: https://issues.apache.org/jira/browse/HDFS-12571.
Update
After some debugging, the root cause is bash "< <(command)" syntax not being accepted for some reason. The bash versions on the system (/bin/bash and /usr/local/bin/bash from Homebrew) both work properly.
Maybe you should modify HDFS_NAMENODE_USER、HDFS_DATANODE_USER and so on in hadoop-env.sh to current user instead of root! Then before run the sudo ./start-all.sh command, you may need to recreate hdfs namenode with hdfs namenode -format.

How to run hive script from hive cli

I have hive script custsales.hql now I want to run it from hive cli as
hive (pract5)> run /user/training/hdfs_location/custsales.hql
but it does not execute. Please guide. I know we can run it from command line with
$ hive -f /home/training/local_location/custsales.hql
but this is not my requirement.
Use source path/to/script command.

Centos 7 errors on each command

Abruptedly when I execute any command on my Centros 7 shell I receive errors hampering their executions. Errors are of the kind:
$ ls
-bash: /usr/bin/ls: Input/output error
$ df
-bash: df: command not found
$ top
-bash: /usr/bin/top: Input/output error
I tried rebotting the machine to no avail and no service like fsftp or http work. Please help me as soon as possible as that is the main server of my backoffice.
It was an issue on the provider, Crissic, one notified and acknowledged the issue it took a day to fix it.

Trouble running pig in both local or mapreduce mode

I already have Hadoop 1.2 running on my Ubuntu VM which is running on Windows 7 machine. I recently installed Pig 0.12.0 on my same Ubuntu VM. I have downloaded the pig-0.12.0.tar.gz from the apache website. I have all the variables such as JAVA_HOME, HADOOP_HOME, PIG_HOME variables set correctly. When I try to start pig in local mode this is what I see:
chandeln#ubuntu:~$ pig -x local
pig: invalid option -- 'x'
usage: pig
chandeln#ubuntu:~$ echo $JAVA_HOME
/usr/lib/jvm/java7
chandeln#ubuntu:~$ echo $HADOOP_HOME
/usr/local/hadoop
chandeln#ubuntu:~$ echo $PIG_HOME
/usr/local/pig
chandeln#ubuntu:~$ which pig
/usr/games/pig
chandeln#ubuntu:~$ echo $PATH
/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/jvm/java7/bin:/usr/local/hadoop/bin:/usr/local/pig/bin
Since I am not a Unix expert I am not sure if this is the problem but the command which pig actually returns /usr/games/pig instead of /usr/local/pig. Is this the root cause of the problem?
Please guide.
I was able to fix the problem by changing the following lines in my .bashrc. This gave precedence to the /usr/local/pig directory instead of /usr/games/pig
BEFORE: export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$PIG_HOME/bin
AFTER: export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$PIG_HOME/bin:$PATH

pig beginner's example [unexpected error]

I am new to Linux and Apache Pig. I am following this tutorial to learn pig:
http://salsahpc.indiana.edu/ScienceCloud/pig_word_count_tutorial.htm
This is a basic word counting example. The data file 'input.txt' and the program file 'wordcount.pig' are in the Wordcount package, linked on the site.
I already have Pig 0.11.1 downloaded on my local machine, as well as Hadoop, and Java 6.
When I downloaded the Wordcount package it took me to a "tar.gz" file. I am unfamiliar with this type, and wasn't sure how to extract it.
It contains the files 'input.txt','wordcount.pig' and a Readme file. I saved 'input.txt' to my Desktop. I wasn't sure where to save wordcount.pig, and decided to just type in the commands line by line in the shell.
I ran pig in local mode as follows:pig -x local
and then I just copy-pasted each line of the wordcount.pig script at the grunt> prompt like this:
A = load '/home/me/Desktop/input.txt';
B = foreach A generate flatten(TOKENIZE((chararray)$0)) as word;
C = group B by word;
D = foreach C generate COUNT(B), group;
dump D;
This generates the following errors:
...
Retrying connect to server: localhost/127.0.0.1:8021. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 2043: Unexpected error during execution.
My questions:
1.
Should I be saving 'input.txt' and the original 'wordcount.pig' script to some special folder inside the directory pig-0.11.1? That is, create a folder called word inside pig-0.11.1 and put 'wordcount.pig' and 'input.txt' there and then type in "wordcount.pig" from the grunt> prompt ???
In general, if I have data in say, 'dat.txt', and a script say, 'program.pig', where should I be saving them to run 'program.pig' from the grunt shell??? I think they should both go in pig-0.11.1,so I can do $ pig -x local wordcount.pig, but I am not sure.
2.
Why am I not able to run the script line by line as I tried to?
I have specified the location of the file 'input.txt' in the load statement.
So why does it not just run the commands line by line and dump the contents of D to my screen???
3.
When I try to run Pig in mapreduce mode using $pig, it gives this error:
retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
2013-06-03 23:57:06,956 [main] ERROR org.apache.pig.Main - ERROR 2999: Unexpected internal error. Failed to create DataStorage
This error indicates that Pig is unable to connect to Hadoop to run the job. You say you have downloaded Hadoop -- have you installed it? If you have installed it, have you started it up according to its docs -- have you run the bin/start-all.sh script? Using -x local tells Pig to use the local filesystem instead of HDFS, but it still needs a running Hadoop instance to perform the execution. Before trying to run Pig, follow the Hadoop docs to get your local "cluster" set up and make sure your NameNode, DataNodes, etc. are up and running.
2043 error occurs when hadoop and pig fail to communicate with each other.
Never do a right click --> extract here, when dealing with tar.gz files.
U shud always do a tar -xzvf *.tar.gz on terminal when extracting them.
I noticed that pig doesn't get installed properly when u do a right click on pig..tar.gz file and select extract here. It's good to do a tar -xzvf pig..tar.gz from terminal.
Make sure u are running Hadoop before u execute pig -x local kind of commands.
If u want to run *.pig files from grunt> prompt, use:
grunt> exec *.pig
If u want to run pig files outside grunt> prompt, use:
$ pig -x local *.pig

Resources