Installing hadoop on Centos 7 but command is not working - hadoop

I am trying to install cluster hadoop on centos7. But command is not responding , I manually download the hadoop from this link on windows and then copy on VMware and installing using this command but this is not working.

What if you run:
tar xvzf /home/hadoop.2.7.0.tar.gz
Also could you please run and paste results for:
du /home/hadoop.2.7.0.tar.gz
tar tvzf /home/hadoop.2.7.0.tar.gz
md5sum /home/hadoop.2.7.0.tar.gz
You can find distribution with checksums here

Related

How to fix 'character map file `UTF-8' not found'

I'm setting up a UBI rhel8 container. I need to execute this command:
localedef -f UTF-8 -i en_US en_US.UTF-8
which failed with:
character map file `UTF-8' not found: No such file or directory
cannot read character map directory `/usr/share/i18n/charmaps': No such file or directory
You need to install these packages
yum -y install glibc-locale-source glibc-langpack-en
and then re-run localedef command
On Debian/Ubuntu, I was able to solve this via apt install locales.
I ran into this same symptom (locale-gen can't find charmap files) after upgrading from Ubuntu 20.04 to 22.04 in WSL 1. The problem is that there is a bug in WSL 1 that prevents gunzip from running. This is a problem for locale-gen because, at least in Ubuntu 22.04, the charmap files in /usr/share/i18n/charmaps are stored in GZip *.gz format. Apparently, locale-gen depends on gunzip to unzip the charmap files, and when it can't run it, it is stuck.
The solution was
copy UTF-8.gz to the Windows filesystem
unzip it to UTF-8 with a Windows tool (such as 7-zip)
copy UTF-8 back to /usr/share/i18n/charmaps
Then locale-gen worked correctly.

Installing spark on hadoop

I installed hadoop 2.7 on my mac. Then i want to install spark on it. But there is no any document for this.can anybody explain step by step how to install spark on hadoop?
Steps to Install Apache Spark
1) Open Apache Spark Website http://spark.apache.org/
2) Click on Downloads Tab a new Page will get open
3) Choose Pre-built for Hadoop 2.7 and later
4) Choose Direct Download
5) Click on Download Spark: spark-2.0.2-bin-hadoop2.7.tgz and save it on your desired location.
6) Go to the Downloaded Tar file and Extract it.
7) Again Extract the spark-2.0.2-bin-hadoop2.7.tar [File name will differ as version changes] to generate spark-2.0.2-bin-hadoop2.7 folder
8) Now open Shell Prompt and go to the bin directory of spark-2.0.2-bin-hadoop2.7 folder [Folder name will differ as version changes ]
9) Execute command spark-shell.sh
You will be in Spark Shell you can execute the spark commands
https://spark.apache.org/docs/latest/quick-start.html <-- Quick start Guide from spark
Hope this Helps!!!
For running spark on yarn cluster there is lot of steps to install hadoop and spark and all so i write one blog on it step by step you can install it and run spark shell on yarn see the below link
https://blog.knoldus.com/2016/01/30/spark-shell-on-yarn-resource-manager-basic-steps-to-create-hadoop-cluster-and-run-spark-on-it/
Here are the steps I took to install Apache Spark to a Linux Centos system with hadoop:
Install a default Java system (ex: sudo yum install java-11-openjdk)
Download latest release of Apache Spark from spark.apache.org
Extract the Spark tarball (tar xvf spark-2.4.5-bin-hadoop2.7.tgz)
Move Spark folder created after extraction to the /opt/ directory (sudo mv spark-2.4.5-bin-hadoop2.7/ /opt/spark)
Execute with command /opt/spark/bin/spark-shell if you wish to work with Scala or /opt/spark/bin/pyspark if you want to work with Python

Installing Apache Mahout 0.11.0 on RHEL on multinode hadoop cluster

Need help on how to configure mahout to run on multinode hadoop cluster. Also how to run sample programs shipped with Mahout.
Steps done till now:
Download tar gz file.
sudo tar -zxvf mahout-distribution-x.x.tar.gz.
sudo mv mahout-distribution-x.x /usr/lib/mahout
sudo gedit ~/.bashrc".
export MAHOUT_HOME=/usr/local/mahout
source ~/.bashrc".
You also need to add:
export PATH=$PATH:$MAHOUT_HOME/bin

Where is mongoimport installed on Mac OS X

I'm trying to setup a cronjob for a regularly scheduled import of json data into a mongo database. To conduct the import, I have the following command in the Python script that the cronjob runs:
os.system("mongoimport --jsonArray --db %s --collection %s --file .../data.txt" %(db_name,collection_name))
However, the log file of the cronjob keeps displaying the following error:
sh: mongoimport: command not found
I think I need to call mongoimport with the full file path in the code, but I'm not sure where mongodb/mongod/mongoimport is installed on my system. whereis mongoimport, whereis mongodb, whereis mongod all return nothing.
I installed mongodb with Homebrew. Packages installed with Homebrew are located in /Library/Caches/Homebrew. However, in my system that folder only has a mongodb-2.6.4_1 tar file. Do I have to unpack this tar file to access mongoimport?
Thanks for your help.
As of June 2020,
I installed mongodb latest version using brew as per the documentation , and I faced the same issue command not found: mongoimport .
I had to to install mongodb-database-tools
brew install mongodb/brew/mongodb-database-tools
Then I could use mongoimport
Just adding this solution, incase it helps someone
Got the same issue, but I installed mongodb via Mac Port. Unfortunately, from version 3 of mongodb, these mongodb tools are maintained as a separate project, so I updated Mac port to latest version then installed mongo tools separately.
sudo port install mongo-tools
Hope this helps someone that installing mongodb by mac port.
If you installed MongoDB correctly you need to create a ~/.bash_profile and assign /usr/local/mongodb/bin to the $PATH environment variable
After that you should be able to access the mongoimport command
If you used brew for installation, mongod is in /usr/local/bin/ directory. Other utilities (mongoimport, mongoexport etc.) are in the same path. All you need to do is open another terminal.
Visit https://www.mongodb.com/download-center/community and you can download a tarball for MacOS, which contains all the tools including mongoimport.
Untar, add to you PATH and voilĂ !
Try using ./mongoimport or sudo ./mongoimport
After following all of these examples, I was able to use it that way from bash.

Hadoop showing old version despite latest version installation

I am trying to install hadoop in my ubuntu OS. I followed each and every step exactly from this link Hadoop Install Tutorial and everything was going as expected until i tried to run
$ start-dfs.sh and $ hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 2 5 command. These commands doesn't work as expected.I tried R&D and somehow came to know that i was using older hadoop version Hadoop 1.0.2 despite of me getting latest 2.2.0 version.
As i could not solve this, i tried to uninstall hadoop completely, Now when i try doing it, it says
$ sudo dpkg -r hadoop
dpkg: dependency problems prevent removal of hadoop:
hadoop-native depends on hadoop (= 1.0.2-0ubuntu1~hadoop1).
dpkg: error processing hadoop (--remove):
dependency problems - not removing
Errors were encountered while processing:
hadoop
Appreciate any help !
I dont know whether its a proper way to remove hadoop or not, but i have removed it using below method.
I first manually deleted the /usr/local/hadoop folder from all the users(If any).If you are not able to remove it due to lack of permissions, then make sure about the permissions of the folder. Make the permission of the folder to "Sudo" and on "Creating and deleting files" so that every user can delete from their instances.
Then from Terminal $ rm -r hadoop does the job going to the /usr/local path.
After this, i checked $ hadoop version again in terminal ..and boom it again showed its existence. Then i did below step.
2.Goto terminal sudo apt-get purge hadoop or sudo apt-get remove hadoop...then it worked

Resources