Hadoop installation on Ubuntu - hadoop

Can anybody provide me with the commands to install Hadoop 2.2 on Ubuntu 14.04?
I have checked various sites but they all seem to have different procedures.

I successfully installed it using this step-by-step guide (in my case it was on 14.10, but I doubt there will be any difference)
Here is a important part:
wget http://apache.mirrors.pair.com/hadoop/common/stable2/hadoop-
2.2..tar.gz
tar –xvzf hadoop-2.2.0.tar.gz
mv hadoop-2.2.0 hadoop
sudo mv hadoop /usr/local/
sudo chown -R hduser:hadoop Hadoop
You can further configure it for your convenience, choose interface etc.
Hope this helps

Related

Installing hadoop on Centos 7 but command is not working

I am trying to install cluster hadoop on centos7. But command is not responding , I manually download the hadoop from this link on windows and then copy on VMware and installing using this command but this is not working.
What if you run:
tar xvzf /home/hadoop.2.7.0.tar.gz
Also could you please run and paste results for:
du /home/hadoop.2.7.0.tar.gz
tar tvzf /home/hadoop.2.7.0.tar.gz
md5sum /home/hadoop.2.7.0.tar.gz
You can find distribution with checksums here

Installing Apache Mahout 0.11.0 on RHEL on multinode hadoop cluster

Need help on how to configure mahout to run on multinode hadoop cluster. Also how to run sample programs shipped with Mahout.
Steps done till now:
Download tar gz file.
sudo tar -zxvf mahout-distribution-x.x.tar.gz.
sudo mv mahout-distribution-x.x /usr/lib/mahout
sudo gedit ~/.bashrc".
export MAHOUT_HOME=/usr/local/mahout
source ~/.bashrc".
You also need to add:
export PATH=$PATH:$MAHOUT_HOME/bin

Hadoop showing old version despite latest version installation

I am trying to install hadoop in my ubuntu OS. I followed each and every step exactly from this link Hadoop Install Tutorial and everything was going as expected until i tried to run
$ start-dfs.sh and $ hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar pi 2 5 command. These commands doesn't work as expected.I tried R&D and somehow came to know that i was using older hadoop version Hadoop 1.0.2 despite of me getting latest 2.2.0 version.
As i could not solve this, i tried to uninstall hadoop completely, Now when i try doing it, it says
$ sudo dpkg -r hadoop
dpkg: dependency problems prevent removal of hadoop:
hadoop-native depends on hadoop (= 1.0.2-0ubuntu1~hadoop1).
dpkg: error processing hadoop (--remove):
dependency problems - not removing
Errors were encountered while processing:
hadoop
Appreciate any help !
I dont know whether its a proper way to remove hadoop or not, but i have removed it using below method.
I first manually deleted the /usr/local/hadoop folder from all the users(If any).If you are not able to remove it due to lack of permissions, then make sure about the permissions of the folder. Make the permission of the folder to "Sudo" and on "Creating and deleting files" so that every user can delete from their instances.
Then from Terminal $ rm -r hadoop does the job going to the /usr/local path.
After this, i checked $ hadoop version again in terminal ..and boom it again showed its existence. Then i did below step.
2.Goto terminal sudo apt-get purge hadoop or sudo apt-get remove hadoop...then it worked

How to uninstall Hadoop 1.0.0

I set up my Hadoop clusters with Hadoop 2.0.2. Then, today I tried to test 1.0.0. So I downloaded the deb file from the Hadoop website and installed it: It did mess up everything.
Now, when I type "which -a hadoop" I get 2 results
one pointing to my old Hadoop installation folder
and the other one pointing to /usr/bin/hadoop.
So the question is: how to get rid off of Hadoop 1.0.0 completely?
Try using dpkg -r hadoop; this should remove the Hadoop package from the system, but leave the config files intact. If you want to lose the config files as well, try dpkg -P hadoop instead.
> $HADOOP_HOME
> /home/shiv/hadoop
> sudo rm -r /home/shiv/hadoop
And Hadoop is uninstalled!
I struggled through this for longer than a while and then decided to share it here:
The trick is to basically delete all the symlinks pointing back to locations where HDP components reside since that is what causes 80% of the problem. Here is a step by step tutorial for that:
http://www.yourtechchick.com/hadoop/how-to-completely-remove-and-uninstall-hdp-components-hadoop-uninstall-on-linux-system/
Hope that helps!

Installing CouchDB in AWS EC2 Free Tier

Does anyone know of a step by step installation guide for CouchDB in the free tier 32bit AWS EC2 instance?
Keep in mind that YUM is limited by default and I would need to add yum.repos to get extra stuff. I've tried all different articles and RPMs but none seem to work.
I also tried couchbase but it has extremely poor post-install instructions. The server start but then what? I couldn't find the files, configs, or install directories. And, how do I access it?
CouchDB sounds like such a great database but it really needs to break these barriers of entry. MongoDb has better docs, although I couldn't get that to work either (I spent a fraction of the time trying, though).
Thanks :)
The apache team put together this quick script that installs CouchDB (thanks #_jhs
for build-couchdb!) on an Amazon Linux AMI:
https://gist.github.com/1171217
If you are using cloudinit + the EC2 command line tools, simply use
ec2-run-instances with --user-data-file (you will need some mods to
the script to save the password or locally generate one) and voila'.
Relaxing FTW.
Worked like a charm for me!
Enable the EPEL repository first and then install it with yum install couchdb
You can enable EPEL using the instructions here.
EDIT:
More information at http://wiki.apache.org/couchdb/Installing_on_RHEL5. Keep in mind that the Linux EC2 AMI is a cut down version of CentOS and you can add custom repositories and install as you wish.
Here is a quick run down of the steps I use to install couchdb 1.5.1 on Amazon Linux 2014.03.1. See also this post on my blog http://www.everyhaironyourhead.com/installing-couchdb-1-5-1-on-amazon-linux-ami-2014-03-1/.
Core deps and dev tools.
Enable the EPEL Repo by editing the file /etc/yum.repos.d/epel.repo and setting it to enabled.
Next install the deps and tools.
sudo yum install gcc gcc-c++ libtool libicu-devel openssl-devel autoconf-archive erlang python27 python-sphinx help2man
Get the SpiderMonkey JS Engine and build it...
wget http://ftp.mozilla.org/pub/mozilla.org/js/js185-1.0.0.tar.gz
tar xvfz js185-1.0.0.tar.gz
cd js-1.8.5/js/src
./configure
make
sudo make install
You should see it installed under /usr/local/lib
Build CouchDB.
Download the source package for CouchDB, unpack it and cd in.
Point it to the required libs and configure.
./configure --with-erlang=/usr/lib64/erlang/usr/include --with-js-lib=/usr/local/lib/ --with-js-include=/usr/local/include/js/
make
sudo make install
Prepare the CouchDB installation.
Make a couchdb user.
sudo useradd -r -d /usr/local/var/lib/couchdb -M -s /bin/bash couchdb
Set the file ownerships.
sudo chown -R couchdb:couchdb /usr/local/etc/couchdb
sudo chown -R couchdb:couchdb /usr/local/var/lib/couchdb
sudo chown -R couchdb:couchdb /usr/local/var/log/couchdb
sudo chown -R couchdb:couchdb /usr/local/var/run/couchdb
sudo chmod 0775 /usr/local/etc/couchdb
sudo chmod 0775 /usr/local/var/lib/couchdb
sudo chmod 0775 /usr/local/var/log/couchdb
sudo chmod 0775 /usr/local/var/run/couchdb
Prepare the init scripts.
Link the init script and copy the log rotate script to /etc.
sudo cp /usr/local/etc/logrotate.d/couchdb /etc/logrotate.d
sudo ln -s /usr/local/etc/rc.d/couchdb /etc/init.d/couchdb
This and most other linux distros don’t include /usr/local/lib in ld, so CouchDB will have problems finding the SpiderMonkey libs we installed there earlier. One way to solve this is to add the following line to the top of the /etc/init.d/couchdb startup script.
export LD_LIBRARY_PATH=/usr/local/lib
See man page for ldconfig for more info, and please comment with a better solution.
You may want to edit /usr/local/etc/default/couchdb to turn off the auto respawn.
To get it to autostart, just use the standard linux setup tools for running service scripts.
sudo chkconfig --add couchdb
It should pick up the default run levels needed from the script, but in case it doesn’t, you can do it manually like this...
sudo chkconfig --level 3 couchdb on
sudo chkconfig --level 4 couchdb on
sudo chkconfig --level 5 couchdb on
You can sudo chkconfig —list to confirm its there. See man chkconfig for more details.
Relax.
Finally reboot (or just start couchdb from the script) and confirm its running with curl http://127.0.0.1:5984/
Comments, corrections, improvements, and criticisms are appreciated.
Add the EPEL repository first and then install it with yum install couchdb
Yeah, not exactly. I'm running AWS Free Tier standard and installing couch has been hell on earth - lots and lots of dependency issues around erland various graphics libs, I'll report back here when I get a process that works
okay, the issue for me was wxGTK.x86_64 - It had a list of 15 or so dependencies that wouldn't install through yum (even with epel) and I had to manually install the rpms and dependencies before yum install couchdb would work.
Not sure the default AMI is a good idea if you want couch!
I googled: "build couchdb"
and followed the steps
I am installing it.
I can tell you it is very painful. After pressing "rake", you need to wait 2, maybe 3 hours until all the dependencies are compiled. I am still installing it right now in my free tier server. You have to make sure you have that time to keep your terminal busy out there!
However, it is the only working solution so far for me. It is installing automatically for real.
I also tried couchbase but it has extremely poor post-install instructions. The server start but then what? I couldn't find the files, configs, or install directories. And, how do I access it?
Sorry for hearing about the experience you are having! We have been focused on making Couchbase highly performant and scalable recently so understand the developer experience pain such as documentations. Hopefully these two step by step guides would help!
This is on how to install the Couchbase Server and Couchbase Sync Gateway Amazon AMI on AWS and then how to connect Couchbase Sync Gateway to a mobile application:
Part 1 : Database on Amazon: Installing Couchbase AMI on AWS
The first part goes over how to install and access the Couchbase Web Console.
Part 2 : Database on Amazon: Connecting Couchbase Sync Gateway to Couchbase AMI on AWS
The second part goes over how to access the Couchbase contents/directory
You mentioned CouchDB and Couchbase together in this thread and they have different APIs but the Couchbase Sync Gateway component would be able to sit in front of CouchDB through the REST APIs as another option.
For those specifically installing on AWS Linux 2
Installing Couchdb on AWS Linux 2
This page uses Apache Couchdb binary installation
Instructions
Using the Centos installation instructions.
Create the bintray-apache-couchdb-rpm.repo file in the /etc/yum.repos.d directory
Fill in the full path to the repository list rather than using the Relver and Architecture macros.
[bintray--apache-couchdb-rpm] name=bintray--apache-couchdb-rpm baseurl=http://apache.bintray.com/couchdb-rpm/el7/x86_64/
gpgcheck=0
repo_gpgcheck=0
enabled=1
Yum install after enabling epel
sudo yum update && sudo yum install -y couchdb
Continue with the Couchdb and configuration as normal

Resources