Where is the source code for apache hadoop examples - hadoop

Can anyone please direct me to the source code for Apache Hadoop Yarn examples. The 2.2.0 distribution comes with a jar names hadoop-mapreduce-examples-2.2.0.jar. I am trying to find the source code for the examples.
Any pointer would be helpful...
Thanks,Amit

Did you look at the source code in SVN? Here it is.
http://svn.apache.org/viewvc/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples/src/

Mirror link on Github (for Git users)
https://github.com/apache/hadoop/tree/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples

Here is the SVN link to get the source code to your local machine.
You can install svn with the below command in Linux.
sudo apt-get install subversion
svn checkout http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-project/hadoop-mapreduce-examples

Related

mapr installation with mapr-setup script

I'm facing issue when installing the MapR installer. It is unable to read from the repository. Please check below link for detailed error. Let me know how to overcome this.
http://justpaste.it/q6vl
You may try to clean the repo cache. If it is on RedHat run the command "yum clean all" and attemp the install again.

Installing serf with scons as a prerequisite for SVN 1.8.9 -> no serf binary

I am running SVN 1.8.9 on Mac OSX 10.8.5. Currently the command "svn log" fails in a given repo with the error message:
svn: E170000: Unrecognized URL scheme for 'https://...'
My research showed that this is due to SVN having been compiled from source without the flag "--with-serf".
So, I downloaded and built serf (with its dependencies APU and APR) using the scons build tool as per the instructions. All went fine, but after installation, there is no serf binary command available. When I type 'serf' in the shell, I get the command not found error. Searching for a serf binary on my machine also doesn't give any results.
What might have gone wrong during the intallation?
Where should binary be and why isn't there?
Are there any workarounds to install SVN with serf?
(I tried 'brew install --build-from-source svn', but this doesn't seem to include the serf dependency either)
Thanks a lot in advance.
Cheers,
Martin
Here is what I did:
Download latest SVN
Download the latest SCONS.
cd ~/Downloads/
tar -zxvf "latest SCONS".tar.gz
tar -zxvf "latest SVN".tar.gz
cd "latest SCONS"
python setup.py install
cd ~/Downloads/"latest SVN"
sh get-deps.sh serf
cd serf
scons install
cd ..
./configure --with-serf
make install
I got the same error here and solved by following this post:
https://ahmadawais.com/installing-svn-subversion-on-yosemite-after-removing-the-old-version/
I had to remove some old references to subversion inside:
/usr/local/include/subversion-1/
/usr/local/include/serf-1/
Building Subversion is a pain due to the dozens of dependency issues. Usually Apache httpd has to be rebuilt with Subversion too, and then there's the APR library.
The easiest solution is to download a package that has everything you need. CollabNet doesn't have a Macintosh server package, but Wandisco does. (Look for Yosemite down the Macintosh list). This will include Apache, Subversion, and the Subversion client all in one package.
I haven't used Wandisco's package before. However, I can tell you that CollabNet installs everything under /opt/collabnet including a new and complete Apache server. This also sets up /etc/init.d to start this Apache server and disables the original. I assume a similar thing happens with Wandisco (although Mac OS X doesn't use /etc/init.d, but Launch Services).
This is probably way easier than attempting to configure your Mac with everything you need for Subversion.

how to determine the cloudera minor release in the one click install debian package ? (i.e., 5.1 ? 5.2 ?)

I've managed to get cloudera single node hadoop cluster up and running from this package: http://archive.cloudera.com/cdh5/one-click-install/precise/amd64/cdh5-repository_1.0_all.deb
my colleagues asked me what minor release of cloudera this is installing.. and i am stumped as to which. Is there some info, readme, config or license file that gives this information for cloudera hadoop distros once they are installed ?
Or maybe someone just knows which minor release the above URL will install (if you could provide that info along with a link to a documentation source that would be fantastic.)
thanks in advance
-chris
The one-click install repo currently points to the latest Cloudera version, which is 5.3.0 as of earlier this week.
To check the version you installed, just list the package name. There should be some version number like '5.2.x' appended to the package name. An example command:
dpkg -l | grep 'cloudera'

How to uninstall Hadoop 1.0.0

I set up my Hadoop clusters with Hadoop 2.0.2. Then, today I tried to test 1.0.0. So I downloaded the deb file from the Hadoop website and installed it: It did mess up everything.
Now, when I type "which -a hadoop" I get 2 results
one pointing to my old Hadoop installation folder
and the other one pointing to /usr/bin/hadoop.
So the question is: how to get rid off of Hadoop 1.0.0 completely?
Try using dpkg -r hadoop; this should remove the Hadoop package from the system, but leave the config files intact. If you want to lose the config files as well, try dpkg -P hadoop instead.
> $HADOOP_HOME
> /home/shiv/hadoop
> sudo rm -r /home/shiv/hadoop
And Hadoop is uninstalled!
I struggled through this for longer than a while and then decided to share it here:
The trick is to basically delete all the symlinks pointing back to locations where HDP components reside since that is what causes 80% of the problem. Here is a step by step tutorial for that:
http://www.yourtechchick.com/hadoop/how-to-completely-remove-and-uninstall-hdp-components-hadoop-uninstall-on-linux-system/
Hope that helps!

Is there any link to download ab Apache benchmark

Can anyone please provide me the direct to download the Ab.exe Apache benchmark utility?
On Ubuntu, I can install ab without installing all of Apache via the apache2-utils package. So:
sudo apt-get install apache2-utils
Just download Apache (www.apache.org). It comes with it (in ApacheX.X/bin)
...Guessing (from one of your other questions) that you're using a Mac... there appear to be instructions here:
http://switch.richard5.net/isp-in-a-box-v2/installing-apache-on-mac-os-x/
(if not, I can probably help with a Windows installation, but in general, Google is your friend!)
A list of mirrors for the windows binaries can be found here.
There are some basic instructions here:
http://www.ricocheting.com/how-to-install-on-windows/apache
...basically, install it, and the ab.exe will be in the 'bin' subdirectory of the installation
TMB nailed it with the link to XAMMP in the following question : How to install apache bench on windows 7?
For Windows Users there is no direct binary download.
One has do install the Apache server via the msi setup package, grab ab.exe from the Program Files/Apachge Group/bin folder and uninstall Apache again.
link to the msi package I used was once here:
http://mirror2.klaus-uwe.me/apache//httpd/binaries/win32/httpd-2.0.65-win32-x86-no_ssl.msi
but this is 404 as of today.
http://mirror2.klaus-uwe.me/apache//httpd/binaries/win32/
might help.
Download the Apache MSI-package (https://httpd.apache.org/download.cgi -> Binaries for the version you want -> win32 directory -> httpd-...-win32-x86-no_ssl.msi), open this .msi file with 7-Zip, find ab.exe, select it, and copy it to wherever you want.
I've just done that with http://.../httpd/binaries/win32/httpd-2.2.25-win32-x86-no_ssl.msi, and it works for me.
CentOS and Fedora users can find Apache Benchmark in package httpd-tools.
Example:
# CentOS
sudo yum install httpd-tools
# Fedora
sudo dnf install httpd-tools
Use installed Apache Benchmark binary as ab.
Get the .msi from archive site:
http://archive.apache.org/dist/httpd/binaries/win32/
once installed, go to your Windows explorer to the installation path like:
C:\Program Files (x86)\Apache Software Foundation\Apache2.2\bin\ab.exe

Resources