Where is the bin directory in Hadoop project dir?

Where is the bin directory in Hadoop project dir? - hadoop

I have been following the official as well as DigitalOcean's documentation(tutorial) but I could not follow their lead. Each of them suggests editing a file an running the Hadoop after editing:
etc/hadoop/hadoop-env.sh in the dist directory.
I am unable to find such a file in the whole extracted dir from the latest stable release as well a release of 2.7.7
Where is the etc/hadoop/hadoop-env.sh ?

The paths in the guide may or may not represent the actual bundle packaging. You should run find and see the results from the tar.

Related

Compile ODL controller

I am trying to follow this example but I found one problem. I am trying to compile ODL controller but the files structure have changed compared to the previous versions and I don't know in what path I have to be to compile the controller.
I am following
git clone https://git.opendaylight.org/gerrit/p/controller.git
Check that the used Yang tools version is >= 0.5.8-SNAPSHOT.
But I have 0.8.0 (downloaded today in the same link).
And then I have to do this to compile the ODL controller:
cd controller/opendaylight/distribution/opendaylight
mvn clean install
But this path doesn exist on the version I have donwloaded.
¿In what directory I have to be to run the mvn clean install?

The ping example wiki is old and outdated. That was back when everything was in the controller project except for yangtools and before ODL was converted to use karaf. So the controller/opendaylight/distribution/opendaylight directory is long gone. So if you want to create and run the ping example, you would create a karaf feature and run the karaf distro in the controller project. You can follow what is done with the toaster sample and its associated wiki which is pretty up-to-date: https://wiki.opendaylight.org/view/OpenDaylight_Controller:MD-SAL:Toaster_Step-By-Step.

just run 'mvn clean install' in the root dir (so, the "controller" dir).
also, to be safe, I'd delete your "repository" directory in your .m2
dir (usually, in ~/.m2/repository).
Finally, make sure your mvn .settings.xml file is correct. here's a
link for that.

Hadoop Installation - Directory structure - /etc/hadoop

I downloaded hadoop 2.7 and each installation guide that I have found mentions /etc/hadoop/.. directory but the distribution that I have downloaded doesn't have this directory.
I tried with Hadoop 2.6 as well and it doesn't have this directory either.
Should I create these directories ?
Caveat; I am a complete newbie !
Thanks in advance.

It seems you have downloaded the source. Build the Hadoop source then you will get that folder.
To build the hadoop source, refer building.txt file available in the Hadoop package that you have downloaded.

Try downloading hadoop-2.6.0.tar.gz instead of hadoop-2.6.0-src.tar.gz from Hadoop Archive. As #Kumar has mentioned, you might have source distribution.
If you don't want to compile hadoop from source, then download hadoop-2.6.0.tar.gz from the link given above.

Try to download compiled hadoop from the below link
http://a.mbbsindia.com/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz instead of downloading source
you will get the /etc/hadoop/.. path in it..

Download it from Apache Website
Apache Hadoop link

mvn and the make package error

OK. Here's the problem and it's driving me crazy!!!
I followed the instruction online, installed hadoop and when running the text it said snappy local library can't be loaded.
It's said I have to install snappy first and then install hadoop-snappy.
I download snappy-1.0.4 from google code and do the following:
cd ../snappy-1.0.4
./configure
make
sudo make install
Then it's the problem when:
mvn package -Dsnappy.prefix=/usr/local
The post online said by default the snappy should be installed in the /usr/local.
But I got the following error and no matter what I change the path, still get erro:
The goal you specified required a project to execute but there's no POM in the directory. Please verify you invoked the maven from the correct directory.
It's the wrong directory of mvn? Or improper of snappy? And it said lack of pom that should be a .xml that in no where I can find..
Please help!

Alright, so looking at that page, you are in the wrong directory.
The directory you should be in for that step is "hadoop-snappy" which you can see has a pom.xml, you can verify by looking at the github, https://github.com/electrum/hadoop-snappy.
So after you follow these steps from the guide you showed me.
Download it(hadoop-snappy) from GitHub
Install libtool, make sure ‘libtoolize’ works
Install Maven 3 if necessary
Change your directory to hadoop-snappy and run the command you were trying before.

Where can I find a tutorial for installing and running cascading.jruby?

I have Hadoop installed and testing fine, however unable to find any instructions for a n00b on
How to setup cascading and cascading.jruby. Where to place the cascading Jars and how to configure jading to build the ruby assemblies correctly?
Is anyone using jenkins to build this automatically?
Edit: more details
I'm trying to build the example word count job from https://github.com/etsy/cascading.jruby
I've installed
hadoop, and run the tests successfully.
installed jruby
gem install cascading.jruby
jade - https://github.com/etsy/jading
installed ant
created the wordcount sample wc.rb
Run jade to compile the wc.rb to a jar
jade wc.rb
I get the following compile error
Buildfile: build.xml does not exist!
Build failed
RuntimeError: Ant retrieve failed
(root) at /usr/bin/hjade:89
Which makes sense looking at the jade code, but this isn't covered in the example usage? What am I missing here?

Sorry for the delay; this is my first answer, here.
The issue you describe, Jading not being able to locate its Ant build script when called from a symlink, is indeed an issue. I'd recommend just adding your Jading clone to your PATH rather than creating symlinks (or submit a pull request to fix the issue!).
To address some of your other concerns, I've created a Getting Started page in the Jading wiki which may be of some help. It walks you through getting up and running with local and remote cascading.jruby jobs without installing anything besides preqs (Java, Ant, JRuby, and the Hadoop client+config). Included now is a full example wordcount script that should function both locally and on a Hadoop cluster, and has been tested on Etsy's own internal cluster.
And backing up further still to address your question about Jenkins, yes, at Etsy we use Jenkins to build and deploy cascading.jruby (and Scalding) to our cluster. However, that build process does not currently use Jading to produce the job jar. Our build predated Jading and Jading was an attempt to release a cleaner version of the process we go through to build that jar. Our build could easily using Jading (and the original examples came from actual uses on our code), but we have slightly different requirements for the artifacts produced by our build.
If you have any other issues with Jading, feel free to submit issues or pull requests to the github project.

If you are using jruby. You must be using bundler as well. In that case you can add cascading.jruby as a dependency in your gemfile.
You could anyways try installing from your project folder as:
gem install 'cascading.jruby'
Hope this Helps.

I've got the working end to end now.
I had created symlinks to the hadoop, jading binaries in /usr/local/bin
The scripts need to be run from their own directories in order to find the supporting files
i.e. the following works: (assuming the cascading.jruby example is in ~/dev/cascading.jruby.demo/wc.rb
cd /usr/local/jading
./jade ~/dev/cascading.jruby.demo/wc.rb
# creates a jade.jar locally in jading folder
cd /usr/local/hadoop
./bin/hadoop jar /usr/local/jading/jade.jar ~/dev/cascading.jruby.demo/wc.rb ~/dev/cascading.jruby.demo/sampledata/in.txt

Build a Hadoop Ecplise Library from CDH4 jar files

I am trying to build a Hadoop library of all the jar files that I need to a build map/reduce job in Eclipse.
What are the .jar files that I need AND from what folders of the single node install of CDH4 when installed Hadoop on Ubuntu?

Assuming you've downloaded the CDH4 tarball distro from https://ccp.cloudera.com/display/SUPPORT/CDH4+Downloadable+Tarballs
Unpack the tarball
locate the build.properties file in the unpacked directory:
hadoop-2.0.0-cdh4.0.0/src/hadoop-mapreduce-project/src/contrib/eclipse-plugin
Add a property to this file for your eclipse installation directory:
eclipse.home=/opt/eclipse/jee-indigo-SR2
Finally run ant from the hadoop-2.0.0-cdh4.0.0/src/hadoop-mapreduce-project directory to build the jar
You'll now have a jar in the hadoop-2.0.0-cdh4.0.0/src/hadoop-mapreduce-project/build/contrib/eclipse-plugin/ folder
To finally answer your question, the dependency jars are now in:
hadoop-2.0.0-cdh4.0.0/src/hadoop-mapreduce-project/build/contrib/eclipse-plugin/
And to be really verbose if you want the list, see this pastebin

Develop Reference

ruby bash windows laravel spring algorithm oracle macos go visual-studio

Where is the bin directory in Hadoop project dir? - hadoop

The paths in the guide may or may not represent the actual bundle packaging. You should run find and see the results from the tar.

Related

Compile ODL controller

Hadoop Installation - Directory structure - /etc/hadoop

mvn and the make package error

Where can I find a tutorial for installing and running cascading.jruby?

Build a Hadoop Ecplise Library from CDH4 jar files

Categories

Resources