How to run multiple java files at once? - hadoop

I have multiple .java files for my hadoop project. How do I execute them without using eclipse?
PS: I use the default ubuntu terminal

You will need to compile your .java files into a jar and then use the hadoop jar command to execute it. If your code has external dependencies, you will need to either use the -libjars flag or create a fat jar.

You can use javac *.java to compile all the files in the current working directory. As for executing them, use java filename
Where the filename does NOT have the .class at the end. IE, you have MyProgram.java which compiles into MyProgram.class you would type: java MyProgram
You want the run the main class of your project btw (slightly vague, but it might be the only one with a main method).

Following is the simple steps for compiling hadoop java files and executing programs:
Compile:
javac -classpath < HADOOP_INSTALL_DIR>/hadoop-< Version>.jar -d < OUTPUT_DIR_NAME> ( < YOUR_MUTLIPLE_JAVAFILES_PATH> --like *.java)
Build jar file:
jar cvf < YOUR_JAR_FILE_PATH_WITH_NAME> -C <(previous compiled output)OUTPUT_DIR_NAME>
Run Hadoop program using Hadoop Jar command:
hadoop jar < JAR_FILE_PATH> < MAIN_PROGRAM_NAME_IN_JAR> < INPUT_PARAMETERS_IF-ANY>
Hope this helps!

Related

Error: Unable to access jarfile build/libs/gs-spring-boot-0.1.0.jar?

I follow the instructions in https://spring.io/guides/gs/spring-boot/#scratch, but when it says to run:
./gradlew build && java -jar build/libs/gs-spring-boot-0.1.0.jar
the build fails with the above error.
There is message before the failure that says:
Deprecated Gradle features were used in this build, making it incompatible with Gradle 5.0.
See https://docs.gradle.org/4.8.1/userguide/command_line_interface.html#sec:command_line_warnings
but everyone online says that's just a warning.
The build doesn't appear to create or download build/libs/gs-spring-boot-0.1.0.jar.
Currently completely blocked on first attempt to use Gradle.
I just had this problem.
The tutorial is in error in what you need to run. It should be
$ gradlew build && java -jar build/libs/gs-rest-service-0.1.0.jar
I think that they updated the code, but forgot to update the tutorial.
I had the same issue when build a simple project with Maven on Intellij IDEA. (Ubuntu 18.04.2).
Just typed terminal (in project directory):
$ sudo mvn package
$ java -jar ./target/(your-project-name)-(<version> at pom.xml).jar
For example my project name is hello-world-spring and version name in pom.xml is <version>0.0.1-SNAPSHOT</version>, I have to type:
$ sudo mvn package
$ java -jar ./target/hello-world-spring-0.0.1-SNAPSHOT.jar
Maybe this method can work for gradle as well.
Please check the path of the jar file build/libs/gs-spring-boot-0.1.0.jar. For your case, the jar might be in a different folder. If your code is in a module in the main project, then the jar will be in the build folder of the module.
If you git clone the repo, then the tutorial works. If you "To start from scratch, move on to Build with Gradle.", then the tutorial doesn't work. There are missing setup steps.
I got the same issue and I changed the command to java -jar target/rest-service-0.0.1-SNAPSHOT.jar (I checked the .jar file in target folder and found that the file name was incorrect).
Parent folder of my project was having spaces in it's name, i changed it to the underscore and it worked.
Looked at the command line as it was in the official guide:
./gradlew clean build && java -jar build/libs/gs-actuator-service-0.1.0.jar
First, the above command line has two parts:
(1) ./gradlew clean build //Use gradle wrapper to build
(2) java -jar build/libs/gs-actuator-service-0.1.0.jar //To run an application packaged as a JAR file
Now, one might run into issues with one part or both parts. Separating them and running just on thing at a time helped troubleshoot.
(1) didn't work for my Windows, I did the following instead and that built the application successfully.
.\gradlew.bat clean build
Now moving to (2) java -jar build/libs/gs-actuator-service-0.1.0.jar
It literally means that "Run a jar file that is called gs-actuator-service-0.1.0.jar under this directory/path: build/libs/" Again, for Windows, this translates to build\libs\ , and there's one more thing that may catch you: The jar file name can be slightly different depending on how it was actually named by the configuration in initial/setting.gradle:
rootProject.name = 'actuator-service'
Note that the official guide changed it from 'gs-actuator-service' to 'actuator-service' in their sample code but hasn't updated the tutorial accordingly. But now you know where the jar file name comes from, that doesn't matter anymore, and you have the choice to rename it however you want.
Having all the factors adjusted, below is what eventually worked in my case:
java -jar build\libs\actuator-service-0.0.1-SNAPSHOT.jar
or
java -jar C:\MyWorkspace\Spring\gs-actuator-service\initial\build\libs\actuator-service-0.0.1-SNAPSHOT.jar //with fully qualified path
If you are curious where does "-0.0.1-SNAPSHOT" come from, here it is:
in build.gradle
version = '0.0.1-SNAPSHOT'
Again, you have the choice to modify it however you want. For example, if I changed it to 0.0.2-SNAPSHOT, the command line should be adjusted accordingly
java -jar build\libs\actuator-service-0.0.2-SNAPSHOT.jar
Reference: https://docs.oracle.com/javase/tutorial/deployment/jar/basicsindex.html
Because you are trying to execute .jar file that doesn't exist. After building the project go to ./build/libs and check the name of freshly built .jar file and then in your project directory run:
./gradlew build && java -jar build/libs/name-of-your-jar-file.jar
or you can set version property to empty string in your build.gradle file
version = ''
after that:
./gradlew build && java -jar build/libs/your-project-name.jar
For Windows, these commands solved the problem: "Error: Unable to access jarfile springboot.jar":
cd target
java -jar springboot-0.0.1-SNAPSHOT.jar
run ./mvnw package
Now a folder named target is created and you can see a jar file inside it.
then execute java -jar target/<jarfilename>

Is maven JAR runnable on hadoop?

To produce jar from hadoop mapreduce program(mapreduce wordcount example) i used maven.
Here i successfully done 'clean' and 'install'.
Also 'build' successfully by running as a Java Application by including arguments(input and output).
And it provided expected result successfully.
Now the problem is not running on hadoop.
Giving the following error:
Exception in thread "main" java.lang.ClassNotFoundException: WordCount
Is maven JAR runnable on hadoop?
Maven is a build tool which creates a Java Artifact. Any JAR containing the hadoop dependencies and the class having main() method in the Manifest file should be working with the hadoop.
Try running your JAR using the below command
hadoop jar your-jar.jar wordcount input output
where "wordcount" is the name of the class with main method,"input" and "output" are the arguments.
Two things
1) I think you are missing the package details before the classname. Copy your package name and put it before the classname and it should work.
hadoop jar /home/user/examples.jar com.test.examples.WordCount /home/user/inputfolder /home/user/outputfolder
PS: If you are using a jar for which the source code is not available with you, you can do
jar -tvf /home/user/examples.jar
and it will print all the classses with their folder names. Replace the "/" with "." (dot) and you get the package name. But this needs JDK (not JRE) in the PATH.
2) You are trying to run a MapReduce program from a Windows prompt. Are you sure you have Hadoop installed on your Windows?

How do I run a java program in cmd?

I have created a program in Java, but it is not taking the inputs correctly in edit-plus(compiler) so now I want it to run in cmd. My JSK file is at : C:\Program Files\Java\jdk1.7.0_21\bin and my Java file is at: C:\TurboC4\TC\java new programs
Please tell me the steps to run it in cmd.
On the command line use:
java -jar path/to/your/jar_file.jar
if you do not have a jar file, than you have to compile first your Java classes:
javac -g Foo.java
if you have just a single file (containing a static void main()) than you can simply run it with:
java path/to/your/compiled_class_file [<command line args>, ...]
Note: Run the command above without .class extension. i.e.
java Foo
if you want to generate a jar file from your compiled .class files run:
jar cf jar-file input-file(s)
However, I would recommend you to use a IDE that compiles, packs and runs your code for you automatically with one click. i.e. IntelliJ or Eclipse
If your class is not into a package and is compiled as java.class in C:\TurboC4\TC\:
cd C:\TurboC4\TC\
C:\Program Files\Java\jdk1.7.0_21\bin java
If you just want to run your single java class as a command line program, then the answer is in This question.
Example:
java my.class.HelloWorld
If you have compiled an entire java project into a JAR file, check this Stackoverflow question for the answer.
Example:
java -cp c:\location_of_jar\myjar.jar com.mypackage.myClass

Unable to execute Map/Reduce job

I've been trying to figure out how execute my Map/Reduce job for almost 2 days now. I keep getting a ClassNotFound exception.
I've installed a Hadoop cluster in Ubuntu using Cloudera CDH4.3.0. The .java file (DemoJob.java which is not inside any package) is inside a folder called inputs and all required jar files are inside inputs/lib.
I followed http://www.cloudera.com/content/cloudera-content/cloudera-docs/HadoopTutorial/CDH4/Hadoop-Tutorial/ht_topic_5_2.html for reference.
I compile the .java file using:
javac -cp "inputs/lib/hadoop-common.jar:inputs/lib/hadoop-map-reduce-core.jar" -d Demo inputs/DemoJob.java
(In the link, it says -cp should be "/usr/lib/hadoop/:/usr/lib/hadoop/client-0.20/". But I don't have those folders in my system at all)
Create jar file using:
jar cvf Demo.jar Demo
Move 2 input files to HDFS
(Now this is where I'm confused. Do I need to move the jar file to HDFS as well? It doesn't say so in the link. But if it is not in HDFS, then how does the hadoop jar .. command work? I mean how does it combine the jar file which is in Linux system and the input files which are in HDFS?)
I run my code using:
hadoop jar Demo.jar DemoJob /Inputs/Text1.txt /Inputs/Text2.txt /Outputs
I keep getting ClassNotFoundException : DemoJob.
Somebody please help.
The class not found exception only means that some class wasn't found when class DemoJob was loaded. The missing class could have been a class referenced (imported, for example) by DemoJob. I think the problem is that you don't have the /usr/lib/hadoop/:/usr/lib/hadoop/client-0.20/ folders (classes) in your class path. It's the classes that should be there but aren't that probably are triggering the class not found exception.
Finally figured out what the problem was. Instead of creating a jar file from a folder, I directly created the jar file from the .class files using jar -cvf Demo.jar *.class
This resolved the ClassNotFound error. But I don't understand why it was not working earlier. Even when I created the jar file from a folder, I did mention the folder name when executing the class file as:hadoop jar Demo.jar Demo.DemoJob /Inputs/Text1.txt /Inputs/Text2.txt /Outputs

How run mahout in action example ReutersToSparseVectors?

I want run "ReutersToSparseVectors.java". I can compile and created JAR file without problem.
I compiled this file by below command:
javac -classpath hadoop-core-0.20.205.0.jar:lucene-core-3.6.0.jar:mahout-core-0.7.jar:mahout-math-0.7.jar ReutersToSparseVectors.java
created JAR file with below command:
jar cvf ReutersToSparseVectors.jar ReutersToSparseVectors.class
When I write java -jar ReutersToSparseVectors.jar to run, give me below error:
Failed to load Main-Class manifest attribute from
ReutersToSparseVectors.jar
Do you can help me to solve this problem?
IF this example can run with hadoop, please me that how i can run this with hadoop.
instead of using -jar option, then it's better to to run:
java -cp mahout-core.jar:... mia.clustering.ch09.ReutersToSparseVectors
or you can use mvn exec:java command, as described in README for examples...
mvn exec:java -Dexec.mainClass="mia.clustering.ch09.ReutersToSparseVectors"
Or you can run this file directly from your IDE (assuming, that you correctly imported Maven project).
P.S. your command isn't working, because to run with -jar switch, the .jar file should have special entry in manifest that describes that class should be started by default...
P.P.S. It's better to use book's examples with Mahout 0.7, as they were tested for it. You can use it with version 0.7 if you need, by then you need to take source code from mahout-0.7 branch of repository with examples (link is above)

Resources