I'm working on a recommender system using Apache Flink. The implementation is running when I test it in IntelliJ, but I would like now to go on a cluster. I also built a jar file and tested it locally to see if all was working but I encountered a problem.
java.lang.NoClassDefFoundError: org/apache/flink/ml/common/FlinkMLTools$
As we can see, the class FlinkMLTools used in my code isn't found during the running of the jar.
I built this jar with Maven 3.3.3 with mvn clean install and I'm using the version 0.9.0 of Flink.
First Trail
The fact is that my global project contains other projects (and this recommender is one of the sub-project). In that way, I have to launch the mvn clean install in the folder of the right project, otherwise Maven always builds a jar of an other project (and I don't understand why). So I'm wondering if there could be a way to say explicitly to maven to build one specific project of the global project. Indeed, perhaps the path to FlinkMLTools is contained in a link present in the pom.xml file of the global project.
Any other ideas?
The problem is that Flink's binary distribution does not contain the libraries (flink-ml, gelly, etc.). This means that you either have to ship the library jar files with your job jar or that you have to copy them manually to your cluster. I strongly recommend the first option.
Building a fat-jar to include library jars
The easiest way to build a fat jar which does not contain unnecessary jars is to use Flink's quickstart archetype to set up the project's pom.
mvn archetype:generate -DarchetypeGroupId=org.apache.flink \
-DarchetypeArtifactId=flink-quickstart-scala -DarchetypeVersion=0.9.0
will create the structure for a Flink project using the Scala API. The generated pom file will have the following dependencies.
<dependencies>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-scala</artifactId>
<version>0.9.0</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-scala</artifactId>
<version>0.9.0</version>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients</artifactId>
<version>0.9.0</version>
</dependency>
</dependencies>
You can remove flink-streaming-scala and instead you insert the following dependency tag in order to include Flink's machine learning library.
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-ml</artifactId>
<version>0.9.0</version>
</dependency>
When you know build the job jar with mvn package, the generated jar should contain the flink-ml jar and all of its transitive dependencies.
Copying the library jars manually to the cluster
Flink includes all jars which are located in the <FLINK_ROOT_DIR>/lib folder in the classpath of the executed jobs. Thus, in order to use Flink's machine learning library you have to put the flink-ml jar and all needed transitive dependencies into the /lib folder. This is rather tricky, since you have to figure out which transitive dependencies are actually needed by your algorithm and, consequently, you will often end up copying all transitive dependencies.
How to build a specific sub-module with maven
In order to build a specific sub-module X from your parent project you can use the following command:
mvn clean package -pl X -am
-pl allows you to specify which sub-modules you want to build and -am tells maven to also build other required sub-modules. It is also described here.
In cluster mode, Flink does not put all library JAR files into the classpath of its workers. When executing the program locally in IntelliJ all required dependencies are in the classpath, but not when executing on a cluster.
You have two options:
copy the FlinkML Jar file into the lib folder of all Flink TaskManager
Build a fat Jar file for you application that includes the FLinkML dependencies.
See the Cluster Execution Documentation for details.
Related
I have a given eclipse maven project which builds to a jar. The pom has one major dependency of BiRT 4.8.0-202010080643 Runtime.
<dependency>
<groupId>com.customer.birt.runtime</groupId>
<artifactId>org.eclipse.birt.runtime</artifactId>
<version>4.8.0-202010080643</version>
</dependency>
So they pushed the artifact into their own nexus; thats why com.customer.birt.runtime.
I really don't know how the guy did that and which tools he used. Currently I want to update to BiRT 4.9. Replacing the above with the only available:
<dependency>
<groupId>org.eclipse.birt</groupId>
<artifactId>birt-runtime</artifactId>
<version>4.9.0</version>
<type>pom</type>
</dependency
does not go well. Both are totally different constellations from the same big project. How can I make use of the above maven dependency of 4.9 in my simple birt project? I'm building only a service for a desktop application that is hosted and run within an RCP application. I started to list the individual maven deps so that the java compiles which I succeeded to but I still have few unit tests that execute and render ReportEngine and fail because of missing Deps at runtime. This is because the ReportEngine is loading APIs at runtime..
I started to post here once I noticed that I will be declaring the separate deps in pom.xml blindly which is (even if the Unittests pass) very unreliable..
Thank you so much!
M.Abdu
My solution was currently as I put in the comments or yet simpler. I just uploaded manually the birt-runtime jar into nexus using my account within the customer and then put in my pom the exact same unique coordinates groupid:artifactid:version. Plus some other dependencies depending of what my unit tests are asking at runtime, e.g. eclipse.platform, emf.core, w3c, batik.css etc.
I am talking about executing the build using mvn clean verify and resulting a jar file
The jar you get from here
https://search.maven.org/remotecontent?filepath=org/eclipse/birt/birt-runtime/4.9.0/birt-runtime-4.9.0.zip
pom in my case:
<dependency>
<groupId>org.eclipse.birt</groupId>
<artifactId>runtime</artifactId>
<version>4.9.0-20220502</version>
</dependency>
I've been trying to make a runnable jar from my project (in Intellij IDEA) which has a dependency to an oracle (driver -> ojdbc6) jar. When I package the project with all of the dependencies, the only one what will be excluded is the jar. Which means my db queries are going to fail when I run it.
I've found several similar questions*, but I've failed the execution of them, because I don't know the groupid and artifact id of the oracle's jar.
*like this one: build maven project with propriatery libraries included
p.s.: the jar wad added through the IDEA's feature (project structure -> modules), and with this solution the project could run without failure. The problem starts with the packaging.
Short Solution: Try using the below:
<dependency>
<groupId>LIB_NAME</groupId>
<artifactId>LIB_NAME</artifactId>
<version>1.0.0</version>
<scope>system</scope>
<systemPath>${basedir}/WebContent/WEB-INF/lib/YOUR_LIB.jar</systemPath> // give the path where your jar is present
</dependency>
Make sure that the groupId, artifactID and the version number are unique.
Long Solution:
Download the jar file to your machine.
Navigate using the prompt to the folder where you downloaded the jar.
Run the following command to install the jar to your local repository.
mvn install:install-file -DgroupId=com.oracle -DartifactId=ojdbc6 -Dversion=11.2.0.3 -Dpackaging=jar -Dfile=ojdbc6.jar -DgeneratePom=true
Finally, add the dependency to the pom.xml.
<dependency>
<groupId>com.oracle</groupId>
<artifactId>ojdbc6</artifactId>
<version>11.2.0.3</version>
</dependency>
Also, don't forget to use -U option while running the project.
Because Tomcat tells us to have the mysql-connector-java in its lib/ directory, so that it can handle multiple projects, I had my dependency as provided:
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.36</version>
<scope>provided</scope>
</dependency>
And I extracted the jar archive mysql-connector-java-x.x.x-bin.jar from the downloaded dependency and copied it into the lib folder of the Tomcat server:
cp ~/.m2/repository/mysql/mysql-connector-java/5.1.36/mysql-connector-java-5.1.36.jar lib/
But when I now run a build the tests phase fails since it cannot connect to the data store.
The build would succeed if commenting out the provided scope.
There must be a simple way around this...
UPDATE: I could run the Maven Tomcat 7 command: mvn clean install tomcat7:run -Denv="preprod" after adding the mysql-connector-java dependency in the tomcat7-maven-plugin plugin. But I still cannot run the tests, I have a connection failed when running the maven-surefire-plugin tests.
I used both provided and test scopes as in:
<scope>provided test</scope>
and it now also make the connector in the tests phase.
The XADisk library deployed on Maven Central packaged as 'rar' instead of 'jar'. But i just need the jar (and possibly source) for the project i'm working on. I was wondering what the best (maven style) way is to deal with this dependency.
The jar files are available on Central but not specified in the pom thus type="jar" doesnt work
the pom is here: https://repo1.maven.org/maven2/net/java/xadisk/xadisk/1.2.2/xadisk-1.2.2.pom
and the jars can be found here: https://repo1.maven.org/maven2/net/java/xadisk/xadisk/1.2.2/xadisk-1.2.2.pom
I can't reproduce your issue, maybe they changed something in the repo?
If I add the following dependency to my project:
<dependency>
<groupId>net.java.xadisk</groupId>
<artifactId>xadisk</artifactId>
<version>1.2.2</version>
</dependency>
then the JAR file gets packaged into my project.
By the way, if no "type" is specified (for the dependency), maven uses JAR as default.
I have a maven project B which is packaged as a war B.war and has a 'local' dependency A.jar. The pom for building A.jar has a dependency on restFB and it resolves properly while compiling.
<dependency>
<groupId>com.restfb</groupId>
<artifactId>restfb</artifactId>
<version>${com.restfb-version}</version>
</dependency>
However when I package B.war, restFB's jar is not present in the WEB-INF/lib directory of B.war and execution throws NoClassDefFoundError. What is also baffling is that I find this happening only when I build it on an AWS Amazon Linux and not while building on Ubuntu. There are similar questions in SO which suggest adding
<packaging>war</packaging>
which I already have but doesn't seem to solve the problem. Any ideas how to solve this?
It should be simple to understand, that you might have invoke locally under Ubuntu
mvn install
for A. Therefore it is in the local .m2 repository.
Did AWS Amazon Linux also have this artifact in its repository? If not, copy it there and try to package again.
Also you explicitly include and exclude certain artifacts within the build configuration for maven-war-plugin