Linkage failure when running Apache Flink jobs - maven

I have a job developed in Flink 0.9 that is using the graph module (Gelly). The job is running successfully within the IDE (Eclipse) but after exporting it to a JAR using maven (mvn clean install) it fails to execute on the local flink instance with the following error
"The program's entry point class 'myclass' could not be loaded due to a linkage failure"
java.lang.NoClassDefFoundError: org/apache/flink/graph/GraphAlgorithm
Any idea why is this happening and how to solve it?

It looks like the code of flink-gelly did not end up in your jar file.
The most obvious reason for this issue is the missing maven dependency in your project's pom file. But I assume the dependency is present, otherwise developing the job in the IDE would be impossible.
Most likely, the jar file has been created by the maven-jar-plugin, which is not including dependencies.
Try adding the following fragment to your pom.xml:
<build>
<plugins>
<!-- We use the maven-shade plugin to create a fat jar that contains all dependencies
except flink and it's transitive dependencies. The resulting fat-jar can be executed
on a cluster. Change the value of Program-Class if your program entry point changes. -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>2.3</version>
<executions>
<!-- Run shade goal on package phase -->
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<filters>
<filter>
<artifact>org.apache.flink:*</artifact>
<excludes>
<exclude>org/apache/flink/shaded/**</exclude>
<exclude>web-docs/**</exclude>
</excludes>
</filter>
</filters>
<transformers>
<!-- add Main-Class to manifest file -->
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>YOURMAINCLASS</mainClass>
</transformer>
</transformers>
<createDependencyReducedPom>false</createDependencyReducedPom>
</configuration>
</execution>
</executions>
</plugin>
</plugins>
</build>
<profiles>
<profile>
<!-- A profile that does everyting correctly:
We set the Flink dependencies to provided -->
<id>build-jar</id>
<activation>
<activeByDefault>false</activeByDefault>
</activation>
<dependencies>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-java</artifactId>
<version>0.9-SNAPSHOT</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-streaming-core</artifactId>
<version>0.9-SNAPSHOT</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-clients</artifactId>
<version>0.9-SNAPSHOT</version>
<scope>provided</scope>
</dependency>
</dependencies>
</profile>
</profiles>
Now, you can build the jar using mvn clean package -Pbuild-jar.
The jar file will now be located in the target/ directory.
You can manually check whether the jar (zip) file contains class files in /org/apache/flink/graph/

Related

Missing "Artifacts" from uber jar using IntelliJ / Maven

I am developing a Flink application, and I'm very new to building Java applications.
I am using IntelliJ 2022.2.3 Community Edition, and Maven for dependency management.
I have the following dependencies in my POM file:
<!-- https://mvnrepository.com/artifact/com.amazonaws/amazon-sqs-java-messaging-lib -->
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>amazon-sqs-java-messaging-lib</artifactId>
<version>2.0.1</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.postgresql/postgresql -->
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>42.5.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/com.amazonaws/aws-kinesisanalytics-runtime -->
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-kinesisanalytics-runtime</artifactId>
<version>1.2.0</version>
</dependency>
When I build the artifact and view its contents, I notice that some of the dependancies are included, and others are missing. I expect to see the postgressql drivers at org/postgressql/... but that folder does not exist.
I have a copy of the project where the artifacts folder does contain the expected folders and when I look at the project settings/artifacts/output layout view, the postgres jars are in the list, but not in my problem project?
I read How can I create an executable/runnable JAR with dependencies using Maven? and i don't have that section in the POM, but in my case as I mentioned the 2 projects I have seem to have different artifacts missing from the jar?
Sorry for my lack of correct terminology.
UPDATE:
I should add this section is in my POM
<!-- We use the maven-shade plugin to create a fat jar that contains all necessary dependencies. -->
<!-- Change the value of <mainClass>...</mainClass> if your program entry point changes. -->
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-shade-plugin</artifactId>
<version>3.1.1</version>
<executions>
<!-- Run shade goal on package phase -->
<execution>
<phase>package</phase>
<goals>
<goal>shade</goal>
</goals>
<configuration>
<artifactSet>
<excludes>
<exclude>org.apache.flink:flink-shaded-force-shading</exclude>
<exclude>com.google.code.findbugs:jsr305</exclude>
<exclude>org.slf4j:*</exclude>
<exclude>org.apache.logging.log4j:*</exclude>
</excludes>
</artifactSet>
<filters>
<filter>
<!-- Do not copy the signatures in the META-INF folder.
Otherwise, this might cause SecurityExceptions when using the JAR. -->
<artifact>*:*</artifact>
<excludes>
<exclude>META-INF/*.SF</exclude>
<exclude>META-INF/*.DSA</exclude>
<exclude>META-INF/*.RSA</exclude>
</excludes>
</filter>
</filters>
<transformers>
<transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
<mainClass>vendor.flink.StreamProcessingNoJoin</mainClass>
</transformer>
</transformers>
</configuration>
</execution>
</executions>
</plugin>

Maven: How to execute a dependency in a forked JVM?

Using maven-exec-plugin and a java goal I execute a jar program that validates some files in my project. When the validation fails, it calls System.exit to return a non zero return code.
The problem is that it executes in the same JVM as Maven, so when it calls exit, the processing stops since it does not fork.
I configured it to execute with maven-exec-plugin and a java goal (like in here ). The execute jar is in my Nexus repository, so I want to download it as a dependency in my pom.xml.
A very nice feature of configuring the maven-exec-plugin dependency is that it downloads the jar and all its dependencies, so it isn't necessary to use maven assembly plugin to include all jars in the executable.
How do I configure my pom.xml to execute a jar dependency and correctly stop when it fails?
I solved my problem. Basically, instead of using the java goal, I must use the exec goal, and run the java executable. The code below sets the classpath and the class with the main method.
This solution using the pom.xml and a Nexus repository has a lot of advantages over just handling a jar file for my users:
No need to install anything in the machine that will run it, be it a developer machine or a continuous integration one.
The validation tool developer can release new versions and it will be automatically updated.
The developer can turn it off with a simple parameter.
Also solves the original problem: the validation tool will execute in a separate process, so the maven process won't abort when it calls System.exit.
Here is a commented pom.xml:
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.company</groupId>
<artifactId>yourId</artifactId>
<version>1.0</version>
<properties>
<!--
Skip the validation executing maven setting the parameter below
mvn integration-test -Dvalidation.skip
-->
<validation.skip>false</validation.skip>
<java.version>1.8</java.version>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
</properties>
<build>
<plugins>
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<version>1.6.0</version>
<executions>
<execution>
<id>MyValidator</id>
<phase>integration-test</phase> <!-- you can associate to any maven phase -->
<goals>
<goal>exec</goal> <!-- forces execution in another process -->
</goals>
</execution>
</executions>
<configuration>
<executable>java</executable> <!-- java must be in your PATH -->
<includeProjectDependencies>false</includeProjectDependencies>
<includePluginDependencies>false</includePluginDependencies>
<skip>${validation.skip}</skip>
<arguments>
<argument>-classpath</argument>
<classpath/> <!-- will include your class path -->
<mainClass>com.company.yourpackage.AppMain</mainClass> <!-- the class that has your main file -->
<argument>argument.xml</argument> <!-- any argument for your executable -->
</arguments>
</configuration>
</plugin>
</plugins>
</build>
<dependencies>
<dependency>
<!-- Specify your executable jar here -->
<groupId>com.company.validator</groupId>
<artifactId>validatorId</artifactId>
<version>RELEASE</version> <!-- you can specify a fixed version here -->
<type>jar</type>
</dependency>
</dependencies>
</project>
You can run more than one executable passing its id: mvn exec:exec#MyValidator
I have stumbled upon the same issue - System.exit halts the maven with exec:java.
I have experimented to use the exec:exec goal, and made it work with the following configuration:
(using exec-maven-plugin 3.1.0)
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>exec-maven-plugin</artifactId>
<executions>
<execution>
<id>generate-observability-docs</id>
<phase>prepare-package</phase>
<goals>
<goal>exec</goal>
</goals>
<configuration>
<executable>java</executable>
<arguments>
<argument>-jar</argument>
<argument>${settings.localRepository}/io/micrometer/micrometer-docs-generator/${micrometer-docs-generator.version}/micrometer-docs-generator-${micrometer-docs-generator.version}.jar</argument>
<argument>${micrometer-docs-generator.inputPath}</argument>
<argument>${micrometer-docs-generator.inclusionPattern}</argument>
<argument>${micrometer-docs-generator.outputPath}</argument>
</arguments>
</configuration>
</execution>
</executions>
<dependencies>
<dependency>
<groupId>io.micrometer</groupId>
<artifactId>micrometer-docs-generator</artifactId>
<version>${micrometer-docs-generator.version}</version>
<type>jar</type>
</dependency>
</dependencies>
</plugin>

Maven Assembly Plugin jar-with-dependencies -> No Dependencies in Jar

The following reference mentions the descriptor Reference jar-with-dependencies. Afaik it is a predefined assembly, which includes all jar dependencies into a single big self-contained jar file. This is great if you have multiple dependencies and need to copy your project to another machine because you don't need to update/delete obsolete libraries separately.
https://newfivefour.com/category_maven-assembly.html
I added the maven-assembly-plugin to my pom, and the MyTool.jar-with-dependency.jar is created. I expected that the jar contains all external dependencies, but it is the same as the normal MyTool.jar and does not contain any dependencies like apache.commons or apache.logging.
The important detail is that the dependencies scope is set to provided. Without this it works as expected. But I use the scope later on with the maven-dependency-plugin to copy all dependencies in the provided scope to a specific directory.
[...]
<dependency>
<groupId>org.apache.logging.log4j</groupId>
<artifactId>log4j-core</artifactId>
<version>2.5</version>
<scope>privided</scope>
</dependency>
</dependencies>
<build>
<!--pluginManagement-->
<plugins>
<plugin> <!-- This is the plugin I added. -->
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<descriptorRefs>
<descriptorRef>jar-with-dependencies</descriptorRef>
</descriptorRefs>
</configuration>
<executions>
<execution>
<id>make-assembly</id>
<phase>package</phase> <!-- bind to the packaging phase -->
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
[...]
I use Apache Maven 2.2.1 (rdebian-14).
How can I include the dependencies from the provided scope? Or is there an other solution?

liquibase maven plugin multiple changeLogFile

I'm using liquibase maven plugin to update the database changes via jenkins automated builds.
I have this in my pom.xml
<plugin>
<groupId>org.liquibase</groupId>
<artifactId>liquibase-maven-plugin</artifactId>
<version>3.4.2</version>
<dependencies>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>9.5</version>
</dependency>
</dependencies>
<configuration>
<changeLogFile>${basedir}/src/main/resources/schema.sql</changeLogFile>
<changeLogFile>${basedir}/src/main/resources/data.sql</changeLogFile>
<driver>org.postgresql.Driver</driver>
<url>jdbc:postgresql://${db.url}</url>
<promptOnNonLocalDatabase>false</promptOnNonLocalDatabase>
</configuration>
</plugin>
I need to run schema.sql before data.sql. When I run this locally it works. When I run it via jenkins the schema changeLogFile executes second, so in order to make it work I reversed the commads.
Question: What's the order of execution? Am I doing something wrong?
The official goal documentation specify that only one entry is foreseen:
changeLogFile:
Specifies the change log file to use for Liquibase.
Type: java.lang.String
Required: No
Expression: ${liquibase.changeLogFile}
You can add further entries, but they will be ignored and maven will not complain: it doesn't validate plugin configuration' content, it cannot, because that part is up to the plugin and not known upfront by maven. That is, is generic.
To ensure a deterministic order and have two changeLogFile executed, you should specify several plugin executions as following:
<plugin>
<groupId>org.liquibase</groupId>
<artifactId>liquibase-maven-plugin</artifactId>
<version>3.4.2</version>
<dependencies>
<dependency>
<groupId>org.postgresql</groupId>
<artifactId>postgresql</artifactId>
<version>9.5</version>
</dependency>
</dependencies>
<configuration>
<changeLogFile>${basedir}/src/main/resources/schema.sql</changeLogFile>
<changeLogFile>${basedir}/src/main/resources/data.sql</changeLogFile>
<driver>org.postgresql.Driver</driver>
<url>jdbc:postgresql://${db.url}</url>
<promptOnNonLocalDatabase>false</promptOnNonLocalDatabase>
</configuration>
<executions>
<execution>
<id>update-schema</id>
<phase>process-resources</phase>
<goals>
<goal>update</goal>
</goals>
<configuration>
<changeLogFile>${basedir}/src/main/resources/schema.sql</changeLogFile>
</configuration>
</execution>
<execution>
<id>update-data</id>
<phase>process-resources</phase>
<goals>
<goal>update</goal>
</goals>
<configuration>
<changeLogFile>${basedir}/src/main/resources/data.sql</changeLogFile>
</configuration>
</execution>
</executions>
</plugin>
Note: we are specifying a common configuration for all executions outside of the executions section, then per each execution we are only defining the additional configuration, which is every time the different file.
The deterministic order is guaranteed by Maven: for the same plugin, for the same phase, the order of declaration in the POM will be respected.
However, this executions will be part of your build now as part of the process-resources phase, which is probably not what you want. So in this case, better to move it to a profile as following:
<profiles>
<profile>
<id>liquibase-executions</id>
<build>
<defaultGoal>process-resources</defaultGoal>
<plugins>
<!-- MOVE HERE liquibase plugin configuration and executions -->
</plugins>
</build>
</profile>
</profiles>
And then execute the following (according also to your comment):
mvn -Pliquibase-executions -Ddb.url=IP:PORT/DB -Dliquibase.username=USERNAME

how to exclude GWT dependency code from OSGI bundle generated by MAven+BND?

I have several Maven modules with Vaadin library dependency in the root pom.xml file.
I'm trying to build a set of OSGI bundles (1 per Maven module) using Maven+BND.
I added this to my "root" pom.xml file:
<dependencies>
<dependency>
<groupId>com.vaadin</groupId>
<artifactId>vaadin</artifactId>
<version>6.6.6</version>
</dependency>
<dependency>
<groupId>com.google.gwt</groupId>
<artifactId>gwt-user</artifactId>
<version>2.3.0</version>
</dependency>
<dependency>
<groupId>org.apache.felix</groupId>
<artifactId>org.osgi.core</artifactId>
<version>1.0.0</version>
</dependency>
</dependencies>
unfortunately, the resulting JAR files (bundles) include GWT (com.google.gwt) classes. This
1) makes the bundles huge, with lots of duplicated dependencies.
2) generated thousands of build warnings about "split packages".
QUESTION: how to prevent adding GWT classes into my Jar files?
I tried setting "scope" of GWT to "provided", setting "type" to "bundle", and even optional=true - didn't help.
here's the part of my root pom.xml, which is responsible for Vaadin/GWT stuff:
<plugins>
<plugin>
<groupId>org.apache.felix</groupId>
<artifactId>maven-bundle-plugin</artifactId>
<version>2.3.5</version>
<extensions>true</extensions>
<configuration>
<instructions>
<Export-Package>mycompany.*</Export-Package>
<Private-Package>*.impl.*</Private-Package>
<Bundle-SymbolicName>${project.artifactId}</Bundle-SymbolicName>
<!-- <Bundle-Activator>com.alskor.publicpackage.MyActivator</Bundle-Activator>-->
</instructions>
</configuration>
</plugin>
<!-- Compiles your custom GWT components with the GWT compiler -->
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>gwt-maven-plugin</artifactId>
<!-- Version 2.1.0-1 works at least with Vaadin 6.5 -->
<version>2.3.0-1</version>
<configuration>
<!-- if you don't specify any modules, the plugin will find them -->
<!--modules>
..
</modules-->
<webappDirectory>${project.build.directory}/${project.build.finalName}/VAADIN/widgetsets
</webappDirectory>
<extraJvmArgs>-Xmx512M -Xss1024k</extraJvmArgs>
<runTarget>clean</runTarget>
<hostedWebapp>${project.build.directory}/${project.build.finalName}</hostedWebapp>
<noServer>true</noServer>
<port>8080</port>
</configuration>
<executions>
<execution>
<goals>
<goal>resources</goal>
<goal>compile</goal>
</goals>
</execution>
</executions>
</plugin>
<!-- Updates Vaadin 6.2+ widgetset definitions based on project dependencies -->
<plugin>
<groupId>com.vaadin</groupId>
<artifactId>vaadin-maven-plugin</artifactId>
<version>1.0.1</version>
<executions>
<execution>
<configuration>
<!-- if you don't specify any modules, the plugin will find them -->
<!--
<modules>
<module>${package}.gwt.MyWidgetSet</module>
</modules>
-->
</configuration>
<goals>
<goal>update-widgetset</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
The wildcards in your Export-Package and Private-Package statements strike me as exceedingly dangerous. It's possible that the GWT packages are being dragged in because of the *.impl.* pattern in Private-Package.
Also you should never use wildcards in Export-Package: exports should be tightly controlled and versioned.
use mvn dependency:tree to see where the gwt dependency comes from
Add an <excludes/> element with an appropriate <exclude/> to the dependency in question to suppress it.
I've had similar problem, as final war file exceeded almost 90MB !
One of the culprit was aforementioned jar, so I did this :
<dependencies>
<dependency>
<groupId>${project.groupId}</groupId>
<artifactId>widgetset</artifactId>
<version>3.2</version>
<exclusions>
<exclusion>
<groupId>com.vaadin.external.gwt</groupId>
<artifactId>gwt-user</artifactId>
</exclusion>
</exclusions>
</dependency>
...
</dependencies>

Resources