parallelize code in a gradle task - gradle

I have a task which essentially performs the following:
['subproj1', 'subproj2'].each { proj ->
GradleRunner.create()
.withProjectDir(file("./examples/${proj}/"))
.withArguments('check')
.build()
}
The check is a system test and requires connecting to 3rd party services, so I would like to parallelize this.
Can this be done in gradle? If so, how?
I tried using java threading but the builds failed with errors which I can't remember what they were exactly, but they suggested that the gradle internal state had gotten corrupted.

Did you tried to use the experimental parallel task execution? On the first glance it's quite simple. You just call ./gradlew --parallel check (or if it turns out to work fine for you can also define this in your gradle.properties). This will start n threads (where n is the number cpu cores) which will execute your tasks. Each thread owns a certain project so the tasks of one project will never be executed in parallel.
You can override the number of tasks (or worker) by setting the property --max-workers at the command line or by setting org.gradle.workers.max=n at your gradle.properies.
If you are just interested in executing tests in parallel than you might try to set Test.setMaxParallelForks(int). That will cause the to execute the tests of one project (if I understood this right) in parallel (with the number of tasks you defined).
Hope that helps. Maybe the gradle documentation gives you some more insights: https://docs.gradle.org/current/userguide/multi_project_builds.html#sec:parallel_execution

While multiproject build is the right way to go in the long term, my short term approach was to use GPars:
import groovyx.gpars.GParsPool
buildscript {
repositories {
mavenCentral()
}
dependencies {
classpath "org.codehaus.gpars:gpars:1.1.0"
}
}
task XXXX << {
GParsPool.withPool(10) {
['subproj1', 'subproj2'].eachParallel { proj ->
GradleRunner.create()
.withProjectDir(file("./examples/${proj}/"))
.withArguments('check')
.build()
}
}
}

Related

A code generator task in a multi-project gradle build

I have studied thousand similar questions on SO and I am still lost. I have a simple multiproject build:
rootProject.name = 'mwe'
include ":Generator"
include ":projectB"
include ":projectC"
with a top level build.gradle as follows (settings.gradle):
plugins { id "java" }
allprojects { repositories { jcenter() } }
and with two kinds of project build.gradle files. The first one (Generator) exposes a run command that runs the generator taking the command line argument:
plugins {
id "application"
id "scala"
}
dependencies { compile "org.scala-lang:scala-library:2.12.3" }
mainClassName = "Main"
ext { cmdlineargs = "" }
run { args cmdlineargs }
The code generator is to be called from projectB (and an analogous projectC, and many others). I am trying to do this as follows (projectB/build.gradle):
task TEST {
project (":Generator").ext.cmdlineargs = "Hurray!"
println ("Value set:" + project(":Generator").ext.cmdlineargs )
dependsOn (":Generator:run")
}
Whatever I try to do (a gradle newbie here) I am not getting what I need. I have two problems:
The property cmdlineargs is not set at the point that task :projectB:TEST is run. The println sees the right value but the argument passed to the executed main method is the one configured in Generator/build.gradle, not the one in projectB/build.gradle. As pointed out in responses this can be work around using lazy property evaluation, but this does not solve the second problem.
The generator is only run once, even if I build both projectB and projectC. I need to run Generator:run for each of projectB and projectC separately (to generate different sources for each dependent project).
How can I get this to work? I suppose a completely different strategy is needed. I don't have to use command line and run; I can also try to run the main class of the generator more directly and pass arguments to it, but I do find the run task quite convenient (the complex classpath is set up automatically, etc.). The generator is a Java/Scala project itself that is compiled within the same multi-project build.
Note: tasks aren't like methods in java. A task will execute either 0 or 1 times per gradle invocation. A task will never execute twice (or more) in a single Gradle invocation
I think you want two or more tasks. Eg:
task run1(type:xxx) {
args 'foo'
}
task run2(type:xxx) {
args 'bar'
}
Then you can depend on run1 or run2 in your other projects.

Execute more than one command in a task without breaking incremental build

We use gradle incremental builds and want to run two commands (ideally from one task). One of the solutions here worked getting the two commands running... however it breaks incremental builds.. It looks something like:
task myTask() {
inputs.files(inputFiles)
project.exec {
workingDir web
commandLine('mycmd')
}
project.exec {
workingDir web
commandLine('mysecond-cmd')
}
}
if running a single command and incremental builds is working, the task looked similar to this, the thing that seems to make the difference is the workingDir:
task myTask(type: Exec) {
workingDir myDir // this seems to trigger/enable continuos compilation
commandLine ('myCmd')
}
the best alternative so far is create 3 tasks, one for each of the cmdline tasks I want to run and a third one to group them, which seems dirty.
The question is: Is there a way to run two or more commands in one task with incremental builds still working?
I'll try to answer the question from the comments: how can I signal from a task that has no output files that the build should watch certain files. Unfortunately, this is hard to answer without knowing the exact use case.
To start with, Gradle requires some form of declared output that it can check to see whether a task has run or whether it needs to run. Consider that the task may have failed during the previous run, but the input files haven't changed since then. Should Gradle run the task?
If you have a task that doesn't have any outputs, that means you need to think about why that task should run or not in any given build execution. What's it doing if it's not creating or modifying files or interacting with another part of the system?
Assuming that incremental build is the right solution — it may not be — Gradle does allow you to handle unusual scenarios via TaskOutputs.upToDateWhen(Action). Here's an example that has a straightforward task (gen) that generates a file that acts as an input for a task (runIt) with no outputs:
task gen {
ext.outputDir = "$buildDir/stuff"
outputs.dir outputDir
doLast {
new File(outputDir, "test.txt").text = "Hurray!\n"
}
}
task runIt {
inputs.dir gen
outputs.upToDateWhen { !gen.didWork }
doLast {
println "Running it"
}
}
This specifies that runIt will only run if the input files have changed or the task gen actually did something. In this example, those two scenarios equate to the same thing. If gen runs, then the input files for runIt have changed.
The key thing to understand is that the condition you provide to upToDateWhen() is evaluated during the execution phase of the build, not the configuration phase.
Hope that helps, and if you need any clarification, let me know.

How to run two methods in Gradle in parallel mode?

I got a Gradle task which calls two methods. The methods are independent of each other so I want to start them parallel. What is the best way to achieve this?
Example:
task cucumber(dependsOn: 'testClasses') {
doLast {
// ...
// I want to call the next 2 methods in parallel
runSequentialTests(testEnvironment, tags)
runParallelTests(testEnvironment, tags)
}
}
def runSequentialTests(testEnvironment, tags) {
// execute cucumber tests via javaexec
}
def runParallelTests(testEnvironment, tags) {
// execute cucumber tests via exec sh script for each feature file (parallelism is done via GParsPool.withPool(5) {...}
}
Worth researching groovy parallel systems library gpars
GParsPool.withPool {
GParsPool.executeAsyncAndWait({runSequentialTests(testEnvironment, tags)}, {runParallelTests(testEnvironment, tags)})
}
Or maybe create a #ParallelizableTask
But I'm not sure what version of gradle you are using. And parallel is/was an incubating feature.
And needs to run build with --parallel if those have no dependencies between themselves Gradle should run them independently.
Alternatively, you can specify a property in your gradle.properties
C:\Users\<user>\.gradle\gradle.properties
org.gradle.parallel=true
I used the asynchronous invocations of GPars: http://www.gpars.org/webapp/guide/#_asynchronous_invocations
I got it to work with this. Thanks
GParsPool.withPool {
GParsPool.executeAsyncAndWait({runSequentialTests(testEnvironment, tags)}, {runParallelTests(testEnvironment, tags)})
}

Getting a list of all Gradle tasks from build script

So I'm trying to write a list of all gradle tasks to a file. I of course could use the tasks command for this, but I want cache it to a file every time any other gradle command is called. So whenever I run ./gradlew build for example, I want the available tasks to be written to a file.
This seemed simple enough, and I wrote the below task to try it out:
task cacheTasks() {
doLast {
allprojects { p ->
p.tasks*.each { t ->
println(p.name + ":" + t.name)
}
}
}
}
The problem is, that I only get a sub-set of all the tasks available. When I run the ./gradlew tasks --all command, many more are printed. It seems that none of the built-in tasks (like build, clean or help) are in the tasks* List when I loop over it, but oddly enough I can reference them directly:
tasks.build { t ->
println("DEBUG:" + t.name)
}
It seems so simple, yet I've been searching in vain for a solution. I even tried looking in the gradle source code to see how the tasks Task works, but I couldn't find any clue as to why this doesn't work.
rootProject.getAllTasks(true) looks like it's retrieving more tasks than rootProject.tasks.
I highly doubt there is a task class in container with name build.
This is what I get when I debug your task:
I am not saying gradle build does not run, but it can be in other forms maybe an instance of org.gradle.api.tasks.GradleBuild. (I am not very sure because the gradle source code is very hard for me to compile and run).
When using
tasks.build { t ->
println("DEBUG:" + t.name)
}
You actually call org.gradle.api.tasks.TaskContainer#create(java.util.Map<java.lang.String,?>, groovy.lang.Closure) and create a new task named build.

Why does my large multi-project gradle build start up slowly?

I am on a team that has a large multi-project build (~350 modules) and we have noticed that doing an individual build (-a) still takes a large amount of time. When building the entire set of things the overhead isn't so bad, but when we do something like:
cd my/individual/project
gradle -a --configure-on-demand --daemon build
that it still takes 30-40 seconds just in the configuration phase (before it starts building anything, which we measure using the --dry-run option)
We have approximately 10 custom tasks and we are setting the inputs and outputs for those tasks but we still see this inordinately long configuration time. We are using gradle 1.10
It's hard to say from a distance. A common mistake is to do work in the configuration phase that should be done in the execution phase. Here is an example:
task myTask {
copy {
from ...
to ...
}
}
Here, the project.copy method will be called (and therefore copying will take place) every single time the task is configured, which is for every single build invocation - no matter which tasks are run! To fix this, you'd move the work into a task action:
task myTask {
doLast {
copy {
from ...
into ...
}
}
}
Or even better, use a predefined task type:
task myTask(type: Copy) {
from ...
into ...
}
Some steps you can take to find the cause(s) for performance problems:
Review the build
Run with --info, --debug, or --profile
Profile the build with a Java profiler
PS: Specifying inputs and outputs doesn't shorten configuration time (but likely execution time).

Resources