Optionally build multiple jobs in a row - continuous-integration

We currently have created several jobs for our components. These components all depend on each other like the following:
A -> B -> C
Currently it is possible to run these jobs separately independent from each other. If someone is running C the build uses A and B artifacts from a previous build.
Now it should be possible to optionally build these jobs in a row. My first thought was some kind of a BuildAll-job which starts the other jobs in the right order, but it does not seem to be possible to start other jobs in a buildstep.
Solving this by using the Build other projects-option is not a solution, because this would always trigger the other builds if someone e.g. starts A.
So anyone got an idea on how to solve this? Is something like this possible? Perhaps I missed an option/plugin to use other jobs as buildsteps?

I would look at using the Parameterized Trigger plugin:
https://wiki.jenkins-ci.org/display/JENKINS/Parameterized+Trigger+Plugin
It allows you to trigger another job as a build step, with parameters if you need them. This would allow you to create BuildAll job that calls A, then B, then C in sequence.

Have you considered:
https://wiki.jenkins-ci.org/display/JENKINS/Join+Plugin
This can help you with the "Build-All" step if you want to go down that path.
However, one part that I do not understand is that,
if A -> B -> C,
how are any optional? If you can clarify, might be able to help you better.

Related

Does Kedro support Checkpointing/Caching of Results?

Let's say we have multiple long running pipeline nodes.
It seems quite straight forward to checkpoint or cache the intermediate results, so when nodes after a checkpoint are changed or added only these nodes must be executed again.
Does Kedro provide functionality to make sure, that when I run the pipeline only those steps are
executed that have changed?
Also the reverse, is there a way to make sure, that all steps that have changed are executed?
Let's say a pipeline producing some intermediate result changed, will it be executed, when i execute a pipeline depending on the output of the first?
TL;DR: Does Kedro have makefile-like tracking of what needs to be done and what not?
I think my question is similar to issue #341, but I do not require support of cyclic graphs.
You might want to have a look at the IncrementalDataSet alongside the partitioned dataset documentation, specifically the section on incremental loads with the incremental dataset which has a notion of "checkpointing", although checkpointing is a manual step and not automated like makefile.

Retrigger build chain and pick up changed source

I have three build configurations inside a build. Let's say A, B and C. C depends on B and B depends on A.
Suppose I trigger a build manually on C. Now, A, B, C steps will get queued for build.
Step A might cause a source update and commit to source control. When this happens, I want to stop the entire build chain first. Then I want to Retrigger (automatically) C with the same parameters as when it was run manually in the first place - but using the new source.
Is there any way to get this done?
Yes, you can absolutely do this with native functionality in 3 steps.
Force Failure (for A)
Configure "Fail when dependency fails" (for B and C)
"Retrigger on failure" (C).
Step 1
You can make a build fail manually by setting the build status. There is a post on this in the jetbrains support issues:
To fail build it should be enough to print something like this:
"##teamcity[buildStatus status='FAILURE']"
For the sake of maintainability, I recommend doing this in an extra command line build step.
Step 2
You can trigger canceling a build when a dependency fails. This can chain transitively. For your project B and C, set the following in the snapshot dependency:
Step 3
There is a build trigger called "Retry Build Trigger" that retriggers a build when that build has failed. You may decide what retry number is appropriate for you, depending how often A changes. Important: Uncheck "Trigger new build with the same revisions".

Is snakemake the right tool to use for handling output mediated workflows

I'm new to trying out snakemake (last week or so) in order to handle less of the small details for workflows, previously I have coded up my own specific workflow through python.
I generated a small workflow which among the steps would use Illumina PE reads and ran Kraken against them. I'd then parse the output of the Kraken output to detect the most common species (within a set of allowable) if a species value wasn't provided (running with snakemake -s test.snake --config R1_reads= R2_reads= species=''.
I have 2 questions.
What is the recommended approach given the dynamic output/input?
Currently my strategy for this is to create a temp file which
contains the detected species and then cat {input.species} it into
other shell commands. This doesn't seem elegant but looking through
the docs I couldn't quite find an adequate alternative. I noticed
PersistentDicts would let me pass variables between run: commands
but I'm unsure if I can use that to load variables into a shell:
section. I also noticed that wrappers could allow me to handle it
however from the point I need that variable on I'd be wrapping the
remainder of my workflow.
Is snakemake the right tool if I want to use the species afterwards to run a set of scripts specific to the species (with multiple species specific workflows)?
Right now my impression on how to solve this problem is to have multiple workflow files for the species and have a run with switch which calls the associated species workflow dependant on the species.
Appreciate any insight on these questions.
-Kim
You can mark output as dynamic (e.g. expecting one file per species). Then, Snakemake will determine the downstream DAG of jobs after those files have been generated. See http://snakemake.readthedocs.io/en/stable/snakefiles/rules.html#dynamic-files

Better to use task dependencies or task.doLast in Gradle?

After building my final output file with Gradle I want to do 2 things. Update a local version.properties file and copy the final output final to some specific directory for archiving. Let's assume I already have 2 methods implemented that do exactly what I just described, updateVersionProperties() and archiveOutputFile().
I'm know wondering what's the best way to do this...
Alternative A:
assembleRelease.doLast {
updateVersionProperties()
archiveOutputFile()
}
Alternative B:
task myBuildTask(dependsOn: assembleRelease) << {
updateVersionProperties()
archiveOutputFile()
}
And here I would call myBuildTask instead of assembleRelease as in alternative A.
Which one is the recommended way of doing this and why? Is there any advantage of one over the other? Would like some clarification please... :)
Whenever you can, model new activities as separate tasks. (In your case, you might add two more tasks.) This has many advantages:
Better feedback as to which activity is currently executing or failed
Ability to declare task inputs and outputs (reaping all benefits that come from this)
Ability to reuse existing task types
More possibilities for Gradle to execute tasks in parallel
Etc.
Sometimes it isn't easily possible to model an activity as a separate task. (One example is when it's necessary to post-process the outputs of an existing task in-place. Doing this in a separate task would result in the original task never being up-to-date on subsequent runs.) Only then the activity should be attached to an existing task with doLast.

Design to distribute work when generating task oriented input for legacy dos application?

I'm attempting to automate a really old dos application. I've decided the best way to do this is via input redirection. The legacy app (menu driven) has many tasks within tasks with branching logic. In order to easily understand and reuse the input for these tasks, I'd like to break them into bit size pieces. Since I'll need to start a fresh app on each run, repeating a context to consume a bit might be messy.
I'd like to create an object model that:
allows me to concentrate on the task at hand
allows me to reuse common tasks from different start points
prevents me from calling a task from the wrong start point
To be more explicit, given I have the following task hierarchy:
START
A
A1
A1a
A1b
A2
A2a
B
B1
B1a
I'd like an object model that lets me generate an input file for task "A1b" buy using building blocks like:
START -> do_A, do_A1, do_A1b
but prevents me from:
START -> do_A1 // because I'm assuming a different call chain from above
This will help me write "do_A1b" because I can always assume the same starting context and will simplify writing "do_A1a" because it has THE SAME starting context. What patterns will help me out here? I'm using ruby at the moment so if dynamic language features can help, I'm game.
EDIT: after re-reading your question, I realized I misunderstood it. Let me answer what you actually asked...
I would create a hierarchy of classes. The simplest ones would be have functions like "do task A1b" that would output the appropriate steps to accomplish this. On top of that, I would build functions that would call the sub-tasks in specific orders to accomplish specific goals.
Pretending VIM was the program being controlled, the first level tasks would be things like 'Enter insert mode' 'Enter command mode' 'write the file' or 'input this arbitrary set of inputs'. On top of this I would build functions like 'insert "foobar" into the open file at the start of line 5' which would call the lower-level tasks.

Resources