Snakemake: need to rerun workflow 20 times from interim rule - multiprocessing

I need to rerun my snakemake workflow many times with different config parameter values. I already use --config which is great but I just run the workflow with different output folders. Is there a way to rerun the workflow many times but not use wildcards (so I wouldn't have to append something to each output file which relates to parameters being changed).
I already use --config

Related

Choice paramter in jenkins as an output from shell script (pipeline)

I'm looking for a way to use an output from shell as a jenkins parameter but in pipeline, don't want to use any of UI plugins.
For example I want to use output from command
ls /home
as a choice parameter (so I would be able to choose users from the list), is it possible to do something like that?
It must be done in pipeline, I'm not looking for some plugins which allow you to do something like that, but you need to do all in UI, if plugin support pipelines then it's also ok.
For a pipeline to run, its parameters need to be completely defined before the pipeline is started.
For the pipeline to parse the output of ls /home, it needs to run and allocate a node and a workspace, which can only happen after the pipeline is started.
So you are in a kind of a chicken-and-egg problem, where you need to run some code before you start a pipeline, but you can't run pipeline before you run that code.
So your issue boils down to "How can I run some arbitrary Groovy script before I start my pipeline?"
There are two options for this:
ActiveChoice plugin allows you to define a parameter that returns a script. Jenkins will then run the script (don't forget to approve it) in order to show you your "Build with parameters" page. Debugging this is notoriously hard, but this can go to great lengths.
Alternatively, you may want to run a scripted pipeline before you run the Declarative (main) one, as outlined e.g. in this answer. This may look a bit like this:
def my_choices_list = []
node('master') {
stage('prepare choices') {
// read the folder contents
def my_choices = sh script: "ls -l /home", returnStdout:true
// make a list out of it - I haven't tested this!
my_choices_list = my_choices.trim().split("\n")
}
}
pipeline {
parameters {
choiceParam('OPTION', my_choices_list)

Script for running multiple Make Commands

I would like to get insight on how to get started or what general direction to look in when trying to make a script or makefile that will run 3 make commands at once that take in the same input. These three commands all ask for the same input but just output different excel files due to it manipulating the pulled data in different ways. Therefore If I were able to create a script or makefile that ran all three commands at once when giving the input one time it would SAVE ME A TON OF TIME.
This is all being done in putty pretty much (in terms of the commands)
Thanks,
NP
You want to use a shell script.
For instance, you can create run.sh with:
#!/bin/bash
make FLAG1=ON $*
make FLAG2=ON $*
make FLAG3=ON $*
Make it executable and do `./run.sh MYCOMMOFLAG1=ON MYCOMMONFLAG2=OFF...

How to a execute list of commands in parallel in snakemake

I have a snakemake rule that creates a text file will many shell commands as its output. I would like to design a second rule that would take the file as an input and run all the commands specified in the file in parallel - taking advantage of multiple threads/cores or submitting the commands to a cluster if --cluster is specified. Is that possible?
You could write python code, either using the "run:" block in snakemake or "script:" with an external python script, which uses the subprocess module to run each command in the file in a separate process. I don't think there's a way to do it more directly in Snakemake.

Want to run a list of commands, but be able to edit the list while it's running

I have a list of (bash) commands I want to run:
<Command 1>
<Command 2>
...
<Command n>
Each command takes a long time to run, and sometimes after seeing the output of (e.g.) <Command 1>, I'd like to update a parameter of <Command 5>, or add a new <Command k> at an arbitrary position in the list. But I want to be able to walk away from my machine at any time, and have it keep working through my last update to the list.
This is similar to the question here: Edit shell script while it's running. Some of those answers could be made to serve, but that question had the additional constraint of wanting to edit the script file itself, and I suspect there is a simpler answer because I don't have that exact constraint.
My current solution is to end my script with a call to a second script. I can edit the second file while the first one runs, this lets me append new commands to the end of my list, but I can't make any changes to the list of commands in the first file. And once execution has started in the second file, I can't make any more changes. But I often stop my script to insert updates, and this sometimes means stopping a long command that is almost complete, only so that I can update later items on the list before I leave my machine for a time. I could of course chain together many files in this way, but that seems a mess for what (hopefully) has a simple solution.
This is more of a conceptual answer than one where I provide the full code. My idea would be to run Redis (Redis description here) - it is pretty simple to install - and use it as a data-structure server. In your case, the data structure would be a list of jobs.
So, you basically add each job to a Redis list which you can do using LPUSH at the command-line:
echo "lpush jobs job1" | redis-cli
You can then start one, or more, workers, in parallel if you wish and they sit in a loop, doing repeated BLPOP of jobs (blocking pop, waiting till there are jobs) off the list and processing them:
#!/bin/bash
while :; do
job=$(echo brpop jobs 0 | redis_cli)
do $job
done
And then you are at liberty to modify the list while the worker(s) is/are running using deletions and insertions.
Example here.
I would say to put each command that you want to run in a file and in the main file list all of the command files
ex: main.sh
#!/bin/bash
# Here you define the absolute path of your script
scriptPath="/home/script/"
# Name of your script
scriptCommand1="command_1.sh"
scriptCommand2="command_2.sh"
...
scriptCommandN="command_N.sh"
# Here you execute your script
$scriptPath/$scriptCommand1
$scriptPath/$scriptCommand2
...
$scriptPath/$scriptCommandN
I suppose while 1 is running you can then modify the other since they are external files

Get execution time information for large number of bash scripts

Given a project that consists of large number of bash scripts that are being launched from crontab periodically how can one track execution time of each script?
There is straightforward approach to edit each of those file by adding date
But what I really want is some kind of daemon that could track execution time and submit results to somewhere several times a day.
So the question is:
Is it possible to gather information about execution time of 200 bash scripts without editing each of them?
time module considered as fallback solution, if nothing better could be found
Depending on your systems cron implementation you may define the log-levels of the cron daemon. For ubuntus default vixie-cron setting log-level will log start and end of a job-execution which can then be analyzed.
On current LTS Ubuntu it works defining the log-level in /etc/init/cron
appending the -L 3 option to the exec line letting it look like:
exec cron -L 3
You could change your cron to run your scripts under time?
time scriptname
And pipe output to you logs.

Resources