Taskwarrior: How do I find the tasks that depend on a specific tasks? - filter

How do I find out which task(s) depend on a specific task without reading the information of all tasks?
Reproduction
System
Version
$ task --version
2.5.1
.taskrc
# Taskwarrior program configuration file.
# Files
data.location=~/.task
alias.cal=calendar
rc.date.iso=Y-M-D
default.command=ready
journal.info=no
rc.regex=on
Here are the tasks that I created for testing purposes:
$ task list
ID Age Description Urg
1 2min Something to do 0
2 1min first do this 0
3 1min do this whenever you feel like it 0
3 tasks
Create the dependency from task#1 to task#2:
$ task 1 modify depends:2
Modifying task 1 'something to do'.
Modified 1 task.
$ task list
ID Age D Description Urg
2 4min first do this 8
3 4min do this whenever you feel like it 0
1 4min D Something to do -5
3 tasks
Goal
Now I want to find the tasks that are dependent on task#2, which should be task#1.
Trials
Unfortunately, this does not result in any matches:
$ task list depends:2
No matches.
$ # I can filter by blocked tasks
$ task blocked
ID Deps Age Description
1 2 18min Something to do
1 task
$ # But when I want to only have tasks \
that are blocked by task#2 also task#3 is returned
$ task blocked:2
[task ready ( blocked:2 )]
ID Age Description Urg
2 20min first do this 8
3 19min do this whenever you feel like it 0
2 tasks
Suggestions?
How would you approach this?
Parsing the taskwarrior output through a script looks like a bit of an overkill.

You have the right command but have actually encountered a bug: the depends attribute does not work with "short id", it's a comma-delimited string of uuids.
It will work if you use UUID instead. Use task <id> _uuid to resolve id to UUID.
$ task --version
2.5.1
# Create tasks
$ task rc.data.location: add -- Something to do
$ task rc.data.location: add -- first do this
$ task rc.data.location: add -- do this whenever you feel like it
$ task rc.data.location: list
ID Age Description Urg
1 - Something to do 1.8
2 - first do this 1.8
3 - do this whenever you feel like it 1.8
3 tasks
# Set up dependency
$ task rc.data.location: 1 modify depends:2
Modifying task 1 'Something to do'.
Modified 1 task.
# Query using depends:UUID
$ task rc.data.location: list "depends.has:$(task rc.data.location: _get 2.uuid)"
ID Age D Description Urg
1 - D Something to do -3.2
1 task
# Query using depends:SHORT ID
# This does not work, despite documentation. Likely a bug
$ task rc.data.location: list "depends.has:$(task rc.data.location: _get 2.id)"
No matches.
Small correction with your trial to find blocked tasks
There is no blocked attribute and you're using the ready report.
$ task blocked:2
[task ready ( blocked:2 )]
The ready report will filter out what we're looking for, the blocked report is what we need. To unmagickify this, these are simply useful default reports that have preset filters on top of task all.
$ task show filter | grep -e 'blocked' -e 'ready'
report.blocked.filter status:pending +BLOCKED
report.ready.filter +READY
report.unblocked.filter status:pending -BLOCKED
Blocked tasks will have the virtual tag +BLOCKED, which is mutually exclusive to +READY.
The blocked attribute doesn't exist, use task _columns to show available attributes (e.g. depends). Unfortunately, the CLI parser is probably attempting to apply the filter blocked:2 and ends up ignoring it. For your workflow, the useful command to use is task blocked "depends.has:$(task _get 2.uuid)". Advisable to write a shell function to make it easier to use:
#!/bin/bash
# Untested but gets the point across
function task_blocked {
blocker=$1
shift
task blocked depends.has:$(task _get ${blocker}.uuid) "$#"
}
# Find tasks of project "foo" that are blocked on task 2
task_blocked 2 project:foo
# What about other project that is also impacted
task_blocked 2 project:bar

You could use this taskwarrior hook script that adds a "blocks" attribute to the tasks: https://gist.github.com/wbsch/a2f7264c6302918dfb30

Related

Ansible playbook to compare 2 register output

I have created 2 tasks to fetch the 2 below outputs:
first output:
NAME STATUS ROLES AGE VERSION
control1.eee-dev.dd.k8s.c0.ms.com Ready master 146d v1.18
control2.eee-dev.dd.k8s.c0.ms.com Ready master 146d v1.18
control3.eee-dev.dd.k8s.c0.ms.com Ready master 146d v1.18
dd900xc15xx.nodes.c0.ms.com Ready worker 146d v1.18
dd900xc16xx.nodes.c0.ms.com Ready worker 146d v1.18
second output:
Transaction ID: xxxx-xxxx-xxxxx-xxxxxx
bootstrap.eee-dev.dd.k8s.c0.ms.com
control1.eee-dev.dd.k8s.c0.ms.com
control2.eee-dev.dd.k8s.c0.ms.com
control3.eee-dev.dd.k8s.c0.ms.com
dd900xc15xx.nodes.c0.ms.com
dd900xc16xx.nodes.c0.ms.com
Now, how do I compare the above 2 outputs stored in register and print PASSED, if NAME(from first output, meaning only the nodes from the first column) matches with the second output? if not FAILED. Note: we need to ignore the first 2 lines from the second output
I'm thinking to do it by logic(filter the node names and do comparison), but unsure about how do I convert this into an ansible playbook. Highly appreciate your suggestions and comments to achieve this.

Any way to print the task graph before Gradle actually performs the task?

Not exactly a newb but am trying to dig a big deeper and understand a bit better ... just thought it might be a nice idea to do this.
Thanks to Opal I did this:
gradle.taskGraph.whenReady {taskGraph ->
println "Tasks"
taskGraph.getAllTasks().eachWithIndex{ task, n ->
println "${n + 1} $task"
task.dependsOn.eachWithIndex{ depObj, m ->
println " ${ m + 1 } $depObj"
}
}
}
Output for a Java build:
Tasks
1 task ':compileJava'
2 task ':processResources'
3 task ':classes'
1 compileJava
2 dirs
3 processResources
4 task ':jar'
5 task ':assemble'
1 org.gradle.api.internal.artifacts.DefaultPublishArtifactSet$ArtifactsTaskDependency#48db7705
6 task ':compileTestJava'
7 task ':processTestResources'
8 task ':testClasses'
1 processTestResources
2 dirs
3 compileTestJava
9 task ':test'
10 task ':check'
1 value: task ':test'
11 task ':build'
1 check
2 assemble
For me, as a Gradle neophyte (one up from a newb), this is great! Although it leaves me slightly puzzled:
1) "build" depends only on "check" and "assemble", and these have 1 dependency each, each with no dependencies. So how does it know to run all the other tasks (which obviously it has to)... I must be missing something.
2) what is dependency "dirs" and "org.gradle.api.internal.artifacts.DefaultPublishArtifactSet$ArtifactsTaskDependency#48db7705"? More importantly, where does these actually come from? getDependsOn() returns a Set<Object> so these may be something other than Tasks.
Plenty to investigate...
The gradle-taskinfo plugin probably does just what Opal describes in his answer.
I wrote it, so I might be biased, but I use it a lot and it works great for me. 😊
Maybe it saves you some time coding!
Interface TaskExecutionGraph provides getAllTasks method which (from the docs):
Returns the tasks which are included in the execution plan. The tasks are returned in the order that they will be executed.
It seems that the output of this method might be used to print the task graph.

Can a snakemake input rule be defined with different paths/wildcards

I want to know if one can define a input rule that has dependencies on different wildcards.
To elaborate, I am running this Snakemake pipeline on different fastq files using qsub which submits each job to a different node:
fastqc on original fastq - no downstream dependency on other jobs
adapter/quality trimming to generate trimmed fastq
fastqc_after on trimmed fastq (output from step 2) and no downstream dependency
star-rsem pipeline on trimmed fastq (output from step 2 above)
rsem and tximport (output from step 4)
Run multiqc
MultiQC - https://multiqc.info/ - runs on the results folder which has results from fastqc, star, rsem, etc. However, because each job runs on a different node, sometimes Step 3 (fastqc and/or fastqc_after) is still running on the nodes while other steps finish running (Steps 2, 4 and 5) OR vice-versa.
Currently, I can create a MultiQc rule which waits on results from Steps 2, 4, 5 because they are linked to each other by input/output rules.
I have attached my pipeline as png to this post. Any suggestions would help.
What I need: I want to create a "collating" step where I want MultiQC to wait till all steps (from 1 to 5) finish. In other words, using my attached png as guide, I want to define multiple input rules for MultiQC that also wait on results from fastqc
Thanks in advance.
Note: Based on comments I received from 'colin' and 'bli' after my original post, I have shared the code for the different rules here.
Step 1 - fastqc
rule fastqc:
input: "raw_fastq/{sample}.fastq"
output: "results/fastqc/{sample}_fastqc.zip"
log: "results/logs/fq_before/{sample}.fastqc.log"
params: ...
shell: ...
Step 2 - bbduk
rule bbduk:
input: R1 = "raw_fastq/{sample}.fastq"
output: R1 = "results/bbduk/{sample}_trimmed.fastq",
params: ...
log: "results/logs/bbduk/{sample}.bbduk.log"
priority:95
shell: ....
Step 3 - fastqc_after
rule fastqc_after:
input: "results/bbduk/{sample}_trimmed.fastq"
output: "results/bbduk/{sample}_trimmed_fastqc.zip"
log: "results/logs/fq_after/{sample}_trimmed.fastqc.log"
priority: 70
params: ...
shell: ...
Step 4 - star_align
rule star_align:
input: R1 = "results/bbduk/{sample}_trimmed.fastq"
output:
out_1 = "results/bam/{sample}_Aligned.toTranscriptome.out.bam",
out_2 = "results/bam/{sample}_ReadsPerGene.out.tab"
params: ...
log: "results/logs/star/{sample}.star.log"
priority:90
shell: ...
Step 5 - rsem_norm
rule rsem_norm:
input:
bam = "results/bam/{sample}_Aligned.toTranscriptome.out.bam"
output:
genes = "results/quant/{sample}.genes.results"
params: ...
threads = 16
priority:85
shell: ...
Step 6 - rsem_model
rule rsem_model:
input: "results/quant/{sample}.genes.results"
output: "results/quant/{sample}_diagnostic.pdf"
params: ...
shell: ...
Step 7 - tximport_rsem
rule tximport_rsem:
input: expand("results/quant/{sample}_diagnostic.pdf",sample=samples)
output: "results/rsem_tximport/RSEM_GeneLevel_Summarization.csv"
shell: ...
Step 8 - multiqc
rule multiqc:
input: expand("results/quant/{sample}.genes.results",sample=samples)
output: "results/multiqc/project_QS_STAR_RSEM_trial.html"
log: "results/log/multiqc"
shell: ...
If you want rule multiqc to happen only after fastqc completed, you can add the output of fastqc to the input of multiqc:
rule multiqc:
input:
expand("results/quant/{sample}.genes.results",sample=samples),
expand("results/fastqc/{sample}_fastqc.zip", sample=samples)
output: "results/multiqc/project_QS_STAR_RSEM_trial.html"
log: "results/log/multiqc"
shell: ...
Or, if you need to be able to refer to the output of rsem_norm in your shell section:
rule multiqc:
input:
rsem_out = expand("results/quant/{sample}.genes.results",sample=samples),
fastqc_out = expand("results/fastqc/{sample}_fastqc.zip", sample=samples)
output: "results/multiqc/project_QS_STAR_RSEM_trial.html"
log: "results/log/multiqc"
shell: "... {input.rsem_out} ..."
In one of your comments, you wrote:
MultiQC needs directory as input - I give it the 'results' directory in my shell command.
If I understand correctly, this means that results/quant/{sample}.genes.results are directories, and not plain files. If this is the case, you should make sure no downstream rule writes files inside those directories. Otherwise, the directories will be considered as having been updated after the output of multiqc, and multiqc will be re-run every time you run the pipeline.

Snakemake tabular config

I'm using Snakemake with a tabular configuration. This table is a bunch of rows that I first take out of a very large sample overview. The resulting sample.tsv has a lot of columns. I read the samples file like this, somewhere in my Snakefile:
samples = pd.read_table('samples.tsv').set_index('samples', drop=False)
When I run snakemake:
snakemake --cluster-config cluster.json --cluster "qsub -l nodes={cluster.nodes}:ppn={cluster.ppn}" --jobs 256
I get the following error:
16:51 nlv24077#kiato ~/temp/test_snakemake > run_snakemake.sh
Building DAG of jobs...
MissingInputException in line 6 of /home/nlv24077/temp/test_snakemake/rseqc.smk:
Missing input files for rule bam_stat:
analyzed/I7_index.STAR.genome.sorted.bam
the strange thing is that I7_index is one of the column names in my samples table. Why is it using the column names as sample names?
Below I can show you part of the samples table (I can't show all data publicly):
Edit:
I was calling the samples like this:
rule bcl2fastq:
input:
config['bcl_dir']
output:
expand([os.path.join(fastq_dir, '{sample}_R1_001.fastq.gz'),
os.path.join(fastq_dir, '{sample}_R2_001.fastq.gz')],
sample=samples)
threads: 6
shell:
'''
# Run bcl2fastq
...
Whereas I should have used:
rule bcl2fastq:
input:
config['bcl_dir']
output:
expand([os.path.join(fastq_dir, '{sample}_R1_001.fastq.gz'),
os.path.join(fastq_dir, '{sample}_R2_001.fastq.gz')],
sample=samples['samples'])
threads: 6
shell:
'''
# Run bcl2fastq
...
Thanx JeeYem.

Shell Scripting to compare the value of current iteration with that of the previous iteration

I have an infinite loop which uses aws cli to get the microservice names, it's parameters like desired tasks,number of running task etc for an environment.
There are 100's of microservices running in an environment. I have a requirement to compare the value of aws ecs metric running task for a particular microservice in the current loop and with that of the previous loop.
Say name a microservice X has the metric running task 5. As it is an infinite loop, after some time, again the loop come for the microservice X. Now, let's assume the value of running task is 4. I want to compare the running task for currnet loop, which is 4 with the value of the running task for the previous run, which is 5.
If you are asking a generic question of how to keep a previous value around so it can be compared to the current value, just store it in a variable. You can use the following as a starting point:
#!/bin/bash
previousValue=0
while read v; do
echo "Previous value=${previousValue}; Current value=${v}"
previousValue=${v}
done
exit 0
If the above script is called testval.sh. And you have an input file called test.in with the following values:
2
1
4
6
3
0
5
Then running
./testval.sh <test.in
will generate the following output:
Previous value=0; Current value=2
Previous value=2; Current value=1
Previous value=1; Current value=4
Previous value=4; Current value=6
Previous value=6; Current value=3
Previous value=3; Current value=0
Previous value=0; Current value=5
If the skeleton script works for you, feel free to modify it for however you need to do comparisons.
Hope this helps.
I dont know how your input looks exactly, but something like this might be useful for you :
The script
#!/bin/bash
declare -A app_stats
while read app tasks
do
if [[ ${app_stats[$app]} -ne $tasks && ! -z ${app_stats[$app]} ]]
then
echo "Number of tasks for $app has changed from ${app_stats[$app]} to $tasks"
app_stats[$app]=$tasks
else
app_stats[$app]=$tasks
fi
done <<< "$( cat input.txt)"
The input
App1 2
App2 5
App3 6
App1 6
The output
Number of tasks for App1 has changed from 2 to 6
Regards!

Resources