Gitlab CI not displaying complete output in terminal - ansible

I have created a pipeline which performs ansible-lint on $CI_PROJECT_DIR. The problem is the complete output is not shown in UI as compared to running on my local machine.
You can notice the difference in output for both.
Below is the output from my local machine (Ubuntu with ansible-lint installed)
**ansible-lint create-dir.yaml -v**
WARNING Overriding detected file kind 'yaml' with 'playbook' for given positional argument: create-dir.yaml
INFO Executing syntax check on create-dir.yaml (0.31s)
WARNING Listing 1 violation(s) that are fatal
syntax-check: 'file' is not a valid attribute for a Play
create-dir.yaml:4:3 [WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
ERROR! 'file' is not a valid attribute for a Play
The error appears to be in '/tmp/create-dir.yaml': line 4, column 3, but may be elsewhere in the file depending on the exact syntax problem.
The offending line appears to be:
---
- name: Create a directory if it does not exist
^ here
Finished with 1 failure(s), 0 warning(s) on 1 files.
Below is output from Gitlab CI :
$ find ./ -not \( -name "*.ansible-lint" -o -name ".gitlab-ci.yml" \) \( -name "*yml" -o -name "*yaml" \) | xargs ansible-lint -v
WARNING Overriding detected file kind 'yaml' with 'playbook' for given positional argument: ./ansible/create-dir.yaml
INFO Executing syntax check on ansible/create-dir.yaml (0.39s)
Cleaning up project directory and file based variables 00:01
Job succeeded
I would like to know why there is difference in output and how to print the complete message on Gitlab CI

Job logs in GitLab CI/CD have a limited length. It's designed for operational reasons, as the jobs can create arbitrary outputs so one could - by mistake or by purpose - output even gigabytes or terabytes of text.
But if you scroll up in the output, you have the abillity to view the full logs by pressing 'Complete Raw'.

Related

schema file at path is invalid because no version is specified

I run my dbt command in a loop and save its contents to a .yml file. This works well and generates a schema in my .yml file accurately:
for file in models/l30_mart/*.sql; do
table=$(basename "$file" .sql)
dbt run-operation generate_model_yaml --args "{\"model_name\": \"$table\"}" > test.yml
done
However, in the example above, I am just saving the test.yml file in the root directory. When I try to save the file in another path for example models/l30_mart/test.yml, it doesn't work:
for file in models/l30_mart/*.sql; do
table=$(basename "$file" .sql)
dbt run-operation generate_model_yaml --args "{\"model_name\": \"$table\"}" > models/l30_mart/test.yml
done
In this case, when I open the test.ymlfile, I see this:
12:06:42 Running with dbt=1.0.1
12:06:43 Encountered an error:
Compilation Error
The schema file at models/l30_mart/test.yml is invalid because no version is specified. Please consult the documentation for more information on schema.yml syntax:
https://docs.getdbt.com/docs/schemayml-files
It's probably a syntax mistake. Do I need any quotes to specify the path? or brackets? Why can't I save the .yml file in the same folder? Similarly, something like this and try to save different files with the tablename, it also doesn't work:
for file in models/l30_mart/*.sql; do
table=$(basename "$file" .sql)
dbt run-operation generate_model_yaml --args "{\"model_name\": \"$table\"}" > models/l30_mart/$table.yml
done
In this case, the files either have this output:
20:39:44 Running with dbt=1.0.1
20:39:45 Encountered an error:
Compilation Error
The schema file at models/l30_mart/firsttable.yml is invalid because no version is specified. Please consult the documentation for more information on schema.yml syntax:
https://docs.getdbt.com/docs/schemayml-files
or this (eg in the second tablename.yml file):
20:39:48 Running with dbt=1.0.1
20:39:49 Encountered an error:
Parsing Error
Error reading dbt_4flow: l30_mart/firstablename.yml - Runtime Error
Syntax error near line 2
------------------------------
1 | 20:39:44 Running with dbt=1.0.1
2 | 20:39:45 Encountered an error:
3 | Compilation Error
4 | The schema file at models/l30_mart/firsttablename.yml is invalid because no version is specified. Please consult the documentation for more information on schema.yml syntax:
5 |
Raw Error:
------------------------------
mapping values are not allowed in this context
in "<unicode string>", line 2, column 31
Note that the secondtablename.yml mentions the firsttablename.yml.

'Wildcards' object has no attribute 'output'

I get an error for a rather simple rule. I have to write a task file for another program, expecting a tsv file. I read a certain number of parameters from my config file and write them to a file with a shell command.
Code:
rule create_tasks:
output:
temp("tasks_{sample}.tsv")
params:
ID="{sample}",
file=lambda wc: samples["path"][wc.sample] ,
bigwig=lambda wc: samples["bigwig"][wc.sample] ,
ambig=lambda wc: samples["ambig"][wc.sample]
shell:
'echo -e "{params.ID}\t{params.file}" > {output}'
When I execute the workflow, I get the following error:
Building DAG of jobs...
Using shell: /usr/bin/bash
Provided cluster nodes: 1
Job counts:
count jobs
1 create_tasks
1
[Mon Oct 12 14:48:15 2020]
rule create_tasks:
output: tasks_sampleA.tsv
jobid: 0
wildcards: sample=sampleA
echo -e "sampleA /Path/To/sampleA.bed " > tasks_sampleA.tsv
WorkflowError in line 23 of /path/to/workflow.snakefile:
'Wildcards' object has no attribute 'output'
File "/path/to/miniconda/envs/snakemake_submit/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 111, in run_jobs
File "/path/to/miniconda/envs/snakemake_submit/lib/python3.8/site-packages/snakemake/executors/__init__.py", line 1233, in run
I should mention, that two of the variables are empty and that I expect the tabs/whitespaces in the echo command.
Does anybody have an explanation, why snakemake is trying to find output in the wildcards? I am expecially confused, because it is printing the correct command.
I've run into this same problem.
The issue is probably in how you invoked Snakemake from the command line.
For example, this was my Snakefile rule:
rule sort:
input:
"{file}.bam",
output:
"{file}.sorted.bam",
"{file}.sorted.bai",
shell:
"sambamba sort {input}"
I don't even have params or wildcards explicitly anywhere in there.
But when I run it on my Slurm HPC I get the same error:
snakemake -j 10 -c "sbatch {cluster.params}" -u cluster.yaml
The Wildcards (note the capital "W") and params objects weren't from the rule.
They came from the cluster execution of the rule, and the error was thrown when trying to parse the cluster.yaml file.
There was no cluster parameter specification in my cluster.yaml file for the sort rule, so the error was thrown.
I fixed this by adding
sort:
params: "..."
to my cluster.yaml file.
In your case, add cluster submission options under a create_tasks: ... list.
You can also add a __default__: ... list as the default submission parameters for any job, by default, unless it matches another rule.

Run ansible-lint through subdirectories within a gitlab role

I am trying to add a validation step to a gitlab repo holding a single ansible role (with no playbook).
The structure of the role looks like :
.gitlab-ci.yml
tasks/
templates/
files/
vars/
handlers/
With the gitlab-ci looking like :
stages:
- lint
job-lint:
image:
name: cytopia/ansible-lint:latest
entrypoint: ["/bin/sh", "-c"]
stage: lint
script:
- ansible-lint --version
- ansible-lint . -x 106 tasks/*.yml
I need to skip the naming rule, thus ignoring rule 106.
Otherwise, I would like all files at the root repo to be checked. Since there is no playbook, lint has to be given the files that need to be checked... or at least, that is what I understoodd : I may have this point wrong. But anyway, if I give no name, lint does return ok but actually performs no check.
My problem is that I don't know how to tell him to check all the yaml in a recursive way, or even within a subdirectory. The above code returns an error :
ansible-lint: error: unrecognized arguments: tasks/deploy.yml tasks/localhost.yml tasks/main.yml tasks/managedata.yml tasks/psqlconf.yml
Any idea on how to check all the files from a subdirectory or through the whole role?
PS : I am using cytopia image for ansible-lint, but I have no problem using another, provided it's hosted on dockerhub.
You should certainly be able to pass multiple YAML files as arguments to ansible-lint. I have version 4.1.1a0, and I'm able to use it like this, for example:
anisble-lint -x 106 roles/*/tasks/*.yml
I notice that you seem to have placed a . before your -x 106; that looks like an error. It doesn't look like ansible-lint will accept a directory name as an argument (it doesn't cause it to fail; it just doesn't accomplish anything).
I've tried this both with a locally installed ansible-lint and using the cytopia/ansible-lint image, which appears to perform identically:
docker run --rm -v $PWD:/src -w /src cytopia/ansible-lint -x 106 roles/*/tasks/*.yml
If you want to check all the yaml files, you can use find with exec option, something like this:
find ./ -not -name ".gitlab-ci.yml" -name "*.yml" | xargs ansible-lint -x 106
However ansible-lint -x 106 ./ should work, are you sure that your role really has errors? I've tested it both on ansible-galaxy init generated roles (with meta and all that stuff) and roles which were containing only tasks directory, and it worked every time.
EDIT: I tried creating an error in existing role, replacing "present" with "latest" in package install task
$ ansible-galaxy install geerlingguy.nfs
$ cd ~/.ansible/roles/geerlingguy.nfs
$ sed -i "s/present/latest/g" tasks/setup-RedHat.yml
$ ansible-lint ./
Examining tasks/main.yml of type tasks
Examining tasks/setup-Debian.yml of type tasks
Examining tasks/setup-RedHat.yml of type tasks
Examining handlers/main.yml of type handlers
Examining meta/main.yml of type meta
[403] Package installs should not use latest
tasks/setup-RedHat.yml:2
Task/Handler: Ensure NFS utilities are installed.
and it actually worked, so you may want to run a verbose output to see if actually works, maybe individual yaml file rules are different from whole roles.
When i ran my find-based check i got a lot of extra [204] Lines should be no longer than 160 chars

problem with snakemake submitting jobs with multiple wildcard on SGE

I used snakemake on LSF cluster before and everything worked just fine. However, recently I migrated to SGE cluster and I am getting a very strange error when I try to run a job with more than one wildcard.
When I try to submit a job based on this rule
rule download_reads :
threads : 1
output : "data/{sp}/raw_reads/{accesion}_1.fastq.gz"
shell : "scripts/download_reads.sh {wildcards.sp} {wildcards.accesion} data/{wildcards.sp}/raw_reads/{wildcards.accesion}"
I get a following error (snakemake_clust.sh details bellow)
./snakemake_clust.sh data/Ecol1/raw_reads/SRA123456_1.fastq.gz
Building DAG of jobs...
Using shell: /bin/bash
Provided cluster nodes: 10
Job counts:
count jobs
1 download_reads
1
[Thu Jul 30 12:08:57 2020]
rule download_reads:
output: data/Ecol1/raw_reads/SRA123456_1.fastq.gz
jobid: 0
wildcards: sp=Ecol1, accesion=SRA123456
scripts/download_reads.sh Ecol1 SRA123456 data/Ecol1/raw_reads/SRA123456
Unable to run job: ERROR! two files are specified for the same host
ERROR! two files are specified for the same host
Exiting.
Error submitting jobscript (exit code 1):
Shutting down, this might take some time.
When I replace the sp wildcard with a constant, it works as expected:
rule download_reads :
threads : 1
output : "data/Ecol1/raw_reads/{accesion}_1.fastq.gz"
shell : "scripts/download_reads.sh Ecol1 {wildcards.accesion} data/Ecol1/raw_reads/{wildcards.accesion}"
I.e. I get
Submitted job 1 with external jobid 'Your job 50731 ("download_reads") has been submitted'.
I wonder why I might have this problem, I am sure I used exactly the same rule on the LSF-based cluster before without any problem.
some details
The snakemake submitting script looks like this
#!/usr/bin/env bash
mkdir -p logs
snakemake $# -p --jobs 10 --latency-wait 120 --cluster "qsub \
-N {rule} \
-pe smp64 \
{threads} \
-cwd \
-b y \
-o \"logs/{rule}.{wildcards}.out\" \
-e \"logs/{rule}.{wildcards}.err\""
-b y makes the command executed as it is, -cwd changes the working directory on the computing node the the working directory from where the job was submitted. Other flags / specifications are clear I hope.
Also, I am aware of --drmaa flag, but I think out cluster is not well configured for that. --cluster was till now a more robust solution.
-- edit 1 --
When I execute exactly the same snakefile locally (on the fronend, without the --cluster flag), the script gets executed as expected. It seems to be a problem of interaction of snakemake and the scheduler.
-o \"logs/{rule}.{wildcards}.out\" \
-e \"logs/{rule}.{wildcards}.err\""
This is a random guess... More than one wildcards are concatenated by space before replacing them into logs/{rule}.{wildcards}.err. So despite you use double quotes, SGE treats the resulting string as two files and throws the error. What if you use single quotes instead? Like:
-o 'logs/{rule}.{wildcards}.out' \
-e 'logs/{rule}.{wildcards}.err'
Alternatively, you could concatenate the wildcards in the rule and use the result on the command line. E.g.:
rule one:
params:
wc= lambda wc: '_'.join(wc)
output: ...
Then use:
-o 'logs/{rule}.{params.wc}.out' \
-e 'logs/{rule}.{params.wc}.err'
(This second solution, if it works, kind of sucks though)

Postprocess drmemory error stacks with new symbols after process exits

After running a set of tests with drmemory overnight I am trying to resolve the error stacks by providing pdb symbols. The pdb's come from a large samba-mapped repository and using _NT_SYMBOL_PATH at runtime slowed things down too much.
Does anyone know of a tool that post-processes results.txt and pulls new symbols (via NT_SYMBOL_PATH or otherwise) as required to produce more detailed stacks ? If not, any hints for adapting asan_symbolize.py to do this ?
https://llvm.org/svn/llvm-project/compiler-rt/trunk/lib/asan/scripts/asan_symbolize.py
What I came up with so far using dbghelp.dll is below. Works but could be better.
https://github.com/patraulea/postpdb
ok this Query does not pertain to use of windbg or doesn't have anything to do with _NT_SYMBOL_PATH
Dr.Memory is a memory diagnostic tool akin to valgrind and is based on Dynamorio instumentation framework usable on raw unmodified binaries
on windows you can invoke it like drmemory.exe calc.exe from a command prompt (cmd.exe)
as soon as the binary finishes execution a log file named results.txt is written to a default location
if you had setup _NT_SYMBOL_PATH drmemory honors it and resolves symbol information from prepulled symbol file (viz *.pdb) it does not seem to download files from ms symbol server it simply seems to ignore the SRV* cache and seems to use only the downstream symbol folder
so if the pdb file is missing or isnt downloaded yet
the results.txt will contain stack trace like
# 6 USER32.dll!gapfnScSendMessage +0x1ce (0x75fdc4e7 <USER32.dll+0x1c4e7>)
# 7 USER32.dll!gapfnScSendMessage +0x2ce (0x75fdc5e7 <USER32.dll+0x1c5e7>)
while if the symbol file was available it would show
# 6 USER32.dll!InternalCallWinProc
# 7 USER32.dll!UserCallWinProcCheckWow
so basically you need the symbol file for appplication in question
so as i commented you need to fetch the symbols for the exe in question
you can use symchk on a running process too and create a manifest file
and you can use symchk on a machine that is connected to internet
to download symbols and copy it to a local folder on a non_internet machine
and point _NT_SYMBOL_PATH to this folder
>tlist | grep calc.exe
1772 calc.exe Calculator
>symchk /om calcsyms.txt /ip 1772
SYMCHK: GdiPlus.dll FAILED - MicrosoftWindowsGdiPlus-
1.1.7601.17514-gdiplus.pdb mismatched or not found
SYMCHK: FAILED files = 1
SYMCHK: PASSED + IGNORED files = 27
>head -n 4 calcsyms.txt
calc.pdb,971D2945E998438C847643A9DB39C88E2,1
calc.exe,4ce7979dc0000,1
ntdll.pdb,120028FA453F4CD5A6A404EC37396A582,1
ntdll.dll,4ce7b96e13c000,1
>tail -n 4 calcsyms.txt
CLBCatQ.pdb,00A720C79BAC402295B6EBDC147257182,1
clbcatq.dll,4a5bd9b183000,1
oleacc.pdb,67620D076A2E43C5A18ECD5AF77AADBE2,1
oleacc.dll,4a5bdac83c000,1
so assuming you have fetched the symbols it would be easier to rerun the tests with a locally cached copies of the symbol files
if you have fetched the symbols but you cannot rerun the tests and have to work solely with the output from results.txt you have some text processing work (sed . grep , awk . or custom parser)
the drmemory suite comes with a symbolquery.exe in the bin folder and it can be used to resolve the symbols from results.txt
in the example above you can notice the offset relative to modulebase like
0x1c4e7 in the line # 6 USER32.dll!gapfnScSendMessage +0x1ce (0x75fdc4e7 {USER32.dll+0x1c4e7})
so for each line in results.txt you have to parse out the offset and invoke symbolquery on the module like below
:\>symquery.exe -f -e c:\Windows\System32\user32.dll -a +0x1c4e7
InternalCallWinProc+0x23
??:0
:\>symquery.exe -f -e c:\Windows\System32\user32.dll -a +0x1c5e7
UserCallWinProcCheckWow+0xb3
a simple test processing example from a result.txt and a trimmed output
:\>grep "^#" results.txt | sed s/".*<"//g
# 0 system call NtUserBuildPropList parameter #2
USER32.dll+0x649d9>)
snip
COMCTL32.dll+0x2f443>)
notice the comctl32.dll (there is a default comctl.dll in system32.dll and several others in winsxs you have to consult the other files like global.log to view the dll load path
symquery.exe -f -e c:\Windows\winsxs\x86_microsoft.windows.common-
controls_6595b64144ccf1df_6.0.7601.17514_none_41e6975e2bd6f2b2\comctl32.dll -a +0x2f443
CallOriginalWndProc+0x1a
??:0
symquery.exe -f -e c:\Windows\system32\comctl32.dll -a +0x2f443
DrawInsert+0x120 <----- wrong symbol due to wrong module (late binding
/forwarded xxx yyy reasons)

Resources