I have a task to count how many files exist. Is there something wrong with the syntax?
task:
- name: "Getting Local File Count"
kubernetes.core.k8s_exec:
namespace: "{{namespace}}"
pod: "{{pod}}"
command: "find '{{local_dir}}' -type f | wc -l"
register: command_status
but after executing the playbook I get
"find: paths must precede expression: |",
That's a common mistake coming from a world of CMD in docker, which has two forms: free text and exec -- but in kubernetes descriptors, it's always the exec form. If one wishes to have shell helpers, such as pipes or && or functions or redirection, one must do that explicitly via sh -c style constructs
The bad news is that the ansible module's command: argument doesn't accept a list[str] and instead they choose to use shlex.split so you have to either copy the complicated command into the pod and just use command: /path/to/my/script.sh or take your chances with shlex, which seems to understand sh -c quoting:
>>> shlex.split("""sh -c "find '{{local_dir}}' -type f | wc -l" """)
['sh', '-c', "find '{{local_dir}}' -type f | wc -l"]
making your ansible parameter look like:
command: >-
sh -ec "find '{{local_dir}}' -type f | wc -l"
as always when using jinja2 templates in a shell context, the safer choice is to use | quote versus just wrapping your thing in single-quotes and hoping no one uses a single-quote in local_dir, although in this case one will need to be extra cautious since it is a shell literal inside a shell literal :-(
Related
Often times, I encounter commands being executed with /bin/bash -c or /bin/sh -c instead of directly. For example, instead of cp /tmp/file1 /tmp/file2, it'd be /bin/bash -c "cp /tmp/file1 /tmp/file2".
What are some reasons for doing this instead of executing the command directly? In recent memory, I've seen this the most in Docker and K8s commands. The only thing I can really think of is because you specifically want to run the command with a specific shell, but this seems like a pretty rare/niche use-case?
Here is a specific example, the k8s deployment uses:
command: ["/bin/sh"]
args: ["-c", ". /config/dynamicenv.sh && /app/bin/docker-entrypoint server"]
Instead of what I would expect the default to be:
. /config/dynamicenv.sh && /app/bin/docker-entrypoint server
Without specific examples it's hard to tell, but a common reason for doing this is that you want to make use of shell i/o redirection, pipes, etc. For example, this fragment of a Kubernetes pod manifest would fail because it involves a pipe, which requires the shell to execute the command line:
containers:
image: docker.io/alpine:latest
command:
- echo hello world | sed s/world/container/
But this would work:
containers:
image: docker.io/alpine:latest
command:
- /bin/sh
- -c
- echo hello world | sed s/world/container/
This is one relatively common situation in which you'll see things explicitly execute with a shell. If you'd like to update your question with some specific examples, we can provide a more thorough answer.
Your example is very close to what I've already included here in my answer. The command . /config/dynamicenv.sh && /app/bin/docker-entrypoint server isn't a simple command; it's a shell script that makes use of both the . and the && operators.
If they were to write:
command: [". /config/dynamicenv.sh && /app/bin/docker-entrypoint server"]
It would fail with an error along the lines of:
exec: "[\". /config/dynamicenv.sh && /app/bin/docker-entrypoint server\"]": stat [". /config/dynamicenv.sh && /app/bin/docker-entrypoint server"]: no such file or directory: unknown.
The command needs to be wrapped with sh -c in order to execute correctly.
I suppose the answer here might be trivial, but I it might require some intricate bash knowledge. I have been browsing bash docs for a few hours now and can't seem to find the answer.
I'm working on a python repository, and came up with a simple script to lint only the files that differ between the current branch and master. Here's the minimal working example, extracted from said script (lint.sh):
#!/bin/bash
paths=$(git diff --name-only -r origin/master...HEAD | grep \.py$)
flake8 $paths
For testing purposes, let's say I only committed one file, bad.py, with the following contents:
hello
there
The expected output of bash lint.sh is:
bad.py:1:1: F821 undefined name 'hello'
bad.py:2:1: F821 undefined name 'there'
However, the output is empty. When run in debug mode, bash shows the following commands:
++ git diff --name-only -r origin/master...HEAD
++ grep '.py$'
+ paths='bad.py'
+ flake8 'bad.py'
Which is what I expect. Also, when I simply run flake8 bad.py, the output is as expected.
I expect this might have something to do with parameter passing which varies between different bash versions. The output of bash --version:
GNU bash, version 4.4.23(1)-release (x86_64-apple-darwin17.5.0)
I will appreciate all insights
Very sorry this isn't exactly an answer, but it surely didn't fit in a comment!
The hint here to me is the following:
+ paths='bad.py'
+ flake8 'bad.py'
In my execution of the same script, I get the following:
$ bash -x lint.sh
++ git diff --name-only -r origin/master...HEAD
++ grep '.py$'
+ paths=bar.py
+ flake8 bar.py
bar.py:1:1: F821 undefined name 'hello'
bar.py:2:1: F821 undefined name 'world'
Notice here how my output does not contain quotes around the filename or the assignment. bash won't usually add quotes unless they are necessary. What this tells me is there's probably some sort of control character in that string (my best guess is either colors or \b + some other characters (this might be one of the few cases where a screenshot is actually helpful!)).
Here's one way that I was able to reproduce your findings:
mkdir -p bin
cat > bin/grep << EOF
#!/usr/bin/env bash
exec /bin/grep --color=always "\$#"
EOF
chmod +x bin/grep
# simulate having this `grep` on your path
PATH=$PWD/bin:$PATH bash -x lint.sh
(and while this seems like an odd thing to do, in the past I've put my own grep in ~/bin so I could add --line-buffered --color=auto now that GREP_OPTIONS is deprecated -- one might erroneously add --color=always and have it work... for the most part). Today I use an alias instead since I ran into sharp edges even with that.
The output in that case matches yours above:
$ PATH=$PWD/bin:$PATH bash -x lint.sh
++ git diff --name-only -r origin/master...HEAD
++ grep '.py$'
+ paths='bar.py'
+ flake8 'bar.py'
But the tricky hint is in the highlighting
addendum
While unrelated to your problem, here's probably a better way to accomplish what you want:
# if you have GNU xargs
git diff -z --name-only origin/master...HEAD -- '*.py' | xargs --null --no-run-if-empty flake8
# if you need to be more portable (I see you're probably on macos)
git diff -z --name-only origin/master...HEAD -- '*.py' | xargs -0 flake8 /dev/null
Explanation of the different parts:
git diff -z: output filenames with null bytes delimiting. This prevents splicing if filenames contain spaces or other special characters
xargs --null: split the input by null bytes when splatting arguments
xargs --no-run-if-empty: don't run the executable at all if there's no arguments (this is a GNU extension)
xargs -0: same as xargs --null, however if you're stuck with non-GNU xargs you won't have access to the long options
flake8 /dev/null: this is a sneaky trick, since there's no "no run if empty" option to bsd xargs, it's always going to invoke flake8. If flake8 gets invoked with zero arguments, it defaults to recursing your current working directory (and linting all your files). By putting /dev/null at the beginning this prevents this behaviour and instead lints an empty file!
Addendum 2, you probably might want to consider using a git hooks framework to handle all of this for you, I maintain pre-commit which aims to smooth out a lot of the rough edges around git (such as this one!).
I'm having difficulty trying to pass a bash parameter to find as arguments.
Best explained via code:
$ echo $BASH_VERSION
4.1.2(1)-release
$ find /etc -path '*/init.d' 2>/dev/null | wc -l
2
$ args="-path '*/init.d'"
$ echo $args
-path '*/init.d'
$ find /etc $args 2>/dev/null | wc -l
0
$ set -x; !!
set -x; find /etc $args 2>/dev/null | wc -l
+ wc -l
+ find /etc -path ''\''*/init.d'\'''
0
Where are the extra '' quotes coming from in the set -x output?
I've tried a number of different ways to get this to work but haven't had success so far. This guide seems to indicate that this should be possible, since parameter expansion takes place before command execution: http://stuff.lhunath.com/parser.png
Use an array:
args=( -path '*/init.d' )
find /etc "${args[#]}"
The reason the other approach doesn't work is discussed in detail in BashFAQ #50. In short: Quote characters are handled during syntax parsing -- before parameter expansion occurs. Thus, when quote characters come from parameter expansion, they're not parsed as syntax, but treated as any other data: Run through string-splitting and glob expansion, and passed through to the program being run.
The extra quotes/escaping in set -x output are just how the shell tells you that your primary quotes are data, distinguishing them from quotes that are present as syntax.
I am trying to grep for a string as below but running into error shown below,can anyone suggest how to fix it?
find . | xargs grep 'bin data doesn't exist for HY11' -sl
Error:-
args: unmatched single quote; by default quotes are special to xargs unless you use the -0 option
Your grep pattern contains a quotation mark!
Use double quotes round the pattern: "bin doesn't exist for HY11" rather than 'bin ... HY11'.
You also want to add -print0 to the find command, and -0 to xargs.
The better way is to do this all directly:
find . -type f -exec grep -H "bin doesn't exist for HY11" "{}" "+"
That doesn't even need xargs.
If you have GNU Parallel you can run:
find . | parallel -X -q grep "bin data doesn't exist for HY11" -sl
All new computers have multiple cores, but most programs are serial in nature and will therefore not use the multiple cores. However, many tasks are extremely parallelizeable:
Run the same program on many files
Run the same program for every line in a file
Run the same program for every block in a file
GNU Parallel is a general parallelizer and makes is easy to run jobs in parallel on the same machine or on multiple machines you have ssh access to.
If you have 32 different jobs you want to run on 4 CPUs, a straight forward way to parallelize is to run 8 jobs on each CPU:
GNU Parallel instead spawns a new process when one finishes - keeping the CPUs active and thus saving time:
Installation
A personal installation does not require root access. It can be done in 10 seconds by doing this:
(wget -O - pi.dk/3 || curl pi.dk/3/ || fetch -o - http://pi.dk/3) | bash
For other installation options see http://git.savannah.gnu.org/cgit/parallel.git/tree/README
Learn more
See more examples: http://www.gnu.org/software/parallel/man.html
Watch the intro videos: https://www.youtube.com/playlist?list=PL284C9FF2488BC6D1
Walk through the tutorial: http://www.gnu.org/software/parallel/parallel_tutorial.html
Sign up for the email list to get support: https://lists.gnu.org/mailman/listinfo/parallel
I used to use fswatch v0.0.2 like so (in this instance to run django test suit when a file changed)
$>fswatch . 'python manage.py test'
this works fine.
I wanted to exclude some files that were causing the test to run more than once per save (Sublime text was saving a .tmp file, and I suspect .pyc files were also causing this)
So I upgraded fswatch to enable the -e mode.
However the way fswatch has changed which is causing me troubles - it now accepts a pipe argument like so:
$>fswatch . | xargs -n1 program
I can't figure out how to pass in arguments to the program here. e.g. this does not work:
$>fswatch . | xargs -n1 python manage.py test
nor does this:
$>fswatch . | xargs -n1 'python manage.py test'
how can I do this without packaging up my command in a bash script?
fswatch documentation (either the Texinfo manual, or the wiki, or README) have examples of how this is done:
$ fswatch [opts] -0 -o path ... | xargs -0 -n1 -I{} your full command goes here
Pitfalls:
xargs -0, fswatch -0: use it to make sure paths with newlines are interpreted correctly.
fswatch -o: use it to have fswatch "bubble" all the events in the set into a single one printing only the number of records in the set.
-I{}: specifying a placeholder is the trick you missed for xargs interpreting correctly your command arguments in those cases where you do not want the record (in this case, since -o was used, the number of records in the set) to be passed down to the command being executed.
Alternative answer not fighting xargs' default reason for being - passing on the output as arguments to the command to be run.
fswatch . | (while read; do python manage.py test; done)
Which is still a bit wordy/syntaxy, so I have created a super simple bash script fswatch-do that simplifies things for me:
#!/bin/bash
(while read; do "$#"; done)
usage:
fswatch -r -o -e 'pyc' somepath | fswatch-do python manage.py test someapp.SomeAppTestCase