Is there a visual bash pipeline editor - bash

I have plenty of bash scripts with various files being piped into various scripts and it does my head in a bit.
I wondered if there was a way of visualising the pipeline in a bash script so I can easily see the flow.

You can add newlines after the pipe, and bash will continue to see it as a single pipeline:
foo | bar | baz | qux
can be written as
foo |
bar |
baz |
qux
Or, use line continuations, if the look appeals more:
foo \
| bar \
| baz \
| qux
Newlines are acceptable after |, && and ||

Try pv. It shows you how much data is being transferred in the pipe. Example usage (I got this image from the website):
pv prints its progress bars to stderr. In your case, you could use:
foo | pv -cN foo |
bar | pv -cN bar |
baz | pv -cN baz |
qux | pv -cN qux

Related

yq v4: print all key value pairs with full key path

I'm trying to determine the correct syntax for using yq to print all key/value pairs from a given yaml input using yq v4 - with the desired output having the full key "path". This was possible using v3 such as this:
$ cat << EOF | yq r -p pv - '**'
> a:
> b: foo
> c: bar
> EOF
a.b: foo
a.c: bar
but I'm having difficulty wrapping my head around the new syntax.
Any help is greatly appreciated.
$ cat << EOF | yq e '.. | select(. == "*") | {(path | join(".")): .} ' -
> a:
> b: foo
> c: bar
> EOF
a.b: foo
a.c: bar
What does this do? Let's go over it:
.. recursively select all values
select(. == "*") filter for scalar values (i.e. filter out the value of a)
(path | join(".")) gets the path as array and joins the elements with .
{…: .} create a mapping, having the joined paths as keys and their values as values
Edit: to get sequence indexes in square brackets ([0] etc), do
$ cat << EOF | yq e '.. | select(. == "*") | {(path | . as $x | (.[] | select((. | tag) == "!!int") |= (["[", ., "]"] | join(""))) | $x | join(".") | sub(".\[", "[")): .} ' -
This seems like there should be a simpler way to do it, but I don't know yq well enough to figure it out.

yq v4 get root keys based on existence of deeper keys

I have this structure:
foo:
image: 123
bar:
image: 456
baz:
config: "my config"
and I'd like to print the root keys (i.e. foo, bar, baz) based on the existence of the child "image"
In yq version 3 I could do this:
$ yq read test.yaml --printMode p "*.image" | awk -F'.' '{print $1}'
foo
bar
But I can't find the equivalent in v4. The yq + jq solution would be:
$ yq -j e test.yaml | jq -r 'to_entries[] | select(.value | has("image")) | [.key][]'
foo
bar
Any idea how to do this with yq v4?
You can use the path operator to get the path of the matching object containing the tag image
yq e '.[] | select(has("image")) | path | .[]' yaml

bash get dirname from urls.txt

$ cat urls.txt
/var/www/example.com.com/upload/email/email-inliner.html
/var/www/example.com.com/upload/email/email.html
/var/www/example.com.com/upload/email/email2-inliner.html
/var/www/example.com.com/upload/email/email2.html
/var/www/example.com.com/upload/email/AquaTrainingBag.png
/var/www/example.com.com/upload/email/fitex/fitex-ecr7.jpg
/var/www/example.com.com/upload/email/fitex/fitex-ect7.jpg
/var/www/example.com.com/upload/email/fitex/fitex-ecu7.jpg
/var/www/example.com.com/upload/email/fitex/fitex.html
/var/www/example.com.com/upload/email/fitex/logo.png
/var/www/example.com.com/upload/email/fitex/form.html
/var/www/example.com.com/upload/email/fitex/fitex.txt
/var/www/example.com.com/upload/email/bigsale.html
/var/www/example.com.com/upload/email/logo.png
/var/www/example.com.com/upload/email/bigsale.png
/var/www/example.com.com/upload/email/bigsale-shop.html
/var/www/example.com.com/upload/email/bigsale.txt
Can anyone help me to get dirname for this?
dirname /var/www/example.com.com/upload/email/sss.png works fine, but what about a list of URLs?
Is it possible to achieve this without the use of any form of a loop (for or while). As the number of URLs can be more than several tens of millions. The best way would be with the help of redirection (tee) to a file
As always when it boils down to things like this, Awk comes to the rescue:
awk 'BEGIN{FS=OFS="/"}{NF--}1' <file>
Be aware that this is an extremely simplified version of dirname and does not have the complete identical implementation as dirname, but it will work for most cases. A correct version, which covers all cases is:
awk 'BEGIN{FS=OFS="/"}{gsub("/+","/")}
{s=$0~/^\//;NF-=$NF?1:2;$0=$0?$0:(s?"/":".")};1' <file>
The following table shows the difference:
| path | dirname | awk full | awk short |
|------------+---------+----------+-----------|
| . | . | . | |
| / | / | / | |
| foo | . | . | |
| foo/ | . | . | foo |
| foo/bar | foo | foo | foo |
| foo/bar/ | foo | foo | foo/bar |
| /foo | / | / | |
| /foo/ | / | / | /foo |
| /foo/bar | /foo | /foo | /foo |
| /foo/bar/ | /foo | /foo | /foo/bar |
| /foo///bar | /foo | /foo | /foo// |
note: various alternative solutions can be found in Extracting directory name from an absolute path using sed or awk. The solutions of Kent will all work, the solution of Solid Kim just needs a tiny tweak to fix the multiple slashes (and misses upvotes!)

How to redirect stderr to a file for the whole pipe?

I am running a command like this:
mycmd1 | mycmd2 | mycmd3 | lp
Is there a way to redirect stderr to a file for the whole pipe instead of repeating it for each command?
That is to say, I'd rather avoid doing this:
mycmd1 2>/myfile | mycmd2 2>/myfile | mycmd3 2>/myfile | lp 2>/myfile
Either
{ mycmd1 | mycmd2 | mycmd3 | lp; } 2>> logfile
or
( mycmd1 | mycmd2 | mycmd3 | lp ) 2>> logfile
will work. (The first version might be have a slightly faster (~1ms) startup time depending on the shell).
I tried the following, and it seems to work:
(mycmd1 | mycmd2 | mycmd3 | lp) 2>>/var/log/mylogfile.log
I use >> because I want to append to the logfile rather than overwriting it every time.

How to delete lines with same middle block?

This is probably an easy question for command-line ninjas, but I can't figure it out for the life of me. As of now, I'm using a PHP script to accomplish this, but I need to do it using awk/sed/cut or similar.
I have got a log file like this:
123 | foo | 12.13
756 | bar | 14.25
236 | baz | 11.23
536 | foo | 10.13
947 | bar | 34.25
134 | baz | 11.26
I need to delete all lines that has the middle element same. If there is a duplicate, the newer version needs to be kept. The output of the file after the removal should look like:
536 | foo | 10.13
947 | bar | 34.25
134 | baz | 11.26
I'm new to this and have no idea how to do this, so a little nudge in the right direction would be of great help.
$ tac file | awk -F' +[|] +' '!seen[$2]++' | tac
536 | foo | 10.13
947 | bar | 34.25
134 | baz | 11.26
or if you prefer an awk-only solution:
$ awk -F' +[|] +' 'NR==FNR{fnr[$2]=FNR; next} FNR==fnr[$2]' file file
536 | foo | 10.13
947 | bar | 34.25
134 | baz | 11.26
You can use this awk command using a custom field separator:
awk -F' *\\| *' '!data[$2]{a[++k]=$2} {data[$2]=$0}
END{for (i=1; i<=k; i++) print data[a[i]]}' file
536 | foo | 10.13
947 | bar | 34.25
134 | baz | 11.26
If you don't care about the output order
perl -F'\s*\|\s*' -lanE '$s{$F[1]}=$_}{say $s{$_} for keys %s' <ca.txt
prints
134 | baz | 11.26
947 | bar | 34.25
536 | foo | 10.13
sed -e ":a
$ !{N;ba
}
:b
s/[0-9]* | \([^ ]*\) | [0-9.]*\n\(.*\)\1/\2\1/g
t b" YourFile
sed posix version (so --posix for GNU sed especially due to use of | inside s///)

Resources