Testing whether stdin is a file vs. a pipe vs. a tty - bash

I know for bash and zsh, one can use e.g. [ -t 1 ] to determine if STDIN is an interactive tty session.
However, there doesn't seem to be a way to test whether stdin is being redirected from a file, versus being piped in from a command:
foo < ./file
bar | foo
Is there any way to detect the difference between these two? Separately, is there any way to get the path of the file being redirected from (outside of /proc/self, which is unavailable on macOS)?

You can check if /dev/stdin is a regular file or a pipe:
$ cat tmp.sh
#!/bin/bash
if [ -f /dev/stdin ]; then
echo "file"
elif [ -p /dev/stdin ]; then
echo "pipe"
fi
$ bash tmp.sh < foo.txt
file
$ echo foo | bash tmp.sh
pipe
This relies on /dev/stdin being in your file system, though.
You can also use the stat command, which will return information about standard input given no file name argument. As you mentioned you are using macOS, you can use the %HT format:
$ stat -f %HT
Character Device
$ stat -f %HT < foo.txt
Regular File
$ echo foo | stat -f %HT
Fifo File

Related

Capture output of piped command while still knowing if first command wrote to stderr

Is it possible to capture the output of cmd2 from cmd1 | cmd2 while still knowing if cmd1 wrote to stderr?
I am using exiftool to strip exif data from files:
exiftool "/path/to/file.ext" -all= -o -
This writes the output to stdout. This works for most files. If the file is corrupt or not a video/image file it will not write anything to stdout and, instead, write an error to stderr. For example:
Error: Writing of this type of file is not supported - /path/to/file.ext
I ultimately need to capture the md5 of files that don't result in an error. Right now I am doing this:
md5=$(exiftool "/path/to/file.ext" -all= -o - | md5sum | awk '{print $1}')
Regardless if the file is a image/video, it'll calculate an md5.
If the file is an image/video, it'll capture the file's md5 as expected.
If the file is not an image/video, exiftool doesn't write anything to stdout and so md5sum calculates the md5 of the null input. But that line will also write an error to stderr.
I need to be able to check if something was written to stderr so I know to scrap the calculated md5.
I know one alternative is to run the exiftool twice: one time without the md5sum and without capturing to see if anything was written to stderr and then a second time with the md5sum and capturing. But this means I have to run exiftool twice. I want to avoid that because it can take a long time for big files. I'd rather only run it once.
Update
Also, I can't capture the output of just exiftool because it yields this error:
bash: warning: command substitution: ignored null byte in input
And I cannot ignore this error because the md5 result is not the same. That is to say:
file=$(exiftool "/path/to/file.ext" -all= -o -)
echo "$file" | md5sum
Will print the above null byte error and will not have the same md5 result as:
exiftool "/path/to/file.ext" -all= -o - | md5sum
There is a special var(array) for this PIPESTATUS, simple example, file and file2 exist
$ ls file &> /dev/null | ls file2 &> /dev/null; echo ${PIPESTATUS[#]}
0 0
And here file3 not exist
$ ls file3 &> /dev/null | ls file2 &> /dev/null; echo ${PIPESTATUS[#]}
2 0
$ ls file3; echo $?
ls: cannot access 'file3': No such file or directory
2
Triple pipe
$ ls file 2> /dev/null | ls file3 &> /dev/null | ls file2 &> /dev/null; echo ${PIPESTATUS[#]}
0 2 0
Pipe in var tested with grep
$ test=$(ls file | grep .; ((${PIPESTATUS[1]} > 0)) && echo error)
$ echo $test
file
$ test=$(ls file3 | grep .; ((${PIPESTATUS[1]} > 0)) && echo error)
ls: cannot access 'file3': No such file or directory
$ echo $test
error
Another approach is to check that file type is image or video first.
type=$(file "/path/to/file.ext")
case $type in
*image*|*Media*) echo "is an image or video";;
esac
A coprocess can be used for this:
#!/usr/bin/env bash
case $BASH_VERSION in [0-3].*) echo "ERROR: Bash 4+ required" >&2; exit 1;; esac
coproc STDERR_CHECK { seen=0; while IFS= read -r; do seen=1; done; echo "$seen"; }
{
md5=$(exiftool "/path/to/file.ext" -all= -o - | md5sum | awk '{print $1}')
} 2>&${STDERR_CHECK[1]}
exec {STDERR_CHECK[1]}>&-
read stderr_seen <&"${STDERR_CHECK[0]}"
if (( stderr_seen )); then
echo "exiftool emitted stdout with md5 $md5, and had content on stderr"
else
echo "exiftool emitted stdout with md5 $md5, and did not emit any content on stderr"
fi
md5=$(exec 3>&1; (exiftool "/path/to/file.ext" -all= -o - 2>&1 1>&3) 3> >(md5sum | awk '{print $1}' >&3) | grep -q .)
This opens file descriptor 3 and redirects it to file descriptor 1 (a.k.a. stdout).
The trick is to redirect exiftool outputs:
exiftool ... 2>&1 tells that file descriptor 2 (a.k.a. stderr) is redirected to stdout
exiftool ... 1>&3 tells that stdout is redirected to file descriptor 3 which, at this moment, is redirected to stdout
Then fd 3 is redirected to another chain of commands using process substitution, i.e. 3> >(md5sum | awk '{print $1}' >&3) where 3> tells to redirect fd3 and >(...) is the process substitution itself.
At the same time, the standard error of exiftool is written to the standard output which is piped into grep -q . which will return 0 if there is at least one character.
Because grep -q . is the last command executed in the main chain of commands, you can simply check the results of $?:
md5=$(exec 3>&1; (exiftool "/path/to/file.ext" -all= -o - 2>&1 1>&3) 3> >(md5sum | awk '{print $1}' >&3) | grep -q .)
if [ $? -eq 0 ]
then
# something was written to exiftool's stderr
fi
The error will not be written. If you want to see the error but not capture it in md5 then replace grep -q . by grep . >&2
md5=$(exec 3>&1; (exiftool "/path/to/file.ext" -all= -o - 2>&1 1>&3) 3> >(md5sum | awk '{print $1}' >&3) | grep . >&2)
It is very important that you redirect exiftool outputs in this very order. If you redirected like this:
exiftool "/path/to/file.ext" -all= -o - 1>&3 2>&1
Then stdout is redirected to fd3 and then stderr is redirected to stdout. But because 1>&3 occurs before 2>&1 then stderr will be redirected to stdout which is redirected to fd3 at this time. You definitely don’t want that.
The end of the process substitution chain writes to fd3 with >&3 because you want to keep the result to fd3. Without >&3, the result of awk would end up in fd1 which would be piped to grep -q . or grep . >&2 and, again, you definitely don’t want that.
PS. you don’t need to close fd3 because it was opened during a subprocess when assigning md5. Should you need to close the file descriptor, please call exec 3>&-
Just capture the output, and then conditionally write it. eg:
if out="$(exiftool "/path/to/file.ext" -all= -o - )"; then
md5=$(echo "$out" | md5sum | awk '{print $1}'))
fi
This makes the assignment to md5 and returns the exit status of exiftool, which is checked by the if. Note that this construction assumes that exiftool returns a reasonable exit value.

From within a shell script I want to access input given from other command

I am messing around with shell scripts and I am trying to get my script to take input from another command, like say ls. It is called like this:
ls | ./example.sh
I try to access the input from within example.sh
#!/bin/bash
echo $1
but it echoes back nothing. Is there another way to reference parameters given to bash by other commands, because it works if I type in:
./example.sh poo
Parameters and input aren't the same thing:
$ ls
example.sh foo
$
$ cat example.sh
for param; do
printf 'argument 1: "%s"\n' "$param"
done
while IFS= read -t 1 -r input; do
printf 'input line: "%s"\n' "$input"
done
$
$ ls | ./example.sh
input line: "example.sh"
input line: "foo"
$
$ ls | ./example.sh bar
argument 1: "bar"
input line: "example.sh"
input line: "foo"
$
$ ./example.sh $(ls)
argument 1: "example.sh"
argument 1: "foo"
Parameters are the arguments passed to the script to start it running, input is what the script reads while it's running.
You can use xargs to do that.
ls | xargs ./example.sh
Resource
https://ss64.com/bash/xargs.html
https://www.cyberciti.biz/faq/linux-unix-bsd-xargs-construct-argument-lists-utility/
Or read man page for it

Bash script: how to get the whole command line which ran the script

I would like to run a bash script and be able to see the command line used to launch it:
sh myscript.sh arg1 arg2 1> output 2> error
in order to know if the user used the "std redirection" '1>' and '2>', and therefore adapt the output of my script.
Is it possible with built-in variables ??
Thanks.
On Linux and some unix-like systems, /proc/self/fd/1 and /proc/self/fd/2 are symlinks to where your std redirections are pointing to. Using readlink, we can query if they were redirected or not by comparing them to the parent process' file descriptor.
We will however not use self but $$ because $(readlink /proc/"$$"/fd/1) spawns a new shell so self would no longer refer to the current bash script but to a subshell.
$ cat test.sh
#!/usr/bin/env bash
#errRedirected=false
#outRedirected=false
parentStderr=$(readlink /proc/"$PPID"/fd/2)
currentStderr=$(readlink /proc/"$$"/fd/2)
parentStdout=$(readlink /proc/"$PPID"/fd/1)
currentStdout=$(readlink /proc/"$$"/fd/1)
[[ "$parentStderr" == "$currentStderr" ]] || errRedirected=true
[[ "$parentStdout" == "$currentStdout" ]] || outRedirected=true
echo "$0 ${outRedirected:+>$currentStdout }${errRedirected:+2>$currentStderr }$#"
$ ./test.sh
./test.sh
$ ./test.sh 2>/dev/null
./test.sh 2>/dev/null
$ ./test.sh arg1 2>/dev/null # You will lose the argument order!
./test.sh 2>/dev/null arg1
$ ./test.sh arg1 2>/dev/null >file ; cat file
./test.sh >/home/camusensei/file 2>/dev/null arg1
$
Do not forget that the user can also redirect to a 3rd file descriptor which is open on something else...!
Not really possible. You can check whether stdout and stderr are pointing to a terminal: [ -t 1 -a -t 2 ]. But if they do, it doesn't necessarily mean they weren't redirected (think >/dev/tty5). And if they don't, you can't distinguish between stdout and stderr being closed and them being redirected. And even if you know for sure they are redirected, you can't tell from the script itself where they point after redirection.

Piping and redirecting with cat

Looking over the Dokku source code, I noticed two uses of pipe and redirect that I am not familiar with.
One is: cat | command
Example: id=$(cat | docker run -i -a stdin progrium/buildstep /bin/bash -c "mkdir -p /app && tar -xC /app")
The other is cat > file
Example: id=$(cat "$HOME/$APP/ENV" | docker run -i -a stdin $IMAGE /bin/bash -c "mkdir -p /app/.profile.d && cat > /app/.profile.d/app-env.sh")
What is the use of pipe and redirect in the two cases?
Normally, both usages are completely useless.
cat without arguments reads from stdin, and writes to stdout.
cat | command is equivalent with command.
&& cat >file is equivalent with >file, assuming the previous command processed the stdin input.
Looking at it more closely, the sole purpose of that cat command in the second example is to read from stdin. Without it, you would redirect the output of mkdir to the file. So the command first makes sure the directory exists, then writes to the file whatever you feed to it through the stdin.

How to refer to redirection file from within a bash script?

I'd like to write a bash script myscript such that issuing this command:
myscript > filename.txt
would return the name of the filename that it's output is being redirected to, filename.txt. Is this possible?
If you are running on Linux, check where /proc/self/fd/1 links to.
For example, the script can do the following:
#!/bin/bash
readlink /proc/self/fd/1
And then run it:
$ ./myscript > filename.txt
$ cat filename.txt
/tmp/filename.txt
Note that if you want to save the value of the output file to a variable or something, you can't use /proc/self since it will be different in the subshell, but you can still use $$:
outputfile=$(readlink /proc/$$/fd/1)
Using lsof:
outfile=$(lsof -p $$ | awk '/1w/{print $NF}')
echo $outfile

Resources