Workaround for vint (vim lint) not allowing input from stdin - bash

I have a program that doesn't accept stdin and I must pass my file as stdin.
I tried to write a wrapper but for some reason process-substitution does not work:
#!/bin/bash
vint "$#" <(cat /dev/stdin)
Yields:
ERROR: no such file or directory: `/dev/fd/63`
Output of ls -ld /dev/fd:
> dr-xr-xr-x 1 root wheel 0 Nov 29 10:57 /dev/fd/
Steps to reproduce:
Create a vim file:
cat <<EOF > myvim.vim
set nocompatible
EOF
Install vint
Create the wrapper script (as mentioned above)
Run this:
cat myvim.vim | vint-wrapper
How can I achieve this workaround?

The Short Answer: Fixing The Problem
vint doesn't accept any kind of socket or FIFO: It only allows regular files as input. Thus, you'll need to dump stdin into a regular file before running it:
#!/usr/bin/env bash
tempfile=$(mktemp "${TMPDIR:-/tmp}/vint.XXXXXX") || exit
trap 'rm -f "$tempfile"' EXIT
cat >"$tempfile" || exit
vint "$tempfile"
...or, if you're willing to rely on undocumented implementation details, use a heredoc or herestring to make the shell do the temporary-file-management on your behalf, while using - as the input filename (which vint explicitly exempts):
#!/usr/bin/env bash
vint - <<<"$(cat)"
The Longer Answer: Why This Happens
The error takes place because vint explicitly checks whether a filename it is given is a file, and rejects anything else:
if not self._should_read_from_stdin(env):
for path_to_lint in paths_to_lint:
if not path_to_lint.exists() or not path_to_lint.is_file():
logging.error('no such file or directory: `{path}`'.format(
path=str(path_to_lint)))
parser.exit(status=1)
...however, see that _should_read_from_stdin? That's important; vint can read from stdin, but it must be given - as the filename to use.
cat myvim.vim | vint -
However, even then, vint requires that file given be seekable, which FIFOs and pipelines are not. Thus, this will result in:
io.UnsupportedOperation: underlying stream is not seekable
...because the way vint checks whether a file has a BOM indicating whether it has multi-byte characters involves reading the beginning and then rewinding; pipelines cannot be rewound, but can only be read once, front-to-back.

Related

BASH - Best practice to use output from one script in another

I'm developing a BASH script which invokes another BASH script which prints a line to stdout. That output is captured by the first BASH script and used later. It works, but it has the downside that any other output which is printed by the second script will cause this part to behave unexpectedly, because there will be extra content.
main.sh
#!/bin/bash
# Invoke worker.sh and capture its standard output to stats
stats=$(worker.sh --generate-stats)
echo "stats=$stats"
worker.sh
#!/bin/bash
[[ $1 == "--generate-stats" ]] && echo "cpu=90 mem=50 disk=15"
In this over-simplified example, it's not a problem to use this construct, but as worker.sh grows in size and complexity, it's hard to remember that no other command can print to stdout without confounding the behavior, and if someone else works on worker.sh without realizing they can't print to stdout, it can easily get fouled. So what is considered good practice to generate output in one script and use it in the other?
I'm wondering if a fifo would be appropriate, or another file descriptor, or just a plain file. Or if exec should be used in this case, something like what is shown here https://www.tldp.org/LDP/abs/html/x17974.html:
#!/bin/bash
exec 6>&1 # Link file descriptor #6 with stdout.
# Saves stdout.
exec >&2 # stdout now goes to stderr
echo "Didn't know I shouldn't print to stdout"
exec 1>&6 6>&- # Restore stdout and close file descriptor #6.
[[ $1 == "--generate-stats" ]] && echo "cpu=90 mem=50 disk=15"
But I wouldn't want to use that if it's not considered good practice.
Many command-line utilities have quiet and verbose modes; it's generally considered good practice to have the most verbose output (debugging, tracing, etc.) be separated to standard error anyway, but it's common to have normal output be formatted for human legibility (e.g. include table headings and column separators) and quiet mode output be just the bare data for programmatic use. (For one example, see docker images vs docker images -q). So that would be my recommendation - have worker.sh take a flag indicating whether its output is being consumed programmatically, and write it such that its output is all sent via a function that checks that flag and filters appropriately.
Maybe a different approach would be for the second script to test to see if it's stdout is being used programatically:
gash.sh:
#!/bin/bash
data=$(./another.sh)
echo "Received $data"
another.sh:
#!/bin/bash
# for -t see man isatty(3). 1 is file descriptor 1 - stdout
if [ -t 1 ]; then
echo "stdout is a terminal"
else
echo "stdout is not a terminal"
fi
Gives (where $ is a generic keyboard prompt):
$ bash gash.sh
Received stdout is not a terminal
$ bash another.sh
stdout is a terminal
You could then set a flag to change script behaviour (ls(1) does a similar thing). However, you should be prepared for this:
$ bash another.sh|more
stdout is not a terminal
$ bash another.sh > out.txt
$ cat out.txt
stdout is not a terminal

How does shell preserve command file input line boundary?

The description of STDIN of sh in SUSv4 2016 edition says
It shall not read ahead in such a manner that any characters intended to be read by the invoked command are consumed by the shell
I did an experiment to see it in action, by redirecting the following script file into sh -s:
#!/bin/sh
./rewind # a c program that reads up its stdin, prints it to stdout, before SEEK_SET its standard input to 0.
echo 1234-1234
And it keeps printing out "echo 1234-1234". (Had shell consumed whole block of file, it would only print out "1234-1234")
So obviously, shells (at least my ones) do read at line boundaries.
But however, when I examined the FreeBSD ash "input.c" source codes, it reads in BUFSIZ-byte blocks, and I don't understand how it preserves line boundaries.
What I want to know is: How does shells preserve line boundaries when their source codes apparently shows that they read in blocks?
Standard input isn't seekable in some cases, for example if it is redirected from a pipe or from a terminal. E.g. having a file called rew with a content:
#!/bin/bash
echo 123
perl -E 'seek(STDIN,0,0) or die "$!"' #the rewind
and using it as
bash -s < rew
prints
123
123
...
123
^C
so, when the STDIN is seekable it will work as expected, but trying the same from a pipe, such:
cat rew | bash -s #the cat is intentional here :)
will print
123
Illegal seek at -e line 1.
So, your c-program rewind should print an error, when it is trying to seek in un-seekable input.
You can demonstrate this (surprising) behaviour with the following script:
$ cat test.sh
cut -f 1
printf '%s\t%s\n' script file
If you pass it to standard input of your shell, line 2 onward should become the standard input of the cut command:
$ sh < test.sh
printf '%s\t%s\n' script file
If you instead run it normally and input literally foo, followed by Tab, bar, Enter, and Ctrl-d, you should instead see standard input being linked as normal:
$ sh test.sh
foo bar
foo
script file
Annotated:
$ sh test.sh # Your command
foo bar # Your input to `cut`
foo # The output of `cut`
script file # The output of `printf`
When I compiled the FreeBSD ash with NO_HISTORY macro defined to 1 in "shell.h", it consumes the whole file and outputs only 1234-1234 on my rewind testing program. Apparently, the FreeBSD ash relies on libedit to preserve IO line boundaries.

Reading full file from Standard Input and Supplying it to a command in ksh

I am trying to read contents of a file given from standard input into a script. Any ideas how to do that?
Basically what I want is:
someScript.ksh < textFile.txt
Inside the ksh, I am using a binary which will read data from "textFile.txt" if the file is given on the standard input.
Any ideas how do I "pass" the contents of the given input file, if any, to another binary inside the script?
You haven't really given us enough information to answer the question, but here are a few ideas.
If you have a script that you want to accept data on stdin, and that script calls something else that expects data to be passed in as a filename on the command line, you can take stdin and dump it to a temporary file. Something like:
#!/bin/sh
tmpfile=$(mktemp tmpXXXXXX)
cat > $tmpfile
/some/other/command $tmpfile
rm -f $tmpfile
(In practice, you would probably use trap to clean up the temporary file on exit).
If instead the script is calling another command that also expects input on stdin, you don't really have to do anything special. Inside your script, stdin of anything you call will be connected to stdin of the calling script, and as long as you haven't previously consumed the input you should be all set.
E.g., given a script like this:
#!/bin/sh
sed s/hello/goodbye/
I can run:
echo hello world | sh myscript.sh
And get:
goodbye world

Bash shell read error: 0: Resource temporarily unavailable

When writing a bash script. Sometimes you are running a command which opens up another program such as npm, composer.. etc. But at the same time you need to use read in order to prompt the user.
Inevitable you hit this kind of error:
read: read error: 0: Resource temporarily unavailable
After doing some research there seems to be a solution by piping the STDIN of those programs which manipulate the STDIN of your bash script to /dev/null.
Something like:
npm install </dev/null
Other research has shown it has something to do with the fact that the STDIN is being set to some sort of blocking/noblocking status and it isn't being reset after the program finishes.
The question is there some sort of fool proof, elegant way of reading user prompted input without being affected by those programs that manipulate the STDIN and not having to hunt which programs need to have their STDIN redirected to /dev/null. You may even need to use the STDIN of those programs!
Usually it is important to know what input the invoked program expects and from where, so it is not a problem to redirect stdin from /dev/null for those that shouldn't be getting any.
Still, it is possible to do it for the shell itself and all invoked programs. Simply move stdin to another file descriptor and open /dev/null in its place. Like this:
exec 3<&0 0</dev/null
The above duplicates stdin file descriptor (0) under file descriptor 3 and then opens /dev/null to replace it.
After this any invoked command attempting to read stdin will be reading from /dev/null. Programs that should read original stdin should have redirection from file descriptor 3. Like this:
read -r var 0<&3
The < redirection operator assumes destination file descriptor 0, if it is omitted, so the above two commands could be written as such:
exec 3<&0 </dev/null
read -r var <&3
When this happens, run bash from within your bash shell, then exit it (thus returning to the original bash shell). I found a mention of this trick in https://github.com/fish-shell/fish-shell/issues/176 and it worked for me, seems like bash restores the STDIN state. Example:
bash> do something that exhibits the STDIN problem
bash> bash
bash> exit
bash> repeat something: STDIN problem fixed
I had a similar issue, but the command I was running did need a real STDIN, /dev/null wasn't good enough. Instead, I was able to do:
TTY=$(/usr/bin/tty)
cmd-using-stdin < $TTY
read -r var
or combined with spbnick's answer:
TTY=$(/usr/bin/tty)
exec 3<&0 < $TTY
cmd-using-stdin
read -r var 0<&3`
which leaves a clean STDIN in 3 for you to read and 0 becomes a fresh stream from the terminal for the command.
I had the same problem. I solved by reading directly from tty like this, redirecting stdin:
read -p "Play both [y]? " -n 1 -r </dev/tty
instead of simply:
read -p "Play both [y]? " -n 1 -r
In my case, the use of exec 3<&0 ... didn't work.
Clearly (resource temporarily unavailable is EAGAIN) this is caused by programs that exits but leaves STDIN in nonblocking mode.
Here is another solution (easiest to script?):
perl -MFcntl -e 'fcntl STDIN, F_SETFL, fcntl(STDIN, F_GETFL, 0) & ~O_NONBLOCK'
The answers here which suggest using redirection are good. Fortunately, Bash's read should soon no longer need such fixes. The author of Readline, Chet Ramey, has already written a patch: http://gnu-bash.2382.n7.nabble.com/read-may-fail-due-to-nonblocking-stdin-td18519.html
However, this problem is more general than just the read command in Bash. Many programs presume stdin is blocking (e.g., mimeopen) and some programs leave stdin non-blocking after they exit (e.g., cec-client). Bash has no builtin way to turn off non-blocking input, so, in those situations, you can use Python from the command line:
$ python3 -c $'import os\nos.set_blocking(0, True)'
You can also have Python print the previous state so that it may be changed only temporarily:
$ o=$(python3 -c $'import os\nprint(os.get_blocking(0))\nos.set_blocking(0, True)')
$ somecommandthatreadsstdin
$ python3 -c $'import os\nos.set_blocking(0, '$o')'

How do I capture the output of a command to a file descriptor in Bourne shell?

The standard way to capture command output in Bourne shell is to use the $() syntax:
output=$(mycommand)
For commands that have a lot of output, however, this requires the shell allocate memory for the whole thing as one long string. I'd prefer to find something that does the moral equivalent of the Unix C function popen, to get a new file descriptor I could read from:
newfd=popen(mycommand)
while read -u $newfd LINE; do
#process output
done
Is this even possible?
#!bash
ls | while read X
do
echo $X is a directory entry
done
Replace 'ls' with the command of your choice

Resources