How does shell preserve command file input line boundary? - bash

The description of STDIN of sh in SUSv4 2016 edition says
It shall not read ahead in such a manner that any characters intended to be read by the invoked command are consumed by the shell
I did an experiment to see it in action, by redirecting the following script file into sh -s:
#!/bin/sh
./rewind # a c program that reads up its stdin, prints it to stdout, before SEEK_SET its standard input to 0.
echo 1234-1234
And it keeps printing out "echo 1234-1234". (Had shell consumed whole block of file, it would only print out "1234-1234")
So obviously, shells (at least my ones) do read at line boundaries.
But however, when I examined the FreeBSD ash "input.c" source codes, it reads in BUFSIZ-byte blocks, and I don't understand how it preserves line boundaries.
What I want to know is: How does shells preserve line boundaries when their source codes apparently shows that they read in blocks?

Standard input isn't seekable in some cases, for example if it is redirected from a pipe or from a terminal. E.g. having a file called rew with a content:
#!/bin/bash
echo 123
perl -E 'seek(STDIN,0,0) or die "$!"' #the rewind
and using it as
bash -s < rew
prints
123
123
...
123
^C
so, when the STDIN is seekable it will work as expected, but trying the same from a pipe, such:
cat rew | bash -s #the cat is intentional here :)
will print
123
Illegal seek at -e line 1.
So, your c-program rewind should print an error, when it is trying to seek in un-seekable input.

You can demonstrate this (surprising) behaviour with the following script:
$ cat test.sh
cut -f 1
printf '%s\t%s\n' script file
If you pass it to standard input of your shell, line 2 onward should become the standard input of the cut command:
$ sh < test.sh
printf '%s\t%s\n' script file
If you instead run it normally and input literally foo, followed by Tab, bar, Enter, and Ctrl-d, you should instead see standard input being linked as normal:
$ sh test.sh
foo bar
foo
script file
Annotated:
$ sh test.sh # Your command
foo bar # Your input to `cut`
foo # The output of `cut`
script file # The output of `printf`

When I compiled the FreeBSD ash with NO_HISTORY macro defined to 1 in "shell.h", it consumes the whole file and outputs only 1234-1234 on my rewind testing program. Apparently, the FreeBSD ash relies on libedit to preserve IO line boundaries.

Related

Workaround for vint (vim lint) not allowing input from stdin

I have a program that doesn't accept stdin and I must pass my file as stdin.
I tried to write a wrapper but for some reason process-substitution does not work:
#!/bin/bash
vint "$#" <(cat /dev/stdin)
Yields:
ERROR: no such file or directory: `/dev/fd/63`
Output of ls -ld /dev/fd:
> dr-xr-xr-x 1 root wheel 0 Nov 29 10:57 /dev/fd/
Steps to reproduce:
Create a vim file:
cat <<EOF > myvim.vim
set nocompatible
EOF
Install vint
Create the wrapper script (as mentioned above)
Run this:
cat myvim.vim | vint-wrapper
How can I achieve this workaround?
The Short Answer: Fixing The Problem
vint doesn't accept any kind of socket or FIFO: It only allows regular files as input. Thus, you'll need to dump stdin into a regular file before running it:
#!/usr/bin/env bash
tempfile=$(mktemp "${TMPDIR:-/tmp}/vint.XXXXXX") || exit
trap 'rm -f "$tempfile"' EXIT
cat >"$tempfile" || exit
vint "$tempfile"
...or, if you're willing to rely on undocumented implementation details, use a heredoc or herestring to make the shell do the temporary-file-management on your behalf, while using - as the input filename (which vint explicitly exempts):
#!/usr/bin/env bash
vint - <<<"$(cat)"
The Longer Answer: Why This Happens
The error takes place because vint explicitly checks whether a filename it is given is a file, and rejects anything else:
if not self._should_read_from_stdin(env):
for path_to_lint in paths_to_lint:
if not path_to_lint.exists() or not path_to_lint.is_file():
logging.error('no such file or directory: `{path}`'.format(
path=str(path_to_lint)))
parser.exit(status=1)
...however, see that _should_read_from_stdin? That's important; vint can read from stdin, but it must be given - as the filename to use.
cat myvim.vim | vint -
However, even then, vint requires that file given be seekable, which FIFOs and pipelines are not. Thus, this will result in:
io.UnsupportedOperation: underlying stream is not seekable
...because the way vint checks whether a file has a BOM indicating whether it has multi-byte characters involves reading the beginning and then rewinding; pipelines cannot be rewound, but can only be read once, front-to-back.

understanding how input was redirected into while construct from a file

I came across a syntax for "while read" loop in a bash script
$> while read line; do echo $line; done < f1 # f1 is a file in my current directory
will print the file line by line.
my search for "while read" in the bash GNU manual https://www.gnu.org/software/bash/manual/
came up short, and while other "tutorial sites" give some usage examples, i would still like to understand the full syntax options for this construct.
can it be used for "for" loops as well?
something like
for line in read; do echo $line; done < f1
The syntax for a while loop is
while list-1; do list-2; done
where list-1 is one or more commands (usually one) and the loop continues while list-1 is successful (return value of zero), list-2 is the "body" of the loop.
The syntax of a for loop is different:
for name in word; do list ; done
where word is usually a list of strings, not a command (although it can be hacked to use a command which returns word).
The purpose of a for loop is to iterate through word, the purpose of while is to loop while a command is successful. They are used for different tasks.
Redirection changes a file descriptor to refer to another file or file descriptor.
< changes file descriptor 0 (zero), also known as stdin
> changes file descriptor 1 (one), also known as stdout
So somecommand < foo changes stdin to read from foo rather than the terminal keyboard.
somecommand > foo changes stdout to write to foo rather than the terminal screen (if foo exists it will be overwritten).
In your case somecommand is while, but it can be any other - note that not all commands read from stdin, yet the command syntax with < is still valid.
A common mistake is:
# WRONG!
while read < somefile
do
....
done
In that case somecommand is read and the effect is that it will read the first line of somefile, then proceed with the body of the loop, come back, then read the first line of the file again! It will continually loop just reading the first line, since while has no knowledge or interest in what read is doing, only its return value of success or fail. (read uses the variable REPLY if you don't specify one)
Redirection examples ($ indicates a prompt):
$ cat > file1
111111
<CTRL+D>
$ cat > file2
222222
<CTRL+D>
cat reads from stdin if we don't specify a filename, so it reads from the keyboard. Instead of writing to the screen we redirect to a file. The <CTRL+D> indicates End-Of-File sent from the keyboard.
This redirects stdin to read from a file:
$ cat < file1
111111
Can you explain this?
$ cat < file1 file2
222222

Error when using exec vi

#!/bin/bash
if [ $# -ne 1 ]
then
echo "USAGE:vitest filename"
else
FILENAME=$1
exec vi $FILENAME <<EOF
i
Line 1.
Line 2.
^[
ZZ
EOF
fi
exit 0
I'm trying to input the Line 1. and Line 2. with Exec vi using the here doc, and commands.
When running the script it gives me the following:
Vim(?):Warning: Input is not from a terminal
Vim: Error reading input, exiting...
Press ENTER or type command to continueVim: Finished.
Vim: Error reading input, exiting...
Vim: Finished.
You want to start vi in ex mode, with a few minor changes to the script.
vi -e "$FILENAME" <<EOF
i
Line 1.
Line 2.
.
wq
EOF
exec is almost certainly unnecessary, especially since you have an exit command following vi. exec is used to replace the current script with the given command; it is not needed simply to execute a command.
A brief history of UNIX text editors:
ed was the original editor, designed to work with a teletype rather than a video terminal.
ex was an extended version of ed, designed to take advantage of a video terminal.
vi was an editor that provided ex with a full-screen visual mode, in contrast with the line-oriented interface employed by ed and ex.
As suggested, ed
ed file << END
1i
line1
line2
.
wq
END
The "dot" line means "end of input".
It can be written less legibly as a one-liner
printf "%s\n" 1i "line1" "line2" . wq | ed file
Use cat.
$ cat file1.txt file2.txt | tee file3.txt
Line 1
Line 2
aaaa
bbbb
cccc
Using sed
If I understand correctly, you want to add two lines to the beginning of a file. In that case, as per Cyrus' suggestion, run:
#!/bin/bash
if [ $# -ne 1 ]
then
echo "USAGE:vitest filename"
exit 1
fi
sed -i.bak '1 s/^/line1\nline2\n/' "$1"
Notes:
When a shell variable is used, it should be in double-quotes unless you want word splitting and pathname expansion to be performed. This is important for file names, for example, as it is now common for them to contain whitespace.
It is best practice to use lower or mixed case names for shell variables. The system uses upper case names for its variables and you don't want to overwrite one of them accidentally.
In the check for the argument, the if statement should include an exit to prevent the rest of the script from being run in the case that no argument was provided. In the above, we added exit 1 which sets the exit code to 1 to signal an error.
Using vi
Let's start with this test file:
$ cat File
some line
Now, let's run vi and see what is in File afterward:
$ vi -s <(echo $'iline1\nline2\n\eZZ') File
$ cat File
line1
line2
some line
The above requires bash or similar.

Reading input while also piping a script via stdin

I have a simple Bash script:
#!/usr/bin/env bash
read X
echo "X=$X"
When I execute it with ./myscript.sh it works. But when I execute it with cat myscript.sh | bash it actually puts echo "X=$X" into $X.
So this script prints Hello World executed with cat myscript.sh | bash:
#!/usr/bin/env bash
read X
hello world
echo "$X"
What's the benefit of executing a script with cat myscript.sh | bash? Why doesn't do it the same things as if I execute it with ./myscript.sh?
How can I avoid Bash to execute line by line but execute all lines after the STDIN reached the end?
Instead of just running
read X
...instead replace it with...
read X </dev/tty || {
X="some default because we can't read from the TTY here"
}
...if you want to read from the console. Of course, this only works if you have a /dev/tty, but if you wanted to do something robust, you wouldn't be piping from curl into a shell. :)
Another alternative, of course, is to pass in your value of X on the command line.
curl https://some.place/with-untrusted-code-only-idiots-will-run-without-reading \
| bash -s "value of X here"
...and refer to "$1" in your script when you want X.
(By the way, I sure hope you're at least using SSL for this, rather than advising people to run code they download over plain HTTP with no out-of-band validation step. Lots of people do it, sure, but that's making sites they download from -- like rvm.io -- big targets. Big, easy-to-man-in-the-middle-or-DNS-hijack targets).
When you cat a script to bash the code to execute is coming from standard input.
Where does read read from? That's right also standard input. This is why you can cat input to programs that take standard input (like sed, awk, etc.).
So you are not running "a script" per-se when you do this. You are running a series of input lines.
Where would you like read to read data from in this setup?
You can manually do that (if you can define such a place). Alternatively you can stop running your script like this.

Open a shell in the second process of a pipe

I'm having problems understanding what's going on in the following situation. I'm not familiar with UNIX pipes and UNIX at all but have read documentation and still can't understand this behaviour.
./shellcode is an executable that successfully opens a shell:
seclab$ ./shellcode
$ exit
seclab$
Now imagine that I need to pass data to ./shellcode via stdin, because this reads some string from the console and then prints "hello " plus that string. I do it in the following way (using a pipe) and the read and write works:
seclab$ printf "world" | ./shellcode
seclab$ hello world
seclab$
However, a new shell is not opened (or at least I can't see it and iteract with it), and if I run exit I'm out of the system, so I'm not in a new shell.
Can someone give some advice on how to solve this? I need to use printf because I need to input binary data to the second process and I can do it like this: printf "\x01\x02..."
When you use a pipe, you are telling Unix that the output of the command before the pipe should be used as the input to the command after the pipe. This replaces the default output (screen) and default input (keyboard). Your shellcode command doesn't really know or care where its input is coming from. It just reads the input until it reaches the EOF (end of file).
Try running shellcode and pressing Control-D. That will also exit the shell, because Control-D sends an EOF (your shell might be configured to say "type exit to quit", but it's still responding to the EOF).
There are two solutions you can use:
Solution 1:
Have shellcode accept command-line arguments:
#!/bin/sh
echo "Arguments: $*"
exec sh
Running:
outer$ ./shellcode foo
Arguments: foo
$ echo "inner shell"
inner shell
$ exit
outer$
To feed the argument in from another program, instead of using a pipe, you could:
$ ./shellcode `echo "something"`
This is probably the best approach, unless you need to pass in multi-line data. In that case, you may want to pass in a filename on the command line and read it that way.
Solution 2:
Have shellcode explicitly redirect its input from the terminal after it's processed your piped input:
#!/bin/sh
while read input; do
echo "Input: $input"
done
exec sh </dev/tty
Running:
outer$ echo "something" | ./shellcode
Input: something
$ echo "inner shell"
inner shell
$ exit
outer$
If you see an error like this after exiting the inner shell:
sh: 1: Cannot set tty process group (No such process)
Then try changing the last line to:
exec bash -i </dev/tty

Resources