why does redirect (<) not create a subshell - bash

I wrote the following code
var=0
cat $file | while read line do
var=$line
done
echo $var
Now as I understand it the pipe (|) will cause a sub shell to be created an therefore the variable var on line 1 will have the same value on the last line.
However this will solve it:
var=0
while read line do
var=$line
done < $file
echo $line
My question is why does the redirect not cause a subshell to be created, or if you like why does pipe cause one to be created?
Thanks

The cat command is a command which means it needs its own process and has its own STDIN and STDOUT. You're basically taking the STDOUT produced by the cat command and redirecting it into the process of the while loop.
When you use redirection, you're not using a separate process. Instead, you're merely redirecting the STDIN of the while loop from the console to the lines of the file.
Needless to say, the second way is more efficient. In the old Usenet days before all of you little whippersnappers got ahold of our Internet (_Hey you kids! Get off of my Internet!) and destroyed it with your fancy graphics and all them web page, some people use to give out the Useless Use of Cat award for people who contributed to the comp.unix.shell group and had a spurious cat command because the use of cat is almost never necessary and is usually more inefficient.
If you're using a cat in your code, you probably don't need it. The cat command comes from concatenate and is suppose to be used only to concatenate files together. For example, when we use to use SneakerNet on 800K floppies, we would have to split up long files with the Unix split command and then use cat to merge them back together.

A pipe is there to hook the stdout of one program to the stdin or another one. Two processes, possibly two shells. When you do redirection (> and <), all you're doing remapping stdin (or stdout) to a file. reading/writing a file can be done without another process or shell.

Related

Pipe command in Bash

Pipe command is showing it's results properly .When i try to use it cat or > it doesn't show the output
i have try to run the command with different spaces but it didn't help
sort spiderman.txt | cat > superman.txt
sort spiderman.txt | > superman.txt
in the first above code cat is not showing it's output (the cat command is not showing contents of superman.txt ) however if i write is separately the cat command it's showing the contents
in the second command nothing happens to superman.txt
ideally it should have replaced all contents of superman.txt and replaced with sorted contents of spiderman.txt but nothing happens.
If you're trying simple output redirection you shouldn't pipe (|), just redirect (>):
sort spiderman.txt > superman.txt
If you want to show the content as well as redirect to a file - perhaps what you're looking for is tee?
sort spiderman.txt | tee superman.txt
Description:
The tee utility copies standard input to standard output, making a copy in zero or more files. The output is unbuffered.
> superman.txt (with no command) is processed as follows:
superman.txt is opened for writing and truncated
The output redirection is removed from the current command.
Since there is nothing left, the empty command is treated as having
run and exited successfully. Nothing actually reads from the pipe
or writes to superman.txt.
cat is necessary as a command which does read from standard input and writes to standard output.
It sometimes seems a little odd to me that more shells don't provide a minimal built-in that simply copies input to output with no frills, to avoid otherwise having to fork and exec cat. ( I should say "no" rather than "more", as I'm not aware of any shell that does. zsh might, if I bothered to search through the documentation to find it.)
(Some shells will optimize away an extra fork when processing a command line; bash is not one of them, though. It forks once to create a process for the write end of the pipe, then forks again to run cat. I believe ksh would simply exec cat directly instead of unnecessarily forking, in which case a built-in cat is less necessary.)

How does find and printf work when using pipes in bash scripting

Suppose I use the printf in the find command like this:
find ./folder -printf "%f\n" | other command which uses the result of printf
in the other command part, I may be having a sort or something similar
what exactly does printf do in this case? where does it print the file names before the process in the part after "|" happens?
if I sort the filenames for example, it will first sort them, and then print them sorted on the monitor, but before that, how exactly does the part after | get the files unsorted in order to sort them? does the printf in this case give the filenames as input to the part after | and then the part after | prints the file names sorted in the output?
sorry for my english :(
Your shell calls pipe() which creates two file descriptors. Writing into one buffers data in the kernel which is available to be read by the other. Then it calls fork() to make a new process for the find command. After the fork() it closes stdout (always fd 1) and uses dup2() to copy one end of the pipe to stdout. Then it uses exec() to run find (replacing the copy of the shell in the subprocess with find). When find runs it just prints to stdout as normal, but it has inherited it from the shell which made it the pipe. Meanwhile the shell is doing the same thing for other command... with stdin so that it is created with fd 0 connected to the other end of the pipe.
Yes, that is how pipes work. The output from the first process is the input to the second. In terms of implementation, the shell creates a socket which receives input from the first process from its standard output, and writes output to the second process on its standard input.
... You should perhaps read an introduction to Unix shell programming if you have this type of questions.

Diff output from two programs without temporary files

Say I have too programs a and b that I can run with ./a and ./b.
Is it possible to diff their outputs without first writing to temporary files?
Use <(command) to pass one command's output to another program as if it were a file name. Bash pipes the program's output to a pipe and passes a file name like /dev/fd/63 to the outer command.
diff <(./a) <(./b)
Similarly you can use >(command) if you want to pipe something into a command.
This is called "Process Substitution" in Bash's man page.
Adding to both the answers, if you want to see a side by side comparison, use vimdiff:
vimdiff <(./a) <(./b)
Something like this:
One option would be to use named pipes (FIFOs):
mkfifo a_fifo b_fifo
./a > a_fifo &
./b > b_fifo &
diff a_fifo b_fifo
... but John Kugelman's solution is much cleaner.
For anyone curious, this is how you perform process substitution in using the Fish shell:
Bash:
diff <(./a) <(./b)
Fish:
diff (./a | psub) (./b | psub)
Unfortunately the implementation in fish is currently deficient; fish will either hang or use a temporary file on disk. You also cannot use psub for output from your command.
Adding a little more to the already good answers (helped me!):
The command docker outputs its help to STD_ERR (i.e. file descriptor 2)
I wanted to see if docker attach and docker attach --help gave the same output
$ docker attach
$ docker attach --help
Having just typed those two commands, I did the following:
$ diff <(!-2 2>&1) <(!! 2>&1)
!! is the same as !-1 which means run the command 1 before this one - the last command
!-2 means run the command two before this one
2>&1 means send file_descriptor 2 output (STD_ERR) to the same place as file_descriptor 1 output (STD_OUT)
Hope this has been of some use.
For zsh, using =(command) automatically creates a temporary file and replaces =(command) with the path of the file itself. With normal Process Substitution, $(command) is replaced with the output of the command.
This zsh feature is very useful and can be used like so to compare the output of two commands using a diff tool, for example Beyond Compare:
bcomp =(ulimit -Sa | sort) =(ulimit -Ha | sort)
For Beyond Compare, note that you must use bcomp for the above (instead of bcompare) since bcomp launches the comparison and waits for it to complete. If you use bcompare, that launches comparison and immediately exits due to which the temporary files created to store the output of the commands disappear.
Read more here: http://zsh.sourceforge.net/Intro/intro_7.html
Also notice this:
Note that the shell creates a temporary file, and deletes it when the command is finished.
and the following which is the difference between $(...) and =(...) :
If you read zsh's man page, you may notice that <(...) is another form of process substitution which is similar to =(...). There is an important difference between the two. In the <(...) case, the shell creates a named pipe (FIFO) instead of a file. This is better, since it does not fill up the file system; but it does not work in all cases. In fact, if we had replaced =(...) with <(...) in the examples above, all of them would have stopped working except for fgrep -f <(...). You can not edit a pipe, or open it as a mail folder; fgrep, however, has no problem with reading a list of words from a pipe. You may wonder why diff <(foo) bar doesn't work, since foo | diff - bar works; this is because diff creates a temporary file if it notices that one of its arguments is -, and then copies its standard input to the temporary file.

When the input is from a pipe, does STDIN.read run until EOF is reached?

Sorry if this is a naïve question, but let's say I have a Ruby program called processor.rb that begins with data = STDIN.read. If I invoke this program like this
cat textfile.txt | processor.rb
Does STDIN.read wait for cat to pipe the entire textfile.txt in? Or does it assign some indeterminate portion of textfile.txt to the data variable?
I'm asking this because I recently saw a strange bug in one of my programs that suggests that the latter is the case.
The read method should import the entire file, as-is, and return only when the process producing the output has finished, as indicated by a flag on the pipe. It should be the case that on output from cat that if you call read a subsequent time, you will return 0 bytes.
In simple terms, a process is allowed to append to its output at any time, which is the case of things like 'tail -f', so you can't be assured that you have read all the data from STDIN without actually checking.
Your OS may implement cat or shell pipes slightly differently, though. I'm not familiar with what POSIX dictates for behavior here.
Probably is line buffered and reads until it encounters a newline or EOF.

Switch from file contents to STDIN in piped command? (Linux Shell)

I have a program (that I did not write) which is not designed to read in commands from a file. Entering commands on STDIN is pretty tedious, so I'd like to be able to automate it by writing the commands in a file for re-use. Trouble is, if the program hits EOF, it loops infinitely trying to read in the next command dropping an endless torrent of menu options on the screen.
What I'd like to be able to do is cat a file containing the commands into the program via a pipe, then use some sort of shell magic to have it switch from the file to STDIN when it hits the file's EOF.
Note: I've already considered using cat with the '-' for STDIN. Unfortunately (I didn't know this before), piped commands wait for the first program's output to terminate before starting the second program -- they do not run in parallel. If there's some way to get the programs to run in parallel with that kind of piping action, that would work!
Any thoughts? Thanks for any assistance!
EDIT:
I should note that my goal is not only to prevent the system from hitting the end of the commands file. I would like to be able to continue typing in commands from the keyboard when the file hits EOF.
I would do something like
(cat your_file_with_commands; cat) | sh your_script
That way, when the file with commands is done, the second cat will feed your script with whatever you type on stdin afterwards.
Same as Idelic answer with more simple syntax ;)
cat your_file_with_commands - | sh your_script
I would think expect would work for this.
Have you tried using something like tail -f commandfile | command I think that should pipe the lines of the file to command without closing the file descriptor afterwards. Use -n to specify the number of lines to be piped if tail -f doesn't catch all of them.

Resources