Why does `cat <(cat)` produce EIO? - bash

I have a program that reads from two input files simultaneously. I'd like to have this program read from standard input. I thought I'd use something like this:
$program1 <(cat) <($program2)
but I've just discovered that
cat <(cat)
produces
....
mmap2(NULL, 139264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb758e000
read(0, 0xb758f000, 131072) = -1 EIO (Input/output error)
....
cat: -: Input/output error
and similarly,
$ cat <(read -n 1)
bash: read: read error: 0: Input/output error
So... Linux is failing to read at the syscall level. That's interesting. Is bash not wiring stdin to the subshell? :(
Is there a solution to this? I specifically need to use process substitution (the ... <(...) format) because $program1 (tail, incidentally) expects files, and I need to do some preprocessing (with od) on standard input before I can pass it to tail - I can't just specify /dev/stdin et al.
EDIT:
What I actually want to do is read from a file (which another process will be writing to) while I also read from standard input so I can accept commands and such. I was hoping I could do
tail -f <(od -An -vtd1 -w1) <(cat fifo)
to read from standard input and the FIFO simultaneously and drop that into a single stdout stream I could run through awk (or similar). I know I could solve this trivially in any scripting language, but I like learning how to make bash do everything :P
EDIT 2: I've asked a new question that more fully explains the context I described just above.

1. Explain why cat <(cat) produces EIO
( I'm using Debian Linux 8.7, Bash 4.4.12 )
Let's replace <(cat) with the long running <(sleep) to see what's happening.
From pty #1:
$ echo $$
906
$ tty
/dev/pts/14
$ cat <(sleep 12345)
Go to another pty #2:
$ ps t pts/14 j
PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND
903 906 906 906 pts/14 29999 Ss 0 0:00 bash
906 29998 906 906 pts/14 29999 S 0 0:00 bash
29998 30000 906 906 pts/14 29999 S 0 0:00 sleep 12345
906 29999 29999 906 pts/14 29999 S+ 0 0:00 cat /dev/fd/63
$ ps p 903 j
PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND
1 903 903 903 ? -1 Ss 0 0:07 SCREEN -T linux -U
$
Let me explain it (according to the APUE book, 2nd edition):
The TPGID being 29999 indicates that cat (PID 29999) is the foreground process group which is now controlling the terminal (pts/14). And sleep is in the background process group (PGID 906).
The process group of 906 is now an orphaned process group because "the parent of every member is either itself a member of the group or is not a member of the group’s session". (The PID 906's PPID is 903 and 903 is in a different session.)
When the process in an orphaned background process group reads from its controlling terminal, read() would fail with EIO.
2. Explain why cat <(cat) sometimes works (not really!)
Daniel Voina mentioned in a comment that cat <(cat) works on OS X with Bash 3.2.57. I just managed to reproduce it also on Linux with Bash 4.4.12.
From pty #1:
bash-4.4# echo $$
10732
bash-4.4# tty
/dev/pts/0
bash-4.4# cat <(cat)
cat: -: Input/output error
bash-4.4#
bash-4.4#
bash-4.4# bash --norc --noprofile # start a new bash
bash-4.4# tac <(cat)
<-- It's waiting here so looks like it's working.
(The first cat <(cat) failing with EIO was explained in the first part of my answer.)
Go to another pty #2:
bash-4.4# ps t pts/0 j
PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND
10527 10732 10732 10732 pts/0 10805 Ss 0 0:00 bash
10732 10803 10803 10732 pts/0 10805 S 0 0:00 bash --norc --noprofile
10803 10804 10803 10732 pts/0 10805 S 0 0:00 bash --norc --noprofile
10804 10806 10803 10732 pts/0 10805 T 0 0:00 cat
10803 10805 10805 10732 pts/0 10805 S+ 0 0:00 tac /dev/fd/63
bash-4.4# ps p 10527 j
PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND
10526 10527 10527 10527 ? -1 Ss 0 0:00 SCREEN -T dtterm -U
bash-4.4#
Let's see what's happening:
The TPGID being 10805 indicates that tac (PID 10805) is the foreground process group which is now controlling the terminal (pts/0). And cat (PID 10806) is in the background process group (PGID 10803).
But this time the pgrp 10803 is not orphanded because its member PID 10803 (bash)'s parent (PID 10732, bash) is in another pgrp (PGID 10732) and it's in the same session (SID 10732).
According to the APUE book, SIGTTIN will be "generated by the terminal driver when a process in a (non-orphaned) background process group tries to read from its controlling terminal". So when cat reads stdin, SIGTTIN will be sent to it and by default this signal would stop the process. That's why the cat's STAT column was shown as T (stopped) in the ps output. Since it's stopped the data we input from keyboard are not sent to it at all. So it just looks like it's working but it's not really.
Conclusion:
So the different behaviors (EIO vs. SIGTTIN) depend on whether the current Bash is a session leader or not. (In the 1st part of my answer, the bash of PID 906 is the session leader, but the bash of PID 10803 in the 2nd part is not the session leader.)

The accepted answer explained why, but I saw one solution which can solve it. It is by subshelling it with additional (), such as:
(cat <(cat))
Please find the solution details here:
https://unix.stackexchange.com/a/244333/89706

Related

Executing a shell with a command and returning

man bash seems to suggest that if I want to execute a command in a separate bash shell all I have to do is bash -c command:
-c string If the -c option is present, then commands are read from string.
I want to do that because I need to run a few things in different environments:
bash --rcfile ~/.bashrc.a -c mytest.a
bash --rcfile ~/.bashrc.b -c mytest.b
However, that didn't work as expected; one can see that by the number of bash shells running, for example:
$ bash
$ ps
PID TTY TIME CMD
7554 pts/0 00:00:00 bash
7573 pts/0 00:00:00 ps
28616 pts/0 00:00:00 bash
$ exit
exit
$ ps
PID TTY TIME CMD
7582 pts/0 00:00:00 ps
28616 pts/0 00:00:00 bash
$ bash -c ps
PID TTY TIME CMD
7583 pts/0 00:00:00 ps
28616 pts/0 00:00:00 bash
How should the invocation of bash should be modified so that it would start a new shell with the specified rc, execute the given command in that shell (with the env modified according to the rc), and exit back?
It's already working exactly the way you want it to. The lack of an extra process is simply due to bash's tail-call optimization.
Bash recognizes that there's no point in having a shell instance whose only job is to wait for a process and exit. It will instead skip the fork and exec the process directly. This is a huge win for e.g. var=$(ps), where it cuts the number of expensive forks from 2 to 1.
If you give it additional commands to run afterwards, this optimization is no longer valid, and then you'll see the additional process:
$ bash -c 'ps'
PID TTY TIME CMD
4531 pts/10 00:00:00 bash
4540 pts/10 00:00:00 ps
$ bash -c 'ps; exit $?'
PID TTY TIME CMD
4531 pts/10 00:00:00 bash
4549 pts/10 00:00:00 bash
4550 pts/10 00:00:00 ps
bash --rcfile ~/.bashrc.a mytest.a will already run mytest.a in a separate process. -c is for specifying a shell command directly, rather than running a script.
# NO!
bash for x in 1 2 3; do echo "$x"; done
# Yes.
bash -c 'for x in 1 2 3; do echo "$x"; done'

Everytime I run ps, it returns the usual PID and CMD, but

In every book I've read, it never returns like this:
PID CMD
2748 -bash
8114 awk
7900 -bash
Which is what my ps returns. Is that normal for the - to be in front of the bash? I've only ever seen 2290 bash, never without the - in front of it. Trivial question, but I assume it isn't normal. Thank you, and sorry for the stupid question.
This means a login shell. Take a look at man bash:
A login shell is one whose first character of argument zero is a -, or one started with the --login option.
If you run cat /proc/2748/cmdline you will see the hyphen there. This is where ps is getting it from.
-f will look at /proc/[pid]/cmdline, whereas by default it will look at /proc/[pid]/comm.
tom#riki:~$ ps
PID TTY TIME CMD
9230 pts/2 00:00:00 bash
9429 pts/2 00:00:00 ps
tom#riki:~$ ps -f
UID PID PPID C STIME TTY TIME CMD
tom 9230 9229 0 17:39 pts/2 00:00:00 -bash
tom 9427 9230 0 18:22 pts/2 00:00:00 ps -f
tom#riki:~$ cat /proc/9230/comm
bash
tom#riki:~$ cat /proc/9230/cmdline
-bash

Retrieving full command line (w/ pipes &c) from a running bash script

How can I get the complete line of code running in the bash in a script that is run from within this line?
ping -c 2 google.com & ping -c 2 aol.com | grep aol & sh myscript.sh
where I want to retrieve the complete upper line in myscript.sh somehow.
My current approach is:
ping -c 2 google.com & ping -c 2 aol.com | grep aol & ps -ef --sort=start_time
And then correlate the PPID and the start time of the process to get what was run.
UID PID PPID C STIME TTY TIME CMD
nm+ 2881 6599 0 12:09 pts/1 00:00:00 ping -c 2 google.com
nm+ 2882 6599 0 12:09 pts/1 00:00:00 ping -c 2 aol.com
nm+ 2883 6599 0 12:09 pts/1 00:00:00 grep --color=auto abc
nm+ 2884 6599 0 12:09 pts/1 00:00:00 ps -ef --sort=start_time
I dont like it since I am unable to say how the processes are connected (pipes or just parallel execution) and therefore its impossible to reconstruct the exact line that was run in the bash. Also it feels to hackish for the right way.
You can grep "pipe" from lsof and find the correlated commands from the pipe id for a process and find the process id and look for details for the correlated processes.
Assuming bash 4.0 or newer:
#!/usr/bin/env bash
exec 3>"$1"; shift
BASH_XTRACEFD=3 PS4=':$BASH_SOURCE:$LINENO:+'
set -x
source "$#"
...if saved as bash_trace, used as:
bash_trace logfile scriptname arg1 arg2 ...
...then, to look up the actual line number, one can use something like the following:
IFS=: read -r filename lineno _ < <(tail -n 1 logfile)
sed -e "${lineno}q;d" <"$filename"

xterm -e not terminating when script finishes execution

I'm running two levels of xterms. In the first level I run "xterm -e bsub -Ip master.tcl". The master.tcl script invokes yet another xterm with "xterm -e bsub -Ip slave.tcl".
From some reason, when slave.tcl finishes executing, the second xterm is not closing. However, the second xterm does display the following message once the slave script finishes:
<< JobExitInfo: Job <128309> is done successfully. >>
Also, when looking at the LSF system, the job does not appear, as if it really finished. But the xterm window stays open, instead of closing.
Any idea why?
Thanks.
It is unlikely that xterm would stay open unless there was something still running there.
I'd check (using ps -ef for instance) to see what processes are still running in the remaining xterm. xterm would only be still open if there were something running, e.g., waiting for input.
Using ps -ef (assuming this is not a BSD system), you would see a listing with a heading like this:
UID PID PPID C STIME TTY TIME CMD
and later in the listing, the relevant information, e.g.,
tom 3647 20185 0 06:17 pts/2 00:00:00 sh -c xterm -e vile
tom 3648 3647 0 06:17 pts/2 00:00:00 xterm -e vile
tom 3649 3648 0 06:17 pts/3 00:00:00 vile
tom 3650 3649 0 06:17 pts/3 00:00:00 sh -c ps -ef
tom 3651 3650 0 06:17 pts/3 00:00:00 ps -ef
xterm's process-id (PID) is the place to start. It would be found in the PPID (parent's process-ID) column at least one other place. In turn, that process's PID may be used in further child-processes.
BSD systems use a different set of options (see FreeBSD for example), but in general you can obtain the necessary information from ps.

Orphan Child (ksh Shell Script not terminating first upon CTRL-X)

I have a program (C++ Executable) on AIX 5.3 that launches a Shell Script (ksh).
When I launch the program and the shell script, i see two processes
AIX:>ps -ef | grep 3657892
u001 **3657892** 3670248 0 18:16:34 pts/11 0:00 /u0012006/bin/Launcher
u001 3723398 **3657892** 0 18:16:41 pts/11 0:00 /usr/bin/ksh /u0012006/shell/Trjt_Slds.sh -m
Now, When I do a CTRL-X key combination on the Keyboard to end and go out of the Shell Script, the main launching program (C++ Executable) process gets killed while the shell script continues to execute.
AIX:>ps -ef | grep 3723398
u001 3723398 1 106 18:16:41 pts/11 0:01 /usr/bin/ksh /u0012006/shell/Trjt_Slds.sh -m
u001 3731504 3723398 0 0:00 <defunct>
u001 3735612 3723398 0 0:00 <defunct>
u001 3739838 3723398 0 0:00 <defunct>
This is leading to the CPU Consumption going to 100% and a lot of defunct processes get launched.
Is there a way to have the AIX Shell Script terminate first when I do a CTRL-X?
Note: Launcher is broken and should be fixed. Thus, any "solution" will be a hack.
One thought is to check $PPID in various places in the script. If it is set to 1 (init), then exit the script.
I don't understand the use of control-X. That is not going to generate any tty signal. I guess that is what you want. Perhaps the tty is also in raw mode. But you might consider hooking control-X up to one of the various tty signals like SIGINT. e.g. stty intr ^X but you will also need to remember to unset it with stty intr ^C
Last, you could wrap the script in a script and use the technique to kill the child and exit. e.g. (untested)
#!/bin/ksh
# launch original program in background
/path/to/real/program "$#" &
# get child's pid
child=$!
while : ; do
# when we become an orphan
if [[ $$PPID -eq 1 ]] ; then
# kill the child and exit
kill $child
exit
fi
# poll once a second
sleep 1
done
Update
./s1 is:
#!/bin/ksh
./s2 &
sleep 10
exit
./s2 is:
#!/bin/ksh
while : ; do
if kill -0 $PPID ; then
echo still good
else
echo orphaned
exit
fi
sleep 1
done
ksh always does this. Just got bitten by this, unlike bash, ksh does not forward hup signals when you exit. if you can find the child pids you can hup them yourself.

Resources