I'm running two levels of xterms. In the first level I run "xterm -e bsub -Ip master.tcl". The master.tcl script invokes yet another xterm with "xterm -e bsub -Ip slave.tcl".
From some reason, when slave.tcl finishes executing, the second xterm is not closing. However, the second xterm does display the following message once the slave script finishes:
<< JobExitInfo: Job <128309> is done successfully. >>
Also, when looking at the LSF system, the job does not appear, as if it really finished. But the xterm window stays open, instead of closing.
Any idea why?
Thanks.
It is unlikely that xterm would stay open unless there was something still running there.
I'd check (using ps -ef for instance) to see what processes are still running in the remaining xterm. xterm would only be still open if there were something running, e.g., waiting for input.
Using ps -ef (assuming this is not a BSD system), you would see a listing with a heading like this:
UID PID PPID C STIME TTY TIME CMD
and later in the listing, the relevant information, e.g.,
tom 3647 20185 0 06:17 pts/2 00:00:00 sh -c xterm -e vile
tom 3648 3647 0 06:17 pts/2 00:00:00 xterm -e vile
tom 3649 3648 0 06:17 pts/3 00:00:00 vile
tom 3650 3649 0 06:17 pts/3 00:00:00 sh -c ps -ef
tom 3651 3650 0 06:17 pts/3 00:00:00 ps -ef
xterm's process-id (PID) is the place to start. It would be found in the PPID (parent's process-ID) column at least one other place. In turn, that process's PID may be used in further child-processes.
BSD systems use a different set of options (see FreeBSD for example), but in general you can obtain the necessary information from ps.
Related
I have a program that reads from two input files simultaneously. I'd like to have this program read from standard input. I thought I'd use something like this:
$program1 <(cat) <($program2)
but I've just discovered that
cat <(cat)
produces
....
mmap2(NULL, 139264, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb758e000
read(0, 0xb758f000, 131072) = -1 EIO (Input/output error)
....
cat: -: Input/output error
and similarly,
$ cat <(read -n 1)
bash: read: read error: 0: Input/output error
So... Linux is failing to read at the syscall level. That's interesting. Is bash not wiring stdin to the subshell? :(
Is there a solution to this? I specifically need to use process substitution (the ... <(...) format) because $program1 (tail, incidentally) expects files, and I need to do some preprocessing (with od) on standard input before I can pass it to tail - I can't just specify /dev/stdin et al.
EDIT:
What I actually want to do is read from a file (which another process will be writing to) while I also read from standard input so I can accept commands and such. I was hoping I could do
tail -f <(od -An -vtd1 -w1) <(cat fifo)
to read from standard input and the FIFO simultaneously and drop that into a single stdout stream I could run through awk (or similar). I know I could solve this trivially in any scripting language, but I like learning how to make bash do everything :P
EDIT 2: I've asked a new question that more fully explains the context I described just above.
1. Explain why cat <(cat) produces EIO
( I'm using Debian Linux 8.7, Bash 4.4.12 )
Let's replace <(cat) with the long running <(sleep) to see what's happening.
From pty #1:
$ echo $$
906
$ tty
/dev/pts/14
$ cat <(sleep 12345)
Go to another pty #2:
$ ps t pts/14 j
PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND
903 906 906 906 pts/14 29999 Ss 0 0:00 bash
906 29998 906 906 pts/14 29999 S 0 0:00 bash
29998 30000 906 906 pts/14 29999 S 0 0:00 sleep 12345
906 29999 29999 906 pts/14 29999 S+ 0 0:00 cat /dev/fd/63
$ ps p 903 j
PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND
1 903 903 903 ? -1 Ss 0 0:07 SCREEN -T linux -U
$
Let me explain it (according to the APUE book, 2nd edition):
The TPGID being 29999 indicates that cat (PID 29999) is the foreground process group which is now controlling the terminal (pts/14). And sleep is in the background process group (PGID 906).
The process group of 906 is now an orphaned process group because "the parent of every member is either itself a member of the group or is not a member of the group’s session". (The PID 906's PPID is 903 and 903 is in a different session.)
When the process in an orphaned background process group reads from its controlling terminal, read() would fail with EIO.
2. Explain why cat <(cat) sometimes works (not really!)
Daniel Voina mentioned in a comment that cat <(cat) works on OS X with Bash 3.2.57. I just managed to reproduce it also on Linux with Bash 4.4.12.
From pty #1:
bash-4.4# echo $$
10732
bash-4.4# tty
/dev/pts/0
bash-4.4# cat <(cat)
cat: -: Input/output error
bash-4.4#
bash-4.4#
bash-4.4# bash --norc --noprofile # start a new bash
bash-4.4# tac <(cat)
<-- It's waiting here so looks like it's working.
(The first cat <(cat) failing with EIO was explained in the first part of my answer.)
Go to another pty #2:
bash-4.4# ps t pts/0 j
PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND
10527 10732 10732 10732 pts/0 10805 Ss 0 0:00 bash
10732 10803 10803 10732 pts/0 10805 S 0 0:00 bash --norc --noprofile
10803 10804 10803 10732 pts/0 10805 S 0 0:00 bash --norc --noprofile
10804 10806 10803 10732 pts/0 10805 T 0 0:00 cat
10803 10805 10805 10732 pts/0 10805 S+ 0 0:00 tac /dev/fd/63
bash-4.4# ps p 10527 j
PPID PID PGID SID TTY TPGID STAT UID TIME COMMAND
10526 10527 10527 10527 ? -1 Ss 0 0:00 SCREEN -T dtterm -U
bash-4.4#
Let's see what's happening:
The TPGID being 10805 indicates that tac (PID 10805) is the foreground process group which is now controlling the terminal (pts/0). And cat (PID 10806) is in the background process group (PGID 10803).
But this time the pgrp 10803 is not orphanded because its member PID 10803 (bash)'s parent (PID 10732, bash) is in another pgrp (PGID 10732) and it's in the same session (SID 10732).
According to the APUE book, SIGTTIN will be "generated by the terminal driver when a process in a (non-orphaned) background process group tries to read from its controlling terminal". So when cat reads stdin, SIGTTIN will be sent to it and by default this signal would stop the process. That's why the cat's STAT column was shown as T (stopped) in the ps output. Since it's stopped the data we input from keyboard are not sent to it at all. So it just looks like it's working but it's not really.
Conclusion:
So the different behaviors (EIO vs. SIGTTIN) depend on whether the current Bash is a session leader or not. (In the 1st part of my answer, the bash of PID 906 is the session leader, but the bash of PID 10803 in the 2nd part is not the session leader.)
The accepted answer explained why, but I saw one solution which can solve it. It is by subshelling it with additional (), such as:
(cat <(cat))
Please find the solution details here:
https://unix.stackexchange.com/a/244333/89706
I need to use command that shows all processes related to terminal. Ps -a looks good except that there is no username printed. This command prints:
PID TTY TIME CMD
26969 pts/34 0:00 man
27636 pts/2 0:00 awk
25215 pts/35 0:00 bash
I would like it to be similar to this:
PID TTY TIME CMD USER
26969 pts/34 0:00 man name
27636 pts/2 0:00 awk name
25215 pts/35 0:00 bash name
Columns order does not matter
Use:
ps a -o pid,tty,etime,cmd,user
From ps manual:
SIMPLE PROCESS SELECTION
a ... An alternate description is that this option causes ps to list all processes with a terminal (tty), or to list all processes
when used together with the x option.
STANDARD FORMAT SPECIFIERS
Here are the different keywords that may be used to control the output format (e.g. with option -o) or to sort the selected processes
with the GNU-style --sort
option.
For example: ps -eo pid,user,args --sort user
I have found that ps -af works the way I want
When using Upstart, controlling subprocesses (child process) is quite important. But what confused me is as following, which has gone beyond upstart itself:
scenario 1:
root#ubuntu-jstorm:~/Desktop# su cr -c 'sleep 20 > /tmp/a.out'
I got 3 processes by: cr#ubuntu-jstorm:~$ ps -ef | grep -v grep | grep sleep
root 8026 6544 0 11:11 pts/2 00:00:00 su cr -c sleep 20 > /tmp/a.out
cr 8027 8026 0 11:11 ? 00:00:00 bash -c sleep 20 > /tmp/a.out
cr 8028 8027 0 11:11 ? 00:00:00 sleep 20
scenario 2:
root#ubuntu-jstorm:~/Desktop# su cr -c 'sleep 20'
I got 2 processes by: cr#ubuntu-jstorm:~$ ps -ef | grep -v grep | grep sleep
root 7975 6544 0 10:03 pts/2 00:00:00 su cr -c sleep 20
cr 7976 7975 0 10:03 ? 00:00:00 sleep 20
The process of sleep 20 is the one I care, especially in Upstart, the process managed by Upstart should be this while not bash -c sleep 20 > /tmp/a.out is managed by Upstart, while not the sleep 20.
In scenario 1, upstart will not work correctly, above is the reason.
Therefore, why scenario 1 got 3 process, this doesn't make sense for me. Even though I know I can use command 'exec' to fix it, I just want to get the procedure what happened when the two command committed.
su -c starts the shell and passes it the command via its -c option. The shell may spawn as many processes as it likes (it depends on the given command).
It appears the shell executes the command directly without forking in some cases e.g., if you run su -c '/bin/sleep $$' then the apparent behaviour as if:
su starts a shell process (e.g., /bin/sh)
the shell gets its own process id (PID) and substitute $$ with it
the shell exec() /bin/sleep.
You should see in ps output that sleep's argument is equal to its pid in this case.
If you run su -c '/bin/sleep $$ >/tmp/sleep' then /bin/sleep argument is different from its PID (it is equal to the ancestor's PID) i.e.:
su starts a shell process (e.g., /bin/sh)
the shell gets its own process id (PID) and substitute $$ with it
the shell double forks and exec() /bin/sleep.
The double fork indicates that the actual sequence of events might be different e.g., su could orchestrate the forking or not forking, not the shell (I don't know). It seems the double fork is there to make sure that the command won't get a controlling terminal.
command > file
This is not atomic action, and actually done in 2 process.
One is execute the command;
the other do the output redirection.
Above two action can not done in one process.
Am I right?
I have a program (C++ Executable) on AIX 5.3 that launches a Shell Script (ksh).
When I launch the program and the shell script, i see two processes
AIX:>ps -ef | grep 3657892
u001 **3657892** 3670248 0 18:16:34 pts/11 0:00 /u0012006/bin/Launcher
u001 3723398 **3657892** 0 18:16:41 pts/11 0:00 /usr/bin/ksh /u0012006/shell/Trjt_Slds.sh -m
Now, When I do a CTRL-X key combination on the Keyboard to end and go out of the Shell Script, the main launching program (C++ Executable) process gets killed while the shell script continues to execute.
AIX:>ps -ef | grep 3723398
u001 3723398 1 106 18:16:41 pts/11 0:01 /usr/bin/ksh /u0012006/shell/Trjt_Slds.sh -m
u001 3731504 3723398 0 0:00 <defunct>
u001 3735612 3723398 0 0:00 <defunct>
u001 3739838 3723398 0 0:00 <defunct>
This is leading to the CPU Consumption going to 100% and a lot of defunct processes get launched.
Is there a way to have the AIX Shell Script terminate first when I do a CTRL-X?
Note: Launcher is broken and should be fixed. Thus, any "solution" will be a hack.
One thought is to check $PPID in various places in the script. If it is set to 1 (init), then exit the script.
I don't understand the use of control-X. That is not going to generate any tty signal. I guess that is what you want. Perhaps the tty is also in raw mode. But you might consider hooking control-X up to one of the various tty signals like SIGINT. e.g. stty intr ^X but you will also need to remember to unset it with stty intr ^C
Last, you could wrap the script in a script and use the technique to kill the child and exit. e.g. (untested)
#!/bin/ksh
# launch original program in background
/path/to/real/program "$#" &
# get child's pid
child=$!
while : ; do
# when we become an orphan
if [[ $$PPID -eq 1 ]] ; then
# kill the child and exit
kill $child
exit
fi
# poll once a second
sleep 1
done
Update
./s1 is:
#!/bin/ksh
./s2 &
sleep 10
exit
./s2 is:
#!/bin/ksh
while : ; do
if kill -0 $PPID ; then
echo still good
else
echo orphaned
exit
fi
sleep 1
done
ksh always does this. Just got bitten by this, unlike bash, ksh does not forward hup signals when you exit. if you can find the child pids you can hup them yourself.
I was trying to write a set of functions that could check to see if a process name was running when I encountered some unexpected output. I've condensed the issue in the following script names isRunning.sh which depends on a system ps command that can take the '-fC' arguments...
#!/bin/bash
progname=isRunning.sh
ps -fC isRunning.sh
pRet=`ps -fC ${progname} | wc -l`
echo pRet $pRet
psOut=`ps -fC ${progname}`
wcOut=`echo "${psOut}" | wc -l`
echo
echo ps output
echo "${psOut}"
echo
echo wcOut $wcOut
The first attempt at piping the ps output to wc gets a return of 3. The second attempt gets the expected return value of 2. Can anyone explain this behavior? I figure it's got to be something stupid I am overlooking.
Thanks,
bbb
edit: my output
UID PID PPID C STIME TTY TIME CMD
root 6717 5940 0 13:10 pts/0 00:00:00 /bin/bash ./isRunning.sh
pRet 3
ps output
UID PID PPID C STIME TTY TIME CMD
root 6717 5940 0 13:10 pts/0 00:00:00 /bin/bash ./isRunning.sh
wcOut 2
I get 2 both attempts. Your ps might be outputting an extra blank line, or somesuch, and then your shell's backtick expansion stripping it. Or maybe you actually had two processes matching the first time you ran it.
If you just want to see if its running, check the exit code from your ps:
if ps -C "${progname}" > /dev/null; then
echo its running
else
echo not running
fi
Even better, you should take a look at pidof and pgrep if you can rely on them being present on whichever systems you're targeting. Or use the LSB functions, if you're on Linux.
edit: Actually, since you're looking for copies of yourself running, you might be picking up the shell doing a fork to implement the pipe.