How do I get the unix terminal to say how many files are left to run, and which file it is working on? - shell

I have a shell script that is sending in about 150 files to a python program I wrote.
I have no idea how long it is going to take, so I was wondering if there was a terminal command to either:
a) tell me which file is currently being worked on
b) how many files are left to run
Here's my shell:
#!/bin/bash
ls fp00t*g*k2odfnew.dat | while read line; do
echo $line
python file_editor.py $line
done

PipeViewer will probably do what you need: http://www.catonmat.net/blog/unix-utilities-pipe-viewer/
Something like this might work, putting both ls and pv in line mode:
#!/bin/bash
ls -1 fp00t*g*k2odfnew.dat | pv -l | while read line; do
echo $line
python file_editor.py $line
done
you can also supply a total to pv so it knows how many you're counting up to, so the progress bar works properly:
#!/bin/bash
ls -1 fp00t*g*k2odfnew.dat | pv -l -s`ls -1 fp00t*g*k2odfnew.dat | wc -l` | while read line; do
echo $line
python file_editor.py $line
done
Full pv docs here: http://www.ivarch.com/programs/quickref/pv.shtml

Related

Catch output of several piped commands

Till today I was always able to find answer for my all bash questions. But now I stuck. I am testing 20TB RAID6 configuration working on LSI 9265.
I wrote script to create files from /dev/urandom and I am creating second to calculate md5 from all files with two addons.
One is to use time command to calculate md5sum execution time
Second is use pv command to show progress of each md5sum command
My command looks like this:
filename="2017-03-13_12-38-08"
/usr/bin/time -f "real read %E" pv $filename | md5sum | sed "s/-/$filename /"
This is example terminal printout:
/usr/bin/time -f "real read %E" pv $i | md5sum | sed "s/-/$i/"
1GiB 0:00:01 [ 551MiB/s] [==================================================================================================>] 100%
real read 0:01.85
f561af8cc0927967c440fe2b39db894a 2017-03-13_12-38-08
And I want to log it to file. I failed all tries using 2>&1, using tee, using brackets. I know pv uses stdErr but this doesnt help in finding solution. I can only catch "f561af8cc0927967c440fe2b39db894a 2017-03-13_12-38-08_done"
which is not enough.
This is the solution:
(time pv -f $filename | md5sum | sed "s/-/$filename/") 2>&1 | tee output.log
or equivalent but without printing into terminal only file to output.log
(time pv -f $filename | md5sum | sed "s/-/$filename/") > output.log 2>&1

netcat inside a while read loop returning immediately

I am making a menu for myself, because sometimes I need to search (Or NMAP which port).
I want to do the same as running the command in the command line.
Here is a piece of my code:
nmap $1 | grep open | while read line; do
serviceport=$(echo $line | cut -d' ' -f1 | cut -d'/' -f1);
if [ $i -eq $choice ]; then
echo "Running command: netcat $1 $serviceport";
netcat $1 $serviceport;
fi;
i=$(($i+1));
done;
It is closing immediately after it scanned everything with nmap.
Don't use FD 0 (stdin) for both your read loop and netcat. If you don't distinguish these streams, netcat can consume content emitted by the nmap | grep pipeline rather than leaving that content to be read by read.
This has a few undesirable effects: Further parts of the while/read loop don't get executed, and netcat sees a closed stdin stream and exits when the pipeline's contents are consumed (so you don't get interactive control of netcat, if that's what you're trying to accomplish). An easy way to work around this issue is to feed the output of your nmap pipeline in on a non-default file descriptor; below, I'm using FD 3.
There's a lot wrong with this code beyond the scope of the question, so please don't consider the parts I've copied-and-pasted an endorsement, but:
while read -r -u 3 line; do
serviceport=${line%% *}; serviceport=${serviceport##/*}
if [ "$i" -eq "$choice" ]; then
echo "Running command: netcat $1 $serviceport"
netcat "$1" "$serviceport"
fi
done 3< <(nmap "$1" | grep open)

bash script inside here document not behaving as expected

Here is a minimal test case which fails
#!/bin/tcsh
#here is some code in tcsh I did not write which spawns many processes.
#let us pretend that it spawns 100 instances of stupid_test which the user kills
#manually after an indeterminate period
/bin/bash <<EOF
#!/bin/bash
while true
do
if [[ `ps -e | grep stupid_test | wc -l` -gt 0 ]]
then
echo 'test program is still running'
echo `ps -e | grep stupid_test | wc -l`
sleep 10
else
break
fi
done
EOF
echo 'test program finished'
The stupid_test program is consists of
#!/bin/bash
while true; do sleep 10; done
The intended behavior is to run until stupid_test is killed (in this case manually by the user), and then terminate within the next ten seconds. The observed behavior is that the script does not terminate, and evaluates ps -e | grep stupid_test | wc -l == 1 even after the program has been killed (and it no longer shows up under ps)
If the bash script is run directly, rather than in a here document, the intended behavior is recovered.
I feel like I am doing something very stupidly wrong, I am not the most experienced shell hacker at all. Why is it doing this?
Usually when you try to grep the name of a process, you get an extra matching line for grep itself, for example:
$ ps xa | grep something
57386 s002 S+ 0:00.01 grep something
So even when there is no matching process, you will get one matching line. You can fix that by adding a grep -v grep in the pipeline:
ps -e | grep stupid_test | grep -v grep | wc -l
As tripleee suggested, an even better fix is writing the grep like this:
ps -e | grep [s]tupid_test
The meaning of the pattern is exactly the same, but this way it won't match grep itself anymore, because the string "grep [s]tupid_test" doesn't match the regular expression /[s]tupid_test/.
Btw I would rewrite your script like this, cleaner:
/bin/bash <<EOF
while :; do
s=$(ps -e | grep [s]tupid_test)
test "$s" || break
echo test program is still running
echo "$s"
sleep 10
done
EOF
Or a more lazy but perhaps sufficient variant (hinted by bryn):
/bin/bash <<EOF
while ps -e | grep [s]tupid_test
do
echo test program is still running
sleep 10
done
EOF

bash output redirect prob

I want to count the number of lines output from a command in a bash script. i.e.
COUNT=ls | wc -l
But I also want the script to output the original output from ls. How to get this done? (My actual command is not ls and it has side effects. So I can't run it twice.)
The tee(1) utility may be helpful:
$ ls | tee /dev/tty | wc -l
CHANGES
qpi.doc
qpi.lib
qpi.s
4
info coreutils "tee invocation" includes this following example, which might be more instructive of tee(1)'s power:
wget -O - http://example.com/dvd.iso \
| tee >(sha1sum > dvd.sha1) \
>(md5sum > dvd.md5) \
> dvd.iso
That downloads the file once, sends output through two child processes (as started via bash(1) process substitution) and also tee(1)'s stdout, which is redirected to a file.
ls | tee tmpfile | first command
cat tmpfile | second command
Tee is a good way to do that, but you can make something simpler:
ls > __tmpfile
cat __tmpfile | wc -l
cat __tmpfile
rm __tmpfile

Using xargs to assign stdin to a variable

All that I really want to do is make sure everything in a pipeline succeeded and assign the last stdin to a variable. Consider the following dumbed down scenario:
x=`exit 1|cat`
When I run declare -a, I see this:
declare -a PIPESTATUS='([0]="0")'
I need some way to notice the exit 1, so I converted it to this:
exit 1|cat|xargs -I {} x={}
And declare -a gave me:
declare -a PIPESTATUS='([0]="1" [1]="0" [2]="0")'
That is what I wanted, so I tried to see what would happen if the exit 1 didn't happen:
echo 1|cat|xargs -I {} x={}
But it fails with:
xargs: x={}: No such file or directory
Is there any way to have xargs assign {} to x? What about other methods of having PIPESTATUS work and assigning the stdin to a variable?
Note: these examples are dumbed down. I'm not really doing an exit 1, echo 1 or a cat, but used these commands to simplify so we can focus on my particular issue.
When you use backticks (or the preferred $()) you're running those commands in a subshell. The PIPESTATUS you're getting is for the assignment rather than the piped commands in the subshell.
When you use xargs, it knows nothing about the shell so it can't make variable assignments.
Try set -o pipefail then you can get the status from $?.
xargs is run in a child process, as are all the commands you call. So they can't effect the environment of your shell.
You might be able to do something with named pipes (mkfifo), or possible bash's read function?
EDIT:
Maybe just redirect the output to a file, then you can use PIPESTATUS:
command1 | command2 | command3 >/tmp/tmpfile
## Examine PIPESTATUS
X=$(cat /tmp/tmpfile)
How about ...
read x <<<"$(echo 1)"
read x < <(echo 1)
echo "$x"
Why not just populate a new array?
IFS=$'\n' read -r -d '' -a result < <(echo a | cat | cat; echo "PIPESTATUS='${PIPESTATUS[*]}'" )
IFS=$'\n' read -r -d '' -a result < <(echo a | exit 1 | cat; echo "PIPESTATUS='${PIPESTATUS[*]}'" )
echo "${#result[#]}"
echo "${result[#]}"
echo "${result[0]}"
echo "${result[1]}"
There are already a few helpful solutions. It turns out that I actually had an example that matches the question as framed above; close-enough anyway.
Consider this:
XX=$(ls -l *.cpp | wc -l | xargs -I{} echo {})
echo $XX
3
Meaning that I had 3 x .cpp files to in my working directory. Now $XX is 3 and I can make use of that result in my script. It is contrived, because I don't actually need the xargs in this example. It works though.
In the example from the question ...
x=`exit 1|cat`
I don't think that will give you what was specified. exit will quit the sub-shell before the cat gets a mention. Also on that note,
I might start with something like
declare -a PIPESTATUS='([0]="0")'
x=$?
x now has the status from the last command.
Assign each line of input to an array, e.g. all python files in a directory
declare -a pyFiles=($(ls -l *.py | awk '{print $9}'))
where $9 is the nineth field in ls -l corresponding to the filename

Resources