How to pipe cmdline output of program to run multiple times and halt when a keyword appears? - shell

Say I want to run a C program 1000 times, and this program is basically a test script that tests the functionality of a simple kernel I have written. It outputs a "SUCCESS" every time it fails. Because of various race conditions that are hard to track down, we often have to run the test manually literally a few hundred times before it fails. I have tried searching the net in vain for perl scripts or bash scripts that can help us run this command:
pintos -v -k -T 60 --qemu -j 2 --filesys-size=2 -p tests/vm/page-parallel -a page-parallel -p tests/vm/child-linear -a child-linear --swap-size=4 -- -q -f run page-parallel < /dev/null
and pipe the command to something to check for a keyword so it can halt/continue if that keyword appears.
Anyone can point me in the right direction?

In bash you can just run it in a while loop:
while true; do
if "pintos -v -k -T 60 --qemu -j 2 --filesys-size=2 -p tests/vm/page-parallel -a page-parallel -p tests/vm/child-linear -a child-linear --swap-size=4 -- -q -f run page-parallel < /dev/null" | grep -c KEYWORD; then
break
fi
done
I'm not 100% sure about the quoting you'd need around the command, obviously I can't run your specific command. It may not need the "" around it.
grep -c counts the matches, if 0 then the KEYWORD was not found so it runs the loop again. If > 0 then the KEYWORD was found and the loop breaks out.

Related

Making a Bash script that can open multiple terminals and run wget in each

I have to download bulks of over 100,000 docs from a databank using this script:
#!/usr/bin/bash
IFS=$'\n'
set -f
for line in $(cat < "$1")
do
wget https://www.uniprot.org/uniprot/${line}.txt
done
The first time it took over a week to download all the files (all under 8Kb) so I tried opening multiple terminals and running a split of the total.txt (10 equal splits of 10000 files in 10 terminals) and in just 14 hours I had all the documents downloaded, is there a way to make a script do that for me?
this is a sample of what the list looks like:
D7E6X7
A0A1L9C3F2
A3K3R8
W0K0I7
gnome-terminal -e command
or
xterm -e command
or
konsole -e command
Or
terminal -e command
There is an another alternative to make it fast.
Right now your downloads are synchronized i.e next download process is not started until current one is finished.
Search for how to make command asynchronous/run in background on unix.
When you were doing this by hand, opening multiple terminals made sense. If you want to script this, you can run multiple processes from one terminal/script. You could use xargs to start multiple processes simultaneously:
xargs -a list.txt -n 1 -P 8 -I # bash -c "wget https://www.uniprot.org/uniprot/#.txt"
Where:
-a list.txt tells xargs to use the list.txt file as input.
-n 1 tells xargs to use a maximum of one argument (from the input) for each command it runs.
-P 8 tells xargs to run 8 commands at a time, you can change this to suit your system/requirements.
-I # tells xargs to use "#" to represent the input (i.e. the line from your file).

Cron job won't start again after I stopped it?

I wrote a script to run constantly on startup. If for whatever reason the script were to fail, I wrote a second script to check if it has failed, and if so, run the first script again. I then set this second script as a cronjob to run every minute so that it is constantly checking if the first script is alive.
So to test this, I reboot my system. I can see in htop that the first script is running from start up as expected. Good. I kill the process to test the second script. Sure enough, the second script starts the first script again. Still good. I then kill this process, but the second script won't run again now. It still updates a txt file when I manually start the first script, but the second script just doesn't start the first script like it's supposed to. Is it because I killed the cronjob? Restarting the cron service doesn't fix anything though, so I don't know why my second script isn't running again at all.
First script:
#!/bin/bash
stamp=$(date +%Y%m%d-%H%M)
timeout 10d tcpdump -i eth0 -s 96 -z gzip -C 10 -w /home/user/Documents/${stamp}
Second script:
#!/bin/bash
echo "not running" > /home/working.txt
if (( $(ps -ef | grep -v grep | grep tcpdump.sh | wc -l) > 0 ))
then
echo "tcpdump is running!!!" > /home/working.txt
else
/usr/local/bin/tcpdump.sh start
fi
Any help?
You would probably be better off running a simple for loop as the main script, and that kicks off the tcpdump script in the background, so something like:
#!/bin/bash
while true; do
if ps -ef | grep -v grep | grep -q tcpdump; then
: tcpdump running OK
else
# tcpdump not running - start it off
nohup /usr/local/bin/firstscript.sh start &
fi
sleep 30
done
This checks that "tcpdump.sh" is in the output of the "ps -ef" command - if it is, then do nothing (note that you must have an actual command between the "then" and "else" - the ":" command, which just takes it s arguments and ignores them, is sufficient). If it isn't running, start the first script in the background. Then sleep 30 seconds and check again. (Yes, I could have inverted the test so that I didn't need an empty "then" arm, but it would have made the code less obvious)
You put this script as the one which starts at boot time.
Edit: Do you really want to check for "tcpdump.sh"? Is that what the first script is actually called? Assuming that you actually want to check for the tcpdump program, you could use:
if pgrep tcpdump; then

Terminal Application to Keep Web Server Process Alive

Is there an app that can, given a command and options, execute for the lifetime of the process and ping a given URL indefinitely on a specific interval?
If not, could this be done on the terminal as a bash script? I'm almost positive it's doable through terminal, but am not fluent enough to whip it up within a few minutes.
Found this post that has a portion of the solution, minus the ping bits. ping runs on linux, indefinitely; until it's actively killed. How would I kill it from bash after say, two pings?
General Script
As others have suggested, use this in pseudo code:
execute command and save PID
while PID is active, ping and sleep
exit
This results in following script:
#!/bin/bash
# execute command, use '&' at the end to run in background
<command here> &
# store pid
pid=$!
while ps | awk '{ print $1 }' | grep $pid; do
ping <address here>
sleep <timeout here in seconds>
done
Note that the stuff inside <> should be replaces with actual stuff. Be it a command or an ip address.
Break from Loop
To answer your second question, that depends in the loop. In the loop above, simply track the loop count using a variable. To do that, add a ((count++)) inside the loop. And do this: [[ $count -eq 2 ]] && break. Now the loop will break when we're pinging for a second time.
Something like this:
...
while ...; do
...
((count++))
[[ $count -eq 2 ]] && break
done
ping twice
To ping only a few times, use the -c option:
ping -c <count here> <address here>
Example:
ping -c 2 www.google.com
Use man ping for more information.
Better practice
As hek2mgl noted in a comment below, the current solution may not suffice to solve the problem. While answering the question, the core problem will still persist. To aid to that problem, a cron job is suggested in which a simple wget or curl http request is sent periodically. This results in a fairly easy script containing but one line:
#!/bin/bash
curl <address here> > /dev/null 2>&1
This script can be added as a cron job. Leave a comment if you desire more information how to set such a scheduled job. Special thanks to hek2mgl for analyzing the problem and suggesting a sound solution.
Say you want to start a download with wget and while it is running, ping the url:
wget http://example.com/large_file.tgz & #put in background
pid=$!
while kill -s 0 $pid #test if process is running
do
ping -c 1 127.0.0.1 #ping your adress once
sleep 5 #and sleep for 5 seconds
done
A nice little generic utility for this is Daemonize. Its relevant options:
Usage: daemonize [OPTIONS] path [arg] ...
-c <dir> # Set daemon's working directory to <dir>.
-E var=value # Pass environment setting to daemon. May appear multiple times.
-p <pidfile> # Save PID to <pidfile>.
-u <user> # Run daemon as user <user>. Requires invocation as root.
-l <lockfile> # Single-instance checking using lockfile <lockfile>.
Here's an example of starting/killing in use: flickd
To get more sophisticated, you could turn your ping script into a systemd service, now standard on many recent Linuxes.

Maintaining a set number of concurrent jobs w/ args from a file in bash

I found this script on the net, I don't know to work in bash too much is too weird but..
Here's my script:
CONTOR=0
for i in `cat targets`
do
CONTOR=`ps aux | grep -c php`
while [ $CONTOR -ge 250 ];do
CONTOR=`ps aux | grep -c php`
sleep 0.1
done
if [ $CONTOR -le 250 ]; then
php b $i > /dev/null &
fi
done
My targets are urls, and the b php file is a crawler which save some links into a file. The problem is max numbers of threads is 50-60 and that's because the crawler finish very fast and that bash script code doesn't have time to open my all 250 threads. It's any chance to do something to open all threads (250) ? It is possible to run more than one thread per ps -aux process? Right know seems he open 1 thread after execute ps -aux.
First: Bash has no multithreading support whatsoever. foo & starts a separate process, not a thread.
Second: launching ps to check for children is both prone to false positives (treating unrelated invocations of php as if they were jobs in the current process) and extremely inefficient if done in a loop (since every invocation involves a fork()/exec()/wait() cycle).
Thus, don't do it that way: Use a release of GNU xargs with -P, or (if you must) GNU parallel.
Assuming your targets file is newline-delimited, and has no special quoting or characters, this could be as simple as:
xargs -d $'\n' -n 1 -P 250 php b <targets
...or, for pure POSIX shells:
xargs -d "
" -n 1 -P 250 php b <targets
With GNU Parallel it looks like this (choose the style you like best):
cat targets | parallel -P 250 php b
parallel -a targets -P 250 php b
parallel -P 250 php b :::: targets
There is no risk of false positives if there are other php processes running. And unlike xargs there is no risk if the file targets contain space, " or '.

bash script parallel ssh remote command

i have a script that fires remote commands on several different machines through ssh connection. Script goes something like:
for server in list; do
echo "output from $server"
ssh to server execute some command
done
The problem with this is evidently the time, as it needs to establish ssh connection, fire command, wait for answer, print it. What i would like is to have script that would try to establish connections all at once and return echo "output from $server" and output of command as soon as it gets it, so not necessary in the list order.
I've been googling this for a while but didn't find an answer. I cannot cancel ssh session after command run as one thread suggested, because i need an output and i cannot use parallel gnu suggested in other threads. Also i cannot use any other tool, i cannot bring/install anything on this machine, only useable tool is GNU bash, version 4.1.2(1)-release.
Another question is how are ssh sessions like this limited? If i simply paste 5+ or so lines of "ssh connect, do some command" it actually doesn't do anything, or execute only on first from list. (it works if i paste 3-4 lines). Thank you
Have you tried this?
for server in list; do
ssh user#server "command" &
done
wait
echo finished
Update: Start subshells:
for server in list; do
(echo "output from $server"; ssh user#server "command"; echo End $server) &
done
wait
echo All subshells finished
There are several parallel SSH tools that can handle that for you:
http://code.google.com/p/pdsh/
http://sourceforge.net/projects/clusterssh/
http://code.google.com/p/sshpt/
http://code.google.com/p/parallel-ssh/
Also, you could be interested in configuration deployment solutions such as Chef, Puppet, Ansible, Fabric, etc (see this summary ).
A third option is to use a terminal broadcast such as pconsole
If you only can use GNU commands, you can write your script like this:
for server in $servers ; do
( { echo "output from $server" ; ssh user#$server "command" ; } | \
sed -e "s/^/$server:/" ) &
done
wait
and then sort the output to reconcile the lines.
I started with the shell hacks mentionned in this thread, then proceeded to something somewhat more robust : https://github.com/bearstech/pussh
It's my daily workhorse, and I basically run anything against 250 servers in 20 seconds (it's actually rate limited otherwise the connection rate kills my ssh-agent). I've been using this for years.
See for yourself from the man page (clone it and run 'man ./pussh.1') : https://github.com/bearstech/pussh/blob/master/pussh.1
Examples
Show all servers rootfs usage in descending order :
pussh -f servers df -h / |grep /dev |sort -rn -k5
Count the number of processors in a cluster :
pussh -f servers grep ^processor /proc/cpuinfo |wc -l
Show the processor models, sorted by occurence :
pussh -f servers sed -ne "s/^model name.*: //p" /proc/cpuinfo |sort |uniq -c
Fetch a list of installed package in one file per host :
pussh -f servers -o packages-for-%h dpkg --get-selections
Mass copy a file tree (broadcast) :
tar czf files.tar.gz ... && pussh -f servers -i files.tar.gz tar -xzC /to/dest
Mass copy several remote file trees (gather) :
pussh -f servers -o '|(mkdir -p %h && tar -xzC %h)' tar -czC /src/path .
Note that the pussh -u feature (upload and execute) was the main reason why I programmed this, no tools seemed to be able to do this. I still wonder if that's the case today.
You may like the parallel-ssh project with the pssh command:
pssh -h servers.txt -l user command
It will output one line per server when the command is successfully executed. With the -P option you can also see the output of the command.

Resources