Making nested awk fault tolerant - shell

I have this command (part of a for loop) that tries to get the start time of a process in seconds and stores it into a variable for later use. The command parses /proc/<pid>/stat. In the context of this command a process can be ephemeral and hence by the time I actually run this command, the directory may no longer exist. This causes the nested awk to fail and cause a syntax error in the outer division command. How can I prevent this?
starttime=$(($(awk '{print $22}' $d/stat) / systick));
$d is /proc/<pid>

I would do like this:
starttime=$(($(awk '{print $22}' $d/stat 2>/dev/null || echo 0) / systick));
Basically if awk fails for some reason, such as the special PID stat file not being there anymore, then it returns 0 to the division.
This way you require little change to your code.
PS: I am assuming that systick will always be something else other than 0.

Related

How to add a shell command and use the result in a Fortran program?

Is it possible to call shell command from a Fortran script?
My problem is that I analyze really big files. These files have a lot of lines, e.g. 84084002 or similar.
I need to know how many lines the file has, before I start the analysis, therefore I usually used shell command: wc -l "filename", and than used this number as a parameter of one variable in my script.
But I would like to call this command from my program and use the number of lines and store it into the variable value.
Since 1984, actually in the 2008 standard but already implemented by most of the commonly-encountered Fortran compilers including gfortran, there is a standard intrinsic subroutine execute_command_line which does, approximately, what the widely-implemented but non-standard subroutine system does. As #MarkSetchell has (almost) written, you could try
CALL execute_command_line('wc -l < file.txt > wc.txt' )
OPEN(unit=nn,file='wc.txt')
READ(nn,*) count
What Fortran doesn't have is a standard way in which to get the number of lines in a file without recourse to the kind of operating-system-dependent workaround above. Other, that is, than opening the file, counting the number of lines, and then rewinding to the start of the file to commence reading.
You should be able to do something like this:
command='wc -l < file.txt > wc.txt'
CALL system(command)
OPEN(unit=nn,file='wc.txt')
READ(nn,*) count
You can output the number of lines to a file (fort.1)
wc -l file|awk '{print $1}' > fort.1
In your Fortran program, you can then store the number of lines to a variable (e.g. count) by reading the file fort.1:
read (1,*) count
then you can loop over the variable count and read your whole file
do 1,count
read (file,*)

bash: intensive rw operations cause to damaged files

I have big txt files (for example, let it be 1 000 000 strings each) and I want to sort them by some field and write data to different output files in several dirs (one input file - one out dir). I can do it simply with awk:
awk '{print $0 >> "dir_"'$i'"/"$1".some_suffix"}' some_file;
if I process files one-by-one it always works well, but if I try to work with many files at the same time, i usually (not always) receive some output files truncated (I know exactly count of fields, it's always the same in my case, so it's easy to find bad files). I use command like
for i in <input_files>; do
awk '{print $0 >> "dir_"'$i'"/"$1".some_suffix"}' < $i &
done
so each process creates files in own out dir. Also I tried to parallelize it with xargs and received the same results - some random files were truncated.
How could this happen? Is it RAM, or filesystem cache problem, any suggestions?
Hardware: RAM is not ECC, processors AMD Opteron 6378. I used ssd (plextor m5s) and tmpfs with ext4 and reiserfs (output files are small)
You are probably running out file descriptors in your awk process, if you check carefully you'll find that maybe the first 1021 (just under a power of 2, check ulimit -n for the limit) unique filenames work. Using print ... >> does not have the same behaviour as in a shell: it leaves the file open.
I assume you are using something more contemporary than a vintage awk, e.g. for GNUs gawk:
https://www.gnu.org/software/gawk/manual/html_node/Close-Files-And-Pipes.html
Similarly, when a file or pipe is opened for output, awk remembers the file name or command associated with it, and subsequent writes to the same file or command are appended to the previous writes. The file or pipe stays open until awk exits.
This implies that special steps are necessary in order to read the same file again from the beginning, or to rerun a shell command (rather than reading more output from the same command). The close() function makes these things possible:
close(filename)
Try it with close():
gawk '{
outfile="dir_"'$i'"/"$1".some_suffix"
print $0 >> outfile
close(outfile)
}' some_file;
gawk offers the special ERRNO variable which can be used to catch certain errors, sadly it's not set during output redirection errors, so this condition cannot be easily detected. However, under gawk this condition is detected internally (error EMFILE during an open operation) and it attempts to close a not recently used file descriptor so that it can continue, but this isn't guaranteed to work in every situation.
With gawk, you can use --lint for various run-time checks, including hitting the file-descriptor limit and failure to explicitly close files:
$ seq 1 1050 | gawk --lint '{outfile="output/" $1 ".out"; print $0 >> outfile;}'
gawk: cmd. line:1: (FILENAME=- FNR=1022) warning: reached system limit for open files:
starting to multiplex file descriptors
gawk: (FILENAME=- FNR=1050) warning: no explicit close of file `output/1050.out' provided
gawk: (FILENAME=- FNR=1050) warning: no explicit close of file `output/1049.out' provided
[...]
gawk: (FILENAME=- FNR=1050) warning: no explicit close of file `output/1.out' provided

Bash script execute shell command with Bash variable as argument

I have one loop that creates a group of variables like DISK1, DISK2... where the number at the end of the variable name gets created by the loop and then loaded with a path to a device name. Now I want to use those variables in another loop to execute a shell command, but the variable doesn't give its contents to the shell command.
for (( counter=1 ; counter<=devcount ; counter++))
do
TEMP="\$DISK$counter"
# $TEMP should hold the variable name of the disk, which holds the device name
# TEMP was only for testing, but still has same problem as $DISK$counter
eval echo $TEMP #This echos correctly
STATD$counter=$(eval "smartctl -H -l error \$DISK$counter" | grep -v "5.41" | grep -v "Joe")
eval echo \$STATD$counter
done
Don't use eval ever, except maybe if there is no other way AND you really know what you are doing.
The STATD$counter=$(...) should give an error. That's not a valid assignment because the string "STATD$counter" is not a valid variable name. What will happen is (using a concrete example, if counter happened to be 3 and your pipeline in the $( ) output "output", bash will only expand that line as far as "STATD3=output" so it will try to find a command named "STATD3=output" and run it. Odds are this is not what you intended.
It sounds like everything you want to do can be accomplished with arrays instead. If you are not familiar with bash arrays take a look at Greg's Wiki, in particular this page or the bash man page to find out how to use them.
For example, in the loop you didn't post in your question: make disk (not DISK: don't use all upper case variable names) an array like so
disk+=( "new value" )
or even
disk[counter]="new value"
Then in the loop in your question, you can make statd an array as well and assign it with values from disk by
statd[counter]="... ${disk[counter]} ..."
It's worth saying again: avoid using eval.

Best way for testing compiled code to return expected output/errors

How do you test if compiled code returns the expected output or fails as expected?
I have worked out a working example below, but it is not easily extendable. Every additional test would require additional nesting parentheses. Of course I could split this into other files, but do you have any suggestions on how to improve this?. Also I'm planning to use this from make test stanza in a makefile, so I do not expect other people to install something that isn't installed by default, just for testing it. And stdout should also remain interleaved with stderr.
simplified example:
./testFoo || echo execution failed
./testBar && echo expected failure
(./testBaz && (./testBaz 2>&1 | cmp -s - foo.tst && ( ./testFoo && echo and so on
|| echo testFoo's execution failed )|| echo testBaz's does not match )
|| echo testBaz's execution failed
my current tester looks like this (for one test):
\#!/bin/bash
compiler1 $1 && (compiler2 -E --make $(echo $1 | sed 's/^\(.\)\(.*\)\..*$/\l\1\2/') && (./$(echo $1 | sed 's/^\(.\)\(.*\)\..*$/\l\1\2/') || echo execution failed) || less $(echo $1 | sed 's/^\(.\)\(.*\)\..*$/\l\1\2/').err) || echo compile failed
I suggest to start looking for patterns here. For example, you could use the file name as the pattern and then create some additional files that encode the expected result.
You can then use a simple script to run the command and verify the result (instead of repeating the test code again and again).
For example, a file testFoo.exec with the content 0 means that it must succeed (or at least return with 0) while testBar.exec would contain 1.
textBaz.out would then contain the expected output. You don't need to call testBaz several times; you can redirect the output in the first call and then look at $? to see if the call succeeded or not. If it did, then you can directly verify the output (without starting the command again).
My own simple minded test harness works like this:
every test is represented by a bash script with an extension .test - these all live in the same directory
when I create a test, I run the test script and examine the output
carefully, if it looks good it goes into a directory called good_results, in a file with the same name as the test that generated it
the main testing script finds all the .test scripts and executes each of them in turn, producing a temporary output file. This is diff'd with the
matching file in the good_results directory and any differences reported
Itv took me about half an hour to write this and get it working, but it has proved invaluable!

How to deal with NFS latency in shell scripts

I'm writing shell scripts where quite regularly some stuff is written
to a file, after which an application is executed that reads that file. I find that through our company the network latency differs vastly, so a simple sleep 2 for example will not be robust enough.
I tried to write a (configurable) timeout loop like this:
waitLoop()
{
local timeout=$1
local test="$2"
if ! $test
then
local counter=0
while ! $test && [ $counter -lt $timeout ]
do
sleep 1
((counter++))
done
if ! $test
then
exit 1
fi
fi
}
This works for test="[ -e $somefilename ]". However, testing existence is not enough, I sometimes need to test whether a certain string was written to the file. I tried
test="grep -sq \"^sometext$\" $somefilename", but this did not work. Can someone tell me why?
Are there other, less verbose options to perform such a test?
You can set your test variable this way:
test=$(grep -sq "^sometext$" $somefilename)
The reason your grep isn't working is that quotes are really hard to pass in arguments. You'll need to use eval:
if ! eval $test
I'd say the way to check for a string in a text file is grep.
What's your exact problem with it?
Also you might adjust your NFS mount parameters, to get rid of the root problem. A sync might also help. See NFS docs.
If you're wanting to use waitLoop in an "if", you might want to change the "exit" to a "return", so the rest of the script can handle the error situation (there's not even a message to the user about what failed before the script dies otherwise).
The other issue is using "$test" to hold a command means you don't get shell expansion when actually executing, just evaluating. So if you say test="grep \"foo\" \"bar baz\"", rather than looking for the three letter string foo in the file with the seven character name bar baz, it'll look for the five char string "foo" in the nine char file "bar baz".
So you can either decide you don't need the shell magic, and set test='grep -sq ^sometext$ somefilename', or you can get the shell to handle the quoting explicitly with something like:
if /bin/sh -c "$test"
then
...
Try using the file modification time to detect when it is written without opening it. Something like
old_mtime=`stat --format="%Z" file`
# Write to file.
new_mtime=$old_mtime
while [[ "$old_mtime" -eq "$new_mtime" ]]; do
sleep 2;
new_mtime=`stat --format="%Z" file`
done
This won't work, however, if multiple processes try to access the file at the same time.
I just had the exact same problem. I used a similar approach to the timeout wait that you include in your OP; however, I also included a file-size check. I reset my timeout timer if the file had increased in size since last it was checked. The files I'm writing can be a few gig, so they take a while to write across NFS.
This may be overkill for your particular case, but I also had my writing process calculate a hash of the file after it was done writing. I used md5, but something like crc32 would work, too. This hash was broadcast from the writer to the (multiple) readers, and the reader waits until a) the file size stops increasing and b) the (freshly computed) hash of the file matches the hash sent by the writer.
We have a similar issue, but for different reasons. We are reading s file, which is sent to an SFTP server. The machine running the script is not the SFTP server.
What I have done is set it up in cron (although a loop with a sleep would work too) to do a cksum of the file. When the old cksum matches the current cksum (the file has not changed for the determined amount of time) we know that the writes are complete, and transfer the file.
Just to be extra safe, we never overwrite a local file before making a backup, and only transfer at all when the remote file has two cksums in a row that match, and that cksum does not match the local file.
If you need code examples, I am sure I can dig them up.
The shell was splitting your predicate into words. Grab it all with $# as in the code below:
#! /bin/bash
waitFor()
{
local tries=$1
shift
local predicate="$#"
while [ $tries -ge 1 ]; do
(( tries-- ))
if $predicate >/dev/null 2>&1; then
return
else
[ $tries -gt 0 ] && sleep 1
fi
done
exit 1
}
pred='[ -e /etc/passwd ]'
waitFor 5 $pred
echo "$pred satisfied"
rm -f /tmp/baz
(sleep 2; echo blahblah >>/tmp/baz) &
(sleep 4; echo hasfoo >>/tmp/baz) &
pred='grep ^hasfoo /tmp/baz'
waitFor 5 $pred
echo "$pred satisfied"
Output:
$ ./waitngo
[ -e /etc/passwd ] satisfied
grep ^hasfoo /tmp/baz satisfied
Too bad the typescript isn't as interesting as watching it in real time.
Ok...this is a bit whacky...
If you have control over the file: you might be able to create a 'named pipe' here.
So (depending on how the writing program works) you can monitor the file in an synchronized fashion.
At its simplest:
Create the named pipe:
mkfifo file.txt
Set up the sync'd receiver:
while :
do
process.sh < file.txt
end
Create a test sender:
echo "Hello There" > file.txt
The 'process.sh' is where your logic goes : this will block until the sender has written its output. In theory the writer program won't need modifiying....
WARNING: if the receiver is not running for some reason, you may end up blocking the sender!
Not sure it fits your requirement here, but might be worth looking into.
Or to avoid synchronized, try 'lsof' ?
http://en.wikipedia.org/wiki/Lsof
Assuming that you only want to read from the file when nothing else is writing to it (ie, the writing process has finished) - you could check whether nothing else has file handle to it ?

Resources