IBM LSF handling of stdout for bsub jobs entering SSUSP state - cluster-computing

For those familiar with IBM LSF - I am wondering if someone knows how to configure a job, via the job itself or a global config file, to handle stdout when a job enters SSUSP states. As far as I understand my issue, a job will restart from the beginning after entering SSUSP but the stdout is appended and not overwritten. Below is an example of a submission. This happens for a variety of functions regardless of output type (compressed or plain text file)
bsub -R 'span[hosts=1]' -N -o stdout.file myFunction ...

One potential solution depending on the specific use case is to pass -oo instead of -o as this will overwrite instead of append the output, but I haven't been able to address the original issue...
https://scicomp.ethz.ch/wiki/LSF_mini_reference

Related

logs are not getting written in log file when I specify command in crontab with bash script [duplicate]

I want to know how I can see exactly what the cron jobs are doing on each execution. Where are the log files located? Or can I send the output to my email? I have set the email address to send the log when the cron job runs but I haven't received anything yet.
* * * * * myjob.sh >> /var/log/myjob.log 2>&1
will log all output from the cron job to /var/log/myjob.log
You might use mail to send emails. Most systems will send unhandled cron job output by email to root or the corresponding user.
By default cron logs to /var/log/syslog so you can see cron related entries by using:
grep CRON /var/log/syslog
https://askubuntu.com/questions/56683/where-is-the-cron-crontab-log
There are at least three different types of logging:
The logging BEFORE the program is executed, which only logs IF the
cronjob TRIED to execute the command. That one is located in
/var/log/syslog, as already mentioned by #Matthew Lock.
The logging of errors AFTER the program tried to execute, which can be sent to
an email or to a file, as mentioned by #Spliffster. I prefer logging
to a file, because with email THEN you have a NEW source of
problems, and its checking if email sending and reception is working
perfectly. Sometimes it is, sometimes it's not. For example, in a
simple common desktop machine in which you are not interested in
configuring an smtp, sometimes you will prefer logging to a file:
* * * * COMMAND_ABSOLUTE_PATH > /ABSOLUTE_PATH_TO_LOG 2>&1
I would also consider checking the permissions of /ABSOLUTE_PATH_TO_LOG, and run the command from that user's permissions. Just for verification, while you test whether it might be a potential source of problems.
The logging of the program itself, with its own error-handling and logging for tracking purposes.
There are some common sources of problems with cronjobs:
* The ABSOLUTE PATH of the binary to be executed. When you run it from your
shell, it might work, but the cron process seems to use another
environment, and hence it doesn't always find binaries if you don't
use the absolute path.
* The LIBRARIES used by a binary. It's more or less the same previous point, but make sure that, if simply putting the NAME of the command, is referring to exactly the binary which uses the very same library, or better, check if the binary you are referring with the absolute path is the very same you refer when you use the console directly. The binaries can be found using the locate command, for example:
$locate python
Be sure that the binary you will refer, is the very same the binary you are calling in your shell, or simply test again in your shell using the absolute path that you plan to put in the cronjob.
Another common source of problems is the syntax in the cronjob. Remember that there are special characters you can use for lists (commas), to define ranges (dashes -), to define increment of ranges (slashes), etc. Take a look:
http://www.softpanorama.org/Utilities/cron.shtml
Here is my code:
* * * * * your_script_fullpath >> your_log_path 2>&1
On Ubuntu you can enable a cron.log file to contain just the CRON entries.
Uncomment the line that mentions cron in /etc/rsyslog.d/50-default.conf file:
# Default rules for rsyslog.
#
# For more information see rsyslog.conf(5) and /etc/rsyslog.conf
#
# First some standard log files. Log by facility.
#
auth,authpriv.* /var/log/auth.log
*.*;auth,authpriv.none -/var/log/syslog
#cron.* /var/log/cron.log
Save and close the file and then restart the rsyslog service:
sudo systemctl restart rsyslog
You can now see cron log entries in its own file:
sudo tail -f /var/log/cron.log
Sample outputs:
Jul 18 07:05:01 machine-host-name CRON[13638]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
However, you will not see more information about what scripts were actually run inside /etc/cron.daily or /etc/cron.hourly, unless those scripts direct output to the cron.log (or perhaps to some other log file).
If you want to verify if a crontab is running and not have to search for it in cron.log or syslog, create a crontab that redirects output to a log file of your choice - something like:
# For more information see the manual pages of crontab(5) and cron(8)
#
# m h dom mon dow command
30 2 * * 1 /usr/local/sbin/certbot-auto renew >> /var/log/le-renew.log 2>&1
Steps taken from: https://www.cyberciti.biz/faq/howto-create-cron-log-file-to-log-crontab-logs-in-ubuntu-linux/
cron already sends the standard output and standard error of every job it runs by mail to the owner of the cron job.
You can use MAILTO=recipient in the crontab file to have the emails sent to a different account.
For this to work, you need to have mail working properly. Delivering to a local mailbox is usually not a problem (in fact, chances are ls -l "$MAIL" will reveal that you have already been receiving some) but getting it off the box and out onto the internet requires the MTA (Postfix, Sendmail, what have you) to be properly configured to connect to the world.
If there is no output, no email will be generated.
A common arrangement is to redirect output to a file, in which case of course the cron daemon won't see the job return any output. A variant is to redirect standard output to a file (or write the script so it never prints anything - perhaps it stores results in a database instead, or performs maintenance tasks which simply don't output anything?) and only receive an email if there is an error message.
To redirect both output streams, the syntax is
42 17 * * * script >>stdout.log 2>>stderr.log
Notice how we append (double >>) instead of overwrite, so that any previous job's output is not replaced by the next one's.
As suggested in many answers here, you can have both output streams be sent to a single file; replace the second redirection with 2>&1 to say "standard error should go wherever standard output is going". (But I don't particularly endorse this practice. It mainly makes sense if you don't really expect anything on standard output, but may have overlooked something, perhaps coming from an external tool which is called from your script.)
cron jobs run in your home directory, so any relative file names should be relative to that. If you want to write outside of your home directory, you obviously need to separately make sure you have write access to that destination file.
A common antipattern is to redirect everything to /dev/null (and then ask Stack Overflow to help you figure out what went wrong when something is not working; but we can't see the lost output, either!)
From within your script, make sure to keep regular output (actual results, ideally in machine-readable form) and diagnostics (usually formatted for a human reader) separate. In a shell script,
echo "$results" # regular results go to stdout
echo "$0: something went wrong" >&2
Some platforms (and e.g. GNU Awk) allow you to use the file name /dev/stderr for error messages, but this is not properly portable; in Perl, warn and die print to standard error; in Python, write to sys.stderr, or use logging; in Ruby, try $stderr.puts. Notice also how error messages should include the name of the script which produced the diagnostic message.
Use the command crontab -e, and then edit the cron jobs as
* * * * * /path/file.sh > /pathToKeepLogs/logFileName.log 2>&1
Here, 2>&1 indicates that the standard error (2>) is redirected to the same file descriptor that is pointed by standard output (&1).
If you'd still like to check your cron jobs you should provide a valid
email account when setting the Cron jobs in cPanel.
When you specify a valid email you will receive the output of the cron job that is executed. Thus you will be able to check it and make sure everything has been executed correctly. Note that you will not receive an email if there is no output from the cron job command.
Please bear in mind that you will receive an email for each of the executed cron jobs. This may flood your inbox in case your crons run too often
Incase you're running some command with sudo, it won't allow it. Sudo needs a tty.

When data is piped from one program via | is there a way to detect what that program was from the second program?

Say you have a shell command like
cat file1 | ./my_script
Is there any way from inside the 'my_script' command to detect the command run first as the pipe input (in the above example cat file1)?
I've been digging into it and so far I've not found any possibilities.
I've been unable to find any environment variables set in the process space of the second command recording the full command line, the command data the my_script commands sees (via /proc etc) is just _./my_script_ and doesn't include any information about it being run as part of a pipe. Checking the process list from inside the second command even doesn't seem to provide any data since the first process seems to exit before the second starts.
The best information I've been able to find suggests in bash in some cases you can get the exit codes of processes in the pipe via PIPESTATUS, unfortunately nothing similar seems to be present for the name of commands/files in the pipe. My research seems to be saying it's impossible to do in a generic manner (I can't control how people decide to run my_script so I can't force 3rd party pipe replacement tools to be used over build in shell pipes) but it just at the same time doesn't seem like it should be impossible since the shell has the full command line present as the command is run.
(update adding in later information following on from comments below)
I am on Linux.
I've investigated the /proc/$$/fd data and it almost does the job. If the first command doesn't exit for several seconds while piping data to the second command can you read /proc/$$/fd/0 to see the value pipe:[PIPEID] that it symlinks to. That can then be used to search through the rest of the /proc//fd/ data for other running processes to find another process with a pipe open using the same PIPEID which gives you the first process pid.
However in most real world tests I've done of piping you can't trust that the first command will stay running long enough for the second one to have time to locate it's pipe fd in /proc before it exits (which removes the proc data preventing it being read). So if this method will return any information is something I can't rely on.

Bash script - run process & send to background if good, or else

I need to start up a Golang web server and leave it running in the background from a bash script. If the script in question in syntactically correct (as it will be most of the time) this is simply a matter of issuing a
go run /path/to/index.go &
However, I have to allow for the possibility that index.go is somehow erroneous. I should explain that in Golang this for something as "trival" as importing a module that you then fail to use. In this case the go run /path/to/index.go bit will return an error message. In the terminal this would be something along the lines of
index.go:4:10: expected...
What I need to be able to do is to somehow change that command above so I can funnel any error messages into a file for examination at a later stage. I tried variants on go run /path/to/index.go >> errors.txt with the terminating & in different positions but to no avail.
I suspect that there is a bash way to do this by altering the priority of evaluation of the command via some judiciously used braces/brackets etc. However, that is way beyond my bash capabilities. I would be most obliged to anyone who might be able to help.
Update
A few minutes later... After a few more experiments I have found that this works
go run /path/to/index.go &> errors.txt &
Quite apart from the fact that I don't in fact understand why it works there remains the issue that it produces a 0 byte errors.txt file when the command goes to completion without Golang throwing up any error messages. Can someone shed light on what is going on and how it might be improved?
Taken from man bash.
Redirecting Standard Output and Standard Error
This construct allows both the standard output (file descriptor 1) and the standard error output (file descriptor 2) to be redirected to the file whose name is the expansion of word.
There are two formats for redirecting standard output and standard error:
&>word
and
>&word
Of the two forms, the first is preferred. This is semantically equivalent to
>word 2>&1
Appending Standard Output and Standard Error
This construct allows both the standard output (file descriptor 1) and the standard error output (file descriptor 2) to be appended to the file whose name is the expansion of word.
The format for appending standard output and standard error is:
&>>word
This is semantically equivalent to
>>word 2>&1
Narūnas K's answer covers why the &> redirection works.
The reason why the file is created anyway is because the shell creates the file before it even runs the command in question.
You can see this by trying no-such-command > file.out and seeing that even though the shell errors because no-such-command doesn't exist the file gets created (using &> on that test will get the shell's error in the file).
This is why you can't do things like sed 'pattern' file > file to edit a file in place.

Redirection standard error & ouput streams is postponed

I need to redirect output & error streams from one Windows process (GNU make.exe executing armcc toolchain) to some filter written on perl. The command I am running is:
Make Release 2>&1 | c:\cygwin\bin\perl ../tools/armfilt.pl
The compilation process throws out some prints which should be put then to STDOUT after some modifications. But I encountered a problem: all prints generated by the make are actually postponed till end of the make's process and only then are shown to a user. So, my questions are:
Why has it happen? I have tried to change the second process (perl.exe) priority from "Normal" to "Above normal" but it didn't help...
How to overcome this problem?
I think that one of possible workarounds may be to send only STDERR prints to the perl (that is what I actually need), not STDOUT+STDERR. But I don't know how to do it in Windows.
The Microsoft explanation concerning pipe operator usage says:
The pipe operator (|) takes the output (by default, STDOUT) of one
command and directs it into the input (by default, STDIN) of another
command.
But how to change this default STDOUT piping is not explained. Is it possible at all?

How to get error output and store it in a variable or file

I'm having a little trouble figuring out how to get error output and store it in a variable or file in ksh. So in my script I have cp -p source.file destination inside a while loop.
When I get the below error
cp: source.file: The file access permissions do not allow the specified action.
I want to grab it and store it in a variable or file.
Thanks
You can redirect the error output of the command like so:
cp -p source.file destination 2>> my_log.txt
It will append the error message to the my_log.txt file.
In case you want a variable you can redirect stderr to stdout and assign the command output to a variable:
my_error_var=$(cp -p source.file destination 2>&1)
In ksh (as per Q), as in bash and other sh derivatives, you can get all/just stderr output from cp using redirection, then grabbing in a var (using $(), better than backtick if using a vaguely recent version):
output=$(cp -p source.file destination 2>&1)
cp doesn't normally output anything though this would capture stdout and stderr; to capture just stderr this way, use 1>/dev/null also. The other solutions redirecting to a file could use cat/various other commands to output/process the logfile.
Reason why I don't suggest using outputting to temporary files:
Redirecting to a file then reading that in (via read command or more inefficiently via $(cat file) ), particularly for just a single line, is less efficient and slower; though not so bad if you want to append to it each time for multiple operations before displaying the errors. You'll also leave the temporary file around unless you ALWAYS clear it up, don't forget when people interrupt (ie. Ctrl-C) or kill the script.
Using temporary files also could be a problem if the script is run multiple times at once (eg. could happen via cron if filesystem/other delays cause massive overruns or just from multiple users), unless the temporary filename is unique.
Generating temporary files is also a security risk unless done very carefully, especially if the file data is processed again or the contents could be rewritten before display by something else to confuse/phish the user/break the script. Don't get into a habit of doing it too casually, read up on temporary files (eg. mktemp) first via other questions here/google.
You can do STDERR redirects by doing:
command 2> /path/to/file.txt

Resources