We have application logs that are rotated on a size basis, I.e. Each time the log reaches 1mb, the log file changes from abc.log to abc.log.201110329656, so on. When that happens, abc.log starts again from 0mb. The frequency of the log rotation is around 30 mins.
We have a cron batch job running in the background against abc.log to check for nullpointerexception every 30 mins.
The problem is, sometimes the log is rotated faster than the next batch job can run, causing the nullpointerexception to go undetected because the batch job couldn't get a chance to run.
Is there a way to solve this problem? No, I cannot change the behavior of the application logging, size, name or rotation. I cannot change the frequency of the cron interval, which is fixed at 30 minutes. However, I can freely change other things of batch job which is a bash script.
How can this be solved?
find(1) is your friend:
$ find /var/log/myapp -cmin -30 -type f -name 'abc.log*'
This gives you a list of all log files under /var/log/myapp touched in the last 30 minutes. Let your cron job script work on all these files.
You've pretty much stated what the problem is:
You have a log that automatically rolls when the log gets to a certain size.
You have another job that runs against the log file, and the log file only.
You can't adjust the log roll, and you can't adjust when the check of the log happens.
So, if the log file changes, you are searching the wrong file. Can you do a check against all log files that you've previously haven't checked with your batch script? Or, are you only allowed to check the current log file?
One way to do this is to track when you last checked the log files, and then check all those log files that are newer than the last time you did a check. You can use a file called last.check for this. This file has no contents (the contents are irrelevant), but you use the timestamp on this file to figure out when the last time your log ran. You can then use touch to change the timestamp once you've successfully checked the logs:
last_check="$log_dir/last.check"
if [ ! -e "$last_check" ]
then
echo "Error: $last_check doesn't exist"
exit 2
fi
find $log_dir -newer "$last_check" | while read file
do
[Whatever you do to check for nullpointerexception]
done
touch "$last_check"
You can create the original $last_check file using the touch command:
$ touch -m 201111301200.00 $log_dir/last.check #Touch date is in YYYYMMDDHHMM.SS format
Using a touch file provides a bit more flexibility in case things change. For example, what if you decide in the future to run the crontab every hour instead of every 30 minutes.
Related
I have a bash script that successfully deletes catalina.out files for one or more tomcat log directories (we run multiple instances) once the file exceeds a certain size. I run this script nightly as a cron job. It essentially looks like this:
find /apache-tomcat-blah*/. -name catalina.out -size +1000M -delete
However, my problem is I need to automatically create a new empty one in its place as soon as the old one is deleted.
The challenge is I will not know ahead of time which catalina.out from which tomcat instance was deleted. Also, I do not want to assume I know all the tomcat instances corresponding to /apache-tomcat-blah*/. We change them from time to time.
I assume the find command knows what it just deleted (maybe I should not assume that) so that I could theoretically pipe that information as in:
$ echo "" > /apache-tomcat-justDeletedFromDir/logs/catalina.out
if I could figure out what to put in the apache-tomcat-justDeletedFromDir part of the string.
I would be grateful for any ideas. Thank you!
Why not just use something like:
for f in $(find /apache-tomcat-blah*/. -name catalina.out -size +1000M); do
rm $f
touch $f
done
so your find command is now executed in a subshell $(), and your bash script iterates through the output (the list of files) and removes each one (via rm) and creates a new one (via touch)
You would need to be careful with the above if your files have spaces in them (note).
I'm relatively new to unix scripting, so apologies for the newbie question.
I need to create a script which will permanently run in the background, and monitor for a file to arrive in an FTP landing directory, then copy it to a different directory, and lastly remove the file from the original directory.
The script is running on a Ubuntu box.
The landing directory is /home/vpntest
The file needs to be copied as /etc/ppp/chap-secrets
So far, I've got this
#/home/vpntest/FTP_file_copy.sh
if [ -f vvpn_azure.txt ]; then
cp vvpn_azure.txt /etc/ppp/chap-secrets
rm vvpn_azure.txt
fi
I can run this as root, and it works, but only as a one off (I need it to run permanently in the background, and trigger each time a new file is received in the landing zone.)
If I don't run as root, I get issues with permissions (even if I run it from within the directory /home/vpntest.)
Any help would be much appreciated.
Updated: crontab correction and extra info
One way to have a check and move process in background with root permissions, is the "polling" approach done via root user's crontab, with your script.
Steps:
Revise your /home/vpntest/FTP_file_copy.sh:
#!/bin/bash
new_file=/home/vpntest/vvpn_azure.txt
if [ -f "$new_file" ]; then
mv $new_file /etc/ppp/chap-secrets
fi
Log out. Log in as root user.
Add a cron task to run the script:
crontab -e
If this is a new machine, and your first time running crontab, you may get a prompt first to choose an editor for crontab, just choose and continue into the editor.
The format is m h dom mon dow command, so if checking every 5 minutes is sufficiently frequent, do:
*/5 * * * * /home/vpntest/FTP_file_copy.sh
Save and close to apply.
It will now automatically run the script every 5 minutes in the background, helping you to move the file if found.
Explanation
Root user, because you mentioned it only worked for you as root.
So we set this in the root user's crontab to have sufficient permissions.
man 5 crontab informs us:
Steps are also permitted after an asterisk, so if you want to say
'every two hours', just use '*/2'.
Thus we write */5 in the first column, which is the minutes column,
to set for "every 5 minutes".
FTP_file_copy.sh:
uses absolute paths, can run from anywhere
re-arranged so one variable new_file can be re-used
good practice to enclose any values being checked within your [ ] test
uses mv to write over the destination while removing itself from the source directory
I use the command:
nohup <command to run> &
and it logs to the nohup.out file, however this log file could get pretty BIG so was wondering if there was a way to automatically save off the output to nohup1.out, nohup2.out nohup3.out etc when the nohup.out gets too big. This is all without terminating the original command.
logrotate(8) is likely what you need and probably already in your Linux distro. Specify in /etc/logrotate.conf the size limit and how many rotations you want:
From an example on thegeekstuff.com:
/tmp/output.log {
size 1k
create 700 user user
rotate 4
}
size 1k – logrotate runs only if the filesize is equal to (or greater
than) this size.
create – rotate the original file and create the new file with
specified permission, user and group.
rotate – limits the number of log file rotation. So, this would keep
only the recent 4 rotated log files.
Then run logrotate /etc/logrotate.conf
I ignored the example's file status location since logrotate is already running as a cron job by my system and a status file already exists (in a different location from the example's).
I have a file with a series of 750 csv files. I wrote a Stata that runs through each of these files and performs a single task. The files are quite big, and my program has been running for more than 12 hours.
Is there is a way to know which was the last of the csv files that was used by Stata? I would like to know if the code is somewhere near finishing. I thought that organizing the 750 files by "last used" would do the trick, but it does not.
Next time I should be more careful about signalling how the process is going...
Thank you
From the OS X terminal, cd to the directory containing the CSV files, and run the command
ls -lUt | head
which should show your files, sorted by the most recent access time, limited to the 10 most recently accessed.
On the most basic level you can use display and log your session:
clear
set more off
local myfiles file1 file2 file3
forvalues i = 1/3 {
display "processing file`i'"
// <do-something-with-file>
}
See also help log.
I have a .jar file that is compiled on a server and is later copied down to a local machine. Doing ls -lon the local machine just gives me the time it was copied down onto the local machine, which could be much later than when it was created on the server. Is there a way to find that time on the command line?
UNIX-like systems do not record file creation time.
Each directory entry has 3 timestamps, all of which can be shown by running the stat command or by providing options to ls -l:
Last modification time (ls -l)
Last access time (ls -lu)
Last status (inode) change time (ls -lc)
For example, if you create a file, wait a few minutes, then update it, read it, and do a chmod to change its permissions, there will be no record in the file system of the time you created it.
If you're careful about how you copy the file to the local machine (for example, using scp -p rather than just scp), you might be able to avoid updating the modification time. I presume that a .jar file probably won't be modified after it's first created, so the modification time might be good enough.
Or, as Etan Reisner suggests in a comment, there might be useful information in the .jar file itself (which is basically a zip file). I don't know enough about .jar files to comment further on that.
wget and curl have options that allow you to preserve the file's modified time stamp. This is close enough to what I was looking for.