my log directory containing following files
access.log
defaultAuditrecorder20110901.log (this is 31st jun generated log file)
defaultAuditrecorder20110901.log (this is 1 st aug generated log file)
defaultAuditrecorder20110902.log (this is 2 nd aug generated log file)
defaultAuditrecorder.log (this is currentdey running log file)
mng1.log001
mng1.log002
mng1.log003 .............. so on......
my requirement is using shell script i need to only delete defaultauditrecord log files except current and previous day.
Consider using logrotate. It lets you delete (or compress, rotate, etc.) log files, and is quite configurable. It is likely more robust than rolling your own script.
Edit: Here's a tutorial.
the simplest mechanism is to use the find command.
find /var/log -mtime +2d -a -type f -print
This will find all files that have been modified more than 2 days ago. To chain it into a removal command you would use:
find /var/log -mtime +2d -a -type f -print0 | xargs -0 rm
In this example, I used /var/log, you would substitute the directory that contains the logs. The reason for using the -print0 and the xargs -0, is that if the file contains whitespace it would not get processed by the rm command properly.
Related
First I made a question here: Unzip a file and then display it in the console in one step
It works and helped me a lot. (please read)
Now I have a second issue. I do not have a single zipped log file but I have a lot of them in defferent folders, which I need to find first. The files have the same names. For example:
/somedir/server1/log.gz
/somedir/server2/log.gz
/somedir/server3/log.gz
and so on...
What I need is a way to:
find all the files like: find /somedir/server* -type f -name log.gz
unzip the files like: gunzip -c log.gz
use grep on the content of the files
Important! The whole should be done in one step.
I cannot first store the extracted files in the filesystem because it is a readonly filesystem. I need somehow to connect, with pipes, the output from one command to the input of the next.
Before, the log files were in text format (.txt), therefore I had not to unzip them first. In this case it was easy:
ex.
find /somedir/server* -type f -name log.txt | xargs grep "term"
Now I have to deal with zipped files. That means, after I find the files, I need first somehow do unzip them and then send the contents to grep.
With one file I do:
gunzip -p /somedir/server1/log.gz | grep term
But for multiple files I don't know how to do it. For example how to pass the output of find to gunzip and the to grep?!
Also if there is another way / "best practise" how to do that, it is welcome :)
find lets you invoke a command on the files it finds:
find /somedir/server* -type f -name log.gz -exec gunzip -c '{}' + | grep ...
From the man page:
-exec command {} +
This variant of the -exec action runs the specified command on
the selected files, but the command line is built by appending
each selected file name at the end; the total number of
invocations of the command will be much less than the number
of matched files. The command line is built in much the same
way that xargs builds its command lines. Only one instance of
{} is allowed within the command, and (when find is being
invoked from a shell) it should be quoted (for example, '{}')
to protect it from interpretation by shells. The command is
executed in the starting directory. If any invocation with
the + form returns a non-zero value as exit status, then
find returns a non-zero exit status. If find encounters an
error, this can sometimes cause an immediate exit, so some
pending commands may not be run at all. This variant of -exec
always returns true.
Part of a script I currently use is using "ls -FCRlhLoprt" to list every file inside of a root directory recursively to a text document. The problem is, every time I run the script, ls includes that document in its output so the text document grows each time I run it. I believe I can use -i or --ignore, but how can I use that when ls is using a few variables? I keep getting errors:
ls "$lsopt" "$masroot"/ >> "$masroot"/"$client"_"$jobnum"_"$mas"_drive_contents.txt . #this works
If I try:
ls -FCRlhLoprt --ignore=""$masroot"/"$client"_"$jobnum"_"$mas"_drive_contents.txt"" "$masroot"/ >> "$masroot"/"$client"_"$jobnum"_"$mas"_drive_contents.txt #this does not work
I get errors. I basically want to not include the output back into the 2nd time I run this command.
Additional, all I am trying to do is create an easy to read document of every file inside of a directory recursively. If there is a better way, please let me know.
Additional, all I am trying to do is create an easy to read document of every file inside of a directory recursively. If there is a better way, please let me know.
To list every file in a directory recursively, the find command does exactly what you want, and admits further programmatic manipulation of the files found if you wish.
Examples:
To list every file under the current directory, recursively:
find ./ -type f
To list files under /etc/ and /usr/share, showing their owners and permissions:
find /etc /usr/share -type f -printf "%-100p %#m %10u %10g\n"
To show line counts of all files recursively, but ignoring subdirectories of .git:
find ./ -type f ! -regex ".*\.git.*" -exec wc -l {} +
To search under $masroot but ignore files generated by past searches, and dump the results into a file:
find "$masroot" -type f ! -regex ".*/[a-zA-Z]+_[0-9]+_.+_drive_contents.txt" | tee "$masroot/${client}_${jobnum}_${mas}_drive_contents.txt"
(Some of that might be slightly different on a Mac. For more information see man find.)
I have a source folder, it consists of nested sub-directories. I want to move all the .txt files which are older than 2 days, present in
source & nested sub-directories to target directory in Hadoop.
Something like this might move files from source to target.
hadoop fs -mv /user/source/*.txt /user/target
How do I move the .txt files which are older than 2 days?
You can use find commands beautiful parameter that allows us to use some commands with it which is -exec
find /user/source/*.txt -type f -mtime 2 -exec mv '{}' /user/target \;
But sometimes this gives some problem with files so in case of this you can try this script as well.You need to create the directory tree from subdir1/subdir2/ - You could do, for example:
find /user/source/*.txt -type f -mtime +2 -print0 | while IFS= read -r -d '' file;do
dir="${file%/*}"
mkdir -p ../yourfilearchive/"$dir"
mv "$file" ../yourhadoopdir/"$file"
done
This script will simply recreate your files rather than move them into the your directories.
These are ofcourse will simply work for your txt files which are older than 2 days if you want to use hdfs s own commands I found great answer for that.
Get files which are created in last 5 minutes in hadoop using shell script
I am busy writing a bash script that can delete a file according to the date in a file name for example xxxxxxxxx140516.log. It should delete the file a day after the 16 of May 2014. I would like the script to read the file from character 25 because this is were the date start and it should do regular checks to see if there are any old files. Current my script looks like this:
#!/bin/bash
rm *$(date -d "-1 day" +"%y%m%d")*
The problem with this script is that if the computer is not up and running for a couple of days it will not delete old files that is past the date and it does not start on character 25. Please help, Thanks
for day in {1..7}; do
rm -f ????????????????????????$(date -d "-$day day" +"%y%m%d").log
done
This will allow for the script not running up to a week; you can change the range to handle longer periods.
There are 24 ? characters in the wildcard, so it will find the date starting at character 25.
The -f option keeps rm from printing an error message if no matching files are found. Warning: it also prevents it from asking for confirmation if you don't have write permission to the file.
the notation {start..end} expands into a sequence of numbers from start to end, so {1..10} would be short for 1 2 3 4 5 6 7 8 9 10. You can also use {char1..char2} to get a sequence of characters, e.g. {a..d} is short for a b c d.
If you want, you can purge such log files using the timestamps of those files as follows:
find <path_to_logs_dir> -mtime +0 -exec rm {} \;
The above command will purge any files older than today. Please note that this command doesn't specifically look for log files of any format as such. I am just assuming here that the log directory only contains the log files.
Further you can filter out log files as below ->
find <path_to_logs_dir> -name "*.log" -type f -mtime +0 -exec rm {} \;
"-type f" ensures it lists only files and not directories.
I want to copy all the files in a directory that were modified this month. I can list those files like this:
ls -l * | grep Jul
And then to copy them I was trying to pipe the result into cp via xargs but had no success (I think) because I couldn't figure out how to parse the ls -l output to just grab the filename for cp.
I'm sure there are many ways of doing this; I'll give the correct answer out to the person who can show me how to parse ls -l in this manner (or talk me down from that position) though I'd be interested in seeing other methods as well.
Thanks!
Of course, just doing grep Jul is bad because you might have files with Jul in their name.
Actually, find is probably the right tool for your job. Something like this:
find $DIR -maxdepth 1 -type f -mtime -30 -exec cp {} $DEST/ \;
where $DIR is the directory where your files are (e.g. '.') and $DEST is the target directory.
The -maxdepth 1 flag means it doesn't look inside sub-directories for files (isn't recursive)
The -type f flag means it looks only at regular files (e.g. not directories)
The -mtime -30 means it looks at files with modification time newer than 30 days (+30 would be older than 30 days)
the -exec flag means it executes the following command on each file, where {} is replaced with the file name and \; is the end of the command
interested in seeing how this might be done with zsh
ls -lt *.*(mM0)
last month
ls -lt *.*(mM1)
or for precise date date ranges
autoload -U age
ls -tl *.*(e#age 2014/06/07 now#)
ls -tl *.*(e#age 2014/06/01 2014/06/20#)