How to read output from bzcat instead of specifying a filename - bash

I need to use 'last' to search through a list of users who logged into a system, i.e.
last -f /var/log/wtmp <username>
Considering the number of bzipped archive files in that directory, and considering I am on a shared system, I am trying to include an inline bzcat, but nothing seems to work. I have tried the following combinations with no success:
last -f <"$(bzcat /var/log/wtmp-*)"
last -f <$(bzcat /var/log/wtmp-*)
bzcat /var/log/wtmp-* | last -f -
Driving me bonkers. Any input would be great!

last (assuming the Linux version) can't read from a pipe. You'll need to temporarily bunzip2 the files to read them.
tempfile=`mktemp` || exit 1
for wtmp in /var/log/wtmp-*; do
bzcat "$wtmp" > "$tempfile"
last -f "$tempfile"
done
rm -f "$tempfile"

You can only use < I/O redirection on one file at a time.
If anything is going to work, then the last line of your examples is it, but does last recognize - as meaning standard input? (Comments in another answer indicate "No, last does not recognize -". Now you see why it is important to follow all the conventions - it makes life difficult when you don't.) Failing that, you'll have to do it the classic way with a shell loop.
for file in /var/log/wtmp-*
do
last -f <(bzcat "$file")
done
Well, using process substitution like that is pure Bash...the classic way would be more like:
tmp=/tmp/xx.$$ # Or use mktemp
trap "rm -f $tmp; exit 1" 0 1 2 3 13 15
for file in /var/log/wtmp-*
do
bzcat $file > $tmp
last -f $tmp
done
rm -f $tmp
trap 0

Related

grep multiple strings from multiple files and stop processing other files when first match found

I have ca 270 .bz2 log files (25 day logs) and one text file with ca 1500 usernames. What I need to do is find who of theese users are logged in in last 25 days. So I need to grep usernames from list of files and stop grepping when username is found in first file (when first match found).
My code works, but if in first file match found I do not need to process other files, break and search for another username, if it is found i.e. in third file, break and search for another username:
for i in $(cat /tmp/usernames.txt); do for j in $(ls *.bz2); do
bzgrep -o -m1 $i $j; done; done
Here, if in fist file match found it breaks (-m1 flag) and starts searching for the same username in second file, but I do not need that anymore.
Problem: I need to inspect users who are not logged in in last 25 days. So I can reduce their permissions in the application. If user is logged at least once in last 25 days, I do not reduce his permissions.
Question is: I need to find whom of theese usernames exist in my log files. If username is found in one of the files at least one time stop searching for this user and start searching for another user.
Example: if user1 is found in file1, print it and stop searching for this user any more in this or other files. If user2 is found in file8, print it one time and stop searching in file9, file10, file11 ... file250. Hope it makes sense.
Can't you just do this to get the list of user names that appear in any of the bzipped files:
bzgrep -o -w -F -f /tmp/usernames.txt *.bz2 | sort -u
and then a diff of that output against usernames.txt to see who has/hasn't logged in? Wrap it in a loop if it turns out to be more efficient to check one .bz2 file at a time:
for file in *.bz2; do
bzgrep -o -w -F -f /tmp/usernames.txt "$file"
done | sort -u
and you could remove found user names from each iteration if that improves performance too:
sort -u /tmp/usernames.txt > /tmp/names.txt
for file in *.bz2; do
bzgrep -o -w -F -f /tmp/names.txt "$file" | sort -u > /tmp/found.txt &&
comm -23 /tmp/names.txt /tmp/found.txt > /tmp/left.txt &&
mv /tmp/left.txt /tmp/names.txt &&
cat /tmp/found.txt
[[ -s /tmp/names.txt ]] || break
done
You could use a conditional:
if [ -n "$var" ]; then
echo "Match!"
break
fi
This structure means that the conditional is True only when $var is not empty. The loop will stop when the condition becomes True.
Good luck!
If disk space isn't a concern I would ask bzip2 to decompress all the archives to a single file and invoke grep -m1 on that file for each username :
bzcat *.bz2 > merged
while IFS='' read -r username; do
grep -om1 "$username" merged
done < /tmp/usernames.txt
rm merged

Monitor Pre-existing and new files in a directory with bash

I have a script using inotify-tool.
This script notifies when a new file arrives in a folder. It performs some work with the file, and when done it moves the file to another folder. (it looks something along these line):
inotifywait -m -e modify "${path}" |
while read NEWFILE
work on/with NEWFILE
move NEWFILE no a new directory
done
By using inotifywait, one can only monitor new files. A similar procedure using for OLDFILE in path instead of inotifywait will work for existing files:
for OLDFILE in ${path}
do
work on/with OLDFILE
move NEWFILE no a new directory
done
I tried combining the two loops. By first running the second loop. But if files arrive quickly and in large numbers there is a change that the files will arrive wile the second loop is running. These files will then not be captured by neither loop.
Given that files already exists in a folder, and that new files will arrive quickly inside the folder, how can one make sure that the script will catch all files?
Once inotifywait is up and waiting, it will print the message Watches established. to standard error. So you need to go through existing files after that point.
So, one approach is to write something that will process standard error, and when it sees that message, lists all the existing files. You can wrap that functionality in a function for convenience:
function list-existing-and-follow-modify() {
local path="$1"
inotifywait --monitor \
--event modify \
--format %f \
-- \
"$path" \
2> >( while IFS= read -r line ; do
printf '%s\n' "$line" >&2
if [[ "$line" = 'Watches established.' ]] ; then
for file in "$path"/* ; do
if [[ -e "$file" ]] ; then
basename "$file"
fi
done
break
fi
done
cat >&2
)
}
and then write:
list-existing-and-follow-modify "$path" \
| while IFS= read -r file
# ... work on/with "$file"
# move "$file" to a new directory
done
Notes:
If you're not familiar with the >(...) notation that I used, it's called "process substitution"; see https://www.gnu.org/software/bash/manual/bash.html#Process-Substitution for details.
The above will now have the opposite race condition from your original one: if a file is created shortly after inotifywait starts up, then list-existing-and-follow-modify may list it twice. But you can easily handle that inside your while-loop by using if [[ -e "$file" ]] to make sure the file still exists before you operate on it.
I'm a bit skeptical that your inotifywait options are really quite what you want; modify, in particular, seems like the wrong event. But I'm sure you can adjust them as needed. The only change I've made above, other than switching to long options for clarity/explicitly and adding -- for robustness, is to add --format %f so that you get the filenames without extraneous details.
There doesn't seem to be any way to tell inotifywait to use a separator other than newlines, so, I just rolled with that. Make sure to avoid filenames that include newlines.
By using inotifywait, one can only monitor new files.
I would ask for a definition of a "new file". The man inotifywait specifies a list of events, which also lists events like create and delete and delete_self and inotifywait can also watch "old files" (beeing defined as files existing prior to inotifywait execution) and directories. You specified only a single event -e modify which notifies about modification of files within ${path}, it includes modification of both preexisting files and created after inotify execution.
... how can one make sure that the script will catch all files?
Your script is just enough to catch all the events that happen inside the path. If you have no means of synchronization between the part that generates files and the part that receives, there is nothing you can do and there always be a race condition. What if you script receives 0% of CPU time and the part that generates the files will get 100% of CPU time? There is no guarantee of cpu time between processes (unless using certified real time system...). Implement a synchronization between them.
You can watch some other event. If the generating sites closes files when ready with them, watch for the close event. Also you could run work on/with NEWFILE in parallel in background to speed up execution and reading new files. But if the receiving side is slower then the sending, if your script is working on NEWFILEs slower then the generating new files part, there is nothing you can do...
If you have no special characters and spaces in filenames, I would go with:
inotifywait -m -e modify "${path}" |
while IFS=' ' read -r path event file ;do
lock "${path}"
work on "${path}/${file}"
ex. mv "${path}/${file}" ${new_location}
unlock "${path}"
done
where lock and unlock is some locking mechanisms implemented between your script and the generating part. You can create a communication between the-creation-of-files-process and the-processing-of-the-files-process.
I think you can use some transaction file system, that would let you to "lock" a directory from the other scripts until you are ready with the work on it, but I have no experience in that field.
I tried combining the two loops. But if files arrive quickly and in large numbers there is a change that the files will arrive wile the second loop is running.
Run the process_new_file_loop in background prior to running the process_old_files_loop. Also it would be nice to make sure (ie. synchronize) that inotifywait has successfully started before you continue to the processing-existing-files-loop so that there is also no race conditions between them.
Maybe a simple example and/or startpoint would be:
work() {
local file="$1"
some work "$file"
mv "$file" "$predefiend_path"
}
process_new_files_loop() {
# let's work on modified files in parallel, so that it is faster
trap 'wait' INT
inotifywait -m -e modify "${path}" |
while IFS=' ' read -r path event file ;do
work "${path}/${file}" &
done
}
process_old_files_loop() {
# maybe we should parse in parallel here too?
# maybe export -f work; find "${path} -type f | xargs -P0 -n1 -- bash -c 'work $1' -- ?
find "${path}" -type f |
while IFS= read -r file; do
work "${file}"
done
}
process_new_files_loop &
child=$!
sleep 1
if ! ps -p "$child" >/dev/null 2>&1; then
echo "ERROR running processing-new-file-loop" >&2
exit 1
fi
process_old_files_loop
wait # wait for process_new_file_loop
If you really care about execution speeds and want to do it faster, change to python or to C (or to anything but shell). Bash is not fast, it is a shell, should be used to interconnect two processes (passing stdout of one to stdin of another) and parsing a stream line by line while IFS= read -r line is extremely slow in bash and should be generally used as a last resort. Maybe using xargs like xargs -P0 -n1 sh -c "work on $1; mv $1 $path" -- or parallel would be a mean to speed things up, but an average python or C program probably will be nth times faster.
A simpler solution is to add an ls in front of the inotifywait in a subshell, with awk to create output that looks like inotifywait.
I use this to detect and process existing and new files:
(ls ${path} | awk '{print "'${path}' EXISTS "$1}' && inotifywait -m ${path} -e close_write -e moved_to) |
while read dir action file; do
echo $action $dir $file
# DO MY PROCESSING
done
So it runs the ls, format the output and sends it to stdout, then runs the inotifywait in the same subshell sending the output also to stdout for processing.

Unix script - Comparing number of filename date with my single input date

I am new to Unix scripting, I am trying to create Unix script since one week but I couldn't. Please help me in this.
I have a number of different files more than 100 (all the filenames are different) which the filename contains the date string(ex: 20171101)in the directory. I want compare these filename dates with my input date (today - 10days =20171114),with the files in the directories only using filename string date if it is less than with my input date then I have to delete the file. could anyone please help on this. Thanks
My script:
ten_days_ago=$(date -d "10 days ago" +%Y%m%d)
cd "$destination_dir" ;
ls *.* | awk -F '-' '{print $2}'
ls *.* | awk -F '-' '{print $2}' > removal.txt
while read filedate
do
if [ "$filedate" -lt "$ten_days_ago" ] ; then
cd "$destination_dir" ;
rm *-"$filedate"*
echo "deletion done"
fi
done <removal.txt
this script is working fine. but I need to send a email as well - if the deletion has been done then -one pass email else fail email.
but here within while loop if I am writing the emails then that will iterate
You're probably trying to pipe to mail from the middle of your loop. (Your question should really show this code, otherwise we can't say what's wrong.) A common technique is to redirect the loop's output to a file, and then send that. (Using a temporary file is slightly ugly, but avoids sending an empty message when there is no output from the loop.)
Just loop over the files and decide which to remove.
#!/bin/bash
t=$(mktemp -t tendays.XXXXXXXX) || exit
# Remove temp file if interrupted, or when done
trap 'rm -f "$t"' EXIT HUP INT TERM
ten_days_ago=$(date -d "10 days ago" +%Y%m%d)
for file in *-[1-9]*[1-9]-*; do
date=${file#*-} # strip prefix up through first dash
date=${date%-*} # strip from last dash from the previous result
if [ "$date" -lt "$ten_days_ago" ]; then
rm -v "$file"
fi
done >"$t" 2>&1
test -s "$t" || exit # Quit if empty
mail -s "Removed files" recipient#example.net <"$t"
I removed the (repeated!) cd so this can be run in any directory -- just switch to the directory you want before running the script. This also makes it easier to test in a directory with a set of temporary files.
Collecting the script's standard error also means the mail message will contain any error messages if rm fails for some reason or you have other exceptions.
By the by you should basically never use ls in scripts.

how to compare same file and send mail in linux

I am creating bash script on a file to send diff over the mail.
for below case, I have created two files as "xyz.conf" and "xyz.conf_bkp" to compare
So far, I have come with below script -
file="/a/b/c/xyz.conf"
while true
do
sleep 1
cmp -s $file ${file}_bkp
if [ $? > 0 ]; then
diff $file ${file}_bkp > compare.log
mailx -s "file compare" abc#xyz.com < compare.log
sleep 2
cp $file ${file}_bkp
exit
fi
done
I have scheduled above script to run every second
* * * * 0-5 script.sh
This is working fine but I am looking with different approach like below -
I am looking for to work without creating another backup file Imagine if I have to work multiple files which will lead me to crate those many backups which doesn't look good solution.
Can anyone suggest, how to implement this approach?
I would write it this way.
while cmp "$file" "${file}_bkp"; do
sleep 2
done
diff "$file" "${file}_bkp" | mailx -s "file compare" abc#xyz.com
cp "$file" "${file}_bkp"
I wanted to avoid running both cmp and diff, but I'm not sure that is possible without a temporary file or pipe to hold the data from diff until you determine if mailx should run.
while diff "$file" "${file}_bkp"; do
sleep 2
done | {
mailx -s "file compare" abc#xyz.com
cp "$file" "${file}_bkp"
exit
}
diff will produce no output when its exit status is 0, so when it finally has a non-zero exit status its output is piped (as the output of the while loop) to the compound command that runs mailx and cp. Technically, both mailx and cp can both read from the pipe, but mailx will exhaust all the data before cp runs, and this cp command would ignore its standard input anyway.

Tailing Rolling Files

I have a directory full of rolling log files that I would like to be able to use tail on.
The files are named as such:
name modified
00A.txt Dec 27 19:00
00B.txt Dec 27 19:01
00C.txt Dec 27 19:02
00D.txt Dec 27 19:03
On an older unix system, I'm trying to come up with a shell script that will tail the most recently modified file in a specific directory, and if that file gets administratively closed (rolls to the next file) I want the program to automatically begin tailing the new file without me having to break out of tail to rerun.
tail -100f `ls -t | head -1`
The desired behavior, given the filenames above, would go like this:
./logtailer.sh
Then the script would begin tailing 00D.txt. As soon as the logger was finished writing to 00D.txt and the newest log file was now named 00E.txt, the program would automatically begin tailing that file.
One could write this script by watching the output of tail for the text "File Administratively Closed" and then having the following command run again.
tail -100f `ls -t | head -1`
Is there a more elegant way to do this than by watching for the text "file administratively closed"? How can I even read the output of tail line by line in a shell script?
Edit: I should explain that the -F flag for tail is not an option for me on this system. It uses a different version of tail that does not contain this feature.
OS version - Solaris 10
You can use the -F option for tail which implies --follow=name --retry.
From the man page:
-F
The -F option implies the -f option, but tail will also check to see if the
file being followed has been renamed or rotated. The file is closed and
reopened when tail detects that the filename being read from has a new inode
number. The -F option is ignored if reading from standard input rather than
a file.
you may need inotify to detect the creation of the new file, a workaround for that would be to keep polling the filesystem while running tail on the background:
#!/bin/bash
get_latest() {
local files=(*.log)
local latest=${files[${#files[#]}-1]}
echo "$latest"
[[ $latest == $1 ]]
}
run_tail() {
tail -c +0 -f "$1"
}
while true; do
while current=$(get_latest "$current"); do
sleep 1
done
[[ $pid ]] && kill $pid
run_tail "$current" & pid=$!
done
(untested, unnecesarily hacky and be careful with the limitations of your old system!)

Resources