How to find value of a key on tail -f - bash

My log files are in key-value format. I want to find value of a particular key on tail -f ..
Suppose one of the line in log is:
ts=2016-12-23-18-31-34-849 | deviceType=LENOVO Lenovo A6000 | elapsed=11 | firstHomePage=null | installId=37797b61-0bb1-4c1a-844c-5904c7e83de8 | ip=157.48.104.146
ts=2016-12-23-18-31-34-849 | deviceType=LENOVO Lenovo A6000 | elapsed=15 | firstHomePage=null | installId=37797b61-0bb1-4c1a-844c-5904c7e83de8 | ip=157.48.104.146
I am not sure how do I pipe output of my tail -f so that output should be following
11
15

Use GNU grep with the --line-buffered command to buffer stdout as it arrives in case of continuously growing file. The -o flag for matching only the pattern and -P to enable perl style regEx captures.
tail -f file | grep --line-buffered -oP "elapsed=\K(\d+)"
11
15
From the man grep page,
--line-buffered
Use line buffering on output.

Try grep:
tail log_file | grep -o '\<elapsed=[^[:space:]]*' | cut -d= -f2

awk -F'[=|]' '{print $6}' file
11
15

Related

How to feed xargs to a piped grep for a piped cat command

How to feed xargs to a piped grep for a piped cat command.
Command 1:
(Generates a grep pattern with unique PIDs for a particular date time, read from runtime.log)
cat runtime.log | grep -e '2018/09/13 14:50' | awk -F'[ ]' '{print $4}' | awk -F'PID=' '{print $2}' | sort -u | xargs -I % echo '2018/09/13 14:50.*PID='%
The output of above command is (It's custom grep pattern):
2018/09/13 14:50.*PID=13109
2018/09/13 14:50.*PID=14575
2018/09/13 14:50.*PID=15741
Command 2:
(Reads runtime.log and fetch the appropriate lines based on the grep pattern (Ideally the grep pattern should comes from command 1))
cat runtime.log | grep '2018/09/13 14:50.*PID=13109'
The question is How to combine both Command 1 & Command 2
Below combined version of command doesn't gives the expected output (The produced output had lines having the date other than '2018/09/13 14:50')
cat runtime.log | grep -e '2018/09/13 14:50' | awk -F'[ ]' '{print $4}' | awk -F'PID=' '{print $2}' | sort -u | xargs -I % echo '2018/09/13 14:50.*PID='% | cat runtime.log xargs grep
grep has an option -f. From man grep:
-f FILE, --file=FILE
Obtain patterns from FILE, one per line. The empty file contains zero patterns, and therefore matches nothing. (-f is specified by POSIX .)
So you could use
cat runtime.log | grep -e '2018/09/13 14:50' | awk -F'[ ]' '{print $4}' | awk -F'PID=' '{print $2}' | sort -u | xargs -I % echo '2018/09/13 14:50.*PID='% > a_temp_file
cat runtime.log | grep -f a_temp_file
The shell has a syntax that avoids having to create the temporary file. <(). From man bash:
Process Substitution
Process substitution is supported on systems that support named pipes
(FIFOs) or the /dev/fd method of naming open files. It takes the form
of <(list) or >(list). The process list is run with its input or
output connected to a FIFO or some file in /dev/fd. The name of this
file is passed as an argument to the current command as the result of
the expansion. If the >(list) form is used, writing to the file will
provide input for list. If the <(list) form is used, the file passed
as an argument should be read to obtain the output of list.
So you can combine it to:
cat runtime.log | grep -f <(cat runtime.log | grep -e '2018/09/13 14:50' | awk -F'[ ]' '{print $4}' | awk -F'PID=' '{print $2}' | sort -u | xargs -I % echo '2018/09/13 14:50.*PID='%)

How to write a shell script that reads all the file names in the directory and finds a particular string in file names?

I need a shell script to find a string in file like the following one:
FileName_1.00_r0102.tar.gz
And then pick the highest value from multiple occurrences.
I am interested in "1.00" part of the file name.
I am able to get this part separately in the UNIX shell using the commands:
find /directory/*.tar.gz | cut -f2 -d'_' | cut -f1 -d'.'
1
2
3
1
find /directory/*.tar.gz | cut -f2 -d'_' | cut -f2 -d'.'
00
02
05
00
The problem is there are multiple files with this string:
FileName_1.01_r0102.tar.gz
FileName_2.02_r0102.tar.gz
FileName_3.05_r0102.tar.gz
FileName_1.00_r0102.tar.gz
I need to pick the file with FileName_("highest value")_r0102.tar.gz
But since I am new to shell scripting I am not able to figure out how to handle these multiple instances in script.
The script which I came up with just for the integer part is as follows:
#!/bin/bash
for file in /directory/*
file_version = find /directory/*.tar.gz | cut -f2 -d'_' | cut -f1 -d'.'
done
OUTPUT: file_version:command not found
Kindly help.
Thanks!
If you just want the latest version number:
cd /path/to/files
printf '%s\n' *r0102.tar.gz | cut -d_ -f2 | sort -n -t. -k1,2 |tail -n1
If you want the file name:
cd /path/to/files
lastest=$(printf '%s\n' *r0102.tar.gz | cut -d_ -f2 | sort -n -t. -k1,2 |tail -n1)
printf '%s\n' *${lastest}_r0102.tar.gz
You could try the following which finds all the matching files, sorts the filenames, takes the last in that list, and then extracts the version from the filename.
#!/bin/bash
file_version=$(find ./directory -name "FileName*r0102.tar.gz" | sort | tail -n1 | sed -r 's/.*_(.+)_.*/\1/g')
echo ${file_version}
I have tried and thats worth working below script line, that You need.
echo `ls ./*.tar.gz | sort | sed -n /[0-9]\.[0-9][0-9]/p|tail -n 1`;
It's unnecessary to parse the filename's version number prior to finding the actual filename. Use GNU ls's -v (natural sort of (version) numbers within text) option:
ls -v FileName_[0-9.]*_r0102.tar.gz | tail -1

sort -R is not an option in my OS

I have a couple OS that do not have sort -R to generate a random list from a txt file I have. For example, I am trying to use the following command:
sort -R file | head -20000 > newfile
I looked up the man pages in these OS and sure enough, the -R option is not listed.
What is an alternative that can generate a random list from a file and print to a new file?
CentOS 5
Try:
shuf file | head -n 20000 > newfile
or:
cat file | perl -MList::Util=shuffle -e 'print shuffle(<STDIN>);'
You can use the shuf command, if it is installed.
shuf can either take a file as its input
shuf file | head -n 20000 > newfile
or read from stdin
cat file | shuf | head -n 20000 > newfile
cat file | awk 'BEGIN{srand();}{print rand()"\t"$0}' | sort -k1 -n | cut -f2 | head -20000 > newfile
This is working out for me.
cat ALLEMAILS.txt | awk 'BEGIN{srand();}{print rand()"\t"$0}' | sort -k1 -n | cut -f2 | head -20000 | tee 20000random.txt
This for seeing progress.

How to find most frequent string in file

I have a question about bash script, lets say there is file witch contains lines, each line will have path to a file and a date, the problem is how to find most frequent path.
Thanks in advance.
Here's a suggestion
$ cut -d' ' -f1 file.txt | sort | uniq -c | sort -rn | head -n1
# \_____________________/ \__/ \_____/ \______/ \_______/
# select the file column sort print sort on print top
# files counts count result
Example use:
$ cat file.txt
/home/admin/fileA jan:17:13:46:27:2015
/home/admin/fileB jan:17:13:46:27:2015
/home/admin/fileC jan:17:13:46:27:2015
/home/admin/fileA jan:17:13:46:27:2015
/home/admin/fileA jan:17:13:46:27:2015
$ cut -d' ' -f1 file.txt | sort | uniq -c | sort -rn | head -n1
3 /home/admin/fileA
You can strip out 3 from the final result by another cut.
Reverse the lines, cut the begginning (the date), reverse them again, then sort and count unique lines:
cat file.txt | rev | cut -b 22- | rev | sort | uniq -c
If you're absolutely sure you won't have whitespace in your paths, you can avoid rev altogether:
cat file.txt | cut -d " " -f 1 | sort | uniq -c
If the output is too long to inspect visually, aioobe's suggestion of following this with sort -rn | head -n1 will serve you well
It's worth noticing, as aioobe mentioned, that many unix commands optionally take a file argument. By using it, you can avoid the extra cat command in the beginning, by supplying its argument to the next command:
cat file.txt | rev | ... vs rev file.txt | ...
While I personally find the first option both easier to remember and understand, the second is preferred by many (most?) people, as it saves up system resources (specifically, the memory and references used by an additional process) and can have better performance in some specific use cases. Wikipedia's cat article discusses this in detail.

How to determine the latest major and full kernel version string as compactly as possible

So what I'm intending to do here is to determine both the latest major and the full kernel version string as compactly as possible (without a zillion pipes to grep).
I'm already quite content with the result but if anybody has any ideas how to squash the first line even the slightest it'd be very awesome (it has to work when there are no minor patches as well).
The index of kernel.org is only 36kB compared to the 136kB of that of http://www.kernel.org/pub/linux/kernel/v3.x/ so that's why I'm using it:
_major=$(curl -s http://www.kernel.org/ -o /tmp/kernel && cat /tmp/kernel | grep -A1 mainline | tail -1 | cut -d ">" -f3 | cut -d "<" -f1)
pkgver=${_major}.$(cat /tmp/kernel | grep ${_major} | head -1 | cut -d "." -f6)
It's just a thought exercise at this stage as the real answer is in the comments above, but here are some possible improvements.
Original:
_major=$(curl -s http://www.kernel.org/ -o /tmp/kernel && cat /tmp/kernel | grep -A1 mainline | tail -1 | cut -d ">" -f3 | cut -d "<" -f1)
Use tee instead of cat:
_major=$(curl -s http://www.kernel.org/ | tee /tmp/kernel | grep -A1 mainline | tail -1 | cut -d ">" -f3 | cut -d "<" -f1)
Use sed to minimise the number of pipes, and to make the command unreadable
_major=$(curl -s http://www.kernel.org/ | tee /tmp/kernel | sed -n '/ainl/,/<\/s/ s|.*>\([0-9\.]*\)</st.*|\1|p')
Cheap tricks: shorten the URL
_major=$(curl -s kernel.org | tee /tmp/kernel | sed -n '/ainl/,/<\/s/ s|.*>\([0-9\.]*\)</st.*|\1|p')
kernel.org provides a plaintext listing of all the current versions at https://www.kernel.org/finger_banner
For mainline:
curl -s https://www.kernel.org/finger_banner | grep mainline | awk '{print $NF}'
For latest stable:
curl -s https://www.kernel.org/finger_banner | grep -m1 stable | awk '{print $NF}'
The mainline and latest stable versions will never be EOL, but other versions often are, so the above awk commands will not work correctly for all versions. A general solution as a bash function:
latest_kernel() {
curl -s https://www.kernel.org/finger_banner | grep -m1 $1 | sed -r 's/^.+: +([^ ]+)( .+)?$/\1/'
}
Examples:
$ latest_kernel mainline
4.18-rc2
$ latest_kernel stable
4.17.3
$ latest_kernel 4.16
4.16.18
You've got a useless use of cat. You can replace:
cat /tmp/kernel | grep -A1 mainline
with simply:
grep -A1 mainline /tmp/kernel
In your case, you don't even need the file at all. Curl by default will emit to standard output, so you can just do:
curl -s http://www.kernel.org/ | grep -A1 mainline
Expanding on #Justin Brewer's answer, you probably want to know when a kernel is EOL since this is useful information... the following single awk command preserves all this information for you.
latest_kernel() {
curl -s https://www.kernel.org/finger_banner |awk -F ':' -v search="$1" '{if ($1 ~ search) {gsub(/^[ ]+/, "", $2); print $2}}'
}
-F ':' -- field separator because everything after the : is the version string.
-v search="$1" -- pass search string as an awk internal variable
if statement -- check if field $1 matches the search string
gsub -- in-place modify of field $2 to strip leading spaces
Then just print field $2 for any matching records (I presume your search string will only match the left-hand side of one line... if it is important to exit after the first match, use print $2; exit)
Search string can include spaces, etc. Use of awk variables and matching with ~ variable instead of pattern-matching '.../'"$1"'/...' avoids the need to exit single-quote mode and avoids syntax errors where the search string contains "/".

Resources