Trying to parse logfile based on start and end time - bash

I am trying to parse large zipped logfile and would like to collect all matching parameters within a certain time range:
Wed Nov 3 09:27:20 2010 : remote IP address 209.151.64.18
Wed Nov 3 11:57:22 2010 : secondary DNS address 204.117.214.10
I am able to grep other parameter using the line below:
gzcat jfk-gw10-asr1.20100408.log.gz | egrep gabriel|98.126.209.144\|13.244.137.58\|16.151.65.121
I have been unable to parse for the start time and/or end time.
Any assistance is greatly appreciated.

Assuming that the log file is chronologically sorted you could do e.g.:
gzcat jfk-gw10-asr1.20100408.log.gz | sed -n '/Nov 3 09:/,/Nov 3 11:/p'
to get log entries between 09:00:00 and 11:59:59 on Nov, 3rd.

you can access space separated field using awk:
cat you_file_name | awk '/ / { print $1;}'
Use $[n] to print the desired field.

Related

converting space separated file into csv format in linux

I have a file that contain data in the format
2 Mar 1 1234 141.98.80.59
1 Mar 1 1234 171.239.249.233
5 Mar 1 admin 116.110.119.156
4 Mar 1 admin1 177.154.8.15
2 Mar 1 admin 141.98.80.63
2 Mar 1 Admin 141.98.80.63
i tried this command to convert into csv format but it is giving me the output with extra (,) in the front
cat data.sql | tr -s '[:blank:]' ',' > data1.csv
,2,Mar,1,1234,141.98.80.59
,1,Mar,1,1234,171.239.249.233
,5,Mar,1,admin,116.110.119.156
,4,Mar,1,admin1,177.154.8.15
,2,Mar,1,admin,141.98.80.63
,2,Mar,1,Admin,141.98.80.63
In my file there is 6 character space is there in-front on every record
how can i remove extra comma from the front
how [to] remove extra comma from the front using awk:
$ awk -v OFS=, '{$1=$1}1' file
Output:
2,Mar,1,1234,141.98.80.59
1,Mar,1,1234,171.239.249.233
5,Mar,1,admin,116.110.119.156
...
Output with #EdMorton's version proposed in comments:
2,Mar,1,1234,141.98.80.59
1,Mar,1,1234,171.239.249.233
5,Mar,1,admin,116.110.119.156
...
The improved version of your current method is:
cat data.sql | sed -E -e 's/^[[:blank:]]+//g' -e 's/[[:blank:]]+/,/g' > data1.csv
But do be aware that replacing spaces/commas isnt a real way of changing this format into a CSV. If there are/were any commas and/or spaces present in the actual data this approach would fail.
The fact that your example source file has the .sql extension suggests that perhaps you get this file by exporting a database, and have already stripped parts of the file away with other tr statements ? If that is the case, a better approach would be to export to CSV (or another format) directly
edit: Made sed statement more portable as recommended by per Quasímodo in the comments.
Using Miller is
mlr --n2c -N remove-empty-columns ./input.txt >./output.txt
The output will be
2,Mar,1,1234,141.98.80.59
1,Mar,1,1234,171.239.249.233
5,Mar,1,admin,116.110.119.156
4,Mar,1,admin1,177.154.8.15
2,Mar,1,admin,141.98.80.63
2,Mar,1,Admin,141.98.80.63

Solaris/Unix: How to display only new lines from log file since last check

The machine is SunOS 5.10 so it behaves a bit differently than Unix.
I have an error log file that is filled by multiple users.
I get my entries by:
tail error.log | grep USER1
I get results in the format:
Jan 16 09:06:18 XYZ USER1
Jan 16 09:22:12 XYZ USER1
Jan 16 11:22:30 XYZ USER1
What I need is a command that would only print my errors that were logged after the last time I checked and print nothing otherwise.
I decided to get my last entry into another file by:
grep USER1 error.log | tail -1 > temp_error.log
From there I'd cut only the first chars containing the date by:
less temp_error.log | cut -c 1-15
This would give me the result:
Jan 16 11:22:30
I then use a 'sed' command to get all lines after this one in the file:
sed -e '1,/Jan 16 11:22:30/d' error.log | grep USER1
This would work but of course I have to paste the date every time. I decided to automate this with a variable by executing the below commands:
LAST_ENTRY=`less temp_error.log | cut -c 1-15`
sed -e '1,/$LAST_ENTRY/d' error.log | grep USER1
This doesn't work because the variable is not passed correctly to the 2nd command and I don't understand why. Please, help me to understand what is wrong with this.

Adding months using shell script

Currently I have a below record in a file.
ABC,XYZ,123,Sep-2018
Looking for a command in linux which will add months and give the output. For example If I want to add 3 months. Expected output is:
ABC,XYZ,123,Dec-2018
Well,
date -d "1-$(echo "ABC,XYZ,123,Sep-2018" | awk -F "," '{ print $4 }')+3 months" "+%b-%Y"
(Careful, that code continues past the edge of the box.)
Shows you how to get it working. Just replace the echo with a shell variable as you loop through the dates.
Basically, you use awk to grab just the date portion, add a 1- to the front to turn it into a real date then use the date command to do the math and then tell it to give you just the month abbreviation and year.
The line above gives just the date portion. The first part can be found using:
stub=`echo "ABC,XYZ,123,Dec-2018" | awk -F "," '{ printf("%s,%s,%s,",$1,$2,$3) }'`
You can use external date or (g)awk's datetime related function to do it. However you have to prepare the string to parse. Here is another way to do the job:
First prepare an index file, we name it month.txt:
Jan
Feb
......
...
Nov
Dec
Then run this:
awk -F'-|,' -v OFS="," 'NR==FNR{m[NR]=$1;a[$1]=NR;next}
{i=a[$4]; if(i==12){i=1;++$5}else i++
$4=m[i]"-"$5;NF--}7' month.txt file
With this example file:
ABC,XYZ,123,Jan-2018
ABC,XYZ,123,Nov-2018
ABC,XYZ,123,Dec-2018
You will get:
ABC,XYZ,123,Feb-2018
ABC,XYZ,123,Dec-2018
ABC,XYZ,123,Jan-2019
update
Oh, I didn't notice that you want to add 3 months. Here is the updated codes for it:
awk -F'-|,' -v OFS="," 'NR==FNR{m[NR]=$1;a[$1]=NR;next}
{i=a[$4]+3; if(i>12){i=i-12;++$5}
$4=m[i]"-"$5;NF--}7' month.txt file
Now with the same input, you get:
ABC,XYZ,123,Apr-2018
ABC,XYZ,123,Feb-2019
ABC,XYZ,123,Mar-2019

how to grep multiple words from different lines in same log

I want to grep for files containing the words in different line from same log. the words are checkCredit?msisdn=766117506 and creditLimit
The log file is like this
freeMemory 103709392time Mon Mar 12 04:02:13 IST 2018
http://127.0.0.1:8087/DialogProxy/services/ServiceProxy/checkCredit?msisdn=767807544&transid=45390124
freeMemory 73117016time Mon Mar 12 04:02:14 IST 2018
statusCode200
{statusCode=200, response=outstandingAmount 0.0 creditLimit 0.0, errorResponse=, responseTime=0}
this is balnce 0.0
What is the best way to do this?
Give this a try:
grep 'checkCredit?msisdn\|creditLimit' inputfile
You can use
$ grep -e 'checkCredit?msisdn=766117506' -e 'creditLimit' <filename>
Simply try
grep creditLimit log.txt | grep checkCredit
grep 'checkCredit?msisdn=766117506\|creditLimit' inputfile catalina.out_bckp , when i run this command it also display only the creditLimit details why it doesn't show the checkCredit?msisdn=766117506 details.
The log file as shown in your question post does not contain the 766117506, so it's no wonder that grep doesn't find it. If you really have data with 766117506, add them to the question.
I have used this command grep 'checkCredit?msisdn\|this is balnce' catalina.out_bckp |awk '$4 < 10 {print ;}' it gave me good result but there is some missing values.
You haven't used creditLimit in that pattern, so it's no wonder that those lines are missing.

Shell script to count number of logins per day

I'm new to Shell programming. I'm trying to write a shell script to count the number of logins per days of the week for users on some machine
Output should look like this:
123 Mon
231 Tue
555 Wed
21 Thu
44 Fri
123 Sat
10 Sun
I've tried to do it using commands last, uniq and sort like this
last -s -7days | awk '{print $1, $4,$5,$6}' | uniq -cd |sort -u
but I think I'm missing something because I'm somehow getting duplicated results. Also, I'm not sure how to get overall counts separated by days.
The problem with uniq is it only collapses adjacent duplicates lines. In your case -d on uniq is hiding the lines that are breaking up the duplicate lines, I am guessing you have some lines similar to reboot 4.4.5-1-ARCH Wed Mar between login attempts for the day. You will also have problems with multiple users logging in breaking up the counts for other users.
Typically you sort | uniq to get a true list of uniq rows but if you remove the -d you end up with lines you do not want. These are best filtered out separately either before or after the sort | uniq.
Finally the last sort -u will delete data if two rows happen to match exactly, I do not think this is what you want. Instead it is better to sort on the date column (will cause a small issue on the month rollover) or by another column you care about with the -k FILENUM argument if you need to sort the counts at all.
Combine this together and you get:
last -s -7days | awk '/reboot/ {next}; /wtmp/ {next}; /^$/ {next}; {print $1, $4,$5,$6}' | sort | uniq -c | sort -k 5
Note that .../reboot/ {next};... causes awk to ignore lines that match the pattern within the /s.

Resources