grep date ranges from feed - shell

I saw it posted here that this would work but it is not working for me. I need something short and sweet like this command. using the python version of rsstail.
rsstail -dl -e 1 -U -a -u https://threatpost.com/feed/ -n 10 | grep -A 2 "2021/03/15 20[1-5]"
This should grab the last 5 hours but it doesn't.
Sample line from the feed follows
Updated: 2021/03/12 21:42:59 Title: Critical Security Hole Can Knock Smart Meters Offline Author: Tara Seals Link: https://threatpost.com/critical-security-smart-meter-offline/164753/ Description: Unpatched Schneider Electric PowerLogic ION/PM smart meters are open to dangerous attacks

You were heading on the right track with the grep - however there wasn't going to be a match on the date because 20[1-5] matches two digits and a digit in the range 1-5 in a row after the specific date - that would match
2021/03/15 2022 but not 2021/03/15 20:22.
Assuming you are in daylight hours you don't have to worry about spanning two days - imagine you ran at 2:00am you'd need yesterday 21:XX, 22:XX, 23:XX and today 00:XXam, 01:XXam.
So say you run at 11.am today - previous 5 hours 6/7/8/9/10 .. so you could do something like this.
grep -A 2 -E -e '2021/03/17 (06|07|08|09|10):'
OR even
grep -A 2 -E -e '2021/03/17 (0[6789]|10):'
You can auto-generate some of the query like this (again I've ignored cross-over of hour) NOTE: OSX Date & GNU date are different - this is OSX example -
FROMMIN=$( date -v -4M +'%M' )
TOMIN=$( date +'%M' )
## GIVES like this 2021/03/17 20:(44|45|46|47|48|)
MATCH=$( echo $( date +'%Y/%m/%d %H:(' )$( seq -s "|" $FROMMIN 1 $TOMIN )')' )
grep -A 2 -E -e "$MATCH"

Related

Last Day of Month in csvfile

i try to delete all days of a csv file which not matched last days. But I find not the right solution.
date,price
2018-07-02,162.17
2018-06-29,161.94
2018-06-28,162.22
2018-06-27,162.32
2018-06-12,163.01
2018-06-11,163.53
2018-05-31,164.87
2018-05-30,165.59
2018-05-29,165.42
2018-05-25,165.96
2018-05-02,164.94
2018-04-30,166.16
2018-04-27,166.69
The output I want become
date,price
2018-06-29,161.94
2018-05-31,164.87
2018-04-30,166.16
I try it with cut + grep
cut -d, -f1 file.csv | grep -E "28|29|30"
Work but bring nothing when combine -f1,2.
I find csvkit which seem to me the right tool, but I find not the solution for multiple grep.
csvgrep -c 1 -m 30 file.csv
Bring me the right result but how can combine multiple search option? I try -m 28,29,30 and -m 28 -m 29 -m 30 all work not. Best it work with last day of every month.
Maybe one have here a idea.
Thank you and nice Sunday
Silvio
You want to get all records of the LAST day of the month. But months vary in length (28-29-30-31).
I don't see why you used cut to extract the first field (the date part), because the data in the second field does not look like dates at all (xx-xx).
I suggest to use grep directly to display the lines that matches the following pattern mm-dd; where mm is the month number, and dd is the last day of the month.
This command should do the trick:
grep -E "01-31|02-(28|29)|03-31|04-30|05-31|06-30|07-31|08-30|09-31|10-30|11-31|12-30" file.csv
This command will give the following output:
2018-05-31,164.87
2018-04-30,166.16

Prepend message to rsstail

I am trying to prepend a message to the output of rsstail, this is what I have right now:
rsstail -o -i 15 --initial 0 http://feeds.bbci.co.uk/news/world/europe/rss.xml | awk -v time=$( date +\[%H:%M:%S_%d/%m/%Y\] ) '{print time,$0}' | tee someFile.txt
which should give me the following:
[23:46:49_23/10/2014] Title: someTitle
After the command I have a | while read line do ... end which never gets called because the above command does not output a single thing. What am I doing wrong?
PS: I am using the python version of rsstail, since the other one kept on crashing (https://github.com/gvalkov/rsstail.py)
EDIT:
As requested in the comments the command:
rsstail -o -i 15 --initial 0 http://feeds.bbci.co.uk/news/world/europe/rss.xml
Will give back a message like the following when a new article is found
Title: Sweden calls off search for sub
It seems that my rsstail is different from yours, but mine supports the option
-Z x add heading 'x'
so that
rsstail -Z"$( date +\[%H:%M:%S_%d/%m/%Y\] ) " ...
does the job without awk; on the other hand, you do have some problem with buffering, is it possible to ask rsstail to stop after a given number of titles?

Shell Script to get exception from logs for last one hour

I am developing script which will grep logs of last one hour and check any exception and send email for solaris platform.
I did following steps
grep -n -h date +'%Y-%m-%d %H:%M' test.logs
above command gives me line number and then i do following
tail +6183313 test.log | grep 'exception'
sample logs
2014-02-17 10:15:02,625 | WARN | m://mEndpoint | oSccMod | 262 - com.sm.sp-client - 0.0.0.R2D03-SNAPSHOT | 1201 or 101 is returned as exception code from SP, but it is ignored
2014-02-17 10:15:02,625 | WARN | m://mEndpoint | oSccMod | 262 - com.sm.sp-client - 0.0.0.R2D03-SNAPSHOT | SP error ignored and mock success returned
2014-02-17 10:15:02,626 | INFO | 354466740-102951 | ServiceFulfill | 183 - org.apache.cxf | Outbound Message
Please suggest any better alternative to perform above task.
With GNU date, one can use:
grep "^$(date -d -1hour +'%Y-%m-%d %H')" test.logs | grep 'exception'| mail -s "exceptions in last hour of test.logs" ImranRazaKhan
The first step above is to select all log entries from the last hour. This is done with grep by looking for all lines beginning with the year-month-day and hour that matches one hour ago:
grep "^$(date -d -1hour +'%Y-%m-%d %H')" test.logs
The next step in the pipeline is to select from those lines the ones that have exceptions:
grep 'exception'
The last step in the pipeline is to send out the mail:
mail -s "exceptions in last hour of test.logs" ImranRazaKhan
The above sends mail to ImranRazaKhan (or whatever email address you chose) with the subject line of "exceptions in last hour of test.logs".
The convenience of having the -d option to date should not be underestimated. It might seem simple to subtract 1 from the current hour but, if the current hour is 12am, then we need to adjust both the day and the hour. If the hour was 12am on the first of the month, we would also have to change the month. And likewise for year. And, of course, February requires special consideration during leap years.
Adapting the above to Solaris:
Consider three cases:
Under Solaris 11 or better, the GNU date utility is available at /usr/gnu/bin/date. Thus, we need simply to specify a path for date:
grep "^$(/usr/gnu/bin/date -d -1hour +'%Y-%m-%d %H')" test.logs | grep 'exception'| mail -s "exceptions in last hour of test.logs" ImranRazaKhan
Under Solaris 10 or earlier, one can download & install GNU date
If GNU date is still not available, we need to find another way to find the date and time for one hour ago. The simplest workaround is likely to select a timezone that is one hour behind your timezone. If that timezone was, say, Hong Kong, then use:
grep "^$(TZ=HongKong date +'%Y-%m-%d %H')" test.logs | grep 'exception'| mail -s "exceptions in last hour of test.logs" ImranRazaKhan
You can do like this:
dt="$(date -d '1 hour ago' "+%m/%d/%Y %H:%M:%S")"
awk -v dt="$dt" '$0 ~ dt && /exceltion/' test.logs
Scanning through millions lines of log sounds terribly inefficient. I would suggest changing log4j (what it looks like) configuration of your application to cut a new log file every hour. This way, tailing the most recent file becomes a breeze.

Trim text and add timestamp?

So basically I have my output as the following:
<span id="PlayerCount">134,015 people currently online</span>
What I want is a way to trim it to show:
134,015 - 3:24:20AM - Oct 24
Can anyone help? Also note the number may change so is it possible output everything between ">" and the "c" in currently? And add a timestamp somehow?
Using commands from terminal in Linux, so that's called bash right?
Do you perhaps mean something like:
$ echo '<span id="PlayerCount">134,015 people currently online</span>' | sed
-e 's/^[^>]*>//'
-e "s/currently.*$/$(date '+%r %b %d %Y')/"
which generates:
134,015 people 03:36:30 PM Oct 24 2011
The echo is just for the test data. The first sed command will change everything up to the first > character into nothing (ie, delete it).
The second one will change everything from the currently to the end of the line with the current date in your desired format (although I have added the year since I'm a bit of a stickler for detail).
The relevant arguments for date here are:
%r locale's 12-hour clock time (e.g., 11:11:04 PM)
%b locale's abbreviated month name (e.g., Jan)
%d day of month (e.g., 01)
%Y year
A full list of format specifiers can be obtained from the date man page (execute man date from a shell).
A small script which will give you the desired information from the page you mentioned in the comments is:
#!/usr/bin/bash
wget --output-document=- http://runescape.com/title.ws 2>/dev/null \
| grep PlayerCount \
| head -1l \
| sed 's/^[^>]*>//' \
| sed "s/currently.*$/$(date '+%r %b %d %Y')/"
Running this gives me:
pax$ ./online.sh
132,682 people 04:09:17 PM Oct 24 2011
In detail:
The wget bit pulls down the web page and writes it on standard output. The standard error (progress bar) is thrown away.
The grep extracts only lines with the word PlayerCount in them.
The head throws away all but the first of those.
The first sed strips up to the first > character.
The second sed changes the trailing text to the durrent date and time.
Quickhack(tm):
$ people=$(echo '<span id="PlayerCount">134,015 people currently online</span>' | \
sed -e 's/^.*>\(.*\) people.*$/\1/')
$ echo $people - $(date)
134,015 - Mon Oct 24 09:36:23 CEST 2011
produce_OUTPUT | grep -o '[0-9,]\+' | while read count; do
printf "%s - %s\n" $count "$(date +'%l:%M:%S %p - %b %e')"
done

find latest version of rpms from a mirror

I want to write a script to find the latest version of rpm of a given package available from a mirror for eg: http://mirror.centos.org/centos/5/updates/x86_64/RPMS/
The script should be able to run on majority of linux flavors (eg centos, redhat, ubuntu). So yum based solution is not an option. Is there any existing script that does this? Or can someone give me a general idea on how to go about this?
Thx to levislevis85 for the wget cli. Try this:
ARCH="i386"
PKG="pidgin-devel"
URL=http://mirror.centos.org/centos/5/updates/x86_64/RPMS
DL=`wget -O- -q $URL | sed -n 's/.*rpm.>\('$PKG'.*'$ARCH'.rpm\).*/\1/p' | sort | tail -1`
wget $URL/$DL
I Will put my comment here, otherwise the code will not be readable.
Try this:
ARCH="i386"
PKG="pidgin-devel"
URL=http://mirror.centos.org/centos/5/updates/x86_64/RPMS
DL=`wget -O- -q $URL | sed -n 's/.*rpm.>\('$PKG'.*'$ARCH'.rpm\).*<td align="right">\(.*\)-\(.*\)-\(.*\) \(..\):\(..\) <\/td><td.*/\4 \3 \2 \5 \6 \1/p' | sort -k1n -k2M -k3n -k4n -k5n | cut -d ' ' -f 6 | tail -1`
wget $URL/$DL
What it does is:
wget - get the index file
sed - cut out some parts and put it together in different order. Should result in Year Month Day Hour Minute and Package, like:
2009 Oct 27 01 14 pidgin-devel-2.6.2-2.el5.i386.rpm
2009 Oct 30 10 49 pidgin-devel-2.6.3-2.el5.i386.rpm
sort - order the columns n stays for numerical and M for month
cut - cut out the filed 6
tail - show only last entry
the problem with this could be, if some older package release comes after a newer then this script will also fail. If the output of the site changes, the script will fail. There are always a lot of points where a script could fail.
using wget and gawk
#!/bin/bash
pkg="kernel-headers"
wget -O- -q http://mirror.centos.org/centos/5/updates/x86_64/RPMS | awk -vpkg="$pkg" 'BEGIN{
RS="\n";FS="</a>"
z=split("Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec",D,"|")
for(i=1;i<=z;i++){
date[D[i]]=sprintf("%02d",i)
}
temp=0
}
$1~pkg{
p=$1
t=$2
gsub(/.*href=\042/,"",p)
gsub(/\042>.*/,"",p)
m=split(t,timestamp," ")
n=split(timestamp[1],d,"-")
q=split(timestamp[2],hm,":")
datetime=d[3]date[d[2]]d[1]hm[1]hm[2]
if ( datetime >= temp ){
temp=datetime
filepkg = p
}
}
END{
print "Latest package: "filepkg", date: ",temp
}'
an example run of the above:
linux$ ./findlatest.sh
Latest package: kernel-headers-2.6.18-164.6.1.el5.x86_64.rpm, date: 200911041457
Try this (which requires lynx):
lynx -dump -listonly -nonumbers http://mirror.centos.org/centos/5/updates/x86_64/RPMS/ |
grep -E '^.*xen-libs.*i386.rpm$' |
sort --version-sort |
tail -n 1
If your sort doesn't have --version-sort, then you'll have to parse the version out of the filename or hope that a regular sort will do the right thing.
You may be able to do something similar with wget or curl or even a Bash script using redirections with /dev/tcp/HOST/PORT. The problem with these is that you would then have to parse HTML.

Resources