Adding months using shell script - bash

Currently I have a below record in a file.
ABC,XYZ,123,Sep-2018
Looking for a command in linux which will add months and give the output. For example If I want to add 3 months. Expected output is:
ABC,XYZ,123,Dec-2018

Well,
date -d "1-$(echo "ABC,XYZ,123,Sep-2018" | awk -F "," '{ print $4 }')+3 months" "+%b-%Y"
(Careful, that code continues past the edge of the box.)
Shows you how to get it working. Just replace the echo with a shell variable as you loop through the dates.
Basically, you use awk to grab just the date portion, add a 1- to the front to turn it into a real date then use the date command to do the math and then tell it to give you just the month abbreviation and year.
The line above gives just the date portion. The first part can be found using:
stub=`echo "ABC,XYZ,123,Dec-2018" | awk -F "," '{ printf("%s,%s,%s,",$1,$2,$3) }'`

You can use external date or (g)awk's datetime related function to do it. However you have to prepare the string to parse. Here is another way to do the job:
First prepare an index file, we name it month.txt:
Jan
Feb
......
...
Nov
Dec
Then run this:
awk -F'-|,' -v OFS="," 'NR==FNR{m[NR]=$1;a[$1]=NR;next}
{i=a[$4]; if(i==12){i=1;++$5}else i++
$4=m[i]"-"$5;NF--}7' month.txt file
With this example file:
ABC,XYZ,123,Jan-2018
ABC,XYZ,123,Nov-2018
ABC,XYZ,123,Dec-2018
You will get:
ABC,XYZ,123,Feb-2018
ABC,XYZ,123,Dec-2018
ABC,XYZ,123,Jan-2019
update
Oh, I didn't notice that you want to add 3 months. Here is the updated codes for it:
awk -F'-|,' -v OFS="," 'NR==FNR{m[NR]=$1;a[$1]=NR;next}
{i=a[$4]+3; if(i>12){i=i-12;++$5}
$4=m[i]"-"$5;NF--}7' month.txt file
Now with the same input, you get:
ABC,XYZ,123,Apr-2018
ABC,XYZ,123,Feb-2019
ABC,XYZ,123,Mar-2019

Related

How to extract multiple fields with specific character lengths in Bash?

I have a file (test.csv) with a few fields and what I wanted is the Title and Path with 10 character for the title and remove a few levels from the path. What have done is use the awk command to pick two fields:
$ awk -F "," '{print substr($4, 1, 10)","$6}' test.csv [1]
The three levels in the path need to be removed are not always the same. It can be /article/17/1/ or this /open-organization/17/1 so I can't use the substr for field $6.
Here the result I have:
Title,Path
Be the ope,/article/17/1/be-open-source-supply-chain
Developing,/open-organization/17/1/developing-open-leaders
Wanted result would be:
Title,Path
Be the ope,be-open-source-supply-chain
Developing,developing-open-leaders
The title is ok with 10 characters but I still need to remove 3 levels off the path.
I could use the cut command:
cut -d'/' -f5- to remove the "/.../17/1/"
But not sure how this can be piped to the [1]
I tried to use a for loop to get the title and the path one by one by but I have difficulty in getting the awk command to run one line at time.
I have spent hours on this with no luck. Any help would be appreciated.
Dummy Data for testing:
test.csv
Post date,Content type,Author,Title,Comment count,Path,Tags,Word count
31 Jan 2017,Article,Scott Nesbitt,Book review: Ours to Hack and to Own,0,/article/17/1/review-book-ours-to-hack-and-own,Books,660
31 Jan 2017,Article,Jason Baker,5 new guides for working with OpenStack,2,/article/17/1/openstack-tutorials,"OpenStack, How-tos and tutorials",419
you can replace the string by using regex.
stringZ="Be the ope,/article/17/1/be-open-source-supply-chain"
sed -E "s/((\\/\\w+){3}\\/)//" <<< $stringZ
note that you need to use -i if you are going to give file as input to sed

How to grep files from specific date to EOF awk

I have a little problem with printing data from file from date to end of file, namely, I have file:
2016/08/10-12:45:14.970000 <some_data> <some_data> ...
2016/08/10-12:45:15.970000 <some_data> <some_data> ...
2016/08/10-12:45:18.970000 <some_data> <some_data> ...
2016/08/10-12:45:19.970000 <some_data> <some_data> ...
And this file has hundreds lines.
And I have to print file from one point in the time to end of file but I don't know precise time when row in logfile appeared.
And I need to print data from date 2016/08/10-12:45:16to end of file, I want to receive file looks like that:
2016/08/10-12:45:18.970000
2016/08/10-12:45:19.970000
OK if I know specific date from which I want to print data everything is easy
awk '/<start_time>/,/<end/'
awk '/2016\/08\/10-12:45:18/,/<end/'
But if I don't know specific date, I know only approximate date 2016/08/10-12:45:16 it's harder.
Can any one please help me?
You can benefit from the fact that the time format you are using supports alphanumerical comparison. With awk the command can look like this:
awk -v start='2016/08/10-12:45:16' '$1>=start' file
You can use mktime function of awk to check for time:
awk -v TIME="2016/08/10-12:45:16" '
BEGIN{
gsub("[/:-]"," ",TIME)
reftime=mktime(TIME)
}
{
t=$1
sub("[0-9]*$","",t)
gsub("[/:-]"," ",t)
if(mktime(t)>reftime)
print
}' file
This script take your reference time and convert it into number and then compare it to time found in the file.
Note the sub and gsub are only to convert your specific time format to the time format understood by awk.
You should be able to do this simply with awk:
awk '{m = "2016/08/10-12:45:18"} $0 ~ m,0 {print}' file
If you weren't sure exactly the time or date you could do:
awk '{m = "2016/08/10-12:45:1[6-8]"} $0 ~ m,0 {print}' file
This should print from your specified date and time around 12:45 +16-18 seconds to the files end. The character class [6-8] treats the seconds as a range from the original time 12:45:1...
Output:
2016/08/10-12:45:18.970000 somedata 3
2016/08/10-12:45:19.970000 somedata 4

Changing date format in CSV file using Ubuntu bash/awk

I have csv files where date is in wrong format. Incoming format is e.g. 15.11.2015 and I should change is to %Y-%m-%d (2015-11-15). I've tried to create an bash/awk script to where I change this value which is in column 43. First row is a header. So far I've managed to create a script which finds the value and replaces it with front slashes:
awk -v FS=";" 'NR>1{split($43,a,".");$43=a[2]"/"a[1]"/"a[3]}1' OFS=";" fileIn
I've tried to change the format with date command but I haven't found a way to use it in awk script. This would print the date in right format:
date -d 11/25/2015 +%Y-%m-%d
EDIT. I need to change the format, otherwise the leading zeros are missed if date or month is < 10.
I followed 123's advice and used sprintf and padding and my working solution is now:
awk -v FS=";" 'NR>1{split($43,a,".");$43=sprintf("%04d-%02d-%02d",a[3],a[2],a[1])}1' OFS=";" fileIn
EDIT. Cleaned after 123's comment.

Comparing and transforming a date that is piped to (probably) awk?

I've got a reasonably complicated string of piped shell commands (let's assume it's bunch | of | commands), which together produces several rows of output, in this format:
some_path/some_file.csv 1439934121
...where 1439934121 is the file's last-modified timestamp.
What I need to do is see if it's a timestamp on the current day, i.e. on or after last midnight, and then include just the lines where that is true.
I assume this means that some string (e.g. the word true) should either replace or be appended to the timestamps of those lines for grep to distinguish them from ones where the timestamps are those of an earlier date.
To put it in shell command terms:
bunch | of | commands | ????
...should produce:
some_path/some_file.csv true or some_path/some_file.csv 1439934121 true
...for which I could easily grep (obviously assuming that last midnight <= 1439934121 <= current time).
What kind of ???? would do this? I'm almost certain that awk can do what I need it to, so I've looked at it and date, but I'm basically doing awk-by-google with no skills and getting nowhere.
Don't feel constrained by my tool assumptions; if you can achieve this with alternate means, given the output of bunch | of | commands but still using shell tools and piping, I'm all ears. I'd like to avoid temp files or Perl, if possible :-)
I'm using gawk + bash 4.3 on Ubuntu Linux, specifically, and have no portability concerns.
Since today 00:00:00 with the %s format returns the unix timestamp of that moment:
$ date -d'today 00:00:00'
Thu Sep 3 00:00:00 CEST 2015
$ date -d 'today 00:00:00' "+%s"
1441231200
You can probably pipe to an awk doing something like:
... | awk -v midnight="$(date -d 'today 00:00:00' '+%s')" '{$2= ($2>midnight) ? "true" : "false"}1'
That is, use the ternary operator to check the value of $2 and replace with either of the values true/false depending on the result:
awk -v midnight="$(date ...)" '{$2= ($2>midnight) ? "true" : "false"}1'
Test
$ cat a
hello 1441231201
bye 23
$ awk -v midnight="$(date -d 'today 00:00:00' '+%s')" '{$2= ($2>midnight) ? "true" : "false"}1' a
hello true
bye false

Get lines from a specific date afterwards/backwards

I'm working on a shell script.I need to get lines which contain informations from that day or older than that. In the file each line is a record and the first line is the older the last is the latest.
The file contains:
some info \t the date \t other info
If I simply grep the given date I find what I'm looking for but just if that date's present in the file. I find the last occurance and get the lines from the start of the file. I tried awk but I totally failed. It should give me each line what contain that date or is older. My failure and last attempt:
awk '$1 <= "2015/03/17"'
So I need something similar to egrep but which gives me all lines with the date of 2015/03/15 or older. Or do I have to go trought each line and compare the 2 dates, according to that write out if it's older?
Should be pretty easy since the date format you are using can be compared as characters..
File: test.txt
a 2015/03/17 b
c 2015/03/12 d
e 2014/02/10 f
g 2016/01/01 h
Awk command:
awk -v d="2015/03/15" '{if ($2 <= d) {print $0}}' test.txt
Just change d= to whatever date value you want.

Resources