Create a set of files based off a date range - bash

How would you go about doing a few lines of code in Bash to accomplish the following. I'm trying to build up my skills in Bash and learn how to handle more small tasks directory from the command line.
Steps:
Specify a start date and an end date. Load all the dates in between including the start and end date into a "list"
Loop over the list creating a file like this each time.
(requires date formatting)
2017-11-10.w
2017-11-11.w
2017-11-12.w

You could convert the input dates to Unix timestamps, then add the number of seconds per day and touch a file named after the result until you are past the end date:
#!/bin/bash
startstamp=$(date -d "$1" +'%s')
endstamp=$(date -d "$2" +'%s')
secs_per_day=$(( 24 * 3600 ))
for (( thedate = startstamp; thedate <= endstamp; thedate += secs_per_day )); do
touch "$(date -d "#$thedate" '+%F.w')"
done
The %s formatting string (a GNU extension) prints the number of seconds since the Unix epoch, and # in the argument to the -d option indicates that the date is in that format. %F is short for %Y-%m-%d, which translates to YYYY-MM-DD.
Example usage:
$ ./dates 2017-11-10 2017-11-15
$ ls -1
2017-11-10.w
2017-11-11.w
2017-11-12.w
2017-11-13.w
2017-11-14.w
2017-11-15.w
dates

Related

Identifying the files older than x-months by the filename only and deleting them

I have 4 different files with different fileName.date formats, having a date embedded as part of the name. I want to identify the files older than 3 months based on their name only because the files would be edited/changed later as well. I want to create a shell script and run it as a cron.
Here below are the file under the same directory:
fileone.log.2018-03-23
file_two_2018-03-23.log
filethree.log.2018-03-23
file_four_file_four_2018-03-23.log
I have checked the existing example but have not found what I am actually looking for!
Working on the premise that you mean 90 days - if you need specifically months, we can check that too, but it's different logic.
here's some code you could work from -
(you said you don't want to work from a list, so I edited to use the current directory.)
$: cat chkDates
# while read f # replaced with -
for f in *[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]*
do # first get the epoch timestamp of the file based on the sate string embedded in the name
filedate=$(
date +%s -d $(
echo $f | sed -E 's/.*([0-9]{4}-[0-9]{2}-[0-9]{2}).*/\1/'
) # this returns the date substring
) # this converts it to an epoch integer of seconds since 1/1/70
# now see if it's > 90 days ( you said 3 months. if you need *months* we have to do some more...)
daysOld=$(( ( $(date +%s) - $filedate ) / 86400 )) # this should give you an integer result, btw
if (( 90 < $daysOld ))
then echo $f is old
else echo $f is not
fi
done # < listOfFileNames # not reading list now
You can pass date a date to report, and a format to present it.
sed pattern explanation
Note the sed -E 's/.*([0-9]{4}-[0-9]{2}-[0-9]{2}).*/\1/' command. This assumes the date format will be consistently YYYY-MM-DD, and does no validations of reasonableness. It will happily accept any 4 digits, then 2, then 2, delimited by dashes.
-E uses expanded regexes, so parens () can denote values to be remembered, without needing \'s. . means any character, and * means any number (including zero) of the previous pattern, so .* means zero or more characters, eating up all the line before the date. [0-9] means any digit. {x,y} sets a minimum(x) and maximum(y) number of consecutive matches - with only one value {4} means only exactly 4 of the previous pattern will do. So, '.*([0-9]{4}-[0-9]{2}-[0-9]{2}).*' means ignore as many characters as you can until seeing 4 digits, then a dash, 2 digits, then a dash, then 2 digits; remember that pattern (the ()'s), then ignore any characters behind it.
In a substitution, \1 means the first remembered match, so
sed -E 's/.*([0-9]{4}-[0-9]{2}-[0-9]{2}).*/\1/'
means find and remember the date pattern in the filenames, and replace the whole name with just that part in the output. This assumes the date will be present - on a filename where there is no date, the pattern will not match, and the whole filename will be returned, so be careful with that.
(hope that helped.)
By isolating the date string from the filenames with sed (your examples were format-consistent, so I used that) we pass it in and ask for the UNIX Epoch timestamp of that date string using date +%s -d $(...), to represent the file with a math-handy number.
Subtract that from the current date in the same format, you get the approximate age of the file in seconds. Divide that by the number of seconds in a day and you get days old. The file date will default to midnight, but the math will drop fractions, so it sorts out.
here's the file list I made, working from your examples
$: cat listOfFileNames
fileone.log.2018-03-23
fileone.log.2018-09-23
file_two_2018-03-23.log
file_two_2018-08-23.log
filethree.log.2018-03-23
filethree.log.2018-10-02
file_four_file_four_2018-03-23.log
file_four_file_four_2019-03-23.log
I added a file for each that would be within the 90 days as of this posting - including one that is "post-dated", which can easily happen with this sort of thing.
Here's the output.
$: ./chkDates
fileone.log.2018-03-23 is old
fileone.log.2018-09-23 is not
file_two_2018-03-23.log is old
file_two_2018-08-23.log is not
filethree.log.2018-03-23 is old
filethree.log.2018-10-02 is not
file_four_file_four_2018-03-23.log is old
file_four_file_four_2019-03-23.log is not
That what you had in mind?
An alternate pure-bash way to get just the date string
(You still need date to convert to the epoch seconds...)
instead of
filedate=$(
date +%s -d $(
echo $f | sed -E 's/.*([0-9]{4}-[0-9]{2}-[0-9]{2}).*/\1/'
) # this returns the date substring
) # this converts it to an epoch integer of seconds since 1/1/70
which doesn't seem to be working for you, try this:
tmp=${f%[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]*} # unwanted prefix
d=${f#$tmp} # prefix removed
tmp=${f#*[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]} # unwanted suffix
filedate=${d%$tmp} # suffix removed
filedate=$( date +%s --date=$filedate ) # epoch time
This is hard to read, but doesn't have to spawn as many subprocesses to get the work done. :)
If that doesn't work, then I'm suspicious of your version of date. Mine:
$: date --version
date (GNU coreutils) 8.26
UPDATE:
Simple Version:
Method for using the date inside of the file's name :
typeset stamp=$(date --date="90 day ago" +%s)
for file in /directory/*.log; do
fdate="$(echo "$file" | sed 's/[^0-9-]*//g')"
fstamp=$(date -d "${fdate} 00:00:00" +"%s")
if [ ${fstamp} -le ${stamp} ] ; then
echo "${file} : ${fdate} (${fstamp})"
fi
done
A More Complete Version:
This version will look at all files, if it fails to make a date value from the file it moves on.
typeset stamp=$(date --date="90 day ago" +%s)
for file in /tmp/* ; do
fdate="$(echo "$file" | sed 's/[^0-9-]*//g')"
fstamp=$(date -d "${fdate} 00:00:00" +"%s" 2> /dev/null)
[[ $? -ne 0 ]] && continue
if [ ${fstamp} -le ${stamp} ] ; then
echo "${file} : ${fdate} (${fstamp})"
fi
done
output:
/tmp/file_2016-05-23.log : 2016-05-23 (1463976000)
/tmp/file_2017-05-23.log : 2017-05-23 (1495512000)
/tmp/file_2018-05-23.log : 2018-05-23 (1527048000)
/tmp/file_2018-06-23.log : 2018-06-23 (1529726400)
/tmp/file_2018-07-23.log : 2018-07-23 (1532318400)
in this example the following were ignored :
/tmp/file_2018-08-23.log : 2018-08-23 (1534996800)
/tmp/file_2018-10-18.log : 2018-10-18 (1539835200)

convert timestamp to date in bash

i have time logs in timestamp (epoch unix time) format :
1515365117236
1515365123162
1515365139963
i would like to convert it to a regular date like
2017-01-07 23:48:01
2017-01-07 23:48:02
2017-01-07 23:48:03
any ideas what approach would be the fastest?
cat ff1.csv | while read line ; do echo $line\;$(date -d +"%Y-%m-%d %H:%M:%S") ; done > somefile.csv
this takes awful lot of time and just appends the current time
Another approach that must be much faster , using printf of bash version >4.2 :
$ printf '%(datefmt)T\n' epoch
For datefmt you need a string accepted by strftime(3) - see man 3 strftime
Testing:
$ cat file10
1515365117236
1515365123162
1515365139963
$ printf '%(%F %H:%M:%S)T\n' $(cat file10)
49990-01-04 04:47:16
49990-01-04 06:26:02
49990-01-04 11:06:03
In this case , printf format string is:
%F Equivalent to %Y-%m-%d (the ISO 8601 date format). (C99)
%H The hour as a decimal number using a 24-hour clock (range 00 to 23).(Calculated from tm_hour.)
%M The minute as a decimal number (range 00 to 59). (Calculated from tm_min.)
%S The second as a decimal number (range 00 to 60). (The range is up to 60 to allow for occasional leap seconds.- Calculated from tm_sec.)
Update to remove milliseconds:
$ printf '%(%F %T)T\n' $(printf '%s/1000\n' $(<file10) |bc)
2018-01-08 00:45:17
2018-01-08 00:45:23
2018-01-08 00:45:39
The way to transform epoch to date is date -d #epochtime +format
An alternative way is to use date --file switch to read dates from a file directly.
$ cat file10
1515365117236
1515365123162
1515365139963
In order date to understand that these lines are epoch time you need to add # in the beginning of each line.
This can be done like bellow:
$ sed -i 's/^/#/g' file10 #caution - this will make changes in your file
$ date --file file10 +"%Y-%m-%d %H:%M:%S"
Alternativelly, you can do it on the fly without affecting the original file:
$ sed 's/^/#/g' file10 |date --file - +"%Y-%m-%d %H:%M:%S"
PS: in this case --file reads from - == stdin == pipe
In both cases, the result is
49990-01-04 04:47:16
49990-01-04 06:26:02
49990-01-04 11:06:03
PS: by the way, the timestamps you provide seems invalid, since it seems to refer at year 49990
Your input data aren't epoch unix time, it has miliseconds. If you wish to use any method on bash first you must convert to timestamp:
cat ff1.csv | while read LINE; do echo "#$(expr $LINE \/ 1000)" | date +"%Y-%m-%d %H:%M:%S" --file - ; done
First divide by 1000 to delete miliseconds parts, the rest is the same that explain George Vasiliou

How could I use bash to work out how many tuesdays there are in a month? [duplicate]

I need to sort data on a weekly base and all i have are dates in a logfile.
Therefore to sort out data per week i would like to create a list with the dates of all mondays for a given year. I have tried to work something out and the only idea i currently have is to use ncal with year and month as argument looping over all months and extracting all mondays. Isn't there a more efficient way?
To get all mondays, by getting all dates and filtering by Mondays:
for i in `seq 0 365`
do date -d "+$i day"
done | grep Mon
Of course, you could also take a monday and keep incrementing by 7 days.
hope that's what you mean. Below can be changed to vary the output formats of the dates.
date command can be used for that, dunno if ncal is any more/less efficient.
I know you went for "binning" now, but here is a more readable v.
$ cat /tmp/1.sh
#!/bin/bash
test -z "$year" && {
echo "I expect you to set \$year environment variable"
echo "In return I will display you the Mondays of this year"
exit 1
}
# change me if you would like the date format to be different
# man date would tell you all the combinations you can use here
DATE_FORMAT="+%Y-%m-%d"
# change me if you change the date format above. I need to be
# able to extract the year from the date I'm shoing you
GET_YEAR="s/-.*//"
# this value is a week, in milliseconds. Changing it would change
# what I'm doing.
WEEK_INC=604800
# Use another 3-digit week day name here, to see dates for other week days
DAY_OF_WEEK=Mon
# stage 1, let's find us the first day of the week in this year
d=1
# is it DAY_OF_WEEK yet?
while test "$(date -d ${year}-1-${d} +%a)" != "$DAY_OF_WEEK"; do
# no, so let's look at the next day
d=$((d+1));
done;
# let's ask for the milliseconds for that DAY_OF_WEEK that I found above
umon=$(date -d ${year}-1-${d} +%s)
# let's loop until we break from inside
while true; do
# ndate is the date that we testing right now
ndate=$(date -d #$umon "$DATE_FORMAT");
# let's extract year
ny=$(echo $ndate|sed "$GET_YEAR");
# did we go over this year? If yes, then break out
test $ny -ne $year && { break; }
# move on to next week
umon=$((umon+WEEK_INC))
# display the date so far
echo "$ndate"
done
No need to iterate over all 365 or 366 days in the year. The following executes date at most 71 times.
#!/bin/bash
y=2011
for d in {0..6}
do
if (( $(date -d "$y-1-1 + $d day" '+%u') == 1)) # +%w: Mon == 1 also
then
break
fi
done
for ((w = d; w <= $(date -d "$y-12-31" '+%j') - 1; w += 7))
do
date -d "$y-1-1 + $w day" '+%Y-%m-%d'
done
Output:
2011-01-03
2011-01-10
2011-01-17
2011-01-24
2011-01-31
2011-02-07
2011-02-14
2011-02-21
2011-02-28
2011-03-07
. . .
2011-11-28
2011-12-05
2011-12-12
2011-12-19
2011-12-26
Another option that I've come up based on the above answers. The start and end date can now be specified.
#!/bin/bash
datestart=20110101
dateend=20111231
for tmpd in {0..6}
do
date -d "$datestart $tmpd day" | grep -q Mon
if [ $? = 0 ];
then
break
fi
done
for ((tmpw = $tmpd; $(date -d "$datestart $tmpw day" +%s) <= $(date -d "$dateend" +%s); tmpw += 7))
do
echo `date -d "$datestart $tmpw day" +%d-%b-%Y`
done
You can get the current week number using date. Maybe you can sort on that:
$ date +%W -d '2011-02-18'
07

BASH ERROR: syntax error: operand expected (error token is ")

I am new to bash scripting, and I'm having an issue with one of my scripts. I'm trying to compose a list of Drivers Under 25 after reading their birthdates in from a folder filled with XML files and calculating their ages. Once I have determined they are under 25, the filename of the driver's data is saved to a text file. The script is working up until a certain point and then it stops. The error I'm getting is:
gdate: extra operand ā€˜+%sā€™
Try 'gdate --help' for more information.
DriversUnder25.sh: line 24: ( 1471392000 - )/60/60/24 : syntax error: operand expected (error token is ")/60/60/24 ")
Here is my code:
#!/bin/bash
# define directory to search and current date
DIRECTORY="/*.xml"
CURRENT_DATE=$(date '+%Y%m%d')
# loop over files in a directory
for FILE in $DIRECTORY;
do
# grab user's birth date from XML file
BIRTH_DATE=$(sed -n '/Birthdate/{s/.*<Birthdate>//;s/<\/Birthdate.*//;p;}' $FILE)
# calculate the difference between the current date
# and the user's birth date (seconds)
DIFFERENCE=$(( ( $(gdate -ud $CURRENT_DATE +'%s') - $(gdate -ud $BIRTH_DATE +'%s') )/60/60/24 ))
# calculate the number of years between
# the current date and the user's birth date
YEARS=$(($DIFFERENCE / 365))
# if the user is under 25
if [ "$YEARS" -le 25 ]; then
# save file name only
FILENAME=`basename $FILE`
# output filename to text file
echo $FILENAME >> DriversUnder25.txt
fi
done
I'm not sure why it correctly outputs the first 10 filenames and then stops. Any ideas why this may be happening?
You need to quote the expansion of $BIRTH_DATE to prevent word splitting on the whitespace in the value. (It is good practice to quote all your parameter expansions, unless you have a good reason not to, for this very reason.)
DIFFERENCE=$(( ( $(gdate -ud "$CURRENT_DATE" +'%s') - $(gdate -ud "$BIRTH_DATE" +'%s') )/60/60/24 ))
(Based on your comment, this would probably at least allow gdate to give you a better error message.)
A best-practices implementation would look something like this:
directory=/ # patch as appropriate
current_date_unix=$(date +%s)
for file in "$directory"/*.xml; do
while IFS= read -r birth_date; do
birth_date_unix=$(gdate -ud "$birth_date" +'%s')
difference=$(( ( current_date_unix - birth_date_unix ) / 60 / 60 / 24 ))
years=$(( difference / 365 ))
if (( years < 25 )); then
echo "${file%.*}"
fi
done < <(xmlstarlet sel -t -m '//Birthdate' -v . -n <"$file")
done >DriversUnder25.txt
If this script needs to be usable my folks who don't have xmlstarlet installed, you can generate an XSLT template and then use xsltproc (which is available out-of-the-box on modern opertaing systems).
That is to say, if you run this once, and bundle its output with your script:
xmlstarlet sel -C -t -m '//Birthdate' -v . -n >get-birthdays.xslt
...then the script can be modified to replace xmlstarlet with:
xsltproc get-birthdays.xslt - <"$file"
Notes:
The XML input files are being read with an actual XML parser.
When expanding for file in "$directory"/*.xml, the expansion is quoted but the glob is not (thus allowing the script to operate on directories with spaces, glob characters, etc. in their names).
The output file is being opened once, for the loop, rather than once per line of output (reducing overhead unnecessarily opening and closing files).
Lower-case variable names are in use to comply with POSIX conventions (specifying that variables with meaning to the operating system and shell have all-upper-case names, and that the set of names with at least one lower-case character is reserved for application use; while the docs in question are with respect to environment variables, shell variables share a namespace, making the convention relevant).
The issue was that there were multiple drivers in some files, thus importing multiple birth dates into the same string. My solution is below:
#!/bin/bash
# define directory to search and current date
DIRECTORY="/*.xml"
CURRENT_DATE=$(date '+%Y%m%d')
# loop over files in a directory
for FILE in $DIRECTORY;
do
# set flag for output to false initially
FLAG=false
# grab user's birth date from XML file
BIRTH_DATE=$(sed -n '/Birthdate/{s/.*<Birthdate>//;s/<\/Birthdate.*//;p;}' $FILE)
# loop through birth dates in file (there can be multiple drivers)
for BIRTHDAY in $BIRTH_DATE;
do
# calculate the difference between the current date
# and the user's birth date (seconds)
DIFFERENCE=$(( ( $(gdate -ud $CURRENT_DATE +'%s') - $(gdate -ud $BIRTHDAY +'%s') )/60/60/24))
# calculate the number of years between
# the current date and the user's birth date
YEARS=$(($DIFFERENCE / 365))
# if the user is under 25
if [ "$YEARS" -le 25 ]; then
# save file name only
FILENAME=`basename $FILE`
# set flag to true (driver is under 25 years of age)
FLAG=true
fi
done
# if there is a driver under 25 in the file
if $FLAG == true; then
# output filename to text file
echo $FILENAME >> DriversUnder25.txt
fi
done

Add mins in existing custom date format via shell

I have problem that is using custom date format like date +%y%j.%H%M%S,
And what i want is to add 15 mins on the this date on just on current date of system. so that i can use for further calculation into my process.
I have tried with below code -
$uprBond=`date +%y%j.%H%M%S`
$ echo $uprBond
16079.031135
$ date -d "$(uprBond) + 5 minutes" +%y%j.%H%M%S
op > bash: uprBond: command not found
16079.035920
I am failing while passing the above date format , Can anybody please help on this.
Just for note, below is the piece of code is working when i used date function instead of defined date variable i.e. $uprBond (I don't want to use predefined date because we have some old same formatted date which needs that adding of mins).
date +%y%j.%H%M%S -d "`date` + 5 minutes";
op > 16079.040724
With GNU date, GNU bash 4 and its Parameter Expansion:
#!/bin/bash
uprBond="$(date +%y%j.%H%M%S)"
year="20${uprBond:0:2}"
doy="${uprBond#${uprBond:0:2}}"
doy="${doy%.*}"
time="${uprBond#*.}"
time="${time:0:2}:${time:2:2}:${time:4:2}"
in5min=$(date -d "${year}-01-01 +${doy} days -1 day +5 minutes ${time}" "+%y%j.%H%M%S")
echo "now: $uprBond"
echo "in 5min: $in5min"
Output:
now: 16079.145026
in 5min: 16079.145526

Resources