this is an example of my data
ip=1.2.3.4, setup_time=05:58:38.617 GMT Tue Mar 16 2021, foo=moshe, bar=haim
ip=2.3.4.5, setup_time=05:59:30.260 GMT Tue Mar 16 2021, foo=moshe2, bar=haim2
i would like to be able to sort by the setup_time column in bash. I know that I can't use sort because sort allow only sort by string matching and this is not a format of YYYY-MM-DD HH:mm:ss so string sorting is not possible.
so any ideas would be greatly appreciated.
thank you
#update
ok to better understand what i'm trying to achieve i created the folowing file named 1:
ip=1.2.3.4, setup_time=06:58:38.617 GMT Tue Mar 16 2021, foo=moshe, bar=haim
ip=2.3.4.5, setup_time=05:59:30.260 GMT Tue Mar 17 2021, foo=moshe2, bar=haim2
ip=2.3.4.5, setup_time=06:50:30.260 GMT Tue Mar 18 2021, foo=moshe2, bar=haim2
so I executed this:
cat 1 | sed 's/, /!/g' | sort -t '!' -k2,2
what i did here is replaced , with ! so i can use a delimiter in sort, the problem is that sort is doing string sorting and not timestamp kind of sorting so the output is:
ip=2.3.4.5!setup_time=05:59:30.260 GMT Tue Mar 17 2021!foo=moshe2!bar=haim2
ip=2.3.4.5!setup_time=06:50:30.260 GMT Tue Mar 18 2021!foo=moshe2!bar=haim2
ip=1.2.3.4!setup_time=06:58:38.617 GMT Tue Mar 16 2021!foo=moshe!bar=haim
Sort is able to deal with month names, thanks to the option M
No need to change , into !. Use the white space as delimiter and just issue:
LC_ALL=en sort -k7nr -k5Mr -k6nr -k2r sample
If you use this as content of the file sample:
ip=2.3.4.5, setup_time=05:59:30.260 GMT Tue Apr 1 2021, foo=moshe2, bar=haim2
ip=2.3.4.5, setup_time=05:59:30.260 GMT Tue Mar 17 2021, foo=moshe2, bar=haim2
ip=1.2.3.4, setup_time=06:58:38.617 GMT Tue Mar 16 2021, foo=moshe, bar=haim
ip=1.2.3.4, setup_time=06:58:38.617 GMT Tue Feb 28 2021, foo=moshe, bar=haim
ip=2.3.4.5, setup_time=06:50:30.260 GMT Tue Mar 18 2020, foo=moshe2, bar=haim2
ip=2.3.4.5, setup_time=06:50:30.260 GMT Tue Mar 18 2021, foo=moshe2, bar=haim2
you will get this as output:
ip=2.3.4.5, setup_time=05:59:30.260 GMT Tue Apr 1 2021, foo=moshe2, bar=haim2
ip=2.3.4.5, setup_time=06:50:30.260 GMT Tue Mar 18 2021, foo=moshe2, bar=haim2
ip=2.3.4.5, setup_time=05:59:30.260 GMT Tue Mar 17 2021, foo=moshe2, bar=haim2
ip=1.2.3.4, setup_time=06:58:38.617 GMT Tue Mar 16 2021, foo=moshe, bar=haim
ip=1.2.3.4, setup_time=06:58:38.617 GMT Tue Feb 28 2021, foo=moshe, bar=haim
ip=2.3.4.5, setup_time=06:50:30.260 GMT Tue Mar 18 2020, foo=moshe2, bar=haim2
Specifying -k7 means to sort on the seventh field. The r option reverses the order of sorting to descending. The M option sorts according the name of the month. The n option sorts numerically. To sort on the time, just consider the whole second field (beginning with the string setup_time=) as a fixed length string using -k2.
LC_ALL=en in the begin of the command line tells the system to use the English names of the months.
A solution involving awk:
awk '
{
year = substr($7, 1, length($7)-1)
cmd ="date --date=\""$3" "$4" "$5" "$6" "$year"\" +%s"
cmd | getline var
print var, $0
close(cmd)
}' file | sort -k 1 | cut -f 1- -d' '
The trick is that date --date="GMT Tue Mar 18 2021" will parse the date heuristically (meaning it will also work with gdate --date="GMT Tue 18 Mar 2021"), and then you can print only the seconds since epoch.
awk will output the seconds as first column, you sort by it, then you remove the first column from the result.
Biggest advantage of this solution is that it will work for other types of date formats (within reason of course).
Note1: for this to work you need GNU date (on Mac OS gdate, for example)
Note2: instead of awk you could use also bash with while/read (as in Read a file line by line assigning the value to a variable), but awk is rather standard, so not sure if it is a big difference for you.
If you have a sort with month name support -- use that. Pierre's solution is elegant!
If you don't, convert the date to ISO 8601 (which sorts lexicographically) and use a Schwartzian transform or a Decorate / Sort / Undecorate pattern.
The easiest, since the date you have is non standard, is use Perl to decorate, sort to sort on the first field, then cut to undecorate (remove the added field):
perl -lnE '
BEGIN{
%m2n = qw(Jan 01 Feb 02 Mar 03 Apr 04 May 05 Jun 06
Jul 07 Aug 08 Sep 09 Oct 10 Nov 11 Dec 12
);}
m/setup_time=([\d:]+).*?(\w\w\w) (\d\d?) (\d\d\d\d),/;
$mon=$m2n{$2};
say "$4$mon$3$1\t$_"' YourFile | sort -t $'\t' -r -k1,1 | cut -d $'\t' -f2-
Using pierre's data, prints:
ip=2.3.4.5, setup_time=05:59:30.260 GMT Tue Apr 1 2021, foo=moshe2, bar=haim2
ip=2.3.4.5, setup_time=06:50:30.260 GMT Tue Mar 18 2021, foo=moshe2, bar=haim2
ip=2.3.4.5, setup_time=05:59:30.260 GMT Tue Mar 17 2021, foo=moshe2, bar=haim2
ip=1.2.3.4, setup_time=06:58:38.617 GMT Tue Mar 16 2021, foo=moshe, bar=haim
ip=1.2.3.4, setup_time=06:58:38.617 GMT Tue Feb 28 2021, foo=moshe, bar=haim
ip=2.3.4.5, setup_time=06:50:30.260 GMT Tue Mar 18 2020, foo=moshe2, bar=haim2
I have a list of files with the substring YYYYMMDDHH in them (example: 2016112200 means 2016 November 22th at 00 hours). These files are: temp_2016102200.data, temp_2016102212.data, temp_2016102300.data, temp_2016102312.data, ..., temp_20170301.data. And I also have other family of files substituting temp by wind.
For each string YYYYMMDDHH I want to create a tar with the temp and its correspondent wind file. I don't want this process to stop if one or both files are missing.
My idea was to loop from 12 hours to 12 hours, but I am having some problems because to specify the date I did: b=$(date -d '2016111400' +'%Y%m%d%H') but bash informs me that that is not a valid date...
Thanks.
It's not bash telling you the date format is wrong: date is telling you. Not everything you type is a bash command.
As Kamil comments, you have to split it up so that date can parse it. The YYYY-mm-dd HH:MM:SS format is parsable. Using bash parameter expansion to extract the relevant substrings:
$ d=2016111400
$ date -d "${d:0:4}-${d:4:2}-${d:6:2} ${d:8:2}:00:00"
Mon Nov 14 00:00:00 EST 2016
Now, when you want to add 12 hours, you have to be careful to do it in the right place in the datetime string: if you add a + character after the time, it will be parsed as a timezone offset, so put the relative part either first or between the date and the time.
$ date -d "+12 hours ${d:0:4}-${d:4:2}-${d:6:2} ${d:8:2}:00:00"
Mon Nov 14 12:00:00 EST 2016
As a loop, you could do:
d=2016111400
for ((i=1; i<=10; i++)); do
# print this datetime
date -d "${d:0:4}-${d:4:2}-${d:6:2} ${d:8:2}:00:00"
# add 12 hours
d=$( date -d "+12 hours ${d:0:4}-${d:4:2}-${d:6:2} ${d:8:2}:00:00" "+%Y%m%d%H" )
done
outputs:
Mon Nov 14 00:00:00 EST 2016
Mon Nov 14 12:00:00 EST 2016
Tue Nov 15 00:00:00 EST 2016
Tue Nov 15 12:00:00 EST 2016
Wed Nov 16 00:00:00 EST 2016
Wed Nov 16 12:00:00 EST 2016
Thu Nov 17 00:00:00 EST 2016
Thu Nov 17 12:00:00 EST 2016
Fri Nov 18 00:00:00 EST 2016
Fri Nov 18 12:00:00 EST 2016
OK, a "nicer" way to loop
start=2019043000
end=2019050300
plus12hours() {
local d=$1
date -d "+12 hours ${d:0:4}-${d:4:2}-${d:6:2} ${d:8:2}:00:00" "+%Y%m%d%H"
}
for (( d = start; d <= end; d = $(plus12hours "$d") )); do
printf "%d\t%s\n" "$d" "$(date -d "${d:0:4}-${d:4:2}-${d:6:2} ${d:8:2}:00:00")"
done
2019043000 Tue Apr 30 00:00:00 EDT 2019
2019043012 Tue Apr 30 12:00:00 EDT 2019
2019050100 Wed May 1 00:00:00 EDT 2019
2019050112 Wed May 1 12:00:00 EDT 2019
2019050200 Thu May 2 00:00:00 EDT 2019
2019050212 Thu May 2 12:00:00 EDT 2019
2019050300 Fri May 3 00:00:00 EDT 2019
This question already has answers here:
Bash script/command to print out date 5 min before/after
(4 answers)
Closed 5 years ago.
I want to add 10 seconds 10 times. But I don't know well how to add times to the value.
This is my code.
./time.sh
time=$(date)
counter=1
while [ $counter -le 10 ]
do
echo "$time"
time=$('$time + 10 seconds') //error occurred.
((counter++))
done
echo All done
Using GNU Date
Assuming GNU date, replace:
time=$('$time + 10 seconds')
with:
time=$(date -d "$time + 10 seconds")
Putting it all together, try:
$ cat a.sh
t=$(date)
counter=1
while [ "$counter" -le 10 ]
do
echo "$t"
t=$(date -d "$t + 10 seconds")
((counter++))
done
echo All done
(I renamed time to t because time is also a bash built-in command and it is best to avoid potential confusion.)
When run, the output looks like:
$ bash a.sh
Tue Jan 16 19:19:44 PST 2018
Tue Jan 16 19:19:54 PST 2018
Tue Jan 16 19:20:04 PST 2018
Tue Jan 16 19:20:14 PST 2018
Tue Jan 16 19:20:24 PST 2018
Tue Jan 16 19:20:34 PST 2018
Tue Jan 16 19:20:44 PST 2018
Tue Jan 16 19:20:54 PST 2018
Tue Jan 16 19:21:04 PST 2018
Tue Jan 16 19:21:14 PST 2018
All done
Using Bash (>4.2)
Recent versions of bash support date calculations without external utilities. Try:
$ cat b.sh
#!/bin/bash
printf -v t '%(%s)T' -1
counter=1
while [ "$counter" -le 10 ]
do
((t=t+10))
printf '%(%c)T\n' "$t"
((counter++))
done
echo All done
Here, t is time since epoch in seconds.
When run, the output looks like:
$ bash b.sh
Tue 16 Jan 2018 07:31:44 PM PST
Tue 16 Jan 2018 07:31:54 PM PST
Tue 16 Jan 2018 07:32:04 PM PST
Tue 16 Jan 2018 07:32:14 PM PST
Tue 16 Jan 2018 07:32:24 PM PST
Tue 16 Jan 2018 07:32:34 PM PST
Tue 16 Jan 2018 07:32:44 PM PST
Tue 16 Jan 2018 07:32:54 PM PST
Tue 16 Jan 2018 07:33:04 PM PST
Tue 16 Jan 2018 07:33:14 PM PST
All done
I know how to delete the files the files which are more than 60 days old. But I have to satisfy below conditions. Please help me to get correct script to automate this.
I have below files for each day on monthly basis. So I have these files for last 3 years.
vtm_data_12month_20140301.txt
vtm_data_12month_20140301.control
vtm_mtd_20130622.txt
vtm_mtd_20130622.control
vtm_ytd_20131031.txtvtm_ytd_20131031.control
I'd like to write a script find the all files which are more than 60 days old and delete them all but except last month file.
Suppose for january I want to keep the last file (latest) vtm_data_12month_20140131.txt and delete all 30 files. Issue here is, there is chance that I might have files received for January 30th, so in that case I should not delete the latest file, but I have to delete the rest.
Please advice me how can we achieve this via shell script. Your response is highly appreciated.
There are many ways to do this. The two primary approaches are either to (1) use the actual file date to determine whether the files are removed or (2) use the date embedded in the filename to determine the file date. Both have advantages and pitfalls. What you seem to be asking is to remove files 60 days older than the latest date embedded in the filename or 2.
As you have indicated, you may have a number of files with dates mixed relatively close to the end and you may need to adjust the date. Rather than just having the script parse for a maximum file date string contained in the file, you can prompt for the end date to measure 60 days back from. Otherwise, just scan each embedded date and find the max, and subtract 60 days from there. The following script prompts for an end_date.
In fact, the following script contains code to remove files by both methods (and sample data). The code to remove based on the actual file create date ( (1) above ) is commented out below the code that uses the embedded date. Look over the script and understand what it does. It is fairly well commented. NOTE the actual rm command is commented out to prevent accidents (even though it requires you to enter YES to confirm removal). Uncomment the rm line to be able to actually remove files. Drop a comment if you have questions:
#!/bin/bash
oifs="$IFS" # save current IFS (internal field separator) (default ' \t\n')
IFS=$'\n' # set IFS to only break on space
## prompt for path containing files & read
printf "\n enter the path to files to remove (no ending '/'): "
read -r rmpath
## validate directory
[ -d "$rmpath" ] || { printf "\nerror: bad path '%s'\n\n" "$rmpath"; exit 1; }
## prompt for ending date of files to keep
printf "\n enter the _end_ date of files to keep 'yyyymmdd' : "
read -r enddatestr
IFS="$oifs" # reset IFS to original
enddt=$(date -d "$enddatestr" +%s) # get enddt in seconds since epoch
enddt=$((enddt - (60 * 24 * 3600))) # subtract 60 days
declare -a rmarray
## Using embedded filename date
mdate=$(date -d "#$enddt" +%Y%m%d) # get mdate string to compare to filename
## fill rmarray with file dates older than mdate
for i in $(find "$rmpath" -maxdepth 1 -type f); do
ffname="${i##*/}" # full filename component
fname=${ffname%.*} # filename w/o extension
fdate="${fname##*_}" # get file date string
## if fdate before mdate, add to remove array
[ "$mdate" -gt "$fdate" ] && rmarray+=( "$i" )
done
# ### Using actual file creation date
# tgtfile=/tmp/tgt_$(date +%s) # tmp filename to measure against
#
# ## create temp file to measure against with find & set trap to remove
# touch -t $(date -d "#${enddt}" +%Y%m%d%H%M.%S) "$tgtfile" &&
# trap 'rm -rf "$tgtfile"' 0
#
# ## fill array with filenames to remove
# rmarray=( $(find "$rmpath" -maxdepth 1 -type f ! -newer $tgtfile) )
## verify files are contained in rmarray
[ "${#rmarray[#]}" -lt 1 ] && {
printf "\n No files matched the dates for removal.\n\n"
exit 1
}
## print files that will be removed
printf "\n ** the following files will be removed **\n\n"
for i in "${rmarray[#]}"; do
ls -al "$i"
done
## prompt for actual removal
printf "\n Continue with ACTUAL removal (YES to remove) : "
read ans
if [ "$ans" = "YES" ]; then
for i in "${rmarray[#]}"; do
# rm "$i" # NOTE: 'rm' is commented, uncomment to really delete
done
else
printf "\n You entered '%s' (not YES), no removal performed.\n\n" "$ans"
fi
exit 0
test directory:
$ls -l dat/fstst
total 0
-rw-r--r-- 1 david david 0 Nov 27 01:10 vtm_data_12month_20140301.control
-rw-r--r-- 1 david david 0 Nov 27 01:10 vtm_data_12month_20140301.txt
-rw-r--r-- 1 david david 0 Nov 27 01:10 vtm_mtd_20130622.control
-rw-r--r-- 1 david david 0 Nov 27 01:10 vtm_mtd_20130622.txt
-rw-r--r-- 1 david david 0 Nov 27 01:10 vtm_ytd_20131031.control
-rw-r--r-- 1 david david 0 Nov 27 01:10 vtm_ytd_20131031.txt
use:
$ bash rmfiles_60days.sh
enter the path to files to remove (no ending '/'): dat/fstst
enter the _end_ date of files to keep 'yyyymmdd' : 20140301
** the following files will be removed **
-rw-r--r-- 1 david david 0 Nov 27 01:10 dat/fstst/vtm_mtd_20130622.txt
-rw-r--r-- 1 david david 0 Nov 27 01:10 dat/fstst/vtm_ytd_20131031.control
-rw-r--r-- 1 david david 0 Nov 27 01:10 dat/fstst/vtm_ytd_20131031.txt
-rw-r--r-- 1 david david 0 Nov 27 01:10 dat/fstst/vtm_mtd_20130622.control
Continue with ACTUAL removal (YES to remove) : YES
result:
$ ls -l dat/fstst
total 0
-rw-r--r-- 1 david david 0 Nov 27 01:10 vtm_data_12month_20140301.control
-rw-r--r-- 1 david david 0 Nov 27 01:10 vtm_data_12month_20140301.txt
The following is an example using the actual file date:
test directory:
$ls -l dat/tst
total 324
-rw-r--r-- 1 david david 74 Sep 9 01:23 1.txt
-rw-r--r-- 1 david david 74 Sep 9 01:23 2.txt
-rw-r--r-- 1 david david 201 Aug 1 03:47 3line.dat
-rw-r--r-- 1 david david 205 Aug 1 03:35 3line.dat.sav
-rw-r--r-- 1 david david 88 Aug 13 04:05 catfile.txt
-rw-r--r-- 1 david david 39 Jul 4 14:40 comma
-rw-r--r-- 1 david david 291 Sep 23 03:00 createfile.txt
-rw-r--r-- 1 david david 11 Jul 17 03:54 data.dat
-rw-r--r-- 1 david david 8 Jul 17 03:54 datb.dat
-rw-r--r-- 1 david david 369 Oct 2 14:25 dia.txt
-rw-r--r-- 1 david david 36 Nov 6 15:51 dicta.dat
-rw-r--r-- 1 david david 23895 Sep 9 17:14 dna.dat
-rw-r--r-- 1 david david 243 Nov 4 23:07 domain.dat
-rw-r--r-- 1 david david 276 Nov 23 00:32 ecread.dat
(snip)
use:
$ bash rmfiles_60days.sh
enter the path to files to remove (no ending '/'): dat/tst
enter the _end_ date of files to keep 'yyyymmdd' : 20141031
** the following files will be removed **
-rw-r--r-- 1 david david 205 Aug 1 03:35 dat/tst/3line.dat.sav
-rw-r--r-- 1 david david 29 Jun 29 02:23 dat/tst/f1f2.dat
-rw-r--r-- 1 david david 8 Jul 17 03:54 dat/tst/datb.dat
-rw-r--r-- 1 david david 60 Jul 27 23:24 dat/tst/vowels.txt
-rw-r--r-- 1 david david 134 Aug 11 00:32 dat/tst/outfile.txt
-rw-r--r-- 1 david david 4622 Jun 26 02:49 dat/tst/single.xml
-rw-r--r-- 1 david david 99 Jul 4 14:51 dat/tst/hostnm
-rw-r--r-- 1 david david 115 Aug 7 01:35 dat/tst/ltags.txt
-rw-r--r-- 1 david david 122 Aug 29 11:11 dat/tst/hh.dat
-rw-r--r-- 1 david david 509 Jul 21 17:28 dat/tst/orders.txt
-rw-r--r-- 1 david david 205 Jun 27 01:06 dat/tst/table.html
(snip)
Continue with ACTUAL removal (YES to remove) : YES
result:
$ ls -l dat/tst
total 168
-rw-r--r-- 1 david david 74 Sep 9 01:23 1.txt
-rw-r--r-- 1 david david 74 Sep 9 01:23 2.txt
-rw-r--r-- 1 david david 291 Sep 23 03:00 createfile.txt
-rw-r--r-- 1 david david 369 Oct 2 14:25 dia.txt
-rw-r--r-- 1 david david 36 Nov 6 15:51 dicta.dat
-rw-r--r-- 1 david david 23895 Sep 9 17:14 dna.dat
-rw-r--r-- 1 david david 243 Nov 4 23:07 domain.dat
-rw-r--r-- 1 david david 276 Nov 23 00:32 ecread.dat
-rw-r--r-- 1 david david 93 Nov 2 21:43 empdata.dat
(snip)
i have a file with:
....1342477599376
1342479596867
1342480248580
1342480501995
1342481198309
1342492256524
1342506099378....
these lines ... means Various character. I'd like to read this file with cat (it is essential that i need to with that) and get these lines with sed commands, than i'd like to convert the epoch to date...
cat myfile.log | sed '...*//' | sed 's/...*//' | date -d #$1
Unfortunately this isn't work.
One way, using sed:
cat file.txt | sed "s/^.*\([0-9]\{13\}\).*/date -d #\1/" | sh
Results:
Thu Jun 4 14:16:16 EST 44511
Sat Jun 27 17:07:47 EST 44511
Sun Jul 5 06:09:40 EST 44511
Wed Jul 8 04:33:15 EST 44511
Thu Jul 16 05:58:29 EST 44511
Sat Nov 21 05:42:04 EST 44511
Fri Apr 29 10:56:18 EST 44512
HTH
This is similar solution but it will find a timestamp in the stream
cat test.txt | sed 's/^/echo "/; s/\([0-9]\{13\}\)/`date -d #\1`/; s/$/"/' | bash