Pick Oldest file on basis of date in Name of the file - bash

I am stuck in one situation where I am having a bunch of files and I need to pick the oldest one on the basis of time present in name only. Not on basis of the timestamp as I am doing SCP from one system to another so timestamp would be same for all the files once SCP runs
I have files like
UAT-2019-03-21-16-31.csv
UAT-2019-03-21-17-01.csv
AIT-2019-03-21-17-01.csv
Here, 2019 represents the year, 03 the month, 21 the day, 16 the hours in 24-hour format and 31 represent the minutes.
I need to pick the UAT-2019-03-21-16-31.csv file from the above files first.
How can I do in shell scripting.
I tried doing ls -1 but it will sort alphabetically, that means AIT-2019-03-21-17-01.csv will be picked first, but I need according to time mentioned in the file name

You can try this
ls -1 | sort -t"-" -k2 -k3 -k4 -k5 -k6 | head -n1
Output :
UAT-2019-03-21-16-31.csv
Curious about alternatives answer as I know that parsing ls output is not ideal.

The best and efficient way to do this would be to convert the filename time stamp to epoch time and find the oldest among them.
You need to write a script that does below in order:
Get all the filename timestamp into a variable.
Convert all filename timestamp to epoch time.
Find the oldest and get the filename.
Command to convert the filename timestamp to epoch time would be
date -d"2019-03-21T17:01" +%s
date -d"YYYY-MM-DDTHH:MM" +%s
You can try these steps in script
Hope so this helps you to start writing the script.

Related

tail a log file from a specific line number

I know how to tail a text file with a specific number of lines,
tail -n50 /this/is/my.log
However, how do I make that line count a variable?
Let's say I have a large log file which is appended to daily by some program, all lines in the log file start with a datetime in this format:
Day Mon YY HH:MM:SS
Every day I want to output the tail of the log file but only for the previous days records. Let's say this output runs just after midnight, I'm not worried about the tail spilling over into the next day.
I just want to be able to work out how many rows to tail, based on the first occurrence of yesterdays date...
Is that possible?
Answering the question of the title, for anyone who comes here that way, head and tail can both accept a code for how much of the file to exclude.
For tail, use -n +num for the line number num to start at
For head, use -n -num for the number of lines not to print
This is relevant to the actual question if you have remembered the number of lines from the previous time you did the command, and then used that number for tail -n +$prevlines to get the next portion of the partial log, regardless of how often the log is checked.
Answering the actual question, one way to print everything after a certain line that you can grep is to use the -A option with a ridiculous count. This may be more useful than the other answers here as you can get a number of days of results. So to get everything from yesterday and so-far today:
grep "^`date -d yesterday '+%d %b %y'`" -A1000000 log_file.txt
You can combine 2 greps to print between 2 date ranges.
Note that this relies on the date actually occurring in the log file. It has the weakness that if no events were logged on a particular day used as the range marker, then it will fail to find anything.
To resolve that you could inject dummy records for the start and end dates and sort the file before grepping. This is probably overkill, though, and the sort may be expensive, so I won't example it.
I don't think tail has any functionality like this.
You could work out the beginning and ending line numbers using awk, but if you just want to exact those lines from the log file, the simplest way is probably to use grep combined with date to do it. Matching yesterday's date at beginning of line should work:
grep "^`date -d yesterday '+%d %b %y'`" < log_file.txt
You may need to adjust the date format to match exactly what you've got in the log file.
You can do it without tail, just grep rows with previous date:
cat my.log | grep "$( date -d "yesterday 13:00" '+%d %m %Y')"
And if you need line count you can add
| wc -l
I worked this out through trial and error by getting the line numbers for the first line containing the date and the total lines, as follows:
lines=$(wc -l < myfile.log)
start=$(cat myfile.log | grep -no $datestring | head -n1 | cut -f1 -d:)
n=$((lines-start))
and then a tail, based on that:
tail -n$n myfile.log

Last Day of Month in csvfile

i try to delete all days of a csv file which not matched last days. But I find not the right solution.
date,price
2018-07-02,162.17
2018-06-29,161.94
2018-06-28,162.22
2018-06-27,162.32
2018-06-12,163.01
2018-06-11,163.53
2018-05-31,164.87
2018-05-30,165.59
2018-05-29,165.42
2018-05-25,165.96
2018-05-02,164.94
2018-04-30,166.16
2018-04-27,166.69
The output I want become
date,price
2018-06-29,161.94
2018-05-31,164.87
2018-04-30,166.16
I try it with cut + grep
cut -d, -f1 file.csv | grep -E "28|29|30"
Work but bring nothing when combine -f1,2.
I find csvkit which seem to me the right tool, but I find not the solution for multiple grep.
csvgrep -c 1 -m 30 file.csv
Bring me the right result but how can combine multiple search option? I try -m 28,29,30 and -m 28 -m 29 -m 30 all work not. Best it work with last day of every month.
Maybe one have here a idea.
Thank you and nice Sunday
Silvio
You want to get all records of the LAST day of the month. But months vary in length (28-29-30-31).
I don't see why you used cut to extract the first field (the date part), because the data in the second field does not look like dates at all (xx-xx).
I suggest to use grep directly to display the lines that matches the following pattern mm-dd; where mm is the month number, and dd is the last day of the month.
This command should do the trick:
grep -E "01-31|02-(28|29)|03-31|04-30|05-31|06-30|07-31|08-30|09-31|10-30|11-31|12-30" file.csv
This command will give the following output:
2018-05-31,164.87
2018-04-30,166.16

Unix Shell Scripting using Date Command

Ok, so i'm trying to write a scrpit to wc files using the date command. The format of the files, for example, goes like this: testfile20170104.gz.
Now the files are set up to have yesterday's date with the format yyyymmdd. So if today is 1/5/2017 the file will have the previous day of 1/4/2017 in the yyyymmdd format, as you see in the example above.
Normally to count the file all one needs to do is simply input: gzcat testfile20170104.gz|wc -l to get the word count.
However, what I want to do is run a script or even a for loop that gzcat the file but instead of having to copy and paste the filename in the command line, I want to use the date command to input put yesterday's date in the filename with the format of yyyymmdd.
So as a template something like this:
gzcat testfile*.gz|wc -l | date="-1 days"+%Y%m%d
Now I know what I have above is COMPLETELY wrong but you get the picture. I want to replace the '*' with the output from the date command, if that makes sense...
Any help will be much much appreciated!
Thanks!
You want:
filename="testfile$( date -d yesterday +%Y%m%d ).gz"
zcat "$filename"

store output of ls - lrt in two different variables

I want to store out put of ls - lrt | tail -2 in two different variables and get the base file name. file name have the pattern YYYYMMDD_filename. I want to compare both the files with current date and pick the previous day file. please help. I m new to shell scripting.
This may not answer your question. To get all the files with yesterday's date:
yesterday=$( date -d yesterday +%Y%m%d )
files=( "$yesterday"_* )
It's generally advised to avoid parsing the output of ls.

Sorting git timestamp in the shell

I have a list of Git timestamps in the format Mon Jan 1 01:01:01 2013 +0500. I need sort them in the shell somehow and have no clue how to approach this. So far I've created two arrays - one for months and one for days.
Any suggestions?
Thanks.
EDIT: This is not a git log that I'm going through, this is just a bunch of git timestamps that I have pulled out from different repos.
You can use date to convert to a format that's easier to sort, such as epoch. I'll assume you have a file called dates.in, with one date per line.
#!/bin/bash
while read d; do
date -d "$d" +%s
done <dates.in | sort | \
while read d; do
date -d "#$d"
done

Resources