Bash function to "use" the most recent "dated" file in a dir - bash

I have a dir with a crap load (hundreds) of log files from over time. In certain cases I want to make a note regarding the most recent (by date in filename, not by creation time) log or I just need some piece of info from it and i want to view it quickly and I just know it was (usually) the last one created (but always) with the newest date. So I wanted to make a "simple" function in my bashrc to overcome this problem, basically what I want is a function that goes to a specific dir and finds the latest log by date (always in the same format) and open it with less or whatever pager I want.
The logs are formatted like this:
typeoflog-short-description-$(date "+%-m-%-d-%y")
basically the digits in between the last 3 dashes are what I'm interested in, for example(s):
update-log-2-24-18
removed-cuda-opencl-nvidia-12-2-19
whatever-changes-1-18-19
Now if it was January, 20 2019 and this was the last log added to the dir I need a way to see what the highest number is in the last 2 digits of the filename (that i don't really have a problem with), then check for the highest month that would be 2 "dashes" from the last set of digits whether it be 2 digits or 1 for the month, and then do the same thing for the day of the month and set that as a local variable and use it like the following example.
Something like this:
viewlatestlog(){
local loc="~/.logdir"
local name=$(echo $loc/*-19 | #awk or cut or sort or i could even loop it from 1-31 and 1-12 for the days and months.)
#I have ideas, but i know there has to be a better way to do this and it's not coming to me, maybe with expr or a couple of sort commands; i'm not sure, it would have been easier if i had made is so that each date number had 2 digits always... But I didn't
## But the ultimate goal is that i can run something like this command at the end
less $loc/$name
{
PS. For bonus points you could also tell me if there is a way to automatically copy the filename (with the location and all or without, I don't really care) to my linux clipboard, so when I'm making my note I can "link" to the log file if I ever need to go back to it...
Edit: Cleaned up post a little bit, I tend to my questions way too wordy, I apologize.

GNU sort can sort by fields:
$ find . -name whatever-changes-\* | sort -n -t- -k5 -k3 -k4
./whatever-changes-3-01-18
./whatever-changes-1-18-19
./whatever-changes-2-12-19
./whatever-changes-11-01-19
The option -t specifies the field delimiter and the option -k selects the fields starting with 1. The option -n specifies numeric sort.

Assuming your filenames do not contain tabs or newlines, how about:
loc="~/.logdir"
for f in "$loc"/* ; do
if [[ $f =~ -([0-9]{1,2})-([0-9]{1,2})-([0-9]{2})$ ]]; then
mm=${BASH_REMATCH[1]}
dd=${BASH_REMATCH[2]}
yy=${BASH_REMATCH[3]}
printf "%02d%02d%02d\t%s\n" "$yy" "$mm" "$dd" "$f"
fi
done | sort -r | head -n 1 | cut -f 2
First extract the month, date, and year from the filename.
Then create a date string formatted as "YYMMDD" and prepend to the
filename delimited by a tab character.
Then you can perform the sort command on the list.
Finally you can obtain the desired (latest) filename by extracting with top and cut.
Hope this helps.

Related

Iterating with awk over some thousend files and writing to the same files in one or two runs

I have a lot of files in their own directory. All have the same name structure:
2019-10-18-42-IV-Friday.md
2019-10-18-42-IV-Saturday.md
2019-10-18-42-IV-Sunday.md
2019-10-18-43-43-IV-Monday.md
2019-10-18-42-IV Tuesday.md
and so on.
This is in detail:
yyyy-mm-dd-dd-week of year-actual quarter-day of week.md
I want to write one line to each file as a second line:
With awk I want to extract and expand the dates from the file name and then write them to the appropriate file.
This is the point where I fail.
%!awk -F"-"-" '{print "Today is $6 ", the " $3"."$2"."$1", Kw "$4", in the" $5 ". Quarter."}'
That works well, I get the sentence I want to write into the files.
So put the whole thing in a loop:
ze.sh
#!/bin/bash
for i in *.md;
j = awk -F " " '{ print "** Today is " $6 ", the" $3"." $2"." $1", Kw " $4 ", in the " $5 ". Quarter. **"}' $i
Something with CAT, I suppose.
end
What do I have to do to make variable i iterate over all files, extract the values for j from $i, and then write $j to the second line of each file?
Thanks a lot for your help.
[Using manjaro linux and bash]
GNU bash, Version 5.0.11(1)-release (x86_64-pc-linux-gnu)
Linux version 5.2.21-1-MANJARO
Could you please try following(haven't tested it, GNU awk is needed for this). For writing date on 2nd line, I have chosen same format in which your Input_file has date in it.
awk -i inplace '
FNR==2{
split(FILENAME,array,"-")
print array[1]"-"array[2]"-"array[3]
}
1
' *.md
If possible try without -i inplace option first so that changes will not be saved into Input_file and once you are Happy with results then you can add it as shown above to code to make inplace changes into Input_file.
For inplace update supported awk versions see James sir's posted link.
Save modifications in place with awk
For updating a file in-place, sed is better suited than awk, because:
You don't need a recent version, older versions can do it too
Can work in both GNU and BSD flavors -> more portable
But first, to split a filename to its parts, you don't need an extra process, the read builtin can do it too. From your examples, we need to extract year, month, day, week numbers, a quarter string, and a weekday name string:
2019-10-18-42-IV-Friday.md
2019-10-18-42-IV-Saturday.md
2019-10-18-42-IV-Sunday.md
2019-10-18-43-43-IV-Monday.md
2019-10-18-42-IV Tuesday.md
For the first 3 lines, this simple expression would work:
IFS=-. read year month day week q dayname rest <<< "$filename"
The last line has a space before the weekday name instead of a -, but that's easy to fix:
IFS='-. ' read year month day week q dayname rest <<< "$filename"
Line 4 is harder to fix, because it has a different number of fields. To handle the extra field, we should add an extra variable term:
IFS='-. ' read year month day week q dayname ext rest <<< "$filename"
And then, if we can assume that the second 43 on that line can be ignored and we can just shift the arguments, then we use a conditional on the value of $ext.
That is, for most lines the value of ext will be md (the file extension).
If the value is different that means we have an extra field, and we should shift the values:
if [[ $ext != "md" ]; then
q=$dayname
dayname=$ext
fi
Now, we can use the variables to format the line you want to insert into the file:
line="Today is $dayname, the $day.$month.$year, Kw $week, in the $q. Quarter."
Finally, we can formulate a sed statement, for example to append our custom formatted line after the first one, ideally in a way that will work with both GNU and BSD flavors of sed.
This will work equivalently with both GNU and BSD versions:
sed -i.bak -e "1 a\\"$'\n'"$line"$'\n' "$filename" && rm *.bak
Notice that .bak backup files are created that must be manually removed.
If you don't want backup files to be created, then I'm afraid you need to use slightly different format for GNU and BSD flavors:
# GNU
sed -i'' -e "1 a\\"$'\n'"$line"$'\n' "$filename"
# BSD
sed -i '' -e "1 a\\"$'\n'"$line"$'\n' "$filename"
In fact if you only need to support GNU flavor, then a simpler form will work too:
sed -i'' "1 a$line" "$filename"
You can put all of that together in a for filename in *.md; do ...; done loop.
You probably want to feed the file name into the AWK script, using the '-' to separate the components.
This script assume the second line need to be appended the AWK output to the file:
for i in *.md ; do
echo $i | awk -F- 'AWK COMMAND HERE' >> $i
done
If the new text has to be inserted (as the second line) into the new file, the sed program can be used to perform update the file (using in-place edit '-i'). Something like
for i in *.md ; do
mark=$(echo $i | awk -F- 'AWK COMMAND HERE')
sed -i -e "2i$mark" $i
done
This is the best solution for me, especially because it copes with the different delimiters.
Many thanks to everyone who was interested in this question and especially to those who posted solutions.
I wish I hadn't made it so hard because I mistyped the example data.
This is now "my" variant of the solution:
for filename in *.md; do
IFS='-. ' read year month day week q dayname rest <<< "$filename"
line="Today is $dayname, the $day.$month.$year, Kw $week, in the $q. Quarter."
sed -i.bak -e "1 a\\"$'\n'"$line"$'\n' "$filename" && rm *.bak;
done
Because of the multiple field separators, the result is best to use.
But perhaps I am wrong, and the other solutions also offer the possibility of using different separators: At least '-' and '.' are required.
I am very surprised and pleased how quickly I received very good answers as a newcomer. Hopefully I can give something back.
And I'm also amazed how many different solutions are possible for the problems that arise.
If anyone is interested in what I've done, read on here:
I've had a fatal autoimmune disease for two years. Little by little, my brain is destroyed, intermittently.
Especially my memory has suffered a lot; I often don't remember what I did yesterday, learned what still has to be done.
That's why I created day files until 31.12.2030, with a markdown template for each day. There I then record what I have done and learned on those days and what still has to be done.
It was important to me to have the correct date within the individual file. Why no database, why markdown?
I want to have a format that I can use anywhere, on any device and with any OS. A format that doesn't belong to a company, that can change it or make it more expensive, that can take it off the market or limit it with licenses.
It's fast enough. The changes to 4,097 files as described above took less than 2 seconds on my i5 laptop (12 GB Ram, SSD).
Searching with fzf over all files is also very fast. I can simply have the files converted and output as what I just need.
My memory won't come back from this, but I have a chance to log what I forgot.
Thank you very much for your help and attention.

tail a log file from a specific line number

I know how to tail a text file with a specific number of lines,
tail -n50 /this/is/my.log
However, how do I make that line count a variable?
Let's say I have a large log file which is appended to daily by some program, all lines in the log file start with a datetime in this format:
Day Mon YY HH:MM:SS
Every day I want to output the tail of the log file but only for the previous days records. Let's say this output runs just after midnight, I'm not worried about the tail spilling over into the next day.
I just want to be able to work out how many rows to tail, based on the first occurrence of yesterdays date...
Is that possible?
Answering the question of the title, for anyone who comes here that way, head and tail can both accept a code for how much of the file to exclude.
For tail, use -n +num for the line number num to start at
For head, use -n -num for the number of lines not to print
This is relevant to the actual question if you have remembered the number of lines from the previous time you did the command, and then used that number for tail -n +$prevlines to get the next portion of the partial log, regardless of how often the log is checked.
Answering the actual question, one way to print everything after a certain line that you can grep is to use the -A option with a ridiculous count. This may be more useful than the other answers here as you can get a number of days of results. So to get everything from yesterday and so-far today:
grep "^`date -d yesterday '+%d %b %y'`" -A1000000 log_file.txt
You can combine 2 greps to print between 2 date ranges.
Note that this relies on the date actually occurring in the log file. It has the weakness that if no events were logged on a particular day used as the range marker, then it will fail to find anything.
To resolve that you could inject dummy records for the start and end dates and sort the file before grepping. This is probably overkill, though, and the sort may be expensive, so I won't example it.
I don't think tail has any functionality like this.
You could work out the beginning and ending line numbers using awk, but if you just want to exact those lines from the log file, the simplest way is probably to use grep combined with date to do it. Matching yesterday's date at beginning of line should work:
grep "^`date -d yesterday '+%d %b %y'`" < log_file.txt
You may need to adjust the date format to match exactly what you've got in the log file.
You can do it without tail, just grep rows with previous date:
cat my.log | grep "$( date -d "yesterday 13:00" '+%d %m %Y')"
And if you need line count you can add
| wc -l
I worked this out through trial and error by getting the line numbers for the first line containing the date and the total lines, as follows:
lines=$(wc -l < myfile.log)
start=$(cat myfile.log | grep -no $datestring | head -n1 | cut -f1 -d:)
n=$((lines-start))
and then a tail, based on that:
tail -n$n myfile.log

Last Day of Month in csvfile

i try to delete all days of a csv file which not matched last days. But I find not the right solution.
date,price
2018-07-02,162.17
2018-06-29,161.94
2018-06-28,162.22
2018-06-27,162.32
2018-06-12,163.01
2018-06-11,163.53
2018-05-31,164.87
2018-05-30,165.59
2018-05-29,165.42
2018-05-25,165.96
2018-05-02,164.94
2018-04-30,166.16
2018-04-27,166.69
The output I want become
date,price
2018-06-29,161.94
2018-05-31,164.87
2018-04-30,166.16
I try it with cut + grep
cut -d, -f1 file.csv | grep -E "28|29|30"
Work but bring nothing when combine -f1,2.
I find csvkit which seem to me the right tool, but I find not the solution for multiple grep.
csvgrep -c 1 -m 30 file.csv
Bring me the right result but how can combine multiple search option? I try -m 28,29,30 and -m 28 -m 29 -m 30 all work not. Best it work with last day of every month.
Maybe one have here a idea.
Thank you and nice Sunday
Silvio
You want to get all records of the LAST day of the month. But months vary in length (28-29-30-31).
I don't see why you used cut to extract the first field (the date part), because the data in the second field does not look like dates at all (xx-xx).
I suggest to use grep directly to display the lines that matches the following pattern mm-dd; where mm is the month number, and dd is the last day of the month.
This command should do the trick:
grep -E "01-31|02-(28|29)|03-31|04-30|05-31|06-30|07-31|08-30|09-31|10-30|11-31|12-30" file.csv
This command will give the following output:
2018-05-31,164.87
2018-04-30,166.16

sorting with terminal after grepping

I was hoping someone might be able to shed some light on how I could sort a set of grepped values in unix.
for example if I have a list such as;
qp_1_v2
qp_50_v1
qp_51_v4
qp_52_v1
qp_53_v1
qp_54_v2
qp_2_v1,
is there a way to sort numerically using the wildcard i.e sort qp_*_v1; where * would be read as a number and then sorted according to this (ignoring anything that came before and after the ). The problem I'm finding currently is that gp_52_v2 is always read as a string so I have to cut gp_ and _v to leave only the number and then sort.
I hope this makes sense...
Thanks in advance.
edit: A little addition that would be nice if anyone knows how to do it.. would be to grep and list values with the highest version i.e if gp_50 exists 3 times with the following suffixs _v1, _v2, _v3 it only lists gp_50_v3. As such this list will still consist of files with various versions but only the highest version of each file will be outputted to terminal.
ls | cut -d '_' -f 2 | sort
in your case substitute ls for your grep command
Edit: In the example I put before the output is cut, if you want the original name of the file use this:
ls | sort -k2,2g -t '_'
k is the number of the field to compare
g is the max number of characters to compare
t is the delimiter

Shell: Get a list of latest files by filename

I have files like
update-1.0.1.patch
update-1.0.2.patch
update-1.0.3.patch
update-1.0.4.patch
update-1.0.5.patch
And I have a variable that contains the last applied path (e.g. update-1.0.3.patch). So, now I have to get a list of files to apply (in the example updates 1.0.4 and 1.0.5). How can I get such list?
To clarify, I need a method to get a list of files that comes alphabetically later of given file and this list of files must be also in alphabetical order (obviously is not allways possible to apply the patch 1.0.5 before 1.0.4).
sed is your go-to guy for printing ranges of lines. You give it a starting/ending address or pattern and command to do between.
ls update-1.0.* | sort | sed -ne "/$ENVVAR/,// p"
The sort probably isn't necessary because ls can sort by name, but it might be good to include as a courtesy to maintainers to show the necessity. The -n to sed means "don't print every line automatically" and the -e means "I'm giving you a script on the command line". I used " to enclose the script to that $ENVVAR would be eval'd. The ending address is empty (//) and the p means "print the line".
Oh, and I just noticed you only want the ones later. There's probably a way to tell sed to start on the line after your address, but instead I'd pipe it through tail -n +2 to start on the second line.

Resources