Incrementing Numbers & Counting with sed syntax - bash

I am trying to wrap my head around sed and thought it would be best to try using something simple yet useful. At work I want to keep count on a small LCD display each time a specific script is run by users. I am currently doing this with a total count using the following syntax:
oldnum=`cut -d ':' -f2 TotalCount.txt`
newnum=`expr $oldnum + 1`
sed -i "s/$oldnum\$/$newnum/g" TotalCount.txt
This modifies the file that has this one line in it:
Total Recordings:0
Now I want to elaborate a little and increment the numbers starting at midnight and resetting to zero at 23:59:59 each day. I created a secondary .txt file for the display to read from with only one single line in it:
Total Recordings Today:0
But the syntax is not going to be the same. How must the above sed syntax be changed to change the number in the dialog of the second file?
I can change and reset the files using sed/bash in conjunction with a simple cron job on a schedule. The problem is that I can't figure out the syntax of sed to replicate the same effect as I originally got to work. Can anyone help please, I have been reading for hours on this, finally decided to post this and just make a pot of coffee. I have a 4 line LCD and would love to track counts across schedules if it is easy enough to learn the syntax.

sed should work fine for doing increments on both Total Recordings:, or Total Recordings Today: in your file since it's looking for the same pattern. To reset it each day at a certain time I would recommend a cronjob.
0 0 * * * echo \"Total Recordings Today:0\" > /path/to/TotalCount.txt >/dev/null 2>&1
The other things I would encourage is to use the newer style syntax $( ... ) for the shell expansion, and create a variable for your TotalCount.txt file.
#!/bin/bash
totals=/path/to/TotalCount.txt
oldnum=$(cut -d ':' -f2 "$totals")
newnum=$((oldnum + 1))
sed -i "s/$oldnum\$/$newnum/g" "$totals"
This way you can easily reuse it for whatever else you want to do with it, quote it properly and simplfy your code. Note: on OS X sed inplace expansion would need to be sed -i ''.
Whenever in doubt, http://shellcheck.net is a really nice tool to help find mistakes in your code.

although you're looking for a sed solution, cannot resist to post how it can be done in awk
$ awk -F: -v OFS=: '{$2++}1' file > temp && temp > file
-F: set the input field delimiter and -v OFS=: output field delimiter to :, awk parses the second field and increments by one, 1 is a shorthand for print (can be replaced with any "true" value); output will be written to a temp file and if successful will overwrite the original input file (to mimic in-place edit).

Sed is a fine tool, but notoriously not the best for arithmetic. You could make what you already have work by initializing the counter to zero prior to incrementing it, if the file was not last modified today (or does not exist):
[ `date +%Y-%m-%d` != "`stat --printf %z TotalCount.txt 2> /dev/null|cut -d ' ' -f 1`" ] && echo "Total Recordings Today:0" > TotalCount.txt
To do same with shifts, you would likely calculate shift "ordinal number" by subtracting first shift start since midnight (say 7 * 3600) from seconds since epoch (which is a midnight) and dividing by length of shift (8 * 3600) and initialize the counter if that changes. Something like:
[ $(((`date +%s` - 7 * 3600) / (8 * 3600))) -gt $(((`stat --printf %Z TotalCount.txt 2> /dev/null` - 7 * 3600) / (8 * 3600))) ] && echo "Total Recordings This Shift:0" > TotalCount.txt

Related

Associate all the filenames at different paths along with their time interval in Unix

I have multiple files (with .txt or .ext format) at different directories.
The files path is stored in a variable say var.
I want to pick all the filenames as well the time interval (in hours) since the file was last placed.
The time interval will be current time - the last modification time.
Let's say
The file is at /Files/New directory with below time :
-rwxrwxrwx 1 ad.sam unx_9998_access 0 Nov 9 08:43 out.txt
I want the file name i.e out.txt and the interval ( in hrs) together.
This want to do for all the files in different paths (in the var variable).
So expected output is :
out.txt,12
abc.txt,9
pqr.txt,7
I am able to pull those details separately in different variables like below:
Files_in_Path=`ls -ltr | awk '{ print $9 }'`
TIMEDIFF=echo $(( ($(date +%s) - $(stat $Files_in_Path -c %Y)) / 3600 ))
But I am not able to associate it together like filename,interval for all the files.
It's not really clear what your expected output is. If it's enough to print the file name and its age side by side, try
now=$(date +%s)
for file in ./*; do
then=$(stat "$file" -c '%Y')
printf '%s,%i\n' "$file" $(( (now - then) / 3600))
done
Notice also how we don't use ls in scripts and more tangentially that
TIMEDIFF=echo $((1))
doesn't actually assign the evaluated value of $((1)) to TIMEDIFF -- instead, it temporarily assigns the string echo to TIMEDIFF and attempts to evaluate the value as a command (so you would get a 1: command not found unless you happen to have a command whose name is 1).

Iterating with awk over some thousend files and writing to the same files in one or two runs

I have a lot of files in their own directory. All have the same name structure:
2019-10-18-42-IV-Friday.md
2019-10-18-42-IV-Saturday.md
2019-10-18-42-IV-Sunday.md
2019-10-18-43-43-IV-Monday.md
2019-10-18-42-IV Tuesday.md
and so on.
This is in detail:
yyyy-mm-dd-dd-week of year-actual quarter-day of week.md
I want to write one line to each file as a second line:
With awk I want to extract and expand the dates from the file name and then write them to the appropriate file.
This is the point where I fail.
%!awk -F"-"-" '{print "Today is $6 ", the " $3"."$2"."$1", Kw "$4", in the" $5 ". Quarter."}'
That works well, I get the sentence I want to write into the files.
So put the whole thing in a loop:
ze.sh
#!/bin/bash
for i in *.md;
j = awk -F " " '{ print "** Today is " $6 ", the" $3"." $2"." $1", Kw " $4 ", in the " $5 ". Quarter. **"}' $i
Something with CAT, I suppose.
end
What do I have to do to make variable i iterate over all files, extract the values for j from $i, and then write $j to the second line of each file?
Thanks a lot for your help.
[Using manjaro linux and bash]
GNU bash, Version 5.0.11(1)-release (x86_64-pc-linux-gnu)
Linux version 5.2.21-1-MANJARO
Could you please try following(haven't tested it, GNU awk is needed for this). For writing date on 2nd line, I have chosen same format in which your Input_file has date in it.
awk -i inplace '
FNR==2{
split(FILENAME,array,"-")
print array[1]"-"array[2]"-"array[3]
}
1
' *.md
If possible try without -i inplace option first so that changes will not be saved into Input_file and once you are Happy with results then you can add it as shown above to code to make inplace changes into Input_file.
For inplace update supported awk versions see James sir's posted link.
Save modifications in place with awk
For updating a file in-place, sed is better suited than awk, because:
You don't need a recent version, older versions can do it too
Can work in both GNU and BSD flavors -> more portable
But first, to split a filename to its parts, you don't need an extra process, the read builtin can do it too. From your examples, we need to extract year, month, day, week numbers, a quarter string, and a weekday name string:
2019-10-18-42-IV-Friday.md
2019-10-18-42-IV-Saturday.md
2019-10-18-42-IV-Sunday.md
2019-10-18-43-43-IV-Monday.md
2019-10-18-42-IV Tuesday.md
For the first 3 lines, this simple expression would work:
IFS=-. read year month day week q dayname rest <<< "$filename"
The last line has a space before the weekday name instead of a -, but that's easy to fix:
IFS='-. ' read year month day week q dayname rest <<< "$filename"
Line 4 is harder to fix, because it has a different number of fields. To handle the extra field, we should add an extra variable term:
IFS='-. ' read year month day week q dayname ext rest <<< "$filename"
And then, if we can assume that the second 43 on that line can be ignored and we can just shift the arguments, then we use a conditional on the value of $ext.
That is, for most lines the value of ext will be md (the file extension).
If the value is different that means we have an extra field, and we should shift the values:
if [[ $ext != "md" ]; then
q=$dayname
dayname=$ext
fi
Now, we can use the variables to format the line you want to insert into the file:
line="Today is $dayname, the $day.$month.$year, Kw $week, in the $q. Quarter."
Finally, we can formulate a sed statement, for example to append our custom formatted line after the first one, ideally in a way that will work with both GNU and BSD flavors of sed.
This will work equivalently with both GNU and BSD versions:
sed -i.bak -e "1 a\\"$'\n'"$line"$'\n' "$filename" && rm *.bak
Notice that .bak backup files are created that must be manually removed.
If you don't want backup files to be created, then I'm afraid you need to use slightly different format for GNU and BSD flavors:
# GNU
sed -i'' -e "1 a\\"$'\n'"$line"$'\n' "$filename"
# BSD
sed -i '' -e "1 a\\"$'\n'"$line"$'\n' "$filename"
In fact if you only need to support GNU flavor, then a simpler form will work too:
sed -i'' "1 a$line" "$filename"
You can put all of that together in a for filename in *.md; do ...; done loop.
You probably want to feed the file name into the AWK script, using the '-' to separate the components.
This script assume the second line need to be appended the AWK output to the file:
for i in *.md ; do
echo $i | awk -F- 'AWK COMMAND HERE' >> $i
done
If the new text has to be inserted (as the second line) into the new file, the sed program can be used to perform update the file (using in-place edit '-i'). Something like
for i in *.md ; do
mark=$(echo $i | awk -F- 'AWK COMMAND HERE')
sed -i -e "2i$mark" $i
done
This is the best solution for me, especially because it copes with the different delimiters.
Many thanks to everyone who was interested in this question and especially to those who posted solutions.
I wish I hadn't made it so hard because I mistyped the example data.
This is now "my" variant of the solution:
for filename in *.md; do
IFS='-. ' read year month day week q dayname rest <<< "$filename"
line="Today is $dayname, the $day.$month.$year, Kw $week, in the $q. Quarter."
sed -i.bak -e "1 a\\"$'\n'"$line"$'\n' "$filename" && rm *.bak;
done
Because of the multiple field separators, the result is best to use.
But perhaps I am wrong, and the other solutions also offer the possibility of using different separators: At least '-' and '.' are required.
I am very surprised and pleased how quickly I received very good answers as a newcomer. Hopefully I can give something back.
And I'm also amazed how many different solutions are possible for the problems that arise.
If anyone is interested in what I've done, read on here:
I've had a fatal autoimmune disease for two years. Little by little, my brain is destroyed, intermittently.
Especially my memory has suffered a lot; I often don't remember what I did yesterday, learned what still has to be done.
That's why I created day files until 31.12.2030, with a markdown template for each day. There I then record what I have done and learned on those days and what still has to be done.
It was important to me to have the correct date within the individual file. Why no database, why markdown?
I want to have a format that I can use anywhere, on any device and with any OS. A format that doesn't belong to a company, that can change it or make it more expensive, that can take it off the market or limit it with licenses.
It's fast enough. The changes to 4,097 files as described above took less than 2 seconds on my i5 laptop (12 GB Ram, SSD).
Searching with fzf over all files is also very fast. I can simply have the files converted and output as what I just need.
My memory won't come back from this, but I have a chance to log what I forgot.
Thank you very much for your help and attention.

Delete lines in file over an hour old using timestamps bash

Having a bit of bother trying to get the following to work.
I have a file containing hostname:timestamp as below:
hostname1:1445072150
hostname2:1445076364
I am trying to create a bash script that will query this file (using a cron job) to check if the timestamp is over 1 hour old and if so, remove the line.
Below is what I have so far but it doesn't appear to be removing the line in the file.
#!/bin/bash
hosts=/tmp/hosts
current_timestamp=$(date +%s)
while read line; do
hostname=`echo $line | sed -e 's/:.*//g'`
timestamp=`echo $line | cut -d ":" -f 2`
diff=$(($current_timestamp-$timestamp))
if [ $diff -ge 3600 ]; then
echo "$hostname - Timestamp over an hour old. Deleting line."
sed -i '/$hostname/d' $hosts
fi
done <$hosts
I have managed to get the timestamp part working correctly in identifying hosts that are over an hour old but having trouble removing the time from the file.
I suspect it may be due to the while loop keeping the file open but not 100% sure how to work around it. Also tried making a copy of the file and editing that but still nothing.
ALTERNATIVELY: If there is a better way to get this to work and produce the same result, I am open to suggestions :)
Any help would be much appreciated.
Cheers
The problem in your script was just this line:
sed -i '/$hostname/d' $hosts
Variables inside single-quotes are not expanded to their values,
so the command is trying to replace literally "$hostname", instead of its value. If you replace the single-quotes with double-quotes,
the variable will get expanded to its value, which is what you need here:
sed -i "/$hostname/d" $hosts
There are improvements possible:
#!/bin/bash
hosts=/tmp/hosts
current_timestamp=$(date +%s)
while read line; do
set -- ${line/:/ }
hostname=$1
timestamp=$2
((diff = current_timestamp - timestamp))
if ((diff >= 3600)); then
echo "$hostname - Timestamp over an hour old. Deleting line."
sed -i "/^$hostname:/d" $hosts
fi
done <$hosts
The improvements:
More strict pattern in the sed command, to make it more robust and to avoid some potential errors
Simpler way to extract hostname part and timestamp part without any sub-shells
Simpler arithmetic operations by enclosing within ((...))
You ask for alternatives — use awk:
awk -F: -v ts=$(date +%s) '$2 <= ts-3600 { next }' $hosts > $hosts.$$
mv $hosts.$$ $hosts
The ts=$(date +%s) sets the awk variable ts to the value from date. The script skips any lines where the value in the second column (after the first colon) is smaller than the threshold. You could do the subtraction once in a BEGIN block if you wanted to. Decide whether <= or < is correct for your purposes.
If you need to know which lines are deleted, you can add
printf "Deleting %s - timestamp %d older than %d\n", $1, $2, (ts-3600) >/dev/stderr;
before the next to print the information on standard error. If you must write that to standard output, then you need to arrange for retained lines to be written to a file with print > file as an alternative action after the filter condition (passing -v file="$hosts.$$" as another pair of arguments to awk). The tweaks that can be made are endless.
If the file is of any significant size, it will be quicker to copy the relevant subsection of the file once to a temporary file and then to the final file than to edit the file in place multiple times as in the original code. If the file is small enough, there isn't a problem.

Doubts about bash script efficiency

I have to accomplish a relatively simple task, basically i have an enormous amount of files with the following format
"2014-01-27","07:20:38","data","data","data"
Basically i would like to extract the first 2 fields, convert them into a unix epoch date, add 6 hours to it (due timezone difference), and replace the first 2 original columns with the resulting milliseconds (unix epoch, since 19700101 converted to mills)
I have written a script that works fine, well, the issue is that is very very slow, i need to run this over 150 files with a total line count of more then 5.000.000 and i was wondering if you had any advice about how could i make it faster, here it is:
#!/bin/bash
function format()
{
while read line; do
entire_date=$(echo ${line} | cut -d"," -f1-2);
trimmed_date=$(echo ${entire_date} | sed 's/"//g;s/,/ /g');
seconds=$(date -d "${trimmed_date} + 6 hours" +%s);
millis=$((${seconds} * 1000));
echo ${line} | sed "s/$entire_date/\"$millis\"/g" >> "output"
done < $*
}
format $*
You are spawning a significant number of processes for each input line. Probably half of those could easily be factored away, by quick glance, but I would definitely recommend a switch to Perl or Python instead.
perl -MDate::Parse -pe 'die "$0:$ARGV:$.: Unexpected input $_"
unless s/(?<=^")([^"]+)","([^"]+)(?=")/ (str2time("$1 $2")+6*3600)*1000 /e'
I'd like to recommend Text::CSV but I do not have it installed here, and if you have requirements to not touch the fields after the second at all, it might not be what you need anyway. This is quick and dirty but probably also much simpler than a "proper" CSV solution.
The real meat is the str2time function from Date::Parse, which I imagine will be a lot quicker than repeatedly calling date (ISTR it does some memoization internally so it can do nearby dates quickly). The regex replaces the first two fields with the output; note the /e flag which allows Perl code to be evaluated in the replacement part. The (?<=^") and (?=") zero-width assertions require these matches to be present but does not include them in the substitution operation. (I originally substituted the enclosing double quotes, but with this change, they are retained, as apparently you want to keep them.)
Change the die to a warn if you want the script to continue in spite of errors (maybe redirect standard error to a file then!)
I have tried to avoid external commands (except date) to gain time. Tests show that it is 4 times faster than your code. (Okay, the tripleee's perl solution is 40 times faster than mine !)
#! /bin/bash
function format()
{
while IFS=, read date0 date1 datas; do
date0="${date0//\"/}"
date1="${date1//\"/}"
seconds=$(date -d "$date0 $date1 + 6 hours" +%s)
echo "\"${seconds}000\",$datas"
done
}
output="output.txt"
# Process each file in argument
for file ; do
format < "$file"
done >| "$output"
exit 0
Using the exist function mktime in awk, tested, it is faster than perl.
awk '{t=$2 " " $4;gsub(/[-:]/," ",t);printf "\"%s\",%s\n",(mktime(t)+6*3600)*1000,substr($0,25)}' FS=\" OFS=\" file
Here is the test result.
$ wc -l file
1244 file
$ time awk '{t=$2 " " $4;gsub(/[-:]/," ",t);printf "\"%s\",%s\n",(mktime(t)+6*3600)*1000,substr($0,25)}' FS=\" OFS=\" file > /dev/null
real 0m0.172s
user 0m0.140s
sys 0m0.046s
$ time perl -MDate::Parse -pe 'die "$0:$ARGV:$.: Unexpected input $_"
unless s/(?<=^")([^"]+)","([^"]+)(?=")/ (str2time("$1 $2")+6*3600)*1000 /e' file > /dev/null
real 0m0.328s
user 0m0.218s
sys 0m0.124s

Grep outputs multiple lines, need while loop

I have a script which uses grep to find lines in a text file (ics calendar to be specific)
My script finds a date match, then goes up and down a few lines to copy the summary and start time of the appointment into a separate variable. The problem I have is that I'm going to have multiple appointments at the same time, and I need to run through the whole process for each result in grep.
Example:
LINE=`grep -F -n 20130304T232200 /path/to/calendar.ics | cut -f1 d:`
And it outputs only the lines, such as
86 89
Then it goes on to capture my other variables, as such:
SUMMARYLINE=$(( $LINE + 5 ))
SUMMARY:`sed -n "$SUMMARYLINE"p /path/to/calendar.ics
my script runs fine with one output, but it obviously won't work with more than 1 and I need for it to. should I send the grep results into an array? a separate text file to read from? I'm sure I'll need a while loop in here somehow. Need some help please.
You can call grep from a loop quite easily:
while IFS=':' read -r LINE notused # avoids the use of cut
do
# First field is now in $LINE
# Further processing
done < <(grep -F -n 20130304T232200 /path/to/calendar.ics)
However, if the file is not too large then it might be easier to read the whole file into an array and more around that.
With your proposed solution, you are reading through the file several times. Using awk, you can do it in one pass:
awk -F: -v time=20130304T232200 '
$1 == "SUMMARY" {summary = substr($0,9)}
/^DTSTART/ {start = $2}
/^END:VEVENT/ && start == time {print summary}
' calendar.ics

Resources