Shell: Get a list of latest files by filename - shell

I have files like
update-1.0.1.patch
update-1.0.2.patch
update-1.0.3.patch
update-1.0.4.patch
update-1.0.5.patch
And I have a variable that contains the last applied path (e.g. update-1.0.3.patch). So, now I have to get a list of files to apply (in the example updates 1.0.4 and 1.0.5). How can I get such list?
To clarify, I need a method to get a list of files that comes alphabetically later of given file and this list of files must be also in alphabetical order (obviously is not allways possible to apply the patch 1.0.5 before 1.0.4).

sed is your go-to guy for printing ranges of lines. You give it a starting/ending address or pattern and command to do between.
ls update-1.0.* | sort | sed -ne "/$ENVVAR/,// p"
The sort probably isn't necessary because ls can sort by name, but it might be good to include as a courtesy to maintainers to show the necessity. The -n to sed means "don't print every line automatically" and the -e means "I'm giving you a script on the command line". I used " to enclose the script to that $ENVVAR would be eval'd. The ending address is empty (//) and the p means "print the line".
Oh, and I just noticed you only want the ones later. There's probably a way to tell sed to start on the line after your address, but instead I'd pipe it through tail -n +2 to start on the second line.

Related

How to check if stdout of a program is in a file?

I've attempted numerous times and tried different methods but cannot seem to get this to work. I am trying to run a python script and grep the output to see if it is contained in a file and if it is not I want to append it to said file.
$./scan_network.py 22 192.168.1.1 192.168.1.20 | if ! grep -q - ./results.log; then - >> results.log; fi
I understand that it macOS grep does not understand - as stdout and that then - >> would not work because it would not pick up stdout either. I am not sure what to do.
As stated before the primary goal is to check the output of the script against a file and if the IP address is not found in the file, it needs to be appended.
Edit:
results.log is currently an empty file. Output of scan_network.py on would be 192.168.1.6 for now. When I go to run it on another network the output would be numerous addresses in a range example being 10.234.x.y where x and y would be any number between 0 and 255.
One simple solution is to merge the log file and the output of the program into a new log file:
sort -u <(./scan_network.py 22 192.168.1.1 192.168.1.20) results.log > newresults.log
The -u flag causes duplicate lines to be removed from the output, so you will get only one of each line.
That has the side effect of reordering the lines (so that they are sorted alphabetically). It is possible to preserve order if necessary, but it gets more complicated.
With a reasonably modern gnu sort, you can use a "version number" sort, which will do a reasonable job of keeping IP numbers in logical order; you can use the -V flag to do that. Or you can sort the octets individually with sort -u -t. -k1,1n -k2,2n -k3,3n -k4,4n .... Or you can just live with lexicographic ordering. Do not just use -n for standard numeric sorting, because it will only examine the first octet, and that will have an unfortunate interaction with the -u option, since two lines which compare equal are considered duplicates. As numeric sort only considers the numeric prefix, there will be many false duplicates.
If you don't mind sorting and rewriting your log file, rici's helpful answer works well (note that simply using -V for true per-component numerical IP-address sorting is not an option on macOS, unfortunately).[1].
Here's an alternative that only appends to the existing log file on demand, in-place, without reordering existing lines:
grep -f results.log -xFv <(./scan_network.py 22 192.168.1.1 192.168.1.20) >> results.log
Note: This assumes that ./scan_network.py's output is line-based; pipe to tr to transform to line-based output, if necessary.
-f treats each line in the specified file as a separate search term, where a match of any term is considered an overall match.
-x matches lines in full
-F performs literal matching (doesn't interpret search terms as regular expressions)
-v only outputs lines that do not match
The net effect is that only lines output by ./scan_network.py ... that aren't already present in results.log are appended to results.log.
Note, however, that performance will likely suffer the larger results.log becomes, so rici's approach may be preferable in the long run, particularly, if the log file keeps growing and/or you want the log sorted by IP addresses anyway.
As for what you've tried:
Both GNU and BSD/macOS grep optionally accept - as a placeholder for stdin to accept the input from, but note that this operand is never needed, because grep reads input from stdin by default.
By contrast, only GNU grep accepts - as the option-argument to -f, i.e., the file containing the search terms to apply.
BSD/macOS requires either an explicit filename, a process substitution (as above), or, in a pinch, /dev/stdin to refer to stdin.
The logic of your search must be reversed: as in the command above, the existing log file contents must serve as the search terms (passed to -f), and the ./scan_network.py ... output must serve as the input in order to determine which lines are not (-v) already in the log file.
using - to represent stdin or stdout, depending on context, is a mere convention that only works as a command argument, so your attempt to refer to stdout output with if ...; then - >> results.log cannot work, because - is invariably interpreted as a command name.
If you use grep -q, stdout output is by definition suppressed, so there's nothing to pass on (even if you used a pipe).
[1] macOS's (OS X's) sort does not support -V for per-component version-number sorting (which can be applied to IP addresses too). Even though the macOS sort is a GNU sort, it is an ancient one - v5.93 as of macOS 10.12 - that predates support for -V.
Assuming that your script returns a single line of text, you can store the output in a variable and then grep for that string. For example:
logfile="results.log"
# save output to a shell variable
str=$(./scan_network.py 22 192.168.1.1 192.168.1.20)
# don't call grep twice for the same pattern
grep=$(grep -F "$str" "$logfile")
# append if grep results are empty
if [[ -z "$grep" ]]; then
echo "$grep" >> "$logfile"
fi

grep pipe searching for one word, not line

For some reason I cannot get this to output just the version of this line. I suspect it has something to do with how grep interprets the dash.
This command:
admin#DEV:~/TEMP$ sendemail
Yields the following:
sendemail-1.56 by Brandon Zehm
More output below omitted
The first line is of interest. I'm trying to store the version to variable.
TESTVAR=$(sendemail | grep '\s1.56\s')
Does anyone see what I am doing wrong? Thanks
TESTVAR is just empty. Even without TESTVAR, the output is empty.
I just tried the following too, thinking this might work.
sendemail | grep '\<1.56\>'
I just tried it again, while editing and I think I have another issue. Perhaps im not handling the output correctly. Its outputting the entire line, but I can see that grep is finding 1.56 because it highlights it in the line.
$ TESTVAR=$(echo 'sendemail-1.56 by Brandon Zehm' | grep -Eo '1.56')
$ echo $TESTVAR
1.56
The point is grep -Eo '1.56'
from grep man page:
-E, --extended-regexp
Interpret PATTERN as an extended regular expression (ERE, see below). (-E is specified by POSIX.)
-o, --only-matching
Print only the matched (non-empty) parts of a matching line, with each such part on a separate output
line.
Your regular expression doesn't match the form of the version. You have specified that the version is surrounded by spaces, yet in front of it you have a dash.
Replace the first \s with the capitalized form \S, or explicit set of characters and it should work.
I'm wondering: In your example you seem to know the version (since you grep for it), so you could just assign the version string to the variable. I assume that you want to obtain any (unknown) version string there. The regular expression for this in sed could be (using POSIX character classes):
sendemail |sed -n -r '1 s/sendemail-([[:digit:]]+\.[[:digit:]]+).*/\1/ p'
The -n suppresses the normal default output of every line; -r enables extended regular expressions; the leading 1 tells sed to only work on line 1 (I assume the version appears in the first line). I anchored the version number to the telltale string sendemail- so that potential other numbers elsewhere in that line are not matched. If the program name changes or the hyphen goes away in future versions, this wouldn't match any longer though.
Both the grep solution above and this one have the disadvantage to read the whole output which (as emails go these days) may be long. In addition, grep would find all other lines in the program's output which contain the pattern (if it's indeed emails, somebody might discuss this problem in them, with examples!). If it's indeed the first line, piping through head -1 first would be efficient and prudent.
jayadevan#jayadevan-Vostro-2520:~$ echo $sendmail
sendemail-1.56 by Brandon Zehm
jayadevan#jayadevan-Vostro-2520:~$ echo $sendmail | cut -f2 -d "-" | cut -f1 -d" "
1.56

Remove files base on integer.name

Ok what I am trying to do is very specific. I need some code that will remove files from a directory based on integer.name.
The files in the directory are listed like this
441.TERM (the # is actually a PID so it'll be random)
442.TERM
No matter what I always want to keep the first .TERM file & remove any .TERM file after that as no more than one should ever be created by my script, but it does happen sometimes due to some issues with the system I am scripting on. I only want it to effect my 000.TERM files any other files it finds in the directory can stay. So if directory contain any .TERM file an with an integer higher than the first one found then remove the .TERM files with higher integers.
PS. .TERM is not an extension just in case there is any confusion.
find /your/path -name "*.TERM" | sort -t. -k1 -n | tail -n +2 | xargs -r rm
Let's break it down:
find /your/path -name "*.TERM" will output a list of all .TERM files.
You could also use ls /your/path/*.TERM, but you may find the output unpredictable. (Example: your implementation may have -F on by default, which would cause every socket to end in a = in the list.)
sort sorts them by the first field (-k1) using a period as a separator (-t.). -n guarantees a numeric sort (such that 5 comes before 06).
tail -n +2 skips the first line and returns the rest
xargs rm sends every output line to an rm command, removing them. -r skips running the rm if there's no output piped in, but is listed as a GNU extension.
The script as above is fairly robust for your needs, but may fail if you have so many files in the directory that they don't fit on one command line, and might get you into trouble if any matching filenames somehow contain a newline.

How to find/replace and increment a matched number with sed/awk?

Straight to the point, I'm wondering how to use grep/find/sed/awk to match a certain string (that ends with a number) and increment that number by 1. The closest I've come is to concatenate a 1 to the end (which works well enough) because the main point is to simply change the value. Here's what I'm currently doing:
find . -type f | xargs sed -i 's/\(\?cache_version\=[0-9]\+\)/\11/g'
Since I couldn't figure out how to increment the number, I captured the whole thing and just appended a "1". Before, I had something like this:
find . -type f | xargs sed -i 's/\?cache_version\=\([0-9]\+\)/?cache_version=\11/g'
So at least I understand how to capture what I need.
Instead of explaining what this is for, I'll just explain what I want it to do. It should find text in any file, recursively, based on the current directory (isn't important, it could be any directory, so I'd configure that later), that matches "?cache_version=" with a number. It will then increment that number and replace it in the file.
Currently the stuff I have above works, it's just that I can't increment that found number at the end. It would be nicer to be able to increment instead of appending a "1" so that the future values wouldn't be "11", "111", "1111", "11111", and so on.
I've gone through dozens of articles/explanations, and often enough, the suggestion is to use awk, but I cannot for the life of me mix them. The closest I came to using awk, which doesn't actually replace anything, is:
grep -Pro '(?<=\?cache_version=)[0-9]+' . | awk -F: '{ print "match is", $2+1 }'
I'm wondering if there's some way to pipe a sed at the end and pass the original file name so that sed can have the file name and incremented number (from the awk), or whatever it needs that xargs has.
Technically, this number has no importance; this replacement is mainly to make sure there is a new number there, 100% for sure different than the last. So as I was writing this question, I realized I might as well use the system time - seconds since epoch (the technique often used by AJAX to eliminate caching for subsequent "identical" requests). I ended up with this, and it seems perfect:
CXREPLACETIME=`date +%s`; find . -type f | xargs sed -i "s/\(\?cache_version\=\)[0-9]\+/\1$CXREPLACETIME/g"
(I store the value first so all files get the same value, in case it spans multiple seconds for whatever reason)
But I would still love to know the original question, on incrementing a matched number. I'm guessing an easy solution would be to make it a bash script, but still, I thought there would be an easier way than looping through every file recursively and checking its contents for a match then replacing, since it's simply incrementing a matched number...not much else logic. I just don't want to write to any other files or something like that - it should do it in place, like sed does with the "i" option.
I think finding file isn't the difficult part for you. I therefore just go to the point, to do the +1 calculation. If you have gnu sed, it could be done in this way:
sed -r 's/(.*)(\?cache_version=)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge' file
let's take an example:
kent$ cat test
ello
barbaz?cache_version=3fooooo
bye
kent$ sed -r 's/(.*)(\?cache_version=)([0-9]+)(.*)/echo "\1\2$((\3+1))\4"/ge' test
ello
barbaz?cache_version=4fooooo
bye
you could add -i option if you like.
edit
/e allows you to pass matched part to external command, and do substitution with the execution result. Gnu sed only.
see this example: external command/tool echo, bc are used
kent$ echo "result:3*3"|sed -r 's/(result:)(.*)/echo \1$(echo "\2"\|bc)/ge'
gives output:
result:9
you could use other powerful external command, like cut, sed (again), awk...
Pure sed version:
This version has no dependencies on other commands or environment variables.
It uses explicit carrying. For carry I use the # symbol, but another name can be used if you like. Use something that is not present in your input file.
First it finds SEARCHSTRING<number> and appends a # to it.
It repeats incrementing digits that have a pending carry (that is, have a carry symbol after it: [0-9]#)
If 9 was incremented, this increment yields a carry itself, and the process will repeat until there are no more pending carries.
Finally, carries that were yielded but not added to a digit yet are replaced by 1.
sed "s/SEARCHSTRING[0-9]*[0-9]/&#/g;:a {s/0#/1/g;s/1#/2/g;s/2#/3/g;s/3#/4/g;s/4#/5/g;s/5#/6/g;s/6#/7/g;s/7#/8/g;s/8#/9/g;s/9#/#0/g;t a};s/#/1/g" numbers.txt
This perl command will search all files in current directory (without traverse it, you will need File::Find module or similar for that more complex task) and will increment the number of a line that matches cache_version=. It uses the /e flag of the regular expression that evaluates the replacement part.
perl -i.bak -lpe 'BEGIN { sub inc { my ($num) = #_; ++$num } } s/(cache_version=)(\d+)/$1 . (inc($2))/eg' *
I tested it with file in current directory with following data:
hello
cache_version=3
bye
It backups original file (ls -1):
file
file.bak
And file now with:
hello
cache_version=4
bye
I hope it can be useful for what you are looking for.
UPDATE to use File::Find for traversing directories. It accepts * as argument but will discard them with those found with File::Find. The directory to begin the search is the current of execution of the script. It is hardcoded in the line find( \&wanted, "." ).
perl -MFile::Find -i.bak -lpe '
BEGIN {
sub inc {
my ($num) = #_;
++$num
}
sub wanted {
if ( -f && ! -l ) {
push #ARGV, $File::Find::name;
}
}
#ARGV = ();
find( \&wanted, "." );
}
s/(cache_version=)(\d+)/$1 . (inc($2))/eg
' *
This is ugly (I'm a little rusty), but here's a start using sed:
orig="something1" ;
text=`echo $orig | sed "s/\([^0-9]*\)\([0-9]*\)/\1/"` ;
num=`echo $orig | sed "s/\([^0-9]*\)\([0-9]*\)/\2/"` ;
echo $text$(($num + 1))
With an original filename ($orig) of "something1", sed splits off the text and numeric portions into $text and $num, then these are combined in the final section with an incremented number, resulting in something2.
Just a start since it doesn't consider cases with numbers within the file name or names with no number at the end, but hopefully helps with your original goal of using sed.
This can actually be simplified within sed by using buffers, I believe (sed can operate recursively), but I'm really rusty with that aspect of it.
perl -pi -e 's/(\?cache_version=)(\d+)/$1.($2+1)/ge' FILE [FILE...]
or for a complete solution:
find . -type f | xargs perl -pi -e 's/(\?cache_version=)(\d+)/$1.($2+1)/ge'
perl substitution operator
/e modifier evaluates the replacement as if it were a Perl statement, using its return value as the replacement text.
. operator concatenates strings in Perl. The parentheses ensures that the arithmetic operation $2+1 takes precedence over concatenation.
/g modifier applies substitution to all matched strings within line
perl options
-p ensures that perl will execute the command on every line of each file
-i ensures that each file will be edited inplace
-e specifies the perl command(s) that are executed (in this case, the substitution operation)

Need help interpreting this sed command

I'm looking through an Oracle script I found online, but it runs a sed command to filter results from a trace file. I'm running Oracle on a Windows server, so the sed command isn't recognized.
host sed -n '/scattered/s/.*p3=//p' &Trace_Name | sort -n | tail -1
I've tried reading the online documentation, but am still not sure how to interpret what this command is trying to filter. Would anyone be so kind as to help me interpret what this command is trying to filter? Or better yet, what I can run from a Windows command prompt to achieve the same result.
Thanks!
It says "on lines that contain 'scattered' replace zero or more of any character followed by 'p3=' with nothing (delete it, in other words) and print the result" (-n says don't print lines unless there's an explicit print command).
For this example input:
abc organized p3=123
def scattered p3=456
ghi ordered p3=789
The output would be:
456
The sed command is searching a file for strings that match a pattern. The "-n" option will suppress output that is not explicitly printed. The "p" at the end says to print the lines that match the preceding pattern.
sort -n does a numerical sort.
tail -1 prints the last line, only.
So, it seems to be searching for scattered disk reads and printing the line with the biggest value.
I think the regular expression pattern is eliminating everything up to and including "p3=". The s/from/to/" is a substitute command.
The sed command works with cygwin, a unix-like shell for Windows.
To address the other part of your question, the unxutils project ports many GNU utilities to Win32, including sed. Find out more.

Resources