How to sort files by mtime and pass to grep? [duplicate] - bash

This question already has answers here:
grep without showing path/file:line
(3 answers)
Closed 4 years ago.
I try to parse log files - get some values from strings and write it into file
First, I get the list of files sorted by mtime.
find . -name log* -printf '%Tm%Tm%Td%TH%TM%TS %p\n'| sort | awk '{print $2}'
it works correctly and prints list of files
For example
./2015195/log/log.08
./2015486/log/log.10
./2015418/log/log.13
./2015415/log/log.14
./2015015/log/log.18
./2015715/log/log.19
./2015115/log/2015-09-10/log.21
...
Next, pass through this list and print words from lines with specific pattern
grep 'pattern' $(find . -name log* -printf '%Tm%Tm%Td%TH%TM%TS %p\n'| sort | awk '{print $2}') | awk '{print $1" "$4}' > prsd.txt
It works but it adds file name to every output line like
./2015195/log/log.08:02:01:09,811 12345ABCD
./2015195/log/log.08:02:02:01:09,975 12345CDEF
./2015195/log/log.08:12:02:02:01:09,978 12345EFGF
./2015195/log/log.08:02:02:01:10,223 12345LJIG
./2015195/log/log.08:02:01:10,275 12345IIUY
...
Here the problem, how to delete those additions?
thanks in advance.

From man grep,
-h, --no-filename
Suppress the prefixing of file names on output. This is the
default when there is only one file (or only standard input)
to search.

Related

How to get "Find" command to ONLY return files and no directories [duplicate]

This question already has answers here:
How to list only files and not directories of a directory Bash?
(12 answers)
Closed 2 years ago.
I'm developing a little desktop application that lists the files in a given directory.
The table on the left gets compared to the table on the right. If anything on the right is missing, the text of the files on the left will turn red. Simple.
What I'm throwing at these tables is a path that contains multiple subfolders of images.
These are saved as a variable so I cannot explicitly declare "omit a directory with this name".
Using this as an example:
I get this output:
And I'd like to get this output:
How do I get the find command to return ONLY the file names? No directory names at all.
Is this possible?
As of now I have:
"find -x "+ path + " | sed 's!.*/!!' | grep -v CaptureOne | grep -v .DS_Store | grep -v .cop | grep -v .cof | grep -v .cot | grep -v .cos | grep -v Settings131 | grep -v Proxies | grep -v Thumbnails | grep -v Cache | sort"
This does get me only the file names and not the full paths. Which I want.
And it doesn't include other file extensions and folders that I know will exist.
Like I said - I've never gone down this path and the above code could probably be done in a much easier way. Open to suggestions!
To limit yourself to files, use -type f (to exclude directories, you'd use ! -type d). To get just the basename, pipe the output through basename. All together, that'd be:
find . -type f -print0 | xargs -0 basename
(The -print0 and -0 uses NUL as the separator so that whitespace in pathnames is preserved.)

filename group by a pattern and select only one from each group

I have following files(as an example, 60000+ actually) and all the log files follows this pattern:
analyse-ABC008795-84865-201911261249.log
analyse-ABC008795-84866-201911261249.log
analyse-ABC008795-84867-201911261249.log
analyse-ABC008795-84868-201911261249.log
analyse-ABC008795-84869-201911261249.log
analyse-ABC008796-84870-201911261249.log
analyse-ABC008796-84871-201911261249.log
analyse-ABC008796-84872-201911261249.log
analyse-ABC008796-84873-201911261249.log
Only numbers get change in log files. I want to take one file from each category where files should be categorized by ABC.... number. So, as you can see, there are only two categories here:
analyse-ABC008795
analyse-ABC008796
So, what I want to have is one file(let's say first file) from each category. Output should look like this:
analyse-ABC008795-84865-201911261249.log
analyse-ABC008796-84870-201911261249.log
This should be done in Bash/linux environment, so that after I get this, I should use grep to check if my "searching string" contain in those files
ls -l | <what should I do to group and get one file from each category> | grep "searching string"
With bash and awk.
files=(*.log)
printf '%s\n' "${files[#]}" | awk -F- '!seen[$2]++'
Or use find instead of a bash array for a more portable approach.
find . -type f -name '*.log' | awk -F- '!seen[$2]++'
If your find has the -printf flag and you don't want the leading ./ from the filename add it before the pipe |
-printf '%f\n'
The !seen[$2]++ Remove second and subsequent instances of each input line, without having to sort them first. The $2 means the second field which -F is using.

Find all occurrences of a word in .txt files with GIT Bash on Windows [duplicate]

This question already has an answer here:
Using regex in Grep for Windows command line
(1 answer)
Closed 2 years ago.
Trying to find all occurrences of a word in a range of different .txt files in a directory.
I'm looking for commands that will give me an exact count if possible.
so far I have tried:
$ grep -w 'string' *
and:
$ grep --include=\*.{txt} -rnw desktop/testfiles/ -e "string"
The first outputs its entire contents and the second doesn't seem to do anything.
Any ideas?
simply :
grep -w "string" *.txt | wc -l
also GNU awk allows it to be done in single command
awk -v w="string" '$1==w{n++} END{print n}' RS=' |\n' *.txt

SHELL printing just right part after . (DOT)

I need to find just extension of all files in directory (if there are 2 same extensions, its just one). I already have it. But the output of my script is like
test.txt
test2.txt
hello.iso
bay.fds
hellllu.pdf
Im using grep -e -e '.' and it just highlight DOTs
And i need just these extensions give in one variable like txt,iso,fds,pdf
Is there anyone who could help? I already had it one time but i had it on array. Today I found out It's has to work on dash too.
You can use find with awk to get all unique extensions:
find . -type f -name '?*.?*' -print0 |
awk -F. -v RS='\0' '!seen[$NF]++{print $NF}'
can be done with find as well, but I think this is easier
for f in *.*; do echo "${f##*.}"; done | sort -u
if you want to assign a comma separated list of the unique extensions, you can follow this
ext=$(for f in *.*; do echo "${f##*.}"; done | sort -u | paste -sd,)
echo $ext
csv,pdf,txt
alternatively with ls
ls -1 *.* | rev | cut -d. -f1 | rev | sort -u | paste -sd,
rev/rev is required if you have more than one dot in the filename, assuming the extension is after the last dot. For any other directory simply change the part *.* to dirpath/*.* in all scripts.
I'm not sure I understand your comment. If you don't assign to a variable, by default it will print to the output. If you want to pass directory name as a variable to a script, put the code into a script file and replace dirpath with $1, assuming that will be your first argument to the script
#!/bin/bash
# print unique extension in the directory passed as an argument, i.e.
ls -1 "$1"/*.* ...
if you have sub directories with extensions above scripts include them as well, to limit only to file types replace ls .. with
find . -maxdepth 1 -type f -name "*.*" | ...

How to sort the results of find (including nested directories) alphabetically in bash

I have a list of directories based on the results of running the "find" command in bash. As an example, the result of find are the files:
test/a/file
test/b/file
test/file
test/z/file
I want to sort the output so it appears as:
test/file
test/a/file
test/b/file
test/z/file
Is there any way to sort the results within the find command, or by piping the results into sort?
If you have the GNU version of find, try this:
find test -type f -printf '%h\0%d\0%p\n' | sort -t '\0' -n | awk -F '\0' '{print $3}'
To use these file names in a loop, do
find test -type f -printf '%h\0%d\0%p\n' | sort -t '\0' -n | awk -F '\0' '{print $3}' | while read file; do
# use $file
done
The find command prints three things for each file: (1) its directory, (2) its depth in the directory tree, and (3) its full name. By including the depth in the output we can use sort -n to sort test/file above test/a/file. Finally we use awk to strip out the first two columns since they were only used for sorting.
Using \0 as a separator between the three fields allows us to handle file names with spaces and tabs in them (but not newlines, unfortunately).
$ find test -type f
test/b/file
test/a/file
test/file
test/z/file
$ find test -type f -printf '%h\0%d\0%p\n' | sort -t '\0' -n | awk -F'\0' '{print $3}'
test/file
test/a/file
test/b/file
test/z/file
If you are unable to modify the find command, then try this convoluted replacement:
find test -type f | while read file; do
printf '%s\0%s\0%s\n' "${file%/*}" "$(tr -dc / <<< "$file")" "$file"
done | sort -t '\0' | awk -F'\0' '{print $3}'
It does the same thing, with ${file%/*} being used to get a file's directory name and the tr command being used to count the number of slashes, which is equivalent to a file's "depth".
(I sure hope there's an easier answer out there. What you're asking doesn't seem that hard, but I am blanking on a simple solution.)
find test -type f -printf '%h\0%p\n' | sort | awk -F'\0' '{print $2}'
The result of find is, for example,
test/a'\0'test/a/file
test'\0'test/file
test/z'\0'test/z/file
test/b'\0'test/b/text file.txt
test/b'\0'test/b/file
where '\0' stands for null character.
These compound strings can be properly sorted with a simple sort:
test'\0'test/file
test/a'\0'test/a/file
test/b'\0'test/b/file
test/b'\0'test/b/text file.txt
test/z'\0'test/z/file
And the final result is
test/file
test/a/file
test/b/file
test/b/text file.txt
test/z/file
(Based on the John Kugelman's answer, with "depth" element removed which is absolutely redundant.)
If you want to sort alphabetically, the best way is:
find test -print0 | sort -z
(The example in the original question actually wanted files before directories, which is not the same and requires extra steps)
try this. for reference, it firsts sorts on the second field second char. which only exists on the file, and has a r for reverse meaning it is first, after that it will sort on the first char of the second field. [-t is field deliminator, -k is key]
find test -name file |sort -t'/' -k2.2r -k2.1
do a info sort for more info. there is a ton of different ways to use the -t and -k together to get different results.

Resources