How to get "Find" command to ONLY return files and no directories [duplicate] - macos

This question already has answers here:
How to list only files and not directories of a directory Bash?
(12 answers)
Closed 2 years ago.
I'm developing a little desktop application that lists the files in a given directory.
The table on the left gets compared to the table on the right. If anything on the right is missing, the text of the files on the left will turn red. Simple.
What I'm throwing at these tables is a path that contains multiple subfolders of images.
These are saved as a variable so I cannot explicitly declare "omit a directory with this name".
Using this as an example:
I get this output:
And I'd like to get this output:
How do I get the find command to return ONLY the file names? No directory names at all.
Is this possible?
As of now I have:
"find -x "+ path + " | sed 's!.*/!!' | grep -v CaptureOne | grep -v .DS_Store | grep -v .cop | grep -v .cof | grep -v .cot | grep -v .cos | grep -v Settings131 | grep -v Proxies | grep -v Thumbnails | grep -v Cache | sort"
This does get me only the file names and not the full paths. Which I want.
And it doesn't include other file extensions and folders that I know will exist.
Like I said - I've never gone down this path and the above code could probably be done in a much easier way. Open to suggestions!

To limit yourself to files, use -type f (to exclude directories, you'd use ! -type d). To get just the basename, pipe the output through basename. All together, that'd be:
find . -type f -print0 | xargs -0 basename
(The -print0 and -0 uses NUL as the separator so that whitespace in pathnames is preserved.)

Related

How to sort files based on filename length with subdirectories?

I am trying to look at a directory named Forever where it has sub-directories with Pure,Mineral which are filled with .csv files. I was able to see all the .csv files in the directory, but I am having hard time sorting them according to the length of filename.
As for current directory, I am at Forever. So I am looking at both sub-directories Pure and Mineral.
What I did was:
find -name ".*csv" | tr ' ' '_' | sort -n -r
This just sorts the file reverse alphabetically, which doesn't consider the length.(I had to truncate some name of the files as it had spaces between them.)
I think this answer is more helpful than the marked duplicate because it also accounts for sub-dirs (which the dupe didn't):
find . -name '*.csv' -exec bash -c 'echo -e $(wc -m <<< $(basename {}))\\t{}' \; | sort -nr | cut -f2
FWIW using fd -e csv -x ... was quite a bit faster for me (0.153s vs find's 2.084s)
even though basename removes the file ext, it doesn't matter since find ensures that all of them have it

How to grep files in date order

I can list the Python files in a directory from most recently updated to least recently updated with
ls -lt *.py
But how can I grep those files in that order?
I understand one should never try to parse the output of ls as that is a very dangerous thing to do.
You may use this pipeline to achieve this with gnu utilities:
find . -maxdepth 1 -name '*.py' -printf '%T#:%p\0' |
sort -z -t : -rnk1 |
cut -z -d : -f2- |
xargs -0 grep 'pattern'
This will handle filenames with special characters such as space, newline, glob etc.
find finds all *.py files in current directory and prints modification time (epoch value) + : + filename + NUL byte
sort command performs reverse numeric sort on first column that is timestamp
cut command removes 1st column (timestamp) from output
xargs -0 grep command searches pattern in each file
There is a very simple way if you want to get the filelist in chronologic order that hold the pattern:
grep -sil <searchpattern> <files-to-grep> | xargs ls -ltr
i.e. you grep e.g. "hello world" in *.txt, with -sil you make the grep case insensitive (-i), suppress messages (-s) and just list files (-l); this you then pass on to ls (| xargs), sorting it by date (-t) showing date (-l) and all files (-a).

Grep to Print all file content [duplicate]

This question already has answers here:
Colorized grep -- viewing the entire file with highlighted matches
(24 answers)
Closed 4 years ago.
How can I modify grep so that it prints full file if its entry matches the grep pattern , instead of printing Just the matching line ?
I tried using(say) grep -C2 to print two lines above and 2 below but this doesn't always works as no. of lines is not fixed ..
I am not Just searching a single file , I am searching an entire directory where some files may contain the given pattern and I want those Files to be completely Printed.
I am also using grep inside grep result without getting printed the first grep output.
Simple grep + cat combination:
grep 'pattern' file && cat file
Use grep's -l option to list the paths of files with matching contents, then print the contents of these files using cat.
grep -lR 'regex' 'directory' | xargs -d '\n' cat
The command from above cannot handle filenames with newlines in them.
To overcome the filename with newlines issue and also allow more sophisticated checks you can use the find command.
The following command prints the content of all regular files in directory.
find 'directory' -type f -exec cat {} +
To print only the content of files whose content matches the regexes regex1 and regex2, use
find 'directory' -type f \
-exec grep -q 'regex1' {} \; -and \
-exec grep -q 'regex2' {} \; \
-exec cat {} +
The linebreaks are only for better readability. Without the \ you can write everything into one line.
Note the -q for grep. That option supresses grep's output. grep's exit status will tell find whether to list a file or not.

How to print the amount of files in a folder (recursively) seperated by extensions?

For example, I have a folder containing files of different types (.jpg, .png, .txt, ..) and would like to know how many files of each extensions there is in my folder separatly.
The output would be something like this:
.jpg : 255
.png : 123
.txt : 12
No extension : 1
For now, I only know how to find how many files exist for one given extension using this command:
find /folderpath -type f -name '*.jpg' | wc -l
However I would like it to be able to find by itself the files extensions.
Thanks for your help.
You can do this for a single directory with:
ls | grep '\.' | sed 's/.*\././' | sort | uniq -c
(I'm ignoring files with no . - tweak if you want something else)
I'd suggest fleshing this out into a script (say, extension_counts) that takes a list of directories, and for each one outputs the path followed by the report in the format you wish.
Quick and dirty version:
#!/bin/sh
for dir in $*; do
echo $dir
(cd $dir && ls | grep '\.' | sed 's/.*\././' | sort | uniq -c)
done
... but you should consider hardening this.
Then for the recursive part, you can use find and xargs:
find . -type d | xargs extension_counts
You could be a bit smarter and do it all in one script file by defining extension_counts as a function, but that's an optimisation.
There are some pitfalls to parsing the output of ls (or find). In this case the only potential issue I can think of is filenames containing a newline (yes, this is possible). You could just accept that you're using a tool not designed for weird filenames, or you could write something more robust in a language with firmer data structures, such as Python, Perl, Ruby, Go, etc.
This could be done with a quick awk one liner as well:
find /folderpath -type f -name '*.*' | awk -F"." 'BEGIN{OFS=" : "}{extensions[$NF]++}END{for (ext in extensions) { print ext, extensions[ext]}};'
That awk script will split each line by a period -F"."
Set the OFS (Output Field Separator) by " : " BEGIN{OFS=" : "}
Load an array using the file extension for the key extensions[$NF] where $NF is the last field in the record. The value of the array will be a count ++.
When all the lines are processed we iterate the array for (ext in extensions) and print out the index and value {print ext, extensions[ext]}
I would proceed this way :
list the file names (rather than their paths produced by find) :
find . -type f | rev | cut -d/ -f1 | rev
We reverse each line so that we can easily address the last field
reduce to their extension :
sed -E 's/^.*\././;t end;s/.*/No extension/;:end'
Here we remove everything up to the first dot, or if the substitution could not be done (because there was no dot) we replace everything by "No extension".
sort the result :
sort
group by extension and add the count :
uniq -c
For a complete command as follows :
find . -type f | rev | cut -d/ -f1 | rev | sed -E 's/^.*\././;t end;s/.*/No extension/;:end' | sort | uniq -c
Note that the presentation differs from yours, which could be easily fixed with an additional sed :
2 .119
1 .147
[...]
1 .Xauthority
1 .xml
1 .xsession-errors
2 .zip
1 .zshrc
48 No extension

SHELL printing just right part after . (DOT)

I need to find just extension of all files in directory (if there are 2 same extensions, its just one). I already have it. But the output of my script is like
test.txt
test2.txt
hello.iso
bay.fds
hellllu.pdf
Im using grep -e -e '.' and it just highlight DOTs
And i need just these extensions give in one variable like txt,iso,fds,pdf
Is there anyone who could help? I already had it one time but i had it on array. Today I found out It's has to work on dash too.
You can use find with awk to get all unique extensions:
find . -type f -name '?*.?*' -print0 |
awk -F. -v RS='\0' '!seen[$NF]++{print $NF}'
can be done with find as well, but I think this is easier
for f in *.*; do echo "${f##*.}"; done | sort -u
if you want to assign a comma separated list of the unique extensions, you can follow this
ext=$(for f in *.*; do echo "${f##*.}"; done | sort -u | paste -sd,)
echo $ext
csv,pdf,txt
alternatively with ls
ls -1 *.* | rev | cut -d. -f1 | rev | sort -u | paste -sd,
rev/rev is required if you have more than one dot in the filename, assuming the extension is after the last dot. For any other directory simply change the part *.* to dirpath/*.* in all scripts.
I'm not sure I understand your comment. If you don't assign to a variable, by default it will print to the output. If you want to pass directory name as a variable to a script, put the code into a script file and replace dirpath with $1, assuming that will be your first argument to the script
#!/bin/bash
# print unique extension in the directory passed as an argument, i.e.
ls -1 "$1"/*.* ...
if you have sub directories with extensions above scripts include them as well, to limit only to file types replace ls .. with
find . -maxdepth 1 -type f -name "*.*" | ...

Resources