Find all files with a filename beginning with a specified string? - bash

I have a directory with roughly 100000 files in it, and I want to perform some function on all files beginning with a specified string, which may match tens of thousands of files.
I have tried
ls mystring*
but this returns with the bash error 'Too many arguments'. My next plan was to use
find ./mystring* -type f
but this has the same issue.
The code needs to look something like
for FILE in `find ./mystring* -type f`
do
#Some function on the file
done

Use find with a wildcard:
find . -name 'mystring*'

ls | grep "^abc"
will give you all files beginning (which is what the OP specifically required) with the substringabc.
It operates only on the current directory whereas find operates recursively into sub folders.
To use find for only files starting with your string try
find . -name 'abc'*

If you want to restrict your search only to files you should consider to use -type f in your search
try to use also -iname for case-insensitive search
Example:
find /path -iname 'yourstring*' -type f
You could also perform some operations on results without pipe sign or xargs
Example:
Search for files and show their size in MB
find /path -iname 'yourstring*' -type f -exec du -sm {} \;

Related

Script to find recursively the number of files with a certain extension

We have a highly nested directory structure, where we have a directory, let's call it 'my Dir', appearing many times in our hierarchy. I am interested in counting the number of "*.csv" files in all directories named 'my Dir' (yes, there is a whitespace in the name). How can I go about it?
I tried something like this, but it does not work:
find . -type d -name "my Dir" -exec ls "{}/*.csv" \; | wc -l
If you want to the number of files matching the pattern '*.csv' under "my Dir", then:
don't ask for -type d; ask for -type f
don't ask for -name "my Dir" if you really want -name '*.csv'
don't try to ls *.csv on each match, because if there's more N csv files in a directory, you would potentially count each one N times
also beware of embedding {} in -exec code!
For counting files from find, I like to use a trick I learned from Stéphane Chazelas on U&L; for example, from: Counting files in Linux:
find "my Dir" -type f -name '*.csv' -printf . | wc -c
This requires GNU find, as -printf is a GNU extension to the POSIX standard.
It works by looking within "my Dir" (from the current working directory) for files that match the pattern; for each matching file, it prints a single dot (period); that's all piped to wc who counts the number of characters (periods) that find produced -- the number of matching files.
You would exclude all pathcs that are not My Dir:
find . -type f -not '(' -not -path '*/my Dir/*' -prune ')' -name '*.csv'
Another solution is to use the -path predicate to select your files.
find . -path '*/my Dir/*.csv'
Counting the number of occurrences could be a simple matter of piping to wc -l, though this will obviously produce the wrong result if some of the files contain newlines in their names. (This is slightly pathological, but definitely something you want to cover in production code.) A common arrangement is to just print a newline for every found file, instead of its name.
find . -path '*/my Dir/*.csv' -printf '.\n' | wc -l
(The -printf predicate is not in POSIX but it's not hard to replace with an -exec or similar.)

Bash Get all files containing file extension, recursively

Using Bash how would I get all file names (not paths) of files containing ".cpp", given a root folder to recursively-check?
Just use find:
find /root/folder/to/check -name '*.cpp' -printf "%P\n"
You can use for that purpose -printf option of find command with the following parameter:
%f File's name with any leading directories removed (only the last element).
so the full command may look like this:
find / -type f -name "*.cpp" -printf "%f\n"

Finding all PHP files within a certain directory containing a string

Im wondering if someone can help me out.
Im currently using the following to find all PHP files in a certain directory
find /home/mywebsite -type f -name "*.php"
How would i extend that to search through those PHP files and get all files with the string base64_decode?
Any help would be great.
Cheers,
find /home/mywebsite -type f -name '*.php' -exec grep -l base64_decode {} +
The -exec option to find executes a command on the files found. {} is replaced by the filename, and the + means that it should keep repeating this for all the filenames. grep looks for a string in the file, and the -l option tells it to print just the filename when there's a match, not all the matching lines.
If you're getting an error from find, you may have an old version that doesn't support the + feature of -exec. Use this command instead:
find /home/mywebsite -type f -name '*.php' | xargs grep -l base64_decode
xargs reads its standard input and turns them into arguments for the command line in its arguments.

Exclude specified directory when using `find` command

I have a directory which contains a number of files (no subdirectories). I wish to find these files. The following gets me close:
$ find docs
docs
docs/bar.txt
docs/baz.txt
docs/foo.txt
I don't want the directory itself to be listed. I could do this instead:
$ find docs -type f
docs/bar.txt
docs/baz.txt
docs/foo.txt
Using a wildcard seems to do the trick as well:
$ find docs/*
docs/bar.txt
docs/baz.txt
docs/foo.txt
My understanding is that these work in different ways: with -type, we're providing a single path to find, whereas in the latter case we're using wildcard expansion to pass several paths to find. Is there a reason to favour one approach over the other?
You have a UNIX tag, and you example has a *. Some versions of find have a problem with that.
If the directory has no subdirectories.
FYI.
Generally the first parms to find has to be a directory or a list of directories
find /dir1 /dir2 -print
Find is recursive - so it will follow each directory down listing every thing, symlinks, directories, pipes, and regular files. This can be confusing. -type delimits your search
find /dir1 /dir2 -type f -print
You can also have find do extra output example: have it rm files older than 30 days for example:
find /dir1 /dir2 -type f -mtime +30 -exec rm {} \;
Or give complete infomation
find /dir1 /dir2 -type f -mtime +30 -exec ls -l {} \;
find /dir1 /dir2 -type f -mtime +30 -ls # works on some systems
To answer your question: because find can be dangerous ALWAYS fully specify each directory , file type ,etc., when you are using a nasty command like rm. You might have forgotten your favorite directory is also in there. Or the one used to generate your paycheck. Using a wildcard is ok for just looking around.
Using *
find /path/to/files -type f -name 'foo*'
-- tics or quotes around strings with a star in them in some UNIX systems.
find docs -type f
will get you a listing of every non-directory file of every subdirectory of docs
find docs/*
will get you a listing of every file AND every subdirectory of docs

Grepping from a text file list

I know I can find specific types of files and then grep them in one shot, i.e.
find . -type f -name "*.log" -exec grep -o "some-pattern" {} \;
But I need to do this in two steps. This is because the find operation is expensive (there are lots of files and subdirectories to search). I'd like to save down the file-list to a text file, and then repeatedly grep for different patterns on this precomputed set of files whenever I need to. The first part is easy:
find . -type f -name "*.log" > my-file-list.txt
Now I have a file that looks like this:
./logs/log1.log
./logs/log2.log
etc
What does the grep look like? I've tried a few combinations but can't get it right.
xargs grep "your pattern" < my-file-list.txt

Resources