How to find all files containing a given file? - bash

I would want to find all files in a given directory that contain the entire content of a given file (not necessarily a text file).
I tried to achieve this goal through the use of find and grep but wasn't successful due to the search across several lines.

"SearchBin" is a simple python script that can check if one file exists inside some others: https://github.com/Sepero/SearchBin
find . -type f -print0 | xargs -0 searchbin.py -f needle
where needle is the file whose contents are to be found.

Related

Given a text file with file names, how can I find files in subdirectories of the current directory?

I have a bunch of files with different names in different subdirectories. I created a txt file with those names but I cannot make find to work using the file. I have seen posts on problems creating the list, on not using find (do not understand the reason though). Suggestions? Is difficult for me to come up with an example because I do not know how to reproduce the directory structure.
The following are the names of the files (just in case there is a formatting problem)
AO-169
AO-170
AO-171
The best that I came up with is:
cat ExtendedList.txt | xargs -I {} find . -name {}
It obviously dies in the first directory that it finds.
I also tried
ta="AO-169 AO-170 AO-171"
find . -name $ta
but it complains find: AO-170: unknown primary or operator
If you are trying to ask "how can I find files with any of these names in subdirectories of the current directory", the answer to that would look something like
xargs printf -- '-o\0-name\0%s\0' <ExtendedList.txt |
xargs -r0 find . -false
The -false is just a cute way to let the list of actual predicates start with "... or".
If the list of names in ExtendedList.txt is large, this could fail if the second xargs decides to break it up between -o and -name.
The option -0 is not portable, but should work e.g. on Linux or wherever you have GNU xargs.
If you can guarantee that the list of strings in ExtendedList.txt does not contain any characters which are problematic to the shell (like single quotes), you could simply say
sed "s/.*/-o -name '&'/" ExtendedList.txt |
xargs -r find . -false

Find files given a pattern and permission bits

I am trying to figure out how to list files in a directory beginning with a certain letter and also have certain permissions.
I know how to list/find files beginning with a for example.
find /[directory-name] -type f -name 'a*'
I have also found how to list the permissions of files.
The problem I seem to have is that I don't know how to combine the two together.
My desired outcome would be something along the lines to display files beginning with a, and have the permissions 770.
it's a little crude, but maybe you can try something like this (Tested on debian):
find ./[directory-name] -type f -name 'a*'|xargs stat --format "%a %n"| grep "^770"
xargs feeds the results of the find command one by one to stat, which outputs the octal file status and file name (there are a lot of format options available to stat). Finally, grep selects the files with the interesting permissions.

How to move files based on file names in a.csv doc - macOS Terminal?

Terminal noob need a little help :)
I have a 98 row long filename list in a .csv file. For example:
name01; name03, etc.
I have an external hard drive with a lot of files in chaotic file
structure. BUT the file names are consistent, something like:
name01_xy; name01_zq; name02_xyz etc.
I would like to copy every file and directory from the external hard
drive which begins with the filename stored in the .csv file to my
computer.
So basically it's a search and copy based on a text file from an eHDD to my computer. I guess the easiest way to do is a Terminal command. Do you have any advice? Thanks in advance!
The task can be split into three: read search criteria from file; find files by criteria; copy found files. We discuss each one separately and combine them in a one-liner step-by-step:
Read search criteria from .csv file
Since your .csv file is pretty much just a text file with one criterion per line, it's pretty easy: just cat the file.
$ cat file.csv
bea001f001
bea003n001
bea007f005
bea008f006
bea009n003
Find files
We will use find. Example: you have a directory /Users/me/where/to/search and want to find all files in there whose names start with bea001f001:
$ find /Users/me/where/to/search -type f -name "bea001f001*"
If you want to find all files that end with bea001f001, move the star wildcard (zero-or-more) to the beginning of the search criterion:
$ find /Users/me/where/to/search -type f -name "*bea001f001"
Now you can already guess what the search criterion for all files containing the name bea001f001 would look like: "*bea001f001*".
We use -type f to tell find that we are interested only in finding files and not directories.
Combine reading and finding
We use xargs for passing the file contents to find a -name argument:
$ cat file.csv | xargs -I [] find /Users/me/where/to/search -type f -name "[]*"
/Users/me/where/to/search/bea001f001_xy
/Users/me/where/to/search/bea001f001_xyz
/Users/me/where/to/search/bea009n003_zq
Copy files
We use cp. It is pretty straightforward: cp file target will copy file to directory target (if it is a directory, or replace file named target).
Complete one-liner
We pass results from find to cp not by piping, but by using the -exec argument passed to find:
$ cat file.csv | xargs -I [] find /Users/me/where/to/search -type f -name "[]*" -exec cp {} /Users/me/where/to/copy \;
Sorry this is my first post here. In response to the comments above, only the last file is selected likely because the others have a carriage return \r. If you first append the directory to each filename in the csv, you can perform the move with the following command, which strips the \r.
cp `tr -d '\r' < file.csv` /your/target/directory

Find files in current directory, list differences from list within script

I am attempting to find differences for a directory and a list of files located in the bash script, for portability.
For example, search a directory with phpBB installed. Compare recursive directory listing to list of core installation files (excluding themes, uploads, etc). Display additional and missing files.
Thus far, I have attempted using diff, comm, and tr with "argument too long" errors. This is likely due to the lists being a list of files it is attempting to compare the actual files rather than the lists themselves.
The file list in the script looks something like this (But I am willing to format differently):
./file.php
./file2.php
./dir/file.php
./dir/.file2.php
I am attempting to use one of the following to print the list:
find ./ -type f -printf "%P\n"
or
find ./ -type f -print
Then use any command you can think of to compare the results to the list of files inside the script.
The following are difficult to use as there are often 1000's of files to check and each version can change the listings and it is a pain to update a whole script every time there is a new release.
find . ! -wholename './file.php' ! -wholename './file2.php'
find . ! -name './file.php' ! -name './file2.php'
find . ! -path './file.php' ! -path './file2.php'
With the lists being in different orders to accommodate any additional files, it can't be a straight comparison.
I'm just stumped. I greatly appreciate any advice or if I could be pointed in the right direction. Ask away for clarification!
You can use the -r option of diff command, to recursively compare the contents of the two directories. This way you don't need all the file names on the command line; just the two top level directory names.
It will give you missing files, newly added files, and the difference of changed files. Many things can be controlled by different options.
If you mean you have a list of expected files somewhere, and only one directory to be compared against it, then you can try using the tree command. The list can be first created using the tree command, and then at the time of comparison you can run the tree command again on the directory, and compare it with the stored "expected output" using the diff command.
Do you have to use coreutils? If so:
Put your list in a file, say list.txt, with one file path per line.
comm -23 <(find path/to/your/directory -type f | sort) \
<(sort path/to/list.txt) \
> diff.txt
diff.txt will have one line per file in path/to/your/directory that is not in your list.
If you care about files in your list that are not in path/to/your/directory, do comm -13 with the same parameters.
Otherwise, you can also use sd (stream diff), which doesn't require sorting nor process substitution and supports infinite streams, like so:
find path/to/your/directory -type f | sd 'cat path/to/list.txt' > diff.txt
And just invert the streams to get the second requirement, like so:
cat path/to/list.txt | sd 'find path/to/your/directory -type f' > diff.txt
Probably not that much of a benefit on this example other than succintness, but still consider it; in some cases you won't be able to use comm nor grep -F nor diff.
Here's a blogpost I wrote about diffing streams on the terminal, which introduces sd.

Bash script to transverse a directory

I have a directory with XML files and other directories. All other directories have XML files and subdirectories, etc.
I need to write a script (bash probably) that for each directory runs java XMLBeautifier directory and since my skills at bash scripting are a bit rubbish, I would really appreciate a bit of help.
If you have to get the directories, you can use:
$ find . -type d
just pipe this to your program like this:
$ find . -type d | xargs java XMLBeautifier
Another approach would be to get all the files with find and pipe that to your program like this:
$ find . -name "*.xml" | xargs java XMLBeautifier
This takes all .xml files from the current directory and recursively through all subdirectories. Then hands them one by one over with xargs to java XMLBeautifier.
Find is an awesome tool ... however, if you are not sure of the file name but have a vague idea of what those xml file contains then you can use grep.
For instance, if you know for sure that all your xml files contains a phrase "correct xml file" (you can change this phrase to what you feel appropriate) then run the following at your command line ...
grep -IRw "correct xml file" /path/to/directory/*
-I option searches the file and returns the file name when pattern is matched
-R option reaches your directory recursively
-w ensure that the pattern given matches on the whole and not single word individually
Hope this helps!

Resources