Enable wildcards behave recursively [duplicate] - bash

This question already has answers here:
What expands to all files in current directory recursively?
(5 answers)
Closed 4 years ago.
I'd like to make a statistics of the words from all the txt file from the current directory and its subdiretories.
In [39]: ls
about.txt distutils/ installing/ whatsnew/
bugs.txt extending/ library/ word.txt
c-api/ faq/ license.txt words_frequency.txt
contents.txt glossary.txt reference/
copyright.txt howto/ tutorial/
distributing/ install/ using
I firstly tried command:
In [46]: !grep -Eoh '[a-zA-Z]+' *.txt | nl
There's a problem that files in the subdiretories were not found:
In [45]: !echo *.txt
about.txt bugs.txt contents.txt copyright.txt glossary.txt license.txt word.txt words_frequency.txt
I improved it as:
In [48]: ! echo */*.txt | grep "about.txt"
In [49]:
Problem again, It failed to find the files of Level one directory and cannot traverse the files of random length.
It's interesting that python has a soluton to this problem:
In [50]: files = glob.glob("**/*.txt", recursive=True)
In [54]: files.index('about.txt')
Out[54]: 4
It could traverse dirs recursively to find all txt files.
However, python is cumbersome to move around files and change text data as grep "pattern" *.txt
How to enable the wildcards as greedy as a recursive behavior.
As an alternative, find command helps
find . -regex -E ".*\.txt" -print0 -exec grep -Eoh "{}" "[a-zA-Z]+" | nl \;
Which not handy as a greedy wildcards if possible.
The globstar could not be activated on Macos.
$ shopt -s globstar
-bash: shopt: globstar: invalid shell option name
$ bash --version
GNU bash, version 4.4.19(1)-release (x86_64-apple-darwin17.3.0)

If I understood the question correctly you may use something like this:
find -type f -name '*.txt' -exec /bin/grep -hEo '\w+' {} \; \
| sort \
| uniq -c \
| sort -k1,1n

Related

Search and replace a word in directory and its subdirectories [duplicate]

This question already has answers here:
How can I do a recursive find/replace of a string with awk or sed?
(37 answers)
Closed 5 years ago.
I need to find and replace a word in all the files across a directory and its subdirectories.
I used this line of code:
perl -pi -w -e 's/foo/bar/g;' *.*
But it changes the words only in files in the current directory and doesn't affect the subdirectories.
How can I change the word on every directory and subdirectory?
You could do a pure Perl solution that recursively traverses your directory structure, but that'd require a lot more code to write.
The easier solution is to use the find command which can be told to find all files and run a command against them.
find . -type f -exec perl -pi -w -e 's/foo/bar/g;' \{\} \;
(I've escaped the {} and ; just in case but you might not need this)
Try that:
find -type f | xargs -r -I {} perl -pi -w -e 's/foo/bar/g;' {}
That should run recursively
In case your file name contains space:
find -type f -print0 | xargs -r0 -I {} perl -pi -w -e 's/foo/bar/g;' {}

Recursively compare specific files in different directories

Similar posts here:
Diff files present in two different directories
and here:
https://superuser.com/q/602877/520666
But not quite what I'm looking for.
I have 2 directories (containing subdirectories and different file types -- binary, images, html, etc.).
I want to be able to recursively compares files with specific extensions (e.g. .html, .strings, etc.) between the two directories -- they may or may not exist in either (sub)directory.
How can I accomplish this? Diff only seems to support exclusions, and I'm not sure how I can leverage Find for this.
Advice?
You could exclude all unwanted fileendings with find:
(this version only matches against file endings)
diff -r -x `find . -type f -name '*.*' | sed 's|.*\.|.*\.|' | sort -u | grep -v YOURFILETYPE | paste -sd "|"` ...rest of diff command
Or you generate the list of excluded files upfront and pass it to the diff:
(this version also matches against filenames and every other regex you specify in include.file)
find /dirA -type f | grep -v YOURFILEENDING > exclude.list
find /dirB -type f | grep -v YOURFILEENDING >> exclude.list
diff -X exclude.list -r /dirA /dirB
If you chain these commands via && you'll get a handy oneliner ;)
WITH INCLUDE FILE
If you want to use an include file, you can use this Method:
You specify the include file
grep matches against all files in the folders and turns your includefile into an exclude file for diff (diff only takes exclude files)
Here is an example:
Complicated inline version:
(this version only matches against file endings)
diff -r -x `find . -type f -name '*.*' | sed 's|.*\.|.*\.|' sort -u | grep -v -f include.file | paste -sd "|"` /dirA /dirB
Slightly longer simpler version:
(this version also matches against filenames and every other regex you specify in include.file)
find /dirA -type f | grep -v -f include.file > exclude.list
find /dirB -type f | grep -v -f include.file >> exclude.list
diff -X exclude.list -r /dirA /dirB
with each line in include.file being a grep regex/expression:
log
txt
fileending3
whateverfileendingyoulilke
fullfilename.txt
someotherregex.*
NOTE
I did not run these because I'm nowhere near a computer.
I hope I got all syntax correct.
The simplest thing you can do is to compare the whole directories:
diff -r /path/the/first /path/the/second
It will show which files are only in one of the directories, which files differ in a binary fashion, and the full diff for any textual files in both directories.
You can loop over a set of relative paths by simply reading a file with a path per line thusly:
while IFS= read -u 9 relative_path
do
diff "/path/the/first/%{relative_path}" "/path/the/second/%{relative_path}"
done 9< relative_paths.txt
Doing this for a specific set of extensions is similarly easy:
shopt -s globstar
while IFS= read -u 9 extension do
diff "/path/the/first/"**/*."${extension}" "/path/the/second/"**/*."${extension}"
done 9< extensions.txt

find folders with executable files

I wrote a script to find all folders that contain executable files. I was first seeking a oneliner command but could find one. (I especially tried to use sort -k -u).
. The script works fine but my initial question remains: Is there a oneliner command to do that?
#! /bin/bash
find $1 -type d | while read Path
do
X=$(ls -l "$Path" | grep '^-rwx' | wc -l)
if ((X>0))
then
echo $Path
fi
done
Using find:
find $1 -type f -perm /111 -exec dirname {} \; | sort -u
This finds all files with permission 111 (i.e. rwx) but then we output only the directory name. To avoid duplicates, sort -u is used.
As pointed out by Paulo Almeida in the comments, this would also work:
find $1 -type f -perm /111 -printf "%h\n" | sort -u

How to list only files and not directories of a directory Bash?

How can I list all the files of one folder but not their folders or subfiles. In other words: How can I list only the files?
Using find:
find . -maxdepth 1 -type f
Using the -maxdepth 1 option ensures that you only look in the current directory (or, if you replace the . with some path, that directory). If you want a full recursive listing of all files in that and subdirectories, just remove that option.
ls -p | grep -v /
ls -p lets you show / after the folder name, which acts as a tag for you to remove.
carlpett's find-based answer (find . -maxdepth 1 -type f) works in principle, but is not quite the same as using ls: you get a potentially unsorted list of filenames all prefixed with ./, and you lose the ability to apply ls's many options;
also find invariably finds hidden items too, whereas ls' behavior depends on the presence or absence of the -a or -A options.
An improvement, suggested by Alex Hall in a comment on the question is to combine shell globbing with find:
find * -maxdepth 0 -type f # find -L * ... includes symlinks to files
However, while this addresses the prefix problem and gives you alphabetically sorted output, you still have neither (inline) control over inclusion of hidden items nor access to ls's many other sorting / output-format options.
Hans Roggeman's ls + grep answer is pragmatic, but locks you into using long (-l) output format.
To address these limitations I wrote the fls (filtering ls) utility,
a utility that provides the output flexibility of ls while also providing type-filtering capability,
simply by placing type-filtering characters such as f for files, d for directories, and l for symlinks before a list of ls arguments (run fls --help or fls --man to learn more).
Examples:
fls f # list all files in current dir.
fls d -tA ~ # list dirs. in home dir., including hidden ones, most recent first
fls f^l /usr/local/bin/c* # List matches that are files, but not (^) symlinks (l)
Installation
Supported platforms
When installing from the npm registry: Linux and macOS
When installing manually: any Unix-like platform with Bash
From the npm registry
Note: Even if you don't use Node.js, its package manager, npm, works across platforms and is easy to install; try
curl -L https://git.io/n-install | bash
With Node.js installed, install as follows:
[sudo] npm install fls -g
Note:
Whether you need sudo depends on how you installed Node.js / io.js and whether you've changed permissions later; if you get an EACCES error, try again with sudo.
The -g ensures global installation and is needed to put fls in your system's $PATH.
Manual installation
Download this bash script as fls.
Make it executable with chmod +x fls.
Move it or symlink it to a folder in your $PATH, such as /usr/local/bin (macOS) or /usr/bin (Linux).
Listing content of some directory, without subdirectories
I like using ls options, for sample:
-l use a long listing format
-t sort by modification time, newest first
-r reverse order while sorting
-F, --classify append indicator (one of */=>#|) to entries
-h, --human-readable with -l and -s, print sizes like 1K 234M 2G etc...
Sometime --color and all others. (See ls --help)
Listing everything but folders
This will show files, symlinks, devices, pipe, sockets etc.
so
find /some/path -maxdepth 1 ! -type d
could be sorted by date easily:
find /some/path -maxdepth 1 ! -type d -exec ls -hltrF {} +
Listing files only:
or
find /some/path -maxdepth 1 -type f
sorted by size:
find /some/path -maxdepth 1 -type f -exec ls -lSF --color {} +
Prevent listing of hidden entries:
To not show hidden entries, where name begin by a dot, you could add ! -name '.*':
find /some/path -maxdepth 1 ! -type d ! -name '.*' -exec ls -hltrF {} +
Then
You could replace /some/path by . to list for current directory or .. for parent directory.
You can also use ls with grep or egrep and put it in your profile as an alias:
ls -l | egrep -v '^d'
ls -l | grep -v '^d'
find files: ls -l /home | grep "^-" | tr -s ' ' | cut -d ' ' -f 9
find directories: ls -l /home | grep "^d" | tr -s ' ' | cut -d ' ' -f 9
find links: ls -l /home | grep "^l" | tr -s ' ' | cut -d ' ' -f 9
tr -s ' ' turns the output into a space-delimited file
the cut command says the delimiter is a space, and return the 9th field (always the filename/directory name/linkname).
I use this all the time!
You are welcome!
ls -l | grep '^-'
Looking just for the name, pipe to cut or awk.
ls -l | grep '^-' | awk '{print $9}'
ls -l | grep '^-' | cut -d " " -f 13
{ find . -maxdepth 1 -type f | xargs ls -1t | less; }
added xargs to make it works, and used -1 instead of -l to show only filenames without additional ls info
You can one of these:
echo *.* | cut -d ' ' -f 1- --output-delimiter=$'\n'
echo *.* | tr ' ' '\n'
echo *.* | sed 's/\s\+/\n/g'
ls -Ap | sort | grep -v /
This method does not use external commands.
bash$ res=$( IFS=$'\n'; AA=(`compgen -d`); IFS='|'; eval compgen -f -X '#("${AA[*]}")' )
bash$ echo "$res"
. . .
Just adding on to carlpett's answer.
For a much useful view of the files, you could pipe the output to ls.
find . -maxdepth 1 -type f|ls -lt|less
Shows the most recently modified files in a list format, quite useful when you have downloaded a lot of files, and want to see a non-cluttered version of the recent ones.
"find '-maxdepth' " does not work with my old version of bash, therefore I use:
for f in $(ls) ; do if [ -f $f ] ; then echo $f ; fi ; done

How to echo directories containing matching file with Bash?

I want to write a bash script which will use a list of all the directories containing specific files. I can use find to echo the path of each and every matching file. I only want to list the path to the directory containing at least one matching file.
For example, given the following directory structure:
dir1/
matches1
matches2
dir2/
no-match
The command (looking for 'matches*') will only output the path to dir1.
As extra background, I'm using this to find each directory which contains a Java .class file.
find . -name '*.class' -printf '%h\n' | sort -u
From man find:
-printf format
%h Leading directories of file’s name (all but the last element). If the file name contains no slashes (since it is in the current directory) the %h specifier expands to ".".
On OS X and FreeBSD, with a find that lacks the -printf option, this will work:
find . -name *.class -print0 | xargs -0 -n1 dirname | sort --unique
The -n1 in xargs sets to 1 the maximum number of arguments taken from standard input for each invocation of dirname
GNU find
find /root_path -type f -iname "*.class" -printf "%h\n" | sort -u
Ok, i come way too late, but you also could do it without find, to answer specifically to "matching file with Bash" (or at least a POSIX shell).
ls */*.class | while read; do
echo ${REPLY%/*}
done | sort -u
The ${VARNAME%/*} will strip everything after the last / (if you wanted to strip everything after the first, it would have been ${VARNAME%%/*}).
Regards.
find / -name *.class -printf '%h\n' | sort --unique
Far too late, but this might be helpful to future readers:
I personally find it more helpful to have the list of folders printed into a file, rather than to Terminal (on a Mac).
For that, you can simply output the paths to a file, e.g. folders.txt, by using:
find . -name *.sql -print0 | xargs -0 -n1 dirname | sort --unique > folders.txt
How about this?
find dirs/ -name '*.class' -exec dirname '{}' \; | awk '!seen[$0]++'
For the awk command, see #43 on this list

Resources