Adding simple pipped command bash - bash

I'm learning bash scripting and needed some simple help.
Here is what I have thus far:
find . -type d -empty -not -path "./.git/*" -exec touch {}/.gitkeep \;
So what this does is starts from a root path, finds all directories inside this root path that are empty and do not have a .git folder, and then when that operation is successful it runs -exec touch {}/.gitkeep to create a file .gitkeep inside that empty directory to ensure proper git commits.
What I want now is to echo out the current file path for the gitkeep file just created.
My first question is:
Should I be piping | as so:
find . -type d -empty -not -path "./.git/*" -exec touch {}/.gitkeep | outputFilenameDisplayFunction \;
Or maybe repeat what -exec does as so:
find . -type d -empty -not -path "./.git/*" -exec touch {}/.gitkeep - exec outputFilenameDisplayFunction \;
Or maybe use >
find . -type d -empty -not -path "./.git/*" -exec touch {}/.gitkeep > outputFilenameDisplayFunction \;
None of these commands has been tested yet. I really am looking for explanations so i can be knowledgeable in the future.

As mentioned here, find accepts multiple -exec portions to the command.
In your case, the second one can call a script, as in here:
find . -type d -empty -not -path "./.git/*" -exec touch {}/.gitkeep \; -exec myscript {} \;
Note the \;.
The script would be:
#!/bin/sh
echo "$1" > "afile"
Charles Duffy actually proposes in the comments fir the second -exec:
-exec sh -c 'echo "$1" >>aFile' _ {} \;
avoid the need for an external file storing your script.

Let's start from your stated requirements:
So what this does is starts from a root path, finds all directories inside this root path that are empty and do not have a .git folder, and then when that operation is successful it runs -exec touch {}/.gitkeep to create a file .gitkeep inside that empty directory to ensure proper git commits.
If a directory is empty, it "can't have a .git folder" in the sense of having a child named .git by definition -- if it had any subdirectory, it wouldn't be empty. So we can completely ignore that part of your description in prose -- or interpret to refer to what the code actually appears to be intended to do, pruning any directory which is under .git.
Should that be your intent, -path is the wrong tool for that job altogether, as it still searches the .git tree (and then excludes all the things that it found); instead, use -prune to stop find from recursing down that path at all:
while IFS= read -r -d '' dirname; do
touch -- "${dirname}/.gitkeep"
printf '%q\n' "$dirname" # this goes to the logfile, since we open it for the whole loop
done < <(find . -name .git -prune -o -type d -empty -print0) >logFile
Why prefer this approach?
Instead of starting a shell per directory found (as would happen if you used -exec to start a shell script or a shell), it keeps your initial/primary shell running, and iterates through the loop once per item found.
Because it's running code in that shell, you can use shell functions; modify shell variables (as with (( ++directoriesFound )) to keep a counter, f/e), or perform redirections scoped to the loop (ie. >logFile) to open an output file just once and use if repeatedly within.

On GNU/anything, find has -printf, which makes doing what you want a straight
find -name .git -prune \
-o -type d -empty -printf %p/.gitkeep\\n -execdir touch {}/.gitkeep \;
(note: fixed omitted {}/, and GNU find's -execdir doesn't change the behavior here but is safer than -exec on systems that may find themselves under attack, the exec'd command is run directly in the location find got to rather than causing the executed command to re-walk the path).

Related

Delete files that are 5 days old using bash script

I currently use a command that searches a directory and deletes 5 day old files.
find /path/to/files* -mtime +5 -exec rm {} \;
I run it from the command line and it works fine. But when I put it in a .sh file it says findĀ /path/to/files*: No such file or directory.
There is only two lines in the shell script:
#! /usr/bin/env bash
find /path/to/files* -mtime +5 -exec rm {} \;
How can I rewrite the script so that it works?
`
The error happens if there are currently no files matching the wildcard, presumably because none have been created since you deleted them previously.
The argument to find should be the directory containing the files, not the filenames themselves, since find will automatically search the directory. If you want to restrict the filenames, use the -name option to specify the wildcard.
And if you don't want to go into subdirectories, use the -maxdepth option.
find /path/to -maxdepth 1 -type f -name 'files*' -mtime +5 -delete
This works:
#! /usr/bin/env bash
find /home/ubuntu/directory -type f -name 'filename*' -mtime +4 -delete
Here is an example:
find /home/ubuntu/processed -type f -name 'output*' -mtime +4 -delete

Getting the contents of a directory excluding everything inside .git in bash

I need to get the number of the contents of a directory that is a git repository.
I have to get the number of:
1) Other directories inside the directory I am currently iterating (and the other sub-directories inside them if they exist)
2) .txt files inside the directory and its sub-directories
3) All the non-txt files inside the directory and its sub-directories
In all the above cases I must ignore the .git directory, along with all the files and directories that are inside of it.
Also I must use bash script exclusively. I can't use another programing language.
Right now I am using the following commands to achieve this:
To get all the .txt files I use : find . -type f \( -name "*.txt" \). There are no .txt files inside .git so this is working.
To get all the non-txt files I use: find . -type f \( ! -name "*.txt" \). The problem is that I also get all the files from .git and I don't know how to ignore them.
To get all the directories and sub-directories I use: find . -type d. I don't know how to ignore the .git directory and it's sub-directories
The easy way is to just add these extra tests:
find . ! -path './.git/*' ! -path ./.git -type f -name '*.txt'
The problem with this is ./.git is still traversed, unnecessarily, which takes time.
Instead, -prune can be used. -prune is not a test (like -path, or -type). It's an action. The action is "don't descend the current path, if it's a directory". It must be used separately to the print action.
# task 1
find . -path './.git' -prune -o -type f -name '*.txt' -print
# task 2
find . -path './.git' -prune -o -type f ! -name '*.txt' -print
# task 3
find . -path './.git' -prune -o -type d -print
If -print isn't specified, ./.git is also printed as the default action.
I used -path ./.git, because you said "the .git directory". If for some reason there are other .git directories in the tree, they will be traversed and printed. To ignore all directories in the tree named .git, replace -path ./.git with -name .git.
Sometimes writing a bash loop is more clear than a one-liner
for f in $(find .); do
if [[ -d $f && "$f" == "./.git" ]]; then
echo "skipping dir $f";
else
echo "do something with $f";
fi;
done

Bash script for removing specific file from certain subdirectories

On a unix server, I'm trying to figure out how to remove a file, say "example.xls", from any subdirectories that start with v0 ("v0*").
I have tried something like:
find . -name "v0*" -type d -exec find . -name "example.xls" -type f
-exec rm {} \;
But i get errors. I have a solution but it works too well, i.e. it will delete the file in any subdirectory, regardless of it's name:
find . -type f -name "example.xls" -exec rm -f {} \;
Any ideas?
You will probably have to do it in two steps -- i.e. first find the directories, and then the files -- you can use xargs to make it in a single line, like
find . -name "v0*" -type d | \
xargs -l -I[] \
find [] -name "example.xls" -type f -exec rm {} \;
what it does, is first generating a list of viable directory name, and let xargs call the second find with the names locating the file name within that directory
Try:
find -path '*/v0*/example.xls' -delete
This matches only files named example.xls which, somewhere in its path, has a parent directory name that starts with v0.
Note that since find offers -delete as an action, it is not necessary to invoke the external executable rm.
Example
Consider this directory structure:
$ find .
.
./a
./a/example.xls
./a/v0
./a/v0/b
./a/v0/b/example.xls
./a/v0/example.xls
We can identify files example.xls who have one of their parent directories named v0*:
$ find -path '*/v0*/example.xls'
./a/v0/b/example.xls
./a/v0/example.xls
To delete those files:
find -path '*/v0*/example.xls' -delete
Alternative: find only those files directly under directory v0*
find -regex '.*/v0[^/]*/example.xls'
Using the above directory structure, this approach returns one file:
$ find -regex '.*/v0[^/]*/example.xls'
./a/v0/example.xls
To delete such files:
find -regex '.*/v0[^/]*/example.xls' -delete
Compatibility
Although my tests were performed with GNU find, both -regex and -path are required by POSIX and also supported by OSX.

How to print the deleted file names along with path in shell script

I am deleting the files in all the directories and subdirectories using the command below:
find . -type f -name "*.txt" -exec rm -f {} \;
But I want to know which are the files deleted along with their paths. How can I do this?
Simply add a -print argument to your find.
$ find . -type f -name "*.txt" -print -exec rm -f {} \;
As noted by #JonathanRoss below, you can achieve an equivalent result with the -v option to rm.
It's not the scope of your question, but more generally it gets more interesting if you want to delete directories recursively. Then:
a simple -exec rm -r argument keeps it silent
a -print -exec rm -r argument reports the toplevel directories you're operating on
a -exec rm -rv argument reports all you're removing

How to add .txt to all files in a directory using terminal

I have many files without file extention. Now I want to add .txt to all files. I tried the following but it gives an error, mv: rename . to ..txt: Invalid argument.
How can I achieve this?
find . -iname "*.*" -exec bash -c 'mv "$0" "$0.txt"' {} \;
You're nearly there!
Just add -type f to only deal with files:
find . -type f -exec bash -c 'mv "$0" "$0.txt"' {} \;
If your mv handles the -n option, you might want to use it (that's the option to not overwrite existing files).
The error your having is because . is one of the first found by found, and your system complains (rightly) when you want to rename .! with the -type f you're sure this won't happen. Now if you wanted to act on everything inside your directory, you would, e.g., add -mindepth 1 at the beginning of the find command (as . is considered depth 0).
It is not very clear in your question, but what if you want to add the extension .txt to all files that don't have an extension? (we'll agree that to have an extension means to have a period in the name). In this case, you'll use the negation of -name '*.*' as follows:
find . -type f \! -name '*.*' -exec bash -c 'mv "$0" "$0.txt"' {} \;

Resources