I have git repositories in a directory. For example,
$ ls
repo1 repo2 repo3 repo4
I want to see the last k commit logs of the all repositories quickly.
(k is something like 3.)
For repo1, I can print the last 3 commit logs and go back to the directory like this:
$ cd repo1; git log -3 ; cd ../
But I do not want to repeat this for all the repositories. I'm looking for a smart way to do it easily. (Maybe use xargs?)
I'm using Bash.
Thank you.
Often it's pointless to use xargs, since find can execute stuff on its own:
find ~/src/ -maxdepth 2 -name .git -execdir git log \;
Explanation:
find ~/src/
Look for stuff under ~/src/. You can pass multiple arguments if you want, possibly as a result of a shell glob.
-maxdepth 2
Don't recurse deeply. This saves a lot of time, since hitting the filesystem is relatively slow.
-maxdepth 2 will find ~/src/.git (if it exists) and ~/src/foo/.git, so it can be used whether you pass the repo directory itself or just the directory containing all the repos.
-maxdepth 1 would work (and be easier on IO) if only you want to pass the repo directories themselves.
-maxdepth 3 might have occasional use for certain source hierarchies, but for them you're probably better just passing additional directories to find in the first place.
-name .git
We're looking for the .git directory (or file! yes, git does that), because -execdir normally takes a {} argument which it passes to the command. The passed filename would just be the basename (so that e.g. mv works ... remember that find often works with files), with the working directory set to whatever contains that.
-execdir ... \;
Run a command in the repo itself. The command (here ...) can include anything, notably - options which will not be interpreted by find itself ... except that {} anywhere in a word is the filename, a lone ; terminates the command (but is here escaped for the shell), and + terminates the command but passes multiple files at once.
For this use case, we don't need to pass a filename, since the directory the program is run in provides all the needed information.
I have something similar, which is easy to be changed to satisfy your requirement:
CODE_BASE=(/parentdir/to/your/repositories
/another/parent/dirs
/another/parent/dirs/if/you/have
)
EXCLUDE_PATT="gitRepoYouWantToIgnore" #this is regex
for base in ${CODE_BASE[#]};do
echo "##########################"
echo " scanning $base"
echo "##########################"
for line in $(find "$base" -name ".git"|grep -v "$EXCLUDE_PATT"); do
line=$(sed 's#/\.git##'<<<"$line")
repo=$(awk -F'/' '$0=$NF' <<<"$line")
echo "##########################"
echo "====> Showing log of Repository: $repo <===="
echo "##########################"
git -C "$line" log -3
done
done
Save to showlog.sh for example, then execute it. You can add more log parameters to make the log output fit your needs.
Related
First I made a question here: Unzip a file and then display it in the console in one step
It works and helped me a lot. (please read)
Now I have a second issue. I do not have a single zipped log file but I have a lot of them in defferent folders, which I need to find first. The files have the same names. For example:
/somedir/server1/log.gz
/somedir/server2/log.gz
/somedir/server3/log.gz
and so on...
What I need is a way to:
find all the files like: find /somedir/server* -type f -name log.gz
unzip the files like: gunzip -c log.gz
use grep on the content of the files
Important! The whole should be done in one step.
I cannot first store the extracted files in the filesystem because it is a readonly filesystem. I need somehow to connect, with pipes, the output from one command to the input of the next.
Before, the log files were in text format (.txt), therefore I had not to unzip them first. In this case it was easy:
ex.
find /somedir/server* -type f -name log.txt | xargs grep "term"
Now I have to deal with zipped files. That means, after I find the files, I need first somehow do unzip them and then send the contents to grep.
With one file I do:
gunzip -p /somedir/server1/log.gz | grep term
But for multiple files I don't know how to do it. For example how to pass the output of find to gunzip and the to grep?!
Also if there is another way / "best practise" how to do that, it is welcome :)
find lets you invoke a command on the files it finds:
find /somedir/server* -type f -name log.gz -exec gunzip -c '{}' + | grep ...
From the man page:
-exec command {} +
This variant of the -exec action runs the specified command on
the selected files, but the command line is built by appending
each selected file name at the end; the total number of
invocations of the command will be much less than the number
of matched files. The command line is built in much the same
way that xargs builds its command lines. Only one instance of
{} is allowed within the command, and (when find is being
invoked from a shell) it should be quoted (for example, '{}')
to protect it from interpretation by shells. The command is
executed in the starting directory. If any invocation with
the + form returns a non-zero value as exit status, then
find returns a non-zero exit status. If find encounters an
error, this can sometimes cause an immediate exit, so some
pending commands may not be run at all. This variant of -exec
always returns true.
I have a file I would like to copy into about 300,000 different directories, these are themselves split between two directories, e.g.
DirA/Dir001/
...
DirB/Dir149000/
However when I try:
cp file.txt */*
It returns:
bash: /bin/cp: Argument list too long
What is the best way of copying a file into multiple directories, when you have too many to use cp?
The answer to the question as asked is find.
find . -mindepth 2 -maxdepth 2 -type d -exec cp script.py {} \;
But of course #triplee is right... why make so many copies of a file?
You could, of course, instead create links to the file...
find . -mindepth 2 -maxdepth 2 -type d -exec ln script.py {} \;
The options -mindepth 2 -maxdepth 2 limit the recursive search of find to elements exactly two levels deep from the current directory (.). The -type d matches all directories. -exec then executes the command (up to the closing \;), for each element found, replacing the {} with the name of the element (the two-levels-deep subdirectory).
The links created are hard links. That means, you edit the script in one place, the script will look different in all places. The script is, for all intents and purposes, in all the places, with none of them being any less "real" than the others. (This concept can be surprising to those not used to it.) Use ln -s if you instead want to create "soft" links, which are mere references to "the one, true" script.py in the original location.
The beauty of find ... -exec ... {}, as opposed to many other ways to do it, is that it will work correctly even for filenames with "funny" characters in them, including but not limited to spaces or newlines.
But still, you should really only need one script. You should fix the part of your project where you need that script in every directory; that is the broken part...
Extrapolating from the answer to your other question you seem to have code which looks something like
for TGZ in $(find . -name "file.tar.gz")
do
mkdir -p work
cd work
tar xzf $TGZ
python script.py
cd ..
rm -rf work
done
Of course, the trivial fix is to replace
python script.py
with
python ../script.py
and voilá, you no longer need a copy of the script in each directory at all.
I woud further advice to refactor out the cd and changing script.py so you can pass it the directory to operate on as a command-line argument. (Briefly, import sys and examine the value of sys.argv[1] though you'll often want to have option parsing and support for multiple arguments; argparse from the Python standard library is slightly intimidating, but there are friendly third-party wrappers like click.)
As an aside, many beginners seem to think the location of your executable is going to be the working directory when it executes. This is obviously not the case; or /bin/ls woul only list files in /bin.
To get rid of the cd problem mentioned in a comment, a minimal fix is
for tgz in $(find . -name "file.tar.gz")
do
mkdir -p work
tar -C work -x -z -f "$tgz"
(cd work; python ../script.py)
rm -rf work
done
Again, if you can change the Python script so it doesn't need its input files in the current directory, this can be simplified further. Notice also the preference for lower case for your variables, and the use of quoting around variables which contain file names. The use of find in a command substitution is still slightly broken (it can't work for file names which contain whitespace or shell metacharacters) but maybe that's a topic for a separate question.
I am new to bash and i am trying to cd to all subdirectories of a parent directory and execute a command in all files these subdirecories contain.But it s not working.
for subdir in $parentdirectory
do
for file in $subdir
do
ngram - lm somefilename.lm - ppl file
done
done
There's many ways to do this, but one would require you to explicitly change to that directory. Assuming $parentdirectory is correctly initialized, then you could look into something like:
for subdir in ${parentdirectory}
do
cd ${subdir} # go into the subdir
for file in * # glob expansion
do
ngram - lm somefilename.lm - ppl ${file}
done
cd .. # go back up
done
Also have a look at the excellent Advanced Bash-Scripting Guide: http://tldp.org/LDP/abs/html/loops1.html
If you're wanting to do this with a small amount of space, you could do something using find -exec.
Such as:
# add a file called foo into every subdirectory
find . -type d -exec sh -c 'touch "$0/foo"' {} \;
Or, if you wanted to echo a string into each of those files you just created:
# find all files and append 'ABC' into them
find . -type f -exec sh -c 'echo "ABC" >> $0' {} \;
The find -exec combo is an extremely powerful tool that can save you on a bit of directory / file navigation, and allows you to achieve what it sounds like is the desired functionality without having to play descend/ascend through the directory structure.
Also, as you can probably guess, this kind of thing can go horribly wrong if you're not careful, so use with great caution.
I have been given a list of folders which need to be found and copied to a new location.
I have basic knowledge of bash and have created a script to find and copy.
The basic command I am using is working, to a certain degree:
find ./ -iname "*searchString*" -type d -maxdepth 1 -exec cp -r {} /newPath/ \;
The problem I want to resolve is that each found folder contains the files that I want, but also contains subfolders which I do not want.
Is there any way to limit the recursion so that only the files at the root level of the found folder are copied: all subdirectories and files therein should be ignored.
Thanks in advance.
If you remove -R, cp doesn't copy directories:
cp *searchstring*/* /newpath
The command above copies dir1/file1 to /newpath/file1, but these commands copy it to /newpath/dir1/file1:
cp --parents *searchstring*/*(.) /newpath
for GNU cp and zsh
. is a qualifier for regular files in zsh
cp --parents dir1/file1 dir2 copies file1 to dir2/dir1 in GNU cp
t=/newpath;for d in *searchstring*/;do mkdir -p "$t/$d";cp "$d"* "$t/$d";done
find *searchstring*/ -type f -maxdepth 1 -exec rsync -R {} /newpath \;
-R (--relative) is like --parents in GNU cp
find . -ipath '*searchstring*/*' -type f -maxdepth 2 -exec ditto {} /newpath/{} \;
ditto is only available on OS X
ditto file dir/file creates dir if it doesn't exist
So ... you've been given a list of folders. Perhaps in a text file? You haven't provided an example, but you've said in comments that there will be no name collisions.
One option would be to use rsync, which is available as an add-on package for most versions of Unix and Linux. Rsync is basically an advanced copying tool -- you provide it with one or more sources, and a destination, and it makes sure things are synchronized. It knows how to copy things recursively, but it can't be told to limit its recursion to a particular depth, so the following will copy each item specified to your target, but it will do so recursively.
xargs -L 1 -J % rsync -vi -a % /path/to/target/ < sourcelist.txt
If sourcelist.txt contains a line with /foo/bar/slurm, then the slurm directory will be copied in its entiriety to /path/to/target/slurm/. But this would include directories contained within slurm.
This will work in pretty much any shell, not just bash. But it will fail if one of the lines in sourcelist.txt contains whitespace, or various special characters. So it's important to make sure that your sources (on the command line or in sourcelist.txt) are formatted correctly. Also, rsync has different behaviour if a source directory includes a trailing slash, and you should read the man page and decide which behaviour you want.
You can sanitize your input file fairly easily in sh, or bash. For example:
#!/bin/sh
# Avoid commented lines...
grep -v '^[[:space:]]*#' sourcelist.txt | while read line; do
# Remove any trailing slash, just in case
source=${line%%/}
# make sure source exist before we try to copy it
if [ -d "$source" ]; then
rsync -vi -a "$source" /path/to/target/
fi
done
But this still uses rsync's -a option, which copies things recursively.
I don't see a way to do this using rsync alone. Rsync has no -depth option, as find has. But I can see doing this in two passes -- once to copy all the directories, and once to copy the files from each directory.
So I'll make up an example, and assume further that folder names do not contain special characters like spaces or newlines. (This is important.)
First, let's do a single-pass copy of all the directories themselves, not recursing into them:
xargs -L 1 -J % rsync -vi -d % /path/to/target/ < sourcelist.txt
The -d option creates the directories that were specified in sourcelist.txt, if they exist.
Second, let's walk through the list of sources, copying each one:
# Basic sanity checking on input...
grep -v '^[[:space:]]*#' sourcelist.txt | while read line; do
if [ -d "$line" ]; then
# Strip trailing slashes, as before
source=${line%%/}
# Grab the directory name from the source path
target=${source##*/}
rsync -vi -a "$source/" "/path/to/target/$target/"
fi
done
Note the trailing slash after $source on the rsync line. This causes rsync to copy the contents of the directory, rather than the directory.
Does all this make sense? Does it match your requirements?
You can use find's ipath argument:
find . -maxdepth 2 -ipath './*searchString*/*' -type f -exec cp '{}' '/newPath/' ';'
Notice the path starts with ./ to match find's search directory, ends with /* in order to exclude files in the top level directory, and maxdepth is set to 2 to only recurse one level deep.
Edit:
Re-reading your comments, it seems like you want to preserve the directory you're copying from? E.g. when searching for foo*:
./foo1/* ---> copied to /newPath/foo1/* (not to /newPath/*)
./foo2/* ---> copied to /newPath/foo2/* (not to /newPath/*)
Also, the other requirement is to keep maxdepth at 1 for speed reasons.
(As pointed out in the comments, the following solution has security issues for specially crafted names)
Combining both, you could use this:
find . -maxdepth 1 -type d -iname 'searchString' -exec sh -c "mkdir -p '/newPath/{}'; cp "{}/*" '/newPath/{}/' 2>/dev/null" ';'
Edit 2:
Why not ditch find altogether and use a pure bash solution:
for d in *searchString*/; do mkdir -p "/newPath/$d"; cp "$d"* "/newPath/$d"; done
Note the / at the end of the search string, causing only directories to be considered for matching.
For going over some recovered data, I am working on a script that recursively goes through folders & files and finally runs file on them, to check if they are likely fully recovered from a certain backup or not. (recovered files play, and are identified as mp3 or other audio, non-working files as ASCII-Text)
For now I would just be satisfied with having it go over my test folder structure, print all folders & corresponding files. (printing them mainly for testing, but also because I would like to log where the script currently is and how far along it is in the end, to verify what has been processed)
I tried using 2 for loops, one for the folders, then one for the files. (so that ideally it would take 1 folder, then list the files in there (or potentially delve into subfolders) and below each folder only give the files in that subfolders, then moving on to the next.
Such as:
Folder1
- File 1
- File 2
-- Subfolder
-- File3
-- File4
Folder2
- File5
However this doesn't seem to work in the ways (such with for loops) that are normally proposed. I got as far as using "find . -type d" for the directories and "find . -type f" or "find * -type f" (so that it doesn't go in to subdirectories) However, when just printing the paths/files in order to check if it ran as I wanted it to, it became obvious that that didn't work.
It always seemed to first print all the directories (first loop) and then all the files (second loop). For keeping track of what it is doing and for making it easier to know what was checked/recovered I would like to do this in a more orderly fashion as explained above.
So is it that I just did something wrong, or is this maybe a general limitation of the for loop in bash?
Another problem that could be related: Although assigning the output of find to an array seemed to work, it wasn't accessible as an array ...
Example for loop:
for folder in '$(find . -type d)' ; do
echo $folder
let foldercounter++
done
Arrays:
folders=("$(find . -type d)")
#As far as I know this should assign the output as an array
#However, it is not really assigned properly somehow as
echo "$folders[1]"
# does not work (quotes necessary for spaces)
A find ... -exec ... solution #H.-Dirk Schmitt was referring to might look something like:
find . -type f -exec sh -c '
case $(file "$1") in
*Audio file*)
echo "$1 is an audio file"
;;
*ASCII text*)
echo "$1 is an ascii text file"
;;
esac
' _ {} ';'
For going over some recovered data, I am working on a script that recursively goes through folders & files and finally runs file on them, to check if they are likely fully recovered from a certain backup or not. (recovered files play, and are identified as mp3 or other audio, non-working files as ASCII-Text)
If you want to run file on every file and directory in the current directory, including its subdirectories and so on, you don't need to use a Bash for-loop, because you can just tell find to run file:
find -exec file '{}' ';'
(The -exec ... ';' option runs the command ... on every matched file or directory, replacing the argument {} with the path to the file.)
If you only want to run file on regular files (not directories), you can specify -type f:
find -type f -exec file '{}' ';'
If you (say) want to just print the names of directories, but run the above on regular files, you can use the -or operator to connect one directive that uses -type d and one that uses -type f:
find -type d -print -or -type f -exec file '{}' ';'
Edited to add: If desired, the effect of the above commands can be achieved in pure Bash (plus the file command, of course), by writing a recursive shell function. For example:
function foo () {
local file
for file in "$1"/* ; do
if [[ -d "$file" ]] ; then
echo "$file"
foo "$file"
else
file "$file"
fi
done
}
foo .
This differs from the find command in that it will sort the files more consistently, and perhaps in gritty details such as handling of dot-files and symbolic links, but is broadly the same, so may be used as a starting-point for further adjustments.