I have an assignment to count all the changes from all the files from a git open-source project. I know that if I use:
git log --pretty=oneline <filename> | wc -l
I will get the number of changes of that file ( I use git bash on Windows 10)
My idea is to use
find .
and to redirect the output to the git command. How can I do the redirecting? I tried :
$ find . > git log --pretty=online | wc -l
0
find: unknown predicate `--pretty=online'
and
$ find . | git log --pretty=online | wc -l
fatal: invalid --pretty format: online
0
You can do much better than that,
git log --pretty='' --name-only | sort | uniq -c
That's "show only the names of files changed in each commit, no other metadata, sort that list so uniq can easily count the occurrences of each'
You'll need to loop over the results of find.
find -type f | grep -v '^\./\.git' |
while read f; do
count=$(git log --oneline ${f} | wc -l)
echo "${f} - ${count}"
done | grep -v ' 0$'
Your find is okay, but I'd restrict it to just files (git doesn't track directories explicitly) and remove the .git folder (we don't care about those files). Pipe that into a loop (I'm using a while), and then your git log command works just fine. Lastly, I'm going to strip anything with a count of 0, since I may have files that are part of .gitignore I don't want to show up (e.g., things in __pycache__).
Related
How would I modify this code to give me the full file path of the last modified file in the code directory, including nested sub-directories?
# Gets the last modified file in the code directory.
get_filename(){
cd "$code_directory" || no_code_directory_error # Stop script if directory doesn't exist.
last_modified=$(ls -t | head -n1)
echo "$last_modified"
}
Use find instead of ls, because the use of ls is an anti-pattern.
Use a Schwartzian transform to prefix your data with a sort key.
Sort the data.
Take what you need.
Remove the sort key.
Post process the data.
find "$code_directory" -type f -printf '%T# %p\n' |
sort -rn |
head -1 |
sed 's/^[0-9.]\+ //' |
xargs readlink -f
You can use the realpath utility.
# Gets the last modified file in the code directory.
get_filename(){
cd "$code_directory" || no_code_directory_error # Stop script if directory doesn't exist.
last_modified=$(ls -t | head -1)
echo "$last_modified"
realpath "$last_modified"
}
Output:
blah.txt
/full/path/to/blah.txt
ls -t sort by modification time and if you want first one you can add | head -1, R helps you recursively sort files, I think the only tips here is ls -tR doesn't stack all files then sort them, so you can use
find . -type f -printf "%T# %f\n" | sort -rn > out.txt
I have a directory like this:
A/B/C/D/data. Inside this exists folders like, 202012, 202013, etc.
Now, I want to find all folders starting with 2020 inside data folder and then obtain the name of the one which was created most recently. So, I did this,
find /A/B/C/D/data/ -name "2020*" -type d. This gave me all folders starting with 2020. Now, when I am piping the output of this to ls -t | head -1 using the | operator, it simply returns the data folder. My expectation is that it should return the latest folder inside the data folder.
I am doing like this,
find /A/B/C/D/data/ -name "2020*" -type d | ls -t | head -1
How can I do this?
Thanks!
shopt -s globstar # Enable **
ls -dFt /A/B/C/D/data/**/2020*
Note that this would also list files starting with 2020, not only directories. For this reason, I used the -F flag. This appends a / to each directory, so you can distinguish files and directories easier. If you are sure, that your directory entries don't contain newline characters or slashes, you can pipe the output to | grep '/$' and get only the directories.
If you need this for a quick inspection in an interactive shell, I would do a ls -dFtr .... to get them sorted in reverse order. This makes sure that the ones you are interested in, show up at the end of the list.
You need to run the output of find through xargs to give it as command line arguments to ls:
find /A/B/C/D/data/ -name "2020*" -type d | xargs ls -t -d | head -1
git diff --name-only or git diff --name-status will list all files that have been changed, however there is no command to list all folder names that contain files changed.
For example, with this directory tree:
test/
|
|__B/
|
|____b1.txt
|
|__C/
|
|____c1.txt
If b1.txt and c1.txt have changed, I'd like to get B and C as the output.
Here's a way:
git diff --name-only | xargs -L1 dirname | uniq
Explanation
git diff --name-only: list all changed files
xargs -L1 dirname: remove filenames to keep only directories
uniq: remove duplicates
You can create an alias so you don't have to type the whole command each time you need it:
alias git-diff-folders="git diff --name-only | xargs -L1 dirname | uniq"
This command would give results of top level directories.
git status | cut -d'/' -f1
For a more in depth solution I am currently working with
${PWD##*/} in bash to find a more correct solution to your problem.
Trying to write a solution for you in git that is simular to this npm solution:
get the path where npm gulp called from
im no longer searching for the answer on this one,
git diff --name-only | xargs -L1 dirname | uniq
was the correct answer provided by Pᴇʜ
it works on my mac computer after his edit's
this command allows me to login to a server, to a specific directory from my pc
ssh -t xxx.xxx.xxx.xxx "cd /directory_wanted ; bash"
How can I then do this operation in that directory. I want to be able to basically delete all files except the N most newest.
find ./tmp/ -maxdepth 1 -type f -iname *.tgz | sort -n | head -n -10 | xargs rm -f
This command should work:
ls -t *.tgz | tail -n +11 | xargs rm -f
Warning: Before doing rm -f, confirm that the files being listed by ls -t *.tgz | tail -n +11 are as expected.
How it works:
ls lists the contents of the directory.-t flag sorts by
modification time (newest first). See the man page of ls
tail -n +11 outputs starting from line 11. Please refer the man page of
tail for more
detials.
If the system is a Mac OS X then you can delete based on creation time too. Use ls with -Ut flag. This will sort the contents based on the creation time.
You can use this command,
ssh -t xxx.xxx.xxx.xxx "cd /directory_wanted; ls -t *.tgz | tail -n
+11 | xargs rm -f; bash"
In side quotes, we can add what ever the operations to be performed in remote machine. But every command should be terminated with semicolon (;)
Note: Included the same command suggested by silentMonk. It is simple and it is working. But verify it once before performing the operation.
I want to copy the files from the most recent directory created. How would I do so in unix?
For example, if I have the directories names as date stamp as such:
/20110311
/20110318
/20110325
This is the answer to the question I think you are asking.
When I deal with many directories that have date/time stamps in the name, I always take the approach that you have which is YYYYMMDD - the great thing about that is that the date order is then also the alphabetical order. In most shells (certainly in bash and I am 90% sure of the others), the '*' expansion is done alphabetically, and by default 'ls' return alphabetical order. Hence
ls | head -1
ls | tail -1
Give you the earliest and the latest dates in the directory.
This can be extended to only keep the last 5 entries etc.
lastdir=`ls -tr <parentdir> | tail -1`
I don't know how to make the backticks play nice with the commenting system here. Just replace those apostrophes with backticks.
After some experimenting, I came up with the following:
The unix stat command is useful here. The '-t' option causes stat to print its output in terse mode (all in one line), and the 13th element of that terse output is the unix timestamp (seconds since epoch) for the last-modified time. This command will list all directories (and sub-directories) in order from newest-modified to oldest-modified:
find -type d -exec stat -t {} \; | sort -r -n -k 13,13
Hopefully the "terse" mode of stat will remain consistent in future releases of stat !
Here's some explanation of the command-line options used:
find -type d # only find directories
find -exec [command] {} \; # execute given command against each *found* file.
sort -r # reverse the sort
sort -n # numeric sort (100 should not appear before 2!)
sort -k M,N # only sort the line using elements M through N.
Returning to your original request, to copy files, maybe try the following. To output just a single directory (the most recent), append this to the command (notice the initial pipe), and feed it all into your 'cp' command with backticks.
| head --lines=1 | sed 's/\ .*$//'
The trouble with the ls based solutions is that they are not filtering just for directories. I think this:
cp `find . -mindepth 1 -maxdepth 1 -type d -exec stat -c "%Y %n" {} \; |sort -n -r |head -1 |awk '{print $2}'`/* /target-directory/.
might do the trick, though note that that will only copy files in the immediate directory. If you want a more general answer for copying anything below your newest directory over to a new directory I think you would be better off using rsync like:
rsync -av `find . -mindepth 1 -maxdepth 1 -type d -exec stat -c "%Y %n" {} \; |sort -n -r |head -1 |awk '{print $2}'`/ /target-directory/
but it depends a bit which behaviour you want. The explanation of the stuff in the backticks is:
. - the current directory (you may want to specify an absolute path here)
-mindepth/-maxdepth - restrict the find command only to the immediate children of the current directory
-type d - only directories
-exec stat .. - outputs the modified time and the name of the directory from find
sort -n -r |head -1 | awk '{print $2}' - date orders the directory and outputs the name of the most recently modified
If your directories are named YYYYMMDD like your question suggests, take advantage of the alphabetic globbing.
Put all directories in an array, and then pick the first one:
dirs=(*/); first_dir="$dirs";
(This is actually a shortcut for first_dir="${dirs[0]}";.)
Similarly, for the last one:
dirs=(*/); last_dir="${dirs[$((${#dirs[#]} - 1))]}";
Ugly syntax, but this is what it breaks down to:
# Create an array of all directories inside the working directory.
dirs=(*/);
# Get the number of entries in the array.
num_dirs=${#dirs[#]};
# Calculate the index of the last entry.
last_index=$(($num_dirs - 1));
# Get the value at the last index.
last_dir="${dirs[$last_index]}";
I know this is an old question with an accepted answer, but I think this method is preferable as it does everything in Bash. No reason to spawn extra processes, let alone parse the output of ls. (Which, admittedly, should be fine in this particular case of YYYYMMDD names.)
please try with following command
ls -1tr | tail -1
find ~ -type d | ls -ltra
This one is simple and useful which I learned recently.
This command will show the results in reverse chronological order.
I wrote a command that can be used to identify which folder or files are created in a folder as a newest. That's seems pure :)
#/bin/sh
path=/var/folder_name
newest=`find $path -maxdepth 1 -exec stat -t {} \; |sed 1d |sort -r -k 14 | head -1 |awk {'print $1'} | sed 's/\.\///g'`
find $path -maxdepth 1| sed 1d |grep -v $newest