SVN: How to know in which revision a file was deleted? - windows

Given that I'm using svn command line on Windows, how to find the revision number, in which a file was deleted? On Windows, there are no fancy stuff like grep and I attempting to use command line only, without TortoiseSVN. Thanks in advance!
EDIT:
I saw a few posts, like examining history of deleted file but it did not answer my question.
Is there any way other than svn log -v url > log.out and search with Notepad?

Install Cygwin.
I use this:
svn log -v --limit <nr> -v | grep -E '<fileName>|^r' | grep -B 1 <fileName>
where
fileName - the name of the file or any pattern which matches it
nr - the number of latest revisions in which I want to look for
This will give you the revisions for all the actions (add, delete, remove, modify) concerning the file, but with a simple tweak with grep you can get the revisions only for deletion.
(Obviously, --limit is optional, however you usually have an overview about how deep you need to search which gains you some performance.)

The log is the place to look. I know you don't want to hear that answer, but that is where you need to look for deleted files in SVN.
The reason for this is simply that a deleted file is not visible after it's been deleted. The only place to find out about its existence at all is either in the logs, or by fetching out an earlier revision prior to it being deleted.
The easiest way I know of to deal with this problem is to move away from the the command line, and use a GUI tool such as TortoiseSVN.
TortoiseSVN links itself into the standard Windows file Explorer, so it's very easy to use. In the context of answering this question, you would still use it to look at the logs, but it becomes a much quicker excersise:
Browser to the SVN folder you want to examine. Then right-click on the folder icon and select TortoiseSVN -> View Logs from the context menu.
You'll now get a window showing all the revisions made in that folder. In particular, it is easy to see which revisions have had additions and deletions, because the list includes a set of Action icons for each revision. You can double-click on a revision to get a list of files that were changed (or straight into a diff view if only one file was changed)
So you can easily see which revisions have had deletions, and you can quickly click them to find out which files were involved. It really is that easy.
I know you're asking about the command-line, but for administrative tasks like this, a GUI browser really does make sense. It makes it much quicker to see what's happening compared with trying to read through pages of cryptic text (no matter how well versed you are at reading that text).

This question was posted and answered some time ago.
In this answer I'll try to show a flexible way to get the informations asked and extend it.
In cygwin, use svn log in combination with awk
REPO_URL=https://<hostname>/path/to/repo
FILENAME=/path/to/file
svn log ${REPO_URL} -v --search "${FILENAME}" | \
awk -v var="^ [D] ${FILENAME}$" \
'/^r[0-9]+/{rev=$1}; \
$0 ~ var {print rev $0}'
svn log ${REPO_URL} -v --search "${FILENAME}" asks svn log for a verbose log containing ${FILENAME}. This reduces the data transfer.
The result is piped to awk. awk gets ${FILENAME} passed via -v in the var vartogether with search pattern var="^ [D] ${FILENAME}$"
In the awk program /^r[0-9]+/ {rev=$1} assigns the revision number to rev if line matches /^r[0-9]+/.
For every line matching ^ [D] ${FILENAME}$ awk prints the stored revision number rev and the line: $0 ~ var {print rev $0}
if you're interested not only in the deletion of the file but also creation, modification, replacing, change Din var="^ [D] ${FILENAME}$"to DAMR.
The following will give you all the changes:
svn log ${REPO_URL} -v --search "${FILENAME}" | \
awk -v var="^ [DAMR] ${FILENAME}$" \
'/^r[0-9]+/ {rev=$1}; \
$0 ~ var {print rev $0}'
And if you're interested in username, date and time:
svn log ${REPO_URL} -v --search "${FILENAME}" | \
awk -v var="^ [DAMR] ${FILENAME}$" \
'/^r[0-9]+/ {rev=$1;user=$3;date=$5;time=$6}; \
$0 ~ var {print rev " | " user " | " date " | " time " | " $0}'

Related

How can I use Git to identify function changes across different revisions of a repository?

I have a repository with a bunch of C files. Given the SHA hashes of two commits,
<commit-sha-1> and <commit-sha-2>,
I'd like to write a script (probably bash/ruby/python) that detects which functions in the C files in the repository have changed across these two commits.
I'm currently looking at the documentation for git log, git commit and git diff. If anyone has done something similar before, could you give me some pointers about where to start or how to proceed.
That doesn't look too good but you could combine git with your
favorite tagging system such as GNU global to achieve that. For
example:
#!/usr/bin/env sh
global -f main.c | awk '{print $NF}' | cut -d '(' -f1 | while read i
do
if [ $(git log -L:"$i":main.c HEAD^..HEAD | wc -l) -gt 0 ]
then
printf "%s() changed\n" "$i"
else
printf "%s() did not change\n" "$i"
fi
done
First, you need to create a database of functions in your project:
$ gtags .
Then run the above script to find functions in main.c that were
modified since the last commit. The script could of course be more
flexible, for example it could handle all *.c files changed between 2 commits as reported by git diff --stats.
Inside the script we use -L option of git log:
-L <start>,<end>:<file>, -L :<funcname>:<file>
Trace the evolution of the line range given by
"<start>,<end>" (or the function name regex <funcname>)
within the <file>. You may not give any pathspec
limiters. This is currently limited to a walk starting from
a single revision, i.e., you may only give zero or one
positive revision arguments. You can specify this option
more than once.
See this question.
Bash script:
#!/usr/bin/env bash
git diff | \
grep -E '^(##)' | \
grep '(' | \
sed 's/##.*##//' | \
sed 's/(.*//' | \
sed 's/\*//' | \
awk '{print $NF}' | \
uniq
Explanation:
1: Get diff
2: Get only lines with hunk headers; if the 'optional section heading' of a hunk header exists, it will be the function definition of a modified function
3: Pick only hunk headers containing open parentheses, as they will contain function definitions
4: Get rid of '## [old-file-range] [new-file-range] ##' sections in the lines
5: Get rid of everything after opening parentheses
6: Get rid of '*' from pointers
7: [See 'awk']: Print the last field (i.e: column) of the records (i.e: lines).
8: Get rid of duplicate names.

Mac OS terminal solution to remove from a textfile lines from another textfiles

I work in SEO and sometimes I have to manage lists of domains to be considered for certain actions in our campaigns. On my iMac, I have 2 lists, one provided for consideration - unfiltered.txt - and another that has listed the domains I've already analyzed - used.txt. The one provided for consideration, the new one (unfiltered.txt), looks like this:
site1.com
site2.com
domain3.net
british.co.uk
england.org.uk
auckland.co.nz
... etc
List of domains that needs to be used as a filter, to be eliminated (used.txt) - looks like this.
site4.org
site5.me
site6.co.nz
gland.org.uk
kland.co.nz
site7.de
site8.it
... etc
Is there a way to use my OS X terminal to remove from unfiltered.txt all the lines found in used.txt? Found a software solution that partially solves a problem, and, aside from the words from used.txt, eliminates also words containing these smaller words. It means I get a broader filter and eliminate also domains that I still need.
For example, if my unfiltered.txt contains a domain named fogland.org.uk it will be automatically eliminated if in my used.txt file I have a domain named gland.org.uk.
Files are pretty big (close to 100k lines). I have pretty good configuration, with SSD, i7 7th gen, 16GB RAM, but it is unlikely to let it run for hours just for this operation.
... hope it makes sense.
TIA
You can do that with awk. You pass both files to awk. Whilst parsing the first file, where the current record number across all files is the same as the record number in the current file, you make a note of each domain you have seen. Then, when parsing the second file, you only print records that correspond to ones you have not seen in the first file:
awk 'FNR==NR{seen[$0]++;next} !seen[$0]' used.txt unfiltered.txt
Sample Output for your input data
site1.com
site2.com
domain3.net
british.co.uk
england.org.uk
auckland.co.nz
awk is included and delivered as part of macOS - no need to install anything.
I have always used
grep -v -F -f expunge.txt filewith.txt > filewithout.txt
to do this. When "expunge.txt" is too large, you can do it in stages, cutting it into manageable chunks and filtering one after another:
cp filewith.txt original.txt
and loop as required:
grep -v -F -f chunkNNN.txt filewith.txt > filewithout.txt
mv filewithout.txt filewith.txt
You could even do this in a pipe:
grep -v -F -f chunk01.txt original.txt |\
grep -v -F -f chunk02.txt original.txt |\
grep -v -F -f chunk03.txt original.txt \
> purged.txt
You can use comm. I haven't got a mac here to check but I expect it will be installed by default. Note that both files must be sorted. Then try:
comm -2 -3 unfiltered.txt used.txt
Check the man page for further details.
You can use comm and process substitution to do everything in one line:
comm -23 <(sort used.txt) <(sort unfiltered.txt) > used_new.txt
P.S. tested on my Mac running OSX 10.11.6 (El Capitan)

GREP: exclude file extensions in specific directory

My code takes added, modified, deleted, renamed, copied files from git status -s and compare them with the list of file paths from the file.
git status -s |
grep -E "^M|^D|^A|^R|^C" |
awk '{if ($1~/M+/ || $1~/D+/ || $1~/A+/ || $1~/R+/ || $1~/C+/) print $2}' |
grep --file=$list_of_files --fixed-strings |
grep -r --exclude="*.jar" "SVCS/bus/projects/Resources/"
Prints out git status like M foo.txt
Does some "filtering" operations
More filtering operations
Takes path to files for compare from the text file
Here I am trying to make so the last step would exclude .jar files from specific directory.
How can I do the last step? Or need to add something to the 4th step?
The simple fix is to change the last line to
grep -v 'SVCS/bus/projects/Resources/.*\.jar$'
but that really is some horrible code you have there.
Keeping in mind that grep | awk and awk | grep is an antipattern, how about this refactoring?
git status -s |
grep -E "^M|^D|^A|^R|^C" |
awk '{if ($1~/M+/ || $1~/D+/ || $1~/A+/ || $1~/R+/ || $1~/C+/)
... Hang on, what's the point of that? The grep already made sure that $1 contains one or more of those letters. The + quantifier is completely redundant here.
print $2}'
Will break on files with whitespace in them. This is a very common error which is aggravating because a lot of the time, the programmer knew it would break, but just figured "can't happen here".
git status -s | awk 'NR==FNR { files[$0] = 1; next }
/^[MDARC]/ { gsub(/^[MDARC]+ /, "");
if ($0 ~ /SVCS\/bus\/projects\/Resources\/.*\.jar$/)
next;
if ($0 in files) print }' "$list_of_files" -
The NR==FNR thing is a common idiom to read the first file into an array, then fall through to the next input file. So we read $list_of_files into the keys of the associative array files; then if the file name we read from git status is present in the keys, we print it. The condition to skip .jar files in a particular path is then a simple addition to this Awk script.
This assumes $list_of_files really is a list of actual files, as suggested by the file name. Your code will look for a match anywhere in that file, so a partial file name would also match (for example, if the file contains path/to/ick, a file named somepath/to/icktys/mackerel would match, and thus be printed). If that is the intended functionality, the above script will require some rather drastic modifications.

How to list installed go packages

To my knowledge go distribution comes with some sort of package manager. After go 1.4.1 installation I've run go help in order to find any sub-command capable of listing locally installed go packages, but unfortunately got none.
So how to do it?
goinstall is now history
goinstall was replaced by go get. go get is used to manage external / 3rd party libraries (e.g. to download them, update them, install them etc).
Type go help get to see command line help, or check out these pages:
Command go
About the go command (blog post)
If you want to list installed packages, you can do that with the go list command:
Listing Packages
To list packages in your workspace, go to your workspace folder and run this command:
go list ./...
./ tells to start from the current folder, ... tells to go down recursively. Of course this works in any other folders not just in your go workspace (but usually that is what you're interested in).
List All Packages
Executing
go list ...
in any folder lists all the packages, including packages of the standard library first followed by external libraries in your go workspace.
Packages and their Dependencies
If you also want to see the imported packages by each package, you can try this custom format:
go list -f "{{.ImportPath}} {{.Imports}}" ./...
-f specifies an alternate format for the list, using the syntax of package template. The struct whose fields can be referenced can be printed by the go help list command.
If you want to see all the dependencies recursively (dependencies of imported packages recursively), you can use this custom format:
go list -f "{{.ImportPath}} {{.Deps}}" ./...
But usually this is a long list and just the single imports ("{{.Imports}}") of each package is what you want.
Also see related question: What's the Go (mod) equivalent of npm-outdated?
Start Go documentation server:
godoc --http :6060
Visit http://localhost:6060/pkg
There will be list of all your packages.
When you install new ones they do not appear automatically. You need to restart godoc.
go list ... is quite useful, but there were two possible issues with it for me:
It will list all packages including standard library packages. There is no way to get the explicitly installed packages only (which is what I assume the more interesting inquiry).
A lot of times I need only the packages used in my projects (i.e. those listed in the respective go.mod files), and I don't care about other packages lying around (which may have been installed just to try them out). go list ... doesn't help with that.
So here's a somewhat different take. Assuming all projects are under ~/work:
find ~/work -type f -name go.mod \
-exec sed $'/^require ($/,/^)$/!d; /^require ($/d;/^)$/d; /\\/\\/ indirect$/d; s/^\t+//g' {} \; \
| cut -d' ' -f1 \
| sort | uniq
A line by line explanation:
find all go.mod files
apply sed to each file to filter its content as follows (explained expression by expression):
extract just the require( ... ) chunks
remove the require( and ) lines, so just lines with packages remain
remove all indirect packages
remove leading tabs 1)
extract just the qualified package name (drop version information)
remove duplicate package names
1) Note the sed expression argument uses bash quoting to escape the TAB character as "\t" for readability over a literal TAB.
on *nix systems (possibly on windows with bash tools like msysgit or cmder), to see what packages I have installed, I can run:
history | grep "go get"
But thats messy output. For whatever reason I decided to see of i could clean that output up a little so i made an alias for this command:
history | grep 'go get' | grep -v ' history ' | sed -e $'s/go get /\\\\\ngo get /g' | grep 'go get ' | sed -e $'s/-u //g' | sed -e $'s/-v //g' | sed -e $'s/ &&//g' | grep -v '\\\n' | egrep 'get [a-z]' | sed -e $'s/go get //g' | sed -e $'s/ //g' | sort -u
please don't ask why I did this. Challenge maybe? Let me explain the parts
history the history
grep "go get" grep over history and only show lines where we went and got something
grep -v " history " and remove times when we have searched for "got get" in history
sed -e $'s/go get /\\\\\ngo get /g' Now we take any instances of "go get " and shove a new line in front of it. Now they're all at the beginning.
grep "go get " filter only lines that now start with "go get"
sed -e $'s/-u //g' and sed -e $'s/-v //g' remove flags we have searched for. You could possibly leave them in but may get duplicates when output is done.
sed -e $'s/ &&//g' some times we install with multiple commands using '&&' so lets remove them from the ends of the line.
grep -v "\\\n" my output had other lines with newlines printed I didnt need. So this got rid of them
egrep "get [a-z]" make sure to get properly formatted go package urls only.
sed -e $'s/go get //g' remove the "go get " text
sed -e $'s/ //g' strip any whitespace (needed to filter out duplicates)
sort -u now sort the remaining lines and remove duplicates.
This is totally untested on other systems. Again, I am quite sure there is a cleaner way to do this. Just thought it would be fun to try.
It would also probably be more fun to make a go ls command to show the actual packages you explicitly installed. But thats a lot more work. Especially since i'm only still learning Go.
Output:
> gols
code.google.com/p/go.crypto/bcrypt
github.com/golang/lint/golint
github.com/kishorevaishnav/revelgen
github.com/lukehoban/go-find-references
github.com/lukehoban/go-outline
github.com/newhook/go-symbols
github.com/nsf/gocode
github.com/revel/cmd/revel
github.com/revel/revel
github.com/rogpeppe/godef
github.com/tpng/gopkgs
golang.org/x/tools/cmd/goimports
golang.org/x/tools/cmd/gorename
gopkg.in/gorp.v1
sourcegraph.com/sqs/goreturns

bash: shortest way to get n-th column of output

Let's say that during your workday you repeatedly encounter the following form of columnized output from some command in bash (in my case from executing svn st in my Rails working directory):
? changes.patch
M app/models/superman.rb
A app/models/superwoman.rb
in order to work with the output of your command - in this case the filenames - some sort of parsing is required so that the second column can be used as input for the next command.
What I've been doing is to use awk to get at the second column, e.g. when I want to remove all files (not that that's a typical usecase :), I would do:
svn st | awk '{print $2}' | xargs rm
Since I type this a lot, a natural question is: is there a shorter (thus cooler) way of accomplishing this in bash?
NOTE:
What I am asking is essentially a shell command question even though my concrete example is on my svn workflow. If you feel that workflow is silly and suggest an alternative approach, I probably won't vote you down, but others might, since the question here is really how to get the n-th column command output in bash, in the shortest manner possible. Thanks :)
You can use cut to access the second field:
cut -f2
Edit:
Sorry, didn't realise that SVN doesn't use tabs in its output, so that's a bit useless. You can tailor cut to the output but it's a bit fragile - something like cut -c 10- would work, but the exact value will depend on your setup.
Another option is something like: sed 's/.\s\+//'
To accomplish the same thing as:
svn st | awk '{print $2}' | xargs rm
using only bash you can use:
svn st | while read a b; do rm "$b"; done
Granted, it's not shorter, but it's a bit more efficient and it handles whitespace in your filenames correctly.
I found myself in the same situation and ended up adding these aliases to my .profile file:
alias c1="awk '{print \$1}'"
alias c2="awk '{print \$2}'"
alias c3="awk '{print \$3}'"
alias c4="awk '{print \$4}'"
alias c5="awk '{print \$5}'"
alias c6="awk '{print \$6}'"
alias c7="awk '{print \$7}'"
alias c8="awk '{print \$8}'"
alias c9="awk '{print \$9}'"
Which allows me to write things like this:
svn st | c2 | xargs rm
Try the zsh. It supports suffix alias, so you can define X in your .zshrc to be
alias -g X="| cut -d' ' -f2"
then you can do:
cat file X
You can take it one step further and define it for the nth column:
alias -g X2="| cut -d' ' -f2"
alias -g X1="| cut -d' ' -f1"
alias -g X3="| cut -d' ' -f3"
which will output the nth column of file "file". You can do this for grep output or less output, too. This is very handy and a killer feature of the zsh.
You can go one step further and define D to be:
alias -g D="|xargs rm"
Now you can type:
cat file X1 D
to delete all files mentioned in the first column of file "file".
If you know the bash, the zsh is not much of a change except for some new features.
HTH Chris
Because you seem to be unfamiliar with scripts, here is an example.
#!/bin/sh
# usage: svn st | x 2 | xargs rm
col=$1
shift
awk -v col="$col" '{print $col}' "${#--}"
If you save this in ~/bin/x and make sure ~/bin is in your PATH (now that is something you can and should put in your .bashrc) you have the shortest possible command for generally extracting column n; x n.
The script should do proper error checking and bail if invoked with a non-numeric argument or the incorrect number of arguments, etc; but expanding on this bare-bones essential version will be in unit 102.
Maybe you will want to extend the script to allow a different column delimiter. Awk by default parses input into fields on whitespace; to use a different delimiter, use -F ':' where : is the new delimiter. Implementing this as an option to the script makes it slightly longer, so I'm leaving that as an exercise for the reader.
Usage
Given a file file:
1 2 3
4 5 6
You can either pass it via stdin (using a useless cat merely as a placeholder for something more useful);
$ cat file | sh script.sh 2
2
5
Or provide it as an argument to the script:
$ sh script.sh 2 file
2
5
Here, sh script.sh is assuming that the script is saved as script.sh in the current directory; if you save it with a more useful name somewhere in your PATH and mark it executable, as in the instructions above, obviously use the useful name instead (and no sh).
It looks like you already have a solution. To make things easier, why not just put your command in a bash script (with a short name) and just run that instead of typing out that 'long' command every time?
If you are ok with manually selecting the column, you could be very fast using pick:
svn st | pick | xargs rm
Just go to any cell of the 2nd column, press c and then hit enter
Note, that file path does not have to be in second column of svn st output. For example if you modify file, and modify it's property, it will be 3rd column.
See possible output examples in:
svn help st
Example output:
M wc/bar.c
A + wc/qax.c
I suggest to cut first 8 characters by:
svn st | cut -c8- | while read FILE; do echo whatever with "$FILE"; done
If you want to be 100% sure, and deal with fancy filenames with white space at the end for example, you need to parse xml output:
svn st --xml | grep -o 'path=".*"' | sed 's/^path="//; s/"$//'
Of course you may want to use some real XML parser instead of grep/sed.

Resources