How to get all revisions in subversion URL (trunk/branch) based on a string in svn comments? - bash

Need some help on shell command to get all revs in subversion trunk URL based on a string in svn comments.
I figured out to get it on one file but not on URL.
I tried svn log URL --stop-on-copy and svn log URL --xml to get the revs but unsuccessful.
Thanks !!

Another way using sed. It's probably not perfect but it also works with multiline comments. Replace SEARCH_STRING for your personal search.
svn log -l100 | sed -n '/^r/{h;d};/SEARCH_STRING/{g;s/^r\([[:digit:]]*\).*/\1/p}'

For Subversion 1.8 it's
svn log URL --search STRING

Try following.
x="refactoring"; svn log --limit 10 | egrep -i --color=none "($x|^r[0-9]+ \|.*lines$)" | egrep -B 1 -i --color=none $x | egrep --color=none "^r[0-9]+ \|.*lines$" | awk '{print $1}' | sed 's/^r//g'
Replace refactoring with search string.
Change svn log parameters to suite your need.
Case insensitive matching is used (egrep -i).
Edit based on comment.
x="ILIES-113493"; svn log | egrep -i --color=none "($x|^r[0-9]+ \|.*lines$)" | egrep -B 1 -i --color=none $x | egrep --color=none "^r[0-9]+ \|.*lines$" | awk '{print $1}' | sed 's/^r//g'
Notes:
x is the variable to contain the search string, and x is used in
two places in the command.
In order to use x as a variable in the shell itself, you need to put entire command on a single line (from x=".."; svn log ... sed '...'). Semicolon ; can be used to separate multiple commands on the same line.
I had used --limit 10 in example to limit the number of log entries,
change that as well as use other svn log parameters to suite your
need. Using --limit 10 will restrict the search to 10 most recent log entries.

Thanks all for the help !! This worked for me:
svn log $URL --stop-on-copy | grep -B 2 $STRING | grep "^r" | cut -d"r" -f2 | cut -d" " -f1
Use "--stop-on-copy" or "--limit" options depending on the requirement.

Related

How to grep only matching string from this result?

I am just simply trying to grab the commit ID, but not quite sure what I'm missing:
➜ ~ curl https://github.com/microsoft/vscode/releases -s | grep -oE 'microsoft/vscode/commit/(.*?)/hovercard'
microsoft/vscode/commit/ccbaa2d27e38e5afa3e5c21c1c7bef4657064247/hovercard
The only thing I need back from this is ccbaa2d27e38e5afa3e5c21c1c7bef4657064247.
This works just fine on regex101.com and in ruby/python. What am I missing?
If supported, you can use grep -oP
echo "microsoft/vscode/commit/ccbaa2d27e38e5afa3e5c21c1c7bef4657064247/hovercard" | grep -oP "microsoft/vscode/commit/\K.*?(?=/hovercard)"
Output
ccbaa2d27e38e5afa3e5c21c1c7bef4657064247
Another option is to use sed with a capture group
echo "microsoft/vscode/commit/ccbaa2d27e38e5afa3e5c21c1c7bef4657064247/hovercard" | sed -E 's/microsoft\/vscode\/commit\/([^\/]+)\/hovercard/\1/'
Output
ccbaa2d27e38e5afa3e5c21c1c7bef4657064247
The point is that grep does not support extracting capturing group submatches. If you install pcregrep you could do that with
curl https://github.com/microsoft/vscode/releases -s | \
pcregrep -o1 'microsoft/vscode/commit/(.*?)/hovercard' | head -1
The | head -1 part is to fetch the first occurrence only.
I would suggest using awk here:
awk 'match($0,/microsoft\/vscode\/commit\/[^\/]*\/hovercard/){print substr($0,RSTART+24,RLENGTH-34);exit}'
The regex will match a line containing
microsoft\/vscode\/commit\/ - microsoft/vscode/commit/ fixed string
[^\/]* - zero or more chars other than /
\/hovercard - a /hovercard string.
The substr($0,RSTART+24,RLENGTH-34) will print the part of the line starting at the RSTART+24 (24 is the length of microsoft/vscode/commit/) index and the RLENGTH is the length of microsoft/vscode/commit/ + the length of the /hovercard.
The exit command will fetch you the first occurrence. Remove it if you need all occurrences.
You can use sed:
curl -s https://github.com/microsoft/vscode/releases |
sed -En 's=.*microsoft/vscode/commit/([^/]+)/hovercard.*=\1=p' |
head -n 1
head -n 1 is to print the first match (there are 10)grep -o will print (only) everything that matches, including microsoft/ etc.
Your task can not be achieved with Mac's grep. grep -o prints all matching text (compared to default behaviour of printing matching lines), including microsoft/ etc. A grep which implemented perl regex (like GNU grep on Linux) could make use of look ahead/behind (grep -Po '(?<=microsoft/vscode/commit/)[^/]+(?=/hovercard)'). But it's just not available on Mac's grep.
On MacOS you don't have gnu utilities available by default. You can just pipe your output to a simple awk like this:
curl https://github.com/microsoft/vscode/releases -s |
grep -oE 'microsoft/vscode/commit/[^/]+/hovercard' |
awk -F/ '{print $(NF-1)}'
ccbaa2d27e38e5afa3e5c21c1c7bef4657064247
3a6960b964327f0e3882ce18fcebd07ed191b316
f4af3cbf5a99787542e2a30fe1fd37cd644cc31f
b3318bc0524af3d74034b8bb8a64df0ccf35549a
6cba118ac49a1b88332f312a8f67186f7f3c1643
c13f1abb110fc756f9b3a6f16670df9cd9d4cf63
ee8c7def80afc00dd6e593ef12f37756d8f504ea
7f6ab5485bbc008386c4386d08766667e155244e
83bd43bc519d15e50c4272c6cf5c1479df196a4d
e7d7e9a9348e6a8cc8c03f877d39cb72e5dfb1ff

sed: Argument list too long when running sed -n

I am running this command from Why is my git repository so big? on a very big git repository as https://github.com/python/cpython
git rev-list --all --objects | sed -n $(git rev-list --objects --all | cut -f1 -d' ' | git cat-file --batch-check | grep blob | sort -n -k 3 | tail -n800 | while read hash type size; do size_in_kibibytes=$(echo $size | awk '{ foo = $1 / 1024 ; print foo "KiB" }'); echo -n "-e s/$hash/$size_in_kibibytes/p "; done) | sort -n -k1;
It works fine if I replace tail -n800 by tail -n40:
1160.94KiB Lib/ensurepip/_bundled/pip-8.0.2-py2.py3-none-any.whl
1169.59KiB Lib/ensurepip/_bundled/pip-8.1.1-py2.py3-none-any.whl
1170.86KiB Lib/ensurepip/_bundled/pip-8.1.2-py2.py3-none-any.whl
1225.24KiB Lib/ensurepip/_bundled/pip-9.0.0-py2.py3-none-any.whl
...
I found this question Bash : sed -n arguments saying I could use awk instead of sed.
Do you know how do fix this sed: Argument list too long when tail is -n800 instead of -n40?
It seems you have used this anwer in the linked question: Some scripts I use:.... There is a telling comment in that answer:
This function is great, but it's unimaginably slow. It can't even finish on my computer if I remove the 40 line limit. FYI, I just added an answer with a more efficient version of this function. Check it out if you want to use this logic on a big repository, or if you want to see the sizes summed per file or per folder. – piojo Jul 28 '17 at 7:59
And luckily piojo has written another answer addressing this. Just use his code.
As an alternative, check if git sizer would work on your repository: that would help isolating what takes place in your repository.
If not, you have other commands in "How to find/identify large commits in git history?", which do loop around each objects and avoid the sed -nxx part
The alternative would be to redirect your result/command to a file, then sed on that file, as in here.

grep pipe with sed

This is my bash command
grep -rl "System.out.print" Project1/ |
xargs -I{} grep -H -n "System.out.print" {} |
cut -f-2 -d: |
sed "s/\(.*\):\(.*\)/filename is \1 and line number is \2/
What I'm trying to do here is,I'm trying to iterate through sub folders and check what files contains "System.out.print" (using grep)
using 2nd grep trying to get file names and line numbers
using sed command I display those to console.
from here I want to remove "System.out.print" with "XXXXX" how I can pipe sed command to this?
pls help me
thanxx
GNU sed has an option to change files in place:
find Project1/ -type f | xargs sed -i 's/System\.out\.print/XXXXX/g'
Btw, your script could be written as:
grep -rsn 'root' /etc/ |
awk -F: '{ print "filename is", $1, "and line number is", $2 }'
I'm just building on hop's answer, which I found to be more useful than find -exec. I had search_text dispersed all over my computer, in logs, config files and so on, but I didn't want to search (or especially change) anything in /dev, /sys, /proc, and so on. One note, read man xargs; it doesn't like file names with spaces.
grep -HriIl --exclude-dir=dev --exclude-dir=proc --exclude-dir=sys search_text / | xargs sed -i 's/search_text/replace_text/g'

Pipe shell output to svn del command?

I have a rather complicated deploy setup for our Drupal site that is a combination of CVS and SVN. We use CVS to get the newest versions of modules, and we deploy with SVN. Unfortunately, when CVS updates remove files, Subversions complains because they weren't removed in SVN. I am trying to do some shell scripting and Perl to run an svn rm command on all these files that have already been deleted from the filesystem, but I have not gotten far. What I have so far is this:
svn st | grep !
This outputs a list of all the files deleted from the filesystem like so:
! panels_views/panels_views.info
! panels_views/panels_views.admin.inc
! contexts/term.inc
! contexts/vocabulary.inc
! contexts/terms.inc
! contexts/node_edit_form.inc
! contexts/user.inc
! contexts/node_add_form.inc
! contexts/node.inc
etc. . .
However, I want to somehow run an svn del on each of these lines. How can I get this output into my Perl script, or alternatively, how can I run svn del on each of these lines?
Edit: The exact command I used, with some help from all, was
svn st | grep ^! | cut -c 9- | xargs svn del
Try using xargs, like this:
svn st | grep ^! | cut -f2 | xargs svn rm
The xargs command takes lines on its standard input, and turns them around and uses those lines as command line parameters on the svn rm. By default, xargs uses multiple lines with each invocation of its command, which in the case of svm rm is fine.
You may also have to experiment with the cut command to get it just right. By default, cut uses a tab as a delimiter and Subversion may output spaces there. In that case, you may have to use cut -d' ' -f6 or something.
As always when building such a command pipeline, run portions at a time to make sure things look right. So run everything up to the cut command to ensure that you have the list of file names you expect, before running it again with "| xargs svn rm" on the end.
svn st | egrep ^! | cut -b 9- | xargs svn del
Just as an alternative to the ones above, I'd use something like:
svn st | awk '$1=="!"{print $2}' | xargs svn del
I find awk's pattern-matching language very handy for tasks like this.

How do you pipe input through grep to another utility?

I am using 'tail -f' to follow a log file as it's updated; next I pipe the output of that to grep to show only the lines containing a search term ("org.springframework" in this case); finally I'd like to make is piping the output from grep to a third command, 'cut':
tail -f logfile | grep org.springframework | cut -c 25-
The cut command would remove the first 25 characters of each line for me if it could get the input from grep! (It works as expected if I eliminate 'grep' from the chain.)
I'm using cygwin with bash.
Actual results: When I add the second pipe to connect to the 'cut' command, the result is that it hangs, as if it's waiting for input (in case you were wondering).
Assuming GNU grep, add --line-buffered to your command line, eg.
tail -f logfile | grep --line-buffered org.springframework | cut -c 25-
Edit:
I see grep buffering isn't the only problem here, as cut doesn't allow linewise buffering.
you might want to try replacing it with something you can control, such as sed:
tail -f logfile | sed -u -n -e '/org\.springframework/ s/\(.\{0,25\}\).*$/\1/p'
or awk
tail -f logfile | awk '/org\.springframework/ {print substr($0, 0, 25);fflush("")}'
On my system, about 8K was buffered before I got any output. This sequence worked to follow the file immediately:
tail -f logfile | while read line ; do echo "$line"| grep 'org.springframework'|cut -c 25- ; done
What you have should work fine -- that's the whole idea of pipelines. The only problem I see is that, in the version of cut I have (GNU coreutiles 6.10), you should use the syntax cut -c 25- (i.e. use a minus sign instead of a plus sign) to remove the first 24 characters.
You're also searching for different patterns in your two examples, in case that's relevant.

Resources