SVN checkout files that have been committed within given time period - bash

I'm writing a deployment bash script that will publish recent changes in source control to a different machine. I'm new to svn from the command line (have used it in development for years) and new to bash scripting.
I need a way to checkout only files that have been modified recently. Something like this:
svn checkout svn://server/repo/project/trunk -mtime -1d4h
The idea being this would only checkout files that have been committed within the last 28 hours.

You can't checkout changes, you get only some state of some part of repository (i.e "revision"), which will include all files, existing in this node (subdirectory) in this revision (and added|modified in any revision before this revision, including this revision)
Date format specification in Subversion doesn't allow "relative date-time", only absolute values
Scripts, which export|save outside repository all files, changed in revision|revision range, exist and can be found in Net (Subversion Command Line Script to export changed files as good bash-sample)
Consequences of the above notes
You must to supply correct Subversion-style date or revision number for starting point
Relative date with free-form specification can be easy constructed with bash date command (-d "28 hours before") and stored in variable, which can be used as parameter for electrictoolbox's script
Deploy of files from export-directory to final destination is final part (heavy environment-specific, no suggestions here now)

I found something simpler to suite my needs based on this:
How can I keep the original file [commit] timestamp on Subversion?
I checkout or update a working copy of the project using --config-option config:miscellany:use-commit-times=yes which sets the timestamps on the filesystem equal to the last checkout time. I then use a standard find command. For example:
#!/bin/bash
function listfn {
while read file; do
if [[ !( $file =~ ^.*\.svn.*$ ) ]]; then
echo $file
fi
done
}
svn checkout --config-option config:miscellany:use-commit-times=yes svn://server/repo/project/trunk project-fordeploy
find project-fordeploy -mtime -1d4h | listfn

Related

How to stage specific lines to git

I want to be able stage specific lines of code that match a pattern (MARKETING_VERSION in my case).
I've got the awk command which will show me the lines that match the pattern MARKETING_VERSION but I don't know how to stage the lines from that result to git.
awk '/MARKETING_VERSION/{print NR}' exampleFile.txt
the result in terminal is
1191
1245
How can I use this result to stage those specific lines in that file to git?
I know you can use git add -p but I want to use this in a shell script so I need a non-interactive version.
TIA
What git add -p <file> does is, very roughly, this:
tmpfile=$(mktemp)
tf2=$(mktemp)
tf3=$(mktemp)
git diff <file> > $tmpfile
while [ -s $tmpfile ]; do
extract first diff hunk from $tmpfile to $tf2 and rest to $tf3
show you $tf2, ask if you want to include this hunk
(with options to edit the hunk, etc); repeat until ready
if you say to *add* the hunk, run git apply --cached $tf2
cat < $tf3 > $tf2
done
rm -f $tmpfile $tf2 $tf3
That is, git add -p uses git apply --cached (a specialized sub-variant of git apply --index that ignores the working tree copy of the file). The key takeaway you need, from the above, is this: There are three versions of the file!
The first one (completely ignored here) is frozen for all time and is in the HEAD commit.
The second one is in Git's index aka staging area. That's used by git diff above as the "old version".
The third one is in your working tree. That's used by git diff above as the "new version".
The patches that Git lets you take or skip are simply the result of comparing the "old" (index) and "new" (working tree) version. If you take some patch, Git updates the in-index copy by applying the patch.
Hence, if there are some set of lines in the working tree version (say, lines 100 through 110 inclusive) that you'd like to use to replace some other set of lines (say, lines 90 through 92 inclusive) in the index version, the way to construct that is:
extract the index version;
scrape out lines 1-89 from the index version; concatenate lines 100-110 from the working tree version; concatenate lines 93-end from the index version, all into a temporary file;
replace the index copy with the temporary file.
To read the index version, use git show or git cat-file -p with the name of the index version of the file. If the file's name is path/to/file, the index version's name is :path/to/file (short for :0:path/to/file: we want the copy in slot zero; there must not be a copy in slots 1, 2, or 3 so that there is a copy in slot 0; you can simply attempt to read it from slot zero, and if that fails, assume the file either isn't in the index, or is conflicted).
Reading the working tree file (some select subset of lines) is left as an exercise, as is the concatenation part, and any error checking you wish to include.
Assuming the final resulting file is in a temporary file named $tf (as a shell variable), to update the index copy, you must first make sure an appropriate blob hash ID exists:
hash=$(git hash-object -w -t blob --path="$path" -- "$tf")
for instance (this assumes you want to run the usual .gitattribute filters, if any, and know that the path is $path). Then, if that goes well, use that hash ID with git update-index:
git update-index --cacheinfo "$mode,$hash,$path"
where $mode is either 100644 or 100755 as appropriate for the file. If you don't want to change the mode, you can read the previous mode with git ls-files --cached or similar. Otherwise, provided core.fileMode is true, read the mode from the working tree copy of the file, to match the behavior of git add: convert "has any executable bit set" to 100755 and "has no executable bit set" to 100644. When core.fileMode is false—use git config --get --type bool core.filemode to read it—git add uses the existing mode for this add-patch case.)

svn:how to get the changed files within specific date?

I want to make a stat via svn,and the input and output like this:
input:svn url,statDate,endDate
output:
person1 20Lines
persion2 30Lines
eg: my.sh http://svn.xxx.com/trunck/xxx/xxx/test.php 2014-12-01 2014-12-10
I use the shell script,and my idea like follows:
1.svn co the whole project
2.get the changed files within a specific date based on the giving url
3.use the command 'svn blame [file url]' get the changed infos by person
but now I don't know how to get the changed files in a directory within a specific date
please give me any ideas,thanks!
You don't want to use svn status, as this only shows if a file has changed in the working copy, whereas you are looking for commit info. You can, however, use
svn log -vq -r {2014-12-01}:{2014-12-10} http://svn.xxx.com/trunck/xxx/xxx/test.php
for your purposes. The -v adds information on the changed paths, the -q suppresses the commit message and only shows the changed paths.
You can then parse this output for the changed paths, user, and revision, and then use svn diff to find out for a given path how many lines were changed by that user in the specified revision.

How to copy only new files using bash scripting

I have to use bash scripting to copy files from one folder to another. If the destination folder has a file with the same name but older timestamp, it should not copy. Only newer files should be copied. I could have used cp -u, but I was asked not to use it. Essentially I have to use the test command testing for "ot". Please let me know how could this be done. I believe two for loops one to read the files in the source and one for the destination directories can be used and the the time stamp compared. The problem is that both for loops produce the absolute path names along with the file name. So not sure how to compare them
Thanks
You can profit from the parameter substitution:
for file in "$folder1"/* ; do
filename=${file##*/} # Remove everything to the last slash.
Or, you can change the directory:
cd "$folder1"
for file in * ; do
## you have to use full or relative path to $folder2 here

Split large repo into multiple subrepos and preserve history (Mercurial)

We have a large base of code that contains several shared projects, solution files, etc in one directory in SVN. We're migrating to Mercurial. I would like to take this opportunity to reorganize our code into several repositories to make cloning for branching have less overhead. I've already successfully converted our repo from SVN to Mercurial while preserving history. My question: how do I break all the different projects into separate repositories while preserving their history?
Here is an example of what our single repository (OurPlatform) currently looks like:
/OurPlatform
---- Core
---- Core.Tests
---- Database
---- Database.Tests
---- CMS
---- CMS.Tests
---- Product1.Domain
---- Product1.Stresstester
---- Product1.Web
---- Product1.Web.Tests
---- Product2.Domain
---- Product2.Stresstester
---- Product2.Web
---- Product2.Web.Tests
==== Product1.sln
==== Product2.sln
All of those are folders containing VS Projects except for the solution files. Product1.sln and Product2.sln both reference all of the other projects. Ideally, I'd like to take each of those folders, and turn them into separate Hg repos, and also add new repos for each project (they would act as parent repos). Then, If someone was going to work on Product1, they would clone the Product1 repo, which contained Product1.sln and subrepo references to ReferenceAssemblies, Core, Core.Tests, Database, Database.Tests, CMS, and CMS.Tests.
So, it's easy to do this by just hg init'ing in the project directories. But can it be done while preserving history? Or is there a better way to arrange this?
EDIT::::
Thanks to Ry4an's answer, I was able to accomplish my goal. I wanted to share how I did it here for others.
Since we had a lot of separate projects, I wrote a small bash script to automate creating the filemaps and to create the final bat script to actually do the conversion. What wasn't completely apparent from the answer, is that the convert command needs to be run once for each filemap, to produce a separate repository for each project. This script would be placed in the directory above a svn working copy that you have previously converted. I used the working copy since it's file structure best matched what I wanted the final new hg repos to be.
#!/bin/bash
# this requires you to be in: /path/to/svn/working/copy/, and issue: ../filemaplister.sh ./
for filename in *
do
extension=${filename##*.} #$filename|awk -F . '{print $NF}'
if [ "$extension" == "sln" -o "$extension" == "suo" -o "$extension" == "vsmdi" ]; then
base=${filename%.*}
echo "#$base.filemap" >> "$base.filemap"
echo "include $filename" >> "$base.filemap"
echo "C:\Applications\TortoiseHgPortable\hg.exe convert --filemap $base.filemap ../hg-datesort-converted ../hg-separated/$base > $base.convert.output.txt" >> "MASTERGO.convert.bat"
else
echo "#$filename.filemap" >> "$filename.filemap"
echo "include $filename" >> "$filename.filemap"
echo "rename $filename ." >> "$filename.filemap"
echo "C:\Applications\TortoiseHgPortable\hg.exe convert --filemap $filename.filemap ../hg-datesort-converted ../hg-separated/$filename > $filename.convert.output.txt" >> "MASTERGO.convert.bat"
fi
done;
mv *.filemap ../hg-conversion-filemaps/
mv *.convert.bat ../hg-conversion-filemaps/
This script looks at every file in an svn working copy, and depending on the type either creates a new filemap file or appends to an existing one. The if is really just to catch misc visual studio files, and place them into a separate repo. This is meant to be run on bash (cygwin in my case), but running the actual convert command is accomplished through the version of hg shipped with TortoiseHg due to forking/process issues on Windows (gah, I know...).
So you run the MASTERGO.convert.bat file, which looks at your converted hg repo, and creates separate repos using the supplied filemap. After it is complete, there is a folder called hg-separated that contains a folder/repo for each project, as well as a folder/repo for each solution. You then have to manually clone all the projects into a solution repo, and add the clones to the .hgsub file. After committing, an .hgsubstate file is created and you're set to go!
With the example given above, my .hgsub file looks like this for "Product1":
Product1.Domain = /absolute/path/to/Product1.Domain
Product1.Stresstester = /absolute/path/to/Product1.Stresstester
Product1.Web = /absolute/path/to/Product1.Web
Product1.Web.Tests = /absolute/path/to/Product1.Web.Tests
Once I transfer these repos to a central server, I'll be manually changing the paths to be urls.
Also, there is no analog to the initial OurPlatform svn repo, since everything is separated now.
Thanks again!
This can absolutely be done. You'll want to use the hg convert command. Here's the process I'd use:
convert everything to a single hg repository using hg convert with a source type of svn and a dest type of hg (it sounds like you've already done this step)
create a collection of filemap files for use with hg convert's --filemap option
run hg convert with source type hg and dest type hg and the source being the mercurial repo created in step one -- and do it for each of the filemaps you created in step two.
The filemap syntax is shown in the hg help convert output, but here's the gist:
The filemap is a file that allows filtering and remapping of files and
directories. Comment lines start with '#'. Each line can contain one of
the following directives:
include path/to/file
exclude path/to/file
rename from/file to/file
So in your example your filemaps would look like this:
# this is Core.filemap
include Core
rename Core .
Note that if you have an include that the exclusion of everything else is implied. Also that rename line ends in a dot and moves everything up one level.
# this is Core.Tests
include Core.Tests
rename Core.Tests .
and so on.
Once you've created the broken-out repositories for each of the new repos, you can delete the has-everything initial repo created in step one and start setting up your subrepo configuration in .hgsub files.

Sync File Modification Time Across Multiple Directories

I have a computer A with two directory trees. The first directory contains the original mod dates that span back several years. The second directory is a copy of the first with a few additional files. There is a second computer be which contains a directory tree which is the same as the second directory on computer A (new mod times and additional files). How update the files in the two newer directories on both machines so that the mod times on the files are the same as the original? Note that these directory trees are in the order of 10s of gigabytes so the solution would have to include some method of sending only the date information to the second computer.
The answer by Paul is partly correct, rsync is able to do this, however with different parameters. The correct command is
rsync -Prt --size-only original_dir copy_dir
where -P enables partial transfers and displays a progress indicator, -r recurses through subdirectories, -t preserves time stamps and --size-only doesn't transfer files that match in size.
The following command will make sure that TEST2 gets the same date assigned that TEST1 has
touch -t `stat -t '%Y%m%d%H%M.%S' -f '%Sa' TEST1` TEST2
Now instead of using hard-coded values here, you could find the files using "find" utility and then run touch via SSH on the remote machine. However, that means you may have to enter the password for each file, unless you switch SSH to cert authentication. I'd rather not do it all in a super fancy one-liner. Instead let's work with temp files. First go to the directory in question and run a find (you can filter by file type, size, extension, whatever pleases you, see "man find" for details. I'm just filtering by type file here to exclude any directories):
find . -type f -print -exec stat -t '%Y%m%d%H%M.%S' -f '%Sm' "{}" \; > /tmp/original_dates.txt
Now we have a file that looks like this (in my example there are only two entries there):
# cat /tmp/original_dates.txt
./test1
200809241840.55
./test2
200809241849.56
Now just copy the file over to the other machine and place it in the directory (so the relative file paths match) and apply the dates:
cat original_dates.txt | (while read FILE && read DATE; do touch -t $DATE "$FILE"; done)
Will also work with file names containing spaces.
One note: I used the last "modification" date at stat, as that's what you wrote in the question. However, it rather sounds as if you want to use the "creation" date (every file has a creation date, last modification date and last access date), you need to alter the stat call a bit.
'%Sm' - last modification date
'%Sc' - creation date
'%Sa' - last access date
However, touch can only change the modification time and access time, I think it can't change the creation time of a file ... so if that was your real intention, my solution might be sub-optimal... but in that case your question was as well ;-)
I would go through all the files in the source directory tree and gather the modification times from them into a script that I could run on the other directory trees. You will need to be careful about a few 'gotchas'. First, make sure that your output script has relative paths, and make sure you run it from the proper target directory, which should be the root directory of the target tree. Also, when changing machines make sure you are using the same timezone as you were on the machine where you generated the script.
Here's a Perl script I put together that will output the touch commands needed to update the times on the other directory trees. Depending on the target machines, you may need to tweak the date formats or command options, but this should give you a place to start.
#!/usr/bin/perl
my $STARTDIR="$HOME/test";
chdir $STARTDIR;
my #files = `find . -type f`;
chomp #files;
foreach my $file (#files) {
my $mtime = localtime((stat($file))[9]);
print qq(touch -m -d "$mtime" "$file"\n);
}
The other approach you could try is to attach the remote directory using NFS and then copy the times using find and touch -r.
I think rsync (with the right options)
will do this - it claims to only send
file differences, so presumably will
work out that there are no differences
to be transferred.
--times preserves the modification times, which is what you want.
See (for instance)
http://linux.die.net/man/1/rsync
Also add -I, --ignore-times don't skip files that match size and time
so that all files are "transferred' and trust to rsync's file differences optimisation to make it "fairly efficient" - see excerpt from the man page below.
-t, --times
This tells rsync to transfer modification times along with the files and update them on the remote system. Note that if this option is not used, the optimization that excludes files that have not been modified cannot be effective; in other words, a missing -t or -a will cause the next transfer to behave as if it used -I, causing all files to be updated (though the rsync algorithm will make the update fairly efficient if the files haven't actually changed, you're much better off using -t).
I used the following Python scripts instead.
Python scripts run much faster than an approach creating new processes for each file (like using find and stat). The solution below also works in case of timezone differences between systems, as it uses UTC times. It also works with paths containing spaces (but not paths containing newline!). It doesn't set times for symlinks, because the operating system provides no mechanism to modify the timestamp of a symlink, but in a file manager the time of the file the symlink points at is shown instead anyway. It uses a maxTime parameter to avoid resetting dates for files that are actually modified after copying from the original directory.
listMTimes.py:
import os
from datetime import datetime
from pytz import utc
for dirpath, dirnames, filenames in os.walk('./'):
for name in filenames+dirnames:
path = os.path.join(dirpath, name)
# Avoid symlinks because os.path.getmtime and os.utime get and
# set the time of the pointed file, and in the new directory,
# the link may have been redirected.
if not os.path.islink(path):
mtime = datetime.fromtimestamp(os.path.getmtime(path), utc)
print(mtime.isoformat()+" "+path)
setMTimes.py:
import datetime, fileinput, os, sys, time
import dateutil.parser
from pytz import utc
# Based on
# http://stackoverflow.com/questions/6999726/python-getting-millis-since-epoch-from-datetime
def unix_time(dt):
epoch = datetime.datetime.fromtimestamp(0, utc)
delta = dt - epoch
return delta.total_seconds()
if len(sys.argv) != 2:
print('Syntax: '+sys.argv[0]+' <maxTime>')
print(' where <maxTime> an ISO time, e. g. "2013-12-02T23:00+02:00".')
exit(1)
# A file with modification time newer than maxTime is not reset to
# its original modification time.
maxTime = unix_time(dateutil.parser.parse(sys.argv[1]))
for line in fileinput.input([]):
(datetimeString, path) = line.rstrip('\r\n').split(' ', 1)
mtime = dateutil.parser.parse(datetimeString)
if os.path.exists(path) and not os.path.islink(path):
if os.path.getmtime(path) <= maxTime:
os.utime(path, (time.time(), unix_time(mtime)))
Usage: in the first directory (the original) run
python listMTimes.py >/tmp/original_dates.txt
Then in the second directory (a copy of the original, possibly with some files modified/added/deleted) run something like this:
python setMTimes.py 2013-12-02T23:00+02:00 </tmp/original_dates.txt

Resources