Access git log data using ruby rugged gem? - ruby

For a given file in a git repo, I'd like to look up the SHA of the last commit in which the file was modified, along with the timestamp.
At the command line, this data is visible with git log for a particular file path, e.g.
git log -n 1 path/to/file
Using the "git" gem for ruby I can also do this:
require 'git'
g = Git.open("/path/to/repo")
modified = g.log(1).object(relative/path/to/file).first.date
sha = g.log(1).object(relative/path/to/file).first.sha
Which is great, but is running too slowly for me when looping through lots of paths. As Rugged uses C libraries instead, I was hoping it would be faster but cannot see how to construct the right query in the rugged syntax. Any suggestions?

This should work:
repo = Rugged::Repository.new("/path/to/repo")
walker = Rugged::Walker.new(repo)
walker.sorting(Rugged::SORT_DATE)
walker.push(repo.head.target)
commit = walker.find do |commit|
commit.parents.size == 1 && commit.diff(paths: ["relative/path/to/file"]).size > 0
end
sha = commit.oid
Taken and adapted from https://github.com/libgit2/pygit2/issues/200#issuecomment-15899713
As an aside: Just because rugged is written in C does not mean that costly operations suddenly become cheap and quick. Obviously, you save a lot of string parsing and stuff like that, but this is not always the bottleneck.
As you're not interested in the actual textual diff here, the libgit2 GIT_DIFF_FORCE_BINARY might be something that could also help in increasing the performance of this lookup - unfortunately this is not yet available in Rugged (but will be, soon).
Testing this with the Rugged repo itself, it works correctly:
repo = Rugged::Repository.new(".")
walker = Rugged::Walker.new(repo)
walker.sorting(Rugged::SORT_DATE)
walker.push(repo.head.target)
commit = walker.find do |commit|
commit.parents.size == 1 && commit.diff(paths: ["Gemfile"]).size > 0
end
sha = commit.oid # => "8f5c763377f5bf0fb88d196b7c45a7d715264ad4"
walker = Rugged::Walker.new(repo)
walker.sorting(Rugged::SORT_DATE)
walker.push(repo.head.target)
commit = walker.find do |commit|
commit.parents.size == 1 && commit.diff(paths: [".travis.yml"]).size > 0
end
sha = commit.oid # => "4e18e05944daa2ba8d63a2c6b149900e3b93a88f"

Related

How do I apply a diff or patch file?

when using the rugged git library how can I apply that diff to my dest branch as a commit?.
# #param src [Rugged::Object] - the rugged object or string to compare from
# #param dst [Rugged::Object] - the rugged object or string to compare to, defaults to parent
# #return [Rugged::Diff] a rugged diff object between src and dst
def create_diff(src, dst = nil)
src = repo.lookup(find_ref(src))
dst ||= repo.lookup(src.parents.first)
dst = find_ref(dst)
src.diff(dst)
end
# #param sha_or_ref [String] - the name or sha of the ref
# #return [String] the oid of the sha or ref
def find_ref(sha_or_ref)
case sha_or_ref
when Rugged::Object
sha_or_ref.oid
else
repo.rev_parse_oid(sha_or_ref)
end
end
Is there not an easy way to apply a patch or diff? Seems silly that I would need to loop through each change in the diff and either add/rm the file.
Considering the:
libgit2/rugged is the ruby bindings to libgit2
libgit2's apply patch feature was requested from a long time (2013) and only recently merged (a year ago) for libgit2 0.24.4,
You would have to wait for that feature to be ported in rugged.
The last official release v0.24.0 does not include 0.24.4.

get all commit of a file/path with rugged

I would like to get the list of all the commits for a file/path but I don't know how to do it.
For example I want all the commit of the file "test", to get oid of each commit and thanks to this oid, I will get the blob of all revision for this file.
Is it possible ?
Thanks !
We can get all commits by this way :
tab = []
walker = Rugged::Walker.new(repo)
walker.sorting(Rugged::SORT_DATE)
walker.push(repo.head.target)
walker.each do |commit|
if commit.diff(paths: ["path_of_file"]).size > 0
tab.push(commit)
end
end

How to detect a file rename using Rugged?

I'm a novice Rugged user, and I'm attempting to detect file renames in the commit history. I'm diffing each commit against its first parent, as follows:
repo = Rugged::Repository.discover("foo")
walker = Rugged::Walker.new(repo)
walker.sorting(Rugged::SORT_TOPO)
walker.push("master")
walker.each.take(200).each do |commit|
puts commit.oid
puts commit.message
diffs = nil
# Handle Root commit
if commit.parents.count > 0 then
diffs = commit.parents[0].diff(commit)
else
diffs = commit.diff(nil)
end
(files,additions,deletions) = diffs.stat
puts "Files changed: #{files}, Additions: #{additions}, Deletions: #{deletions}"
paths = [];
diffs.each_delta do |delta|
old_file_path = delta.old_file[:path]
new_file_path = delta.new_file[:path]
puts delta.status
puts delta.renamed?
puts delta.similarity
paths += [delta]
end
puts "Paths:"
puts paths
puts "===================================="
end
walker.reset
However, when I do have a rename, the program will output an addition and a removal (A and D status). This matches the output of git log --name-status.
On the other hand, I found out that using git log --name-status --format='%H' --follow -- b.txt correctly shows the rename as R100.
The repo history and the outputs of git can be seen in the following gist: https://gist.github.com/ifigueroap/60716bbf4aa2f205b9c9
My question is how to use the Diff, or Delta objects of Rugged to detect such a file rename...
Thanks
Before accessing diffs.stat, you should call diffs.find_similar! with :renames => true. That'll modify the diffs object to do include rename information. This is not done by default, as the underlying operation is quite complex and not needed in most cases.
Check the documentation for find_similar! here: https://github.com/libgit2/rugged/blob/e96d26174b2bf763e9dd5dd2370e79f5e29077c9/ext/rugged/rugged_diff.c#L310-L366 for more options.

Get all open pull requests from an organisation using the Github API Ruby gem

For our organisation's dashboard, I'd like to keep a count of all the open PRs on all our repositories. At the moment, all I've got is to loop through all the repos, and count through all the open PRs on each repo like so (which often results in a rate limit error):
connection = Github.new oauth_token: MY_OAUTH_TOKEN
pulls = 0
connection.repos.list(:org => GITHUB_ORGANISATION).each do |repo|
pulls += connection.pull_requests.list(:user => repo['owner']['login'], :repo => repo['name']).count
end
I know there must be a nicer way round this. Any ideas? (short of screen scraping!)
OK, so I think I've cracked this now. Pull requests are issues, so I can get all issues, and loop through the issues like so:
pulls = 0
issues = connection.issues.list(:org => GITHUB_ORGANISATION, :filter => 'all', :auto_pagination => true)
issues.each do |issue|
if issue["pull_request"]
pulls += 1
end
end
Once you remember that pull requests are issues too, everything just falls into place.

Why must I use local path rather than 'svn://' with SVN bindings?

I'm using the Ruby SVN bindings built with SWIG. Here's a little tutorial.
When I do this
#repository = Svn::Repos.open('/path/to/repository')
I can access the repository fine. But when I do this
#repository = Svn::Repos.open('svn://localhost/some/path')
It fails with
/SourceCache/subversion/subversion-35/subversion/subversion/libsvn_subr/io.c:2710: 2: Can't open file 'svn://localhost/format': No such file or directory
When I do this from the command line, I do get output
svn ls svn://localhost/some/path
Any ideas why I can't use the svn:// protocol?
EDIT
Here's what I ended up doing, and it works.
require 'svn/ra'
class SvnWrapper
def initialize(repository_uri, repository_username, repository_password)
# Remove any trailing slashes from the path, as the SVN library will choke
# if it finds any.
#repository_uri = repository_uri.gsub(/[\/]+$/, '')
# Initialize repository session.
#context = Svn::Client::Context.new
#context.add_simple_prompt_provider(0) do |cred, realm, username, may_save|
cred.username = repository_username
cred.password = repository_password
cred.may_save = true
end
config = {}
callbacks = Svn::Ra::Callbacks.new(#context.auth_baton)
#session = Svn::Ra::Session.open(#repository_uri, config, callbacks)
end
def ls(relative_path, revision = nil)
relative_path = relative_path.gsub(/^[\/]+/, '').gsub(/[\/]+$/, '')
entries, properties = #session.dir(relative_path, revision)
return entries.keys.sort
end
def info(relative_path, revision = nil)
path = File.join(#repository_uri, relative_path)
data = {}
#context.info(path, revision) do |dummy, infoStruct|
# These values are enumerated at http://svn.collab.net/svn-doxygen/structsvn__info__t.html.
data['url'] = infoStruct.URL
data['revision'] = infoStruct.rev
data['kind'] = infoStruct.kind
data['repository_root_url'] = infoStruct.repos_root_url
data['repository_uuid'] = infoStruct.repos_UUID
data['last_changed_revision'] = infoStruct.last_changed_rev
data['last_changed_date'] = infoStruct.last_changed_date
data['last_changed_author'] = infoStruct.last_changed_author
data['lock'] = infoStruct.lock
end
return data
end
end
Enjoy.
The svn command is a client. It communicates with the Subversion server using several protocols (http(s)://, svn:// and file:///).
Repos.open is a repository function (much like svnadmin for instance). It operates directly on the database, and doesn't use a client protocol to communicate with the server.

Resources