How to select the appropriate diff /patch for a commit with rugged

202 Views Asked by At

I try to get the commits that have been done after a date in a local copy of a git repo and then extract the related modifications on the files.

If I would like to compare this to a git command, It would be :

git log -p --reverse --after="2016-10-01"

Here is the script I use:

require "rugged"
require "date"

git_dir = "../ruby-gnome2/"

repo = Rugged::Repository.new(git_dir)
walker = Rugged::Walker.new(repo)
walker.sorting(Rugged::SORT_DATE| Rugged::SORT_REVERSE)
walker.push(repo.head.target)

walker.each do |commit|
  c_time = Time.at(commit.time)
  next unless c_time >= Date.new(2016,10,01).to_time

    puts c_time
    puts commit.diff.size
    puts commit.diff.stat.inspect
end

The problem is that it looks like a lot of files are modified here is the end of the output of this script:

2016-10-22 17:33:37 +0200
2463
[2463, 0, 271332]

Which means that there are 2463 files modified/deleted/replaced. While a git log -p --reverse --after="2016-10-22" show that only 2 files are modified.

How can I get the same results than with the git command? ie How can I find the real files that are modified by this commit?

2

There are 2 best solutions below

0
cedlemo On BEST ANSWER

As I didn't have any answer from the rugged team, I have done a ruby gobject-introspection loader for the libgit2-glib here https://github.com/ruby-gnome2/ggit.

Now I can find the diff and the logs that corresponds to the git command line interface:

require "ggit"

PATH = File.expand_path(File.dirname(__FILE__))

repo_path = "#{PATH}/ruby-gnome2/.git"

file = Gio::File.path(repo_path)

begin
  repo = Ggit::Repository.open(file)
  revwalker = Ggit::RevisionWalker.new(repo)
  revwalker.sort_mode = [:time, :topological, :reverse]
  head = repo.head
  revwalker.push(head.target)
rescue => error
  STDERR.puts error.message
  exit 1
end

def signature_to_string(signature)
  name = signature.name
  email = signature.email
  time = signature.time.format("%c")

  "#{name} <#{email}> #{time}"
end

while oid = revwalker.next do
  commit = repo.lookup(oid, Ggit::Commit.gtype)

  author = signature_to_string(commit.author)
  date = commit.committer.time
  next unless (date.year >= 2016 && date.month >= 11 && date.day_of_month > 5)
  committer = signature_to_string(commit.committer)

  subject = commit.subject
  message = commit.message

  puts "SHA: #{oid}"
  puts "Author:  #{author}"
  puts "Committer: #{committer}"
  puts "Subject: #{subject}"
  puts "Message: #{message}"
  puts "----------------------------------------"

  commit_parents = commit.parents
  if commit_parents.size > 0
    parent_commit = commit_parents.get(0)
    commit_tree = commit.tree
    parent_tree = parent_commit.tree

    diff = Ggit::Diff.new(repo, :old_tree => parent_tree,
                          :new_tree => commit_tree, :options => nil)

    diff.print( Ggit::DiffFormatType::PATCH ).each do |_delta, _hunk, line|
      puts "\t | #{line.text}"
      0
    end

  end

end
8
VonC On

When I clone ruby-gnome2/ruby-gnome2, it tells me there is 2400+ files, so for you to get 2463, that strikes me as all the files have been modified.

This differs from the normal behavior of a rugged#commit.diff, which diff by default the current commit (returned by the Walker) against the first parent commit.

Check if you have some settings like git config core.autocrlf set to true (which might change eol in your local repo).