Re: Hacking on PostgreSQL via GIT

From: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
To: Martin Langhoff <martin(at)catalyst(dot)net(dot)nz>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Hacking on PostgreSQL via GIT
Date: 2007-04-19 00:41:01
Message-ID: 20070419004100.GB72669@nasby.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 19, 2007 at 10:07:08AM +1200, Martin Langhoff wrote:
> Jim C. Nasby wrote:
> > Then how do you tell what version a file is if it's outside of a
> > checkout?
>
> It's trivial for git to answer that - the file will either be pristine,
> and then we can just scan for the matching SHA1, or modified, and we can
> scan (taking a weee bit more time) which are the "closest matches" in
> your history, in what branches and commits.
>
> The actual scripting for this isn't written just yet -- Linus posted a
> proof-of-concept shell implementation along the lines of
>
> git rev-list --no-merges --full-history v0.5..v0.7 --
> src/widget/widget.c > rev-list
>
> best_commit=none
> best=1000000
> while read commit
> do
> git cat-file blob "$commit:src/widget/widget.c" > tmpfile
> lines=$(diff reference-file tmpfile | wc -l)
> if [ "$lines" -lt "$best" ]
> then
> echo Best so far: $commit $lines
> best=$lines
> fi
> done < rev-list
>
> and it's fast. One of the good properties of this is that you can ask
> for a range of your history (v0.5 to v0.7 in the example) and an exact
> path (src/widget/widget.c) but you can also say --all (meaning "in all
> branches") and a handwavy "over there", like src. And git will take an
> extra second or two on a large repo, but tell you about all the good
> candidates across the branches.
>
> Metadata is metadata, and we can fish it out of the SCM easily - and
> data is data, and it's silly to pollute it with metadata that is mostly
> incidental.
>
> If I find time today I'll post to the git list a cleaned up version of
> Linus' shell script as
>
> git-findclosestmatch <head or range or --all> path/to/scan/ \
> randomfile.c

Not bad... took you 40 lines to answer my question. Let's see if I can
beat that...

> > Then how do you tell what version a file is if it's outside of a
> > checkout?

Answer: you look at the $Id$ (or in this case, $PostgreSQL$) tag.

Sorry, tried to get it to 2 lines, but couldn't. ;)

I understand the argument about metadata and all, and largely agree with
it. But on the other hand I think a version identifier is a critical
piece of information; it's just as critical as the file name when it
comes to identifying the information contained in the file.

Or does GIT not use filenames, either? :)
--
Jim Nasby jim(at)nasby(dot)net
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Martin Langhoff 2007-04-19 01:20:43 Re: Hacking on PostgreSQL via GIT
Previous Message Tom Lane 2007-04-19 00:38:12 Re: modifying the table function