Re: Hacking on PostgreSQL via GIT

From: Martin Langhoff <martin(at)catalyst(dot)net(dot)nz>
To: "Jim C(dot) Nasby" <jim(at)nasby(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Aidan Van Dyk <aidan(at)highrise(dot)ca>, "Florian G(dot) Pflug" <fgp(at)phlo(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Hacking on PostgreSQL via GIT
Date: 2007-04-19 01:20:43
Message-ID: 4626C3EB.3080308@catalyst.net.nz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jim C. Nasby wrote:
> Not bad... took you 40 lines to answer my question. Let's see if I can
> beat that...

Sure - it'll be 1 line when it's wrapped in a shell script. And then
we'll be even.

> I understand the argument about metadata and all, and largely agree with
> it. But on the other hand I think a version identifier is a critical
> piece of information; it's just as critical as the file name when it
> comes to identifying the information contained in the file.

Surely. It is important, but it's metadata and belongs elsewhere. That
metadata _is_ important doesn't mean you corrupt _data_ with it.

Just imagine that MySQL users were used to getting their SQL engine
expand $Oid$ $Tablename$ $PrimayKey$ in TEXT fields. And that when
INSERT/UPDATEing those were collapsed. And in comparisons too. Wouldn't
you say "that's metadata, can be queried in a thousand ways, does not
belong in the middle of the data"?

And the _really_ interesting version identifier is usually the "commit"
identifier, which gives you a SHA1 of the whole src directory and the
history. Projects that use git usually include that SHA1 in their build
script, so even if a user compiles off a daily snapshot or a checkout on
a random branch of your SCM, you can just ask them "what's the build
identifier?" and they'll give you a SHA1.

Actually, git can spit a nicer build identifier that includes the latest
tag, so if you see the identifier being

v8.2.<sha1>

You know it's not 8.2 "release" but a commit soon after it, identified
by that SHA1. GIT uses that during its build to insert the version
identifier, so:

$ git --version
git version 1.5.1.gf8ce

With that in your hand, you can say

# show me what commits on top of the tagged 1.5.1 have I got:
$ git log 1.5.1..gf8ce

# file src/lib/foo.c at this exact commit
git show gf8ce:src/lib/foo.c

So if you use this identifier (just call `git version`) to

- name your tarballs
- create a "build-id" file at tarball creation time
- tag your builds with a version id

And then when you have code out there in the wild, and people report
bugs or send you patches, there's a good identifier you can ask for that
covers _all_ the files.

If it happens that someone reports a bug and says they have 8.2.gg998
and you don't seem to have any gg998 commit after 8.2, you can say with
confidence: you are running some a patched Pg - please repro with a
pristine copy (or show us your code!) :-)

cheers,

m
--
-----------------------------------------------------------------------
Martin @ Catalyst .Net .NZ Ltd, PO Box 11-053, Manners St, Wellington
WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St
OFFICE: +64(4)916-7224 UK: 0845 868 5733 ext 7224 MOB: +64(21)364-017
Make things as simple as possible, but no simpler - Einstein
-----------------------------------------------------------------------

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2007-04-19 02:57:08 Re: Background LRU Writer/free list
Previous Message Jim C. Nasby 2007-04-19 00:41:01 Re: Hacking on PostgreSQL via GIT