Re: git: uh-oh

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Max Bowsher <maxb(at)f2s(dot)com>, Magnus Hagander <magnus(at)hagander(dot)net>, Michael Haggerty <mhagger(at)alum(dot)mit(dot)edu>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: git: uh-oh
Date: 2010-08-25 03:21:24
Message-ID: 11026.1282706484@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> On Fri, Aug 20, 2010 at 1:56 PM, Max Bowsher <maxb(at)f2s(dot)com> wrote:
>> My guess at this point is that there may be a (very old?) version of cvs
>> which, when adding a file to a branch, actually misrecorded the file as
>> having existed on the branch from the moment it was first added to trunk
>> - this would explain this anomaly.

> I think this is what is happening, except I'm unable to account for it
> by the age of the CVS version we're runnning. The machine the CVS
> repo is running on is running 1.11.17-FreeBSD (client/server).

Um, how old do you think that is? A look at the cvs sources says 2004...

It looks to me like the bogus commits for back-branch additions are
indeed part of our CVS history. While perhaps it would be nice if the
git conversion cleaned them up, I'm not sure that we want to put off
doing the conversion for however long it might take to make that happen.

> The odder cases are the ones involving deletion. There are a couple
> of branches/tags that, or so I'm guessing, are only present for a
> subset of the files in the repository: ecpg_big_bison, creation,
> Release-1-6-0, MANUAL_1_0, REL2_0B, and SUPPORT. I'm wondering if we
> shouldn't just nuke those, or at least nuke them from the copy of the
> repository upon which we are running the conversion.

Yeah, I noticed some of those in my copy of the test repository too,
but I see a slightly different set:

remotes/origin/REL2_0B
remotes/origin/REL6_4
remotes/origin/Release_1_0_3
remotes/origin/WIN32_DEV
remotes/origin/ecpg_big_bison

I doubt they're of any more than archaeological interest, but do we want
to be deleting history? What seemed more likely to be artifacts were
these:

remotes/origin/unlabeled-1.44.2
remotes/origin/unlabeled-1.51.2
remotes/origin/unlabeled-1.59.2
remotes/origin/unlabeled-1.87.2
remotes/origin/unlabeled-1.90.2

Any idea where those came from?

> This series of commits also seems pretty messed up:
> http://archives.postgresql.org/pgsql-committers/2007-04/msg00222.php
> http://archives.postgresql.org/pgsql-committers/2007-04/msg00223.php

You can find out about the reasons for that in this *other* discussion
of conversion to git:
http://archives.postgresql.org/pgsql-hackers/2007-04/msg00670.php
particularly here:
http://archives.postgresql.org/pgsql-hackers/2007-04/msg00685.php

> ... pretty crazy. I think we should try to do something to clean this up,
> perhaps by doctoring the file on the CVS side.

On the whole I feel that you're moving the goalposts. AFAIR the agreed
criteria for an acceptable SCM conversion were that it reproduce the
historical states of our tree at least at all the release tags, and that
it provide a close approximation of the CVS commit logs. I think that
manufactured commits that correspond to CVS's artifacts might be a bit
ugly, but trying to get rid of them sounds way too much like putting
lipstick on a pig. And if it means removing real, if ugly, history,
I'm not sure I'm in favor of it at all.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2010-08-25 03:29:12 Re: SQLSTATE of notice PGresult
Previous Message Euler Taveira de Oliveira 2010-08-25 03:04:38 Re: SQLSTATE of notice PGresult