Re: git: uh-oh

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Max Bowsher <maxb(at)f2s(dot)com>, Michael Haggerty <mhagger(at)alum(dot)mit(dot)edu>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: git: uh-oh
Date: 2010-08-20 22:59:15
Message-ID: 29443.1282345155@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Magnus Hagander <magnus(at)hagander(dot)net> writes:
> I have now pushed a complete copy of the latest migrated repository to
> http://git.postgresql.org/gitweb?p=git-migration-test.git;a=summary.

> This one has tkey keyword expansion on, which we decided we want. My
> script that compares branch tips and tags to cvs now shows *zero*
> differences. Which in itself is an improvement over the old version of
> cvs2git :-)

Cool --- this alone is probably worth the delay in converting.

> I have not checked the state of the "newly added files issue" that
> Robert found, nor in general verified anything other than branch tips
> and tags so far. Anybody who has time to do that, please go right
> ahead :-)

I just spent some time comparing the REL8_3_STABLE history to what
I have from cvs2cl. It is *much* better --- no more unrelated commits.
I do see the "manufactured commits" that Robert is on about, but what
is now apparent is that those correspond to artifact commits on the CVS
side too. For example, here's what cvs2cl claims happened in the 8.3
branch on Feb 28 (all times EST = GMT-5):

2010-02-28 22:41 tgl

* contrib/xml2/: Makefile, xpath.c, xslt_proc.c, expected/xml2.out,
sql/xml2.sql (REL8_3_STABLE): Back-patch today's memory management
fixups in contrib/xml2.

Prior to 8.3, these changes are not critical for compatibility with
core Postgres, since core had no libxml2 calls then. However there
is still a risk if contrib/xml2 is used along with libxml2
functionality in Perl or other loadable modules. So back-patch to
all versions.

Also back-patch addition of regression tests. I'm not sure how
many of the cases are interesting without the interaction with core
xml code, but a silly regression test is still better than none at
all.

2010-02-28 21:21 tgl

* src/: backend/access/transam/xact.c, backend/utils/adt/xml.c,
include/utils/xml.h (REL8_3_STABLE): Back-patch changes of
2009-05-13 in xml.c's memory management.

I was afraid to do this when these changes were first made, but now
that 8.4 has seen some field use it should be all right to
back-patch. These changes are really quite necessary in order to
give xml.c any hope of co-existing with loadable modules that also
wish to use libxml2.

2010-02-28 16:31 tgl

* contrib/xml2/: expected/xml2.out, sql/xml2.sql: Fix up memory
management problems in contrib/xml2.

Get rid of the code that attempted to funnel libxml2's memory
allocations into palloc. We already knew from experience with the
core xml datatype that trying to do this is simply not reliable.
Unlike the core code, I did not bother adding a lot of
PG_TRY/PG_CATCH logic to try to ensure that everything is cleaned
up on error exit. Hence, we might leak some memory if one of these
functions fails partway through. Given the deprecated status of
this contrib module and the fact that errors partway through the
functions shouldn't be too common, it doesn't seem worth worrying
about.

Also fix a separate bug in xpath_table, that it did the wrong
things if given a result tuple descriptor with less than 2 columns.
While such a case isn't very useful in practice, we shouldn't fail
or stomp memory when it occurs.

Add some simple regression tests based on all the reported crash
cases that I have on hand.

This should be back-patched, but let's see if the buildfarm likes
it first.

Notice that that last entry doesn't say (REL8_3_STABLE). Which is
correct, because *there was no such commit against 8.3*. This entry is
quoting the commit message, and the commit time, of the HEAD commit that
added xml2.sql and xml2.out --- and notice it only lists those two
files, not the other ones touched by that HEAD commit.

In the git conversion, there is a "manufactured commit" corresponding to
this one, although the timestamp seems slightly different, and it
appears to inject the HEAD versions of these test files into the branch.
Then in the later git commit corresponding to the first cvs2cl entry
above, there is a diff that makes the test files match the way they
really look in 8.3.

I suspect that this happened every time we did a back-branch file
addition, and that in most cases the oddity got masked because the HEAD
and back-branch commits happened at approximately the same time and with
identical commit messages. So they got folded into one report by
cvs2cl. In this example, with the messages being different and a few
hours elapsed between the commits, we can see that some weirdness did
happen in the CVS history too. This also becomes apparent when you
look at cvsweb:
http://anoncvs.postgresql.org/cvsweb.cgi/pgsql/contrib/xml2/sql/xml2.sql
The back-branch versions of the file are shown as being updates from the
HEAD version 1.1, which is surely not the way things happened in
reality, but ...

So at this point I'm willing to buy Max and Michael's assertion that
this is a faithful conversion of the CVS history. The fact that the
commit messages are tagged as manufactured seems like it might be a good
thing not a bad thing --- they're manufactured on the CVS side too.

We need to do more testing of this conversion, but right at the moment
I'm thinking it might be OK as-is.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Joshua D. Drake 2010-08-20 23:28:30 Re: Version Numbering
Previous Message Josh Berkus 2010-08-20 22:41:01 Re: Version Numbering