Re: PostgreSQL Developer meeting minutes up

From: "Markus Wanner" <markus(at)bluegap(dot)ch>
To: "Aidan Van Dyk" <aidan(at)highrise(dot)ca>
Cc: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>, "Magnus Hagander" <magnus(at)hagander(dot)net>, "Andrew Dunstan" <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PostgreSQL Developer meeting minutes up
Date: 2009-05-29 15:05:59
Message-ID: 20090529170559.99155w414h94hazb@mail.bluegap.ch
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Quoting "Aidan Van Dyk" <aidan(at)highrise(dot)ca>:
>> Ok, so seeing the interest in having a "good conversion", I took a stab at
>> parsecvs this afternoon, probably what I consider the leading "static"
>> conversion tool.

Here are some results from a conversion with cvs2git.

>> It takes about 10 minutes to run my old xeon.

The conversion with cvs2git certainly took a bit longer, however, I
don't think that matters at all. Everything below a day or two is good
enough, IMO. What counts is the result.

The first step is running cvs2git itself:

cvs2svn Statistics:
------------------
Total CVS Files: 6873
Total CVS Revisions: 140191
Total CVS Branches: 36057
Total CVS Tags: 457515
Total Unique Tags: 171
Total Unique Branches: 21
CVS Repos Size in KB: 377337
Total SVN Commits: 32889
First Revision Date: Tue Jul 9 08:21:07 1996
Last Revision Date: Thu May 28 22:02:10 2009

(number of files matches pretty well with my own algorithm, however,
total svn commits is a bit lower, compared to the ~ 40'000 blobs I got).

The output of cvs2git can then be imported with git fast-import:

git-fast-import statistics:
---------------------------------------------------------------------
Alloc'd objects: 350000
Total objects: 349405 ( 19563 duplicates )
blobs : 132672 ( 3255 duplicates 119032 deltas)
trees : 183967 ( 16308 duplicates 165582 deltas)
commits: 32766 ( 0 duplicates 0 deltas)
tags : 0 ( 0 duplicates 0 deltas)
Total branches: 194 ( 664 loads )
marks: 1073741824 ( 168693 unique )
atoms: 5280
Memory total: 16532 KiB
pools: 2860 KiB
objects: 13671 KiB
---------------------------------------------------------------------
pack_report: getpagesize() = 4096
pack_report: core.packedGitWindowSize = 1073741824
pack_report: core.packedGitLimit = 8589934592
pack_report: pack_used_ctr = 124414
pack_report: pack_mmap_calls = 3674
pack_report: pack_open_windows = 1 / 1
pack_report: pack_mapped = 199500913 / 199500913
---------------------------------------------------------------------

The resulting repository contains the following branches. The
unlabeled ones contain only 1-2 files and seem rather irrelevant. In a
next try, I'd disable their creation completely, just wanted to check.

REL2_0B
REL6_4
REL6_5_PATCHES
REL7_0_PATCHES
REL7_1_STABLE
REL7_2_STABLE
REL7_3_STABLE
REL7_4_STABLE
REL8_0_0
REL8_0_STABLE
REL8_1_STABLE
REL8_2_STABLE
REL8_3_STABLE
Release_1_0_3
WIN32_DEV
ecpg_big_bison
* master
unlabeled-1.44.2 -> from src/backend/commands/tablecmds.c
unlabeled-1.51.2 -> from src/test/regress/expected/alter_table.out
unlabeled-1.59.2 -> from src/backend/executor/execTuples.c
unlabeled-1.87.2 -> from src/backend/executor/nodeAgg.c
unlabeled-1.90.2 -> from src/backend/parser/parse_target.c and
src/backend/access/common/tupdesc.c

Comparison of the head of each branch between git and CVS (modulo CVS
keyword expansion, which I've filtered out):

ecpg_big_bison.diff: 0 files changed
master.diff: 0 files changed
REL2_0B.diff: 0 files changed
REL6_4.diff: 0 files changed
REL6_5_PATCHES.diff: 0 files changed
REL7_0_PATCHES.diff: 0 files changed
REL7_1_STABLE.diff: 0 files changed
REL7_2_STABLE.diff: 0 files changed
REL7_3_STABLE.diff: 0 files changed
REL7_4_STABLE.diff: 0 files changed
REL8_0_0.diff: 0 files changed
REL8_0_STABLE.diff: 0 files changed
REL8_1_STABLE.diff: 0 files changed
REL8_2_STABLE.diff: 0 files changed
REL8_3_STABLE.diff: 0 files changed
Release_1_0_3.diff: 0 files changed
WIN32_DEV.diff: 0 files changed

I plan to compare the tags as well and test what branch they are in,
but so far cvs2git seems to hold its promises. I'll report back again
within the next few days.

Regards

Markus Wanner

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2009-05-29 15:12:14 Re: search_path vs extensions
Previous Message Dimitri Fontaine 2009-05-29 15:05:47 Re: search_path vs extensions