Re: PostgreSQL GIT mirror status

From: "Daniel Farina" <drfarina(at)acm(dot)org>
To: "Heikki Linnakangas" <heikki(dot)linnakangas(at)enterprisedb(dot)com>
Cc: "Peter Eisentraut" <peter_e(at)gmx(dot)net>, pgsql-www(at)postgresql(dot)org, "Jeff Davis" <pgsql(at)j-davis(dot)com>
Subject: Re: PostgreSQL GIT mirror status
Date: 2009-01-09 17:56:09
Message-ID: 7b97c5a40901090956n2eb0478x3133a1edddcfc102@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-www

On Fri, Jan 9, 2009 at 3:06 AM, Heikki Linnakangas
<heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
> Wow, that's impressive! How long does a "git gc --agressive" run take?

Actually, not that long. The main step that takes forever at this
point (starting from scratch) is counting all those objects. The
actual gc --aggressive time could probably be measured in minutes and
< 1hr on a reasonably fast machine.

> That could be because of the duplicated history we had there in December,
> that I then fixed. I reset the branches to just before the screwup, and then
> ran fromcvs to catch up with CVS HEAD again. That duplicated history is
> probably still there, but nor reachable from any branches or tags.
>
> Should we run "git prune" to get rid of the garbage?
>

Sounds like a good candidate, but I don't think that alone will do
it. I've had to do something like this before when I temporarily added
some large blobs to my git repository to move them between home and
work.

I have isolated the problem to the being the reflog, which sounds
about right. The "git reflog" man page says it has ways to delete
and/or expire these to be pruned, so try that first (and then tell me
if it worked as you expected, and what you did).

If it doesn't (i.e. for some reason is not pruning properly) and if
you are sure you won't need the reflog it seems that you can just
delete the 'logs' directory under the git repository (you may notice
that it seems that the repository at lolrus.org works fine, but has no
'logs' directory). That seems to be the same state as having no reflog
at all, after which a regular 'git gc' will collect most of those
objects.

"But wait, there's more!"

You'll then want to run a 'git prune', as it seems that gc will still
keep some objects around because they're inside the gc grace period,
which I believe to be distinct from the reflog. In this case it seems
that we really want them gone.

Given this information it seems like the right steps are something
like this:

1. Somehow expire and/or delete the reflogs so they register as
garbage.

* By making use of the 'git reflog' expiration/deletion commands
(preferred, if one can figure out their behavior exactly)

* Or just deleting $GITREPO/logs. (works for me at the moment)

2. Run 'git gc --aggressive'

3. Run 'git prune'

Alternatively, just steal the pack from fdr.lolrus.org, as mentioned
above.

fdr

In response to

Browse pgsql-www by date

  From Date Subject
Next Message Marc G. Fournier 2009-01-09 21:48:22 Re: Denver PUG mailing list
Previous Message Josh Berkus 2009-01-09 17:38:37 Re: Wiki wizard help?