Re: Vacuum questions...

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Hannu Krosing <hannu(at)skype(dot)net>
Cc: Jan Wieck <JanWieck(at)Yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Vacuum questions...
Date: 2005-09-30 22:18:12
Message-ID: 20050930221812.GG40138@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 29, 2005 at 12:50:13AM +0300, Hannu Krosing wrote:
> On T, 2005-09-27 at 17:57 -0500, Jim C. Nasby wrote:
> > On Tue, Sep 27, 2005 at 02:47:46PM -0400, Jan Wieck wrote:
> > > On 9/24/2005 8:17 PM, Jim C. Nasby wrote:
> > >
> > > >Would it be difficult to vacuum as part of a dump? The reasoning behind
> > > >this is that you have to read the table to do the dump anyway,
> > >
> > > I think aside from what's been said so far, it would be rather difficult
> > > anyway. pg_dump relies on MVCC and requires to run in one transaction to
> > > see a consistent snapshot while vacuum jiggles around with transactions
> > > in some rather non-standard way.
> >
> > Is this true even if they were in different connections?
> >
> > My (vague) understanding of the vacuum process is that it first vacuums
> > indexes, and then vacuums the heap.
>
> actually (lazy) vacuum does this
>
> 1) scan heap, collect ctids of rows to remove
> 2) clean indexes
> 3) clean heap
>
> > Since we don't dump indexes, there's
> > nothing for backup to do while those are vacuumed, so my idea is:
> >
> > pg_dump:
> > foreach (table)
> > spawn vacuum
> > wait for vacuum to hit heap
> > start copy
> > wait for analyze to finish
> > next;
>
> probably the first heap scan of vacuum would go faster than dump as it
> does not have to write out anything, and the second scan ( nr 3 in above
> list ) would be either faster or slower, as it has to lock each page and
> rearrange tuples there.
>
> so it would be very hard to synchronize vacuum with either of them.

Well, I guess it depends on what the dump was writing to. Also depends
on available cache I expect.

Is this something that could be hacked together fairly easy just for
testing purposes? Would firing off a VACUUM tablename at the same time
as a COPY tablename be a good enough approximation?
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2005-09-30 22:21:16 Re: [HACKERS] A Better External Sort?
Previous Message Jim C. Nasby 2005-09-30 22:13:09 Bug 1473, pthread python on FreeBSD