Skip site navigation (1) Skip section navigation (2)

Re: Vacuum questions...

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Hannu Krosing <hannu(at)skype(dot)net>
Cc: Jan Wieck <JanWieck(at)Yahoo(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Vacuum questions...
Date: 2005-09-30 22:18:12
Message-ID: 20050930221812.GG40138@pervasive.com (view raw or flat)
Thread:
Lists: pgsql-hackers
On Thu, Sep 29, 2005 at 12:50:13AM +0300, Hannu Krosing wrote:
> On T, 2005-09-27 at 17:57 -0500, Jim C. Nasby wrote:
> > On Tue, Sep 27, 2005 at 02:47:46PM -0400, Jan Wieck wrote:
> > > On 9/24/2005 8:17 PM, Jim C. Nasby wrote:
> > > 
> > > >Would it be difficult to vacuum as part of a dump? The reasoning behind
> > > >this is that you have to read the table to do the dump anyway, 
> > > 
> > > I think aside from what's been said so far, it would be rather difficult 
> > > anyway. pg_dump relies on MVCC and requires to run in one transaction to 
> > > see a consistent snapshot while vacuum jiggles around with transactions 
> > > in some rather non-standard way.
> > 
> > Is this true even if they were in different connections?
> > 
> > My (vague) understanding of the vacuum process is that it first vacuums
> > indexes, and then vacuums the heap. 
> 
> actually (lazy) vacuum does this
> 
> 1) scan heap, collect ctids of rows to remove
> 2) clean indexes
> 3) clean heap
> 
> > Since we don't dump indexes, there's
> > nothing for backup to do while those are vacuumed, so my idea is:
> > 
> > pg_dump:
> > foreach (table)
> >     spawn vacuum
> >     wait for vacuum to hit heap
> >     start copy
> >     wait for analyze to finish
> > next;
> 
> probably the first heap scan of vacuum would go faster than dump as it
> does not have to write out anything, and the second scan ( nr 3 in above
> list ) would be either faster or slower, as it has to lock each page and
> rearrange tuples there.
> 
> so it would be very hard to synchronize vacuum with either of them.

Well, I guess it depends on what the dump was writing to. Also depends
on available cache I expect.

Is this something that could be hacked together fairly easy just for
testing purposes? Would firing off a VACUUM tablename at the same time
as a COPY tablename be a good enough approximation?
-- 
Jim C. Nasby, Sr. Engineering Consultant      jnasby(at)pervasive(dot)com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

In response to

pgsql-hackers by date

Next:From: Simon RiggsDate: 2005-09-30 22:21:16
Subject: Re: [HACKERS] A Better External Sort?
Previous:From: Jim C. NasbyDate: 2005-09-30 22:13:09
Subject: Bug 1473, pthread python on FreeBSD

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group