Re: "Compacting" a relation

From: Hannu Krosing <hannu(at)skype(dot)net>
To: Jim Nasby <decibel(at)decibel(dot)org>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: "Compacting" a relation
Date: 2006-12-05 07:35:50
Message-ID: 1165304150.3117.5.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

Ühel kenal päeval, P, 2006-12-03 kell 22:14, kirjutas Jim Nasby:
> On Nov 29, 2006, at 2:29 AM, Hannu Krosing wrote:
> > Ühel kenal päeval, K, 2006-11-29 kell 11:19, kirjutas Peter
> > Eisentraut:
> >> vacuumlazy.c contains a hint "Consider compacting this relation"
> >> but AFAICT,
> >> there is no indication anywhere how "compacting" is supposed to be
> >> achieved.
> >> I guess this means VACUUM FULL or CLUSTER, but I don't think the
> >> hint can be
> >> processed effectively by a user.
> >
> > I once had an online/concurrent/non-locking compacting script,
> > which did
> > for each const_pk_col_with_largest_ctid staring starting from end of
> > relation
> >
> > UPDATE rel
> > SET pk_col=pk_col
> > WHERE pk_col = const_pk_col_with_largest_ctid
> >
> > until the tuple moved to another page as determined by
> >
> > SELECT ctid FROM rel where pk_col=const_pk_col_with_largest_ctid
> >
> > if the tuple moved to a larger page number then it was time for
> > another
> > lazy vacuum.
>
> Larger or smaller?

Larger, smaller is the expected behaviour without vacuum.

> There's a TODO about allowing control over what pages in a relation
> you get back from FSM that would make this a lot easier. In the case
> of a bloated table, you'd want to have the FSM favor handing out
> pages at the beginning of the heap. If you combined that with a
> special mode where new tuples would not be created on any page in the
> last X percent of the heap, it would be trivial to clean up a bloated
> table. Theoretically, you might be able to apply the same kind of
> technique to cleaning up a bloated index.
>
> BTW, the other reason to allow selecting where the FSM hands out data
> is for keeping a table clustered.

FSM is consulted only in case the new tuple does not fit on the same
page as old. for clustering putposes, this should also be optional.

> You might also be able to keep
> indexes in a more optimal order on-disk (as I understand it, over
> time the physical order of an index can become very different from
> the index order).
> --
> Jim Nasby jim(at)nasby(dot)net
> EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
>
>
--
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me: callto:hkrosing
Get Skype for free: http://www.skype.com

NOTICE: This communication contains privileged or other confidential
information. If you have received it in error, please advise the sender
by reply email and immediately delete the message and any attachments
without copying or disclosing the contents.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Markus Schiltknecht 2006-12-05 08:08:33 Re: Integrating Replication into Core
Previous Message Hannu Krosing 2006-12-05 07:20:14 Re: old synchronized scan patch

Browse pgsql-patches by date

  From Date Subject
Next Message Michael Meskes 2006-12-05 09:48:34 Re: ECPG docs
Previous Message Iannsp 2006-12-05 02:05:36 Re: [DOCS] 8.2.0 pdf