Re: Maintaining cluster order on insert

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Heikki Linnakangas <heikki(at)enterprisedb(dot)com>
Cc: Jaime Casanova <systemguards(at)gmail(dot)com>, "Jim C(dot) Nasby" <decibel(at)decibel(dot)org>, PostgreSQL-patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: Maintaining cluster order on insert
Date: 2007-05-18 13:54:59
Message-ID: 200705181354.l4IDsxp16082@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------

Heikki Linnakangas wrote:
> Jaime Casanova wrote:
> > On 5/16/07, Heikki Linnakangas <heikki(at)enterprisedb(dot)com> wrote:
> >> Jim C. Nasby wrote:
> >> > What about adding the ability to ask the FSM for a page that's near a
> >> > given page? That way if you did have to go to the FSM you could at
> >> least
> >> > try and insert close to the page you originally wanted.
> >>
> >> Yeah, there's always room for improvement. I made the patch when I was
> >> working on clustered indexes, and was mostly concerned about getting
> >> inserts to the same page as other tuples with similar values so that the
> >> clustered index stays clustered.
> >>
> >
> > the patch doesn't apply in cvs... you'll need to update it...
>
> Oh, here you are.
>
> The implementation has changed a bit since August. I thought I had
> submitted an updated version in the winter but couldn't find it. Anyway,
> I updated and dusted off the source tree, tidied up the comments a
> little bit, and fixed some inconsistencies in pg_proc entries that made
> opr_sanity to fail.
>
> The beef of the patch is two new optional indexam API functions:
> amprepareinsert and amfinishinsert. amprepareinsert is called before
> inserting the heap tuple. It descends the tree and finds and pins the
> right leaf page to insert to, and returns a suggestion on where the heap
> tuple should be inserted. amfinishinsert is called after inserting the
> heap tuple to actually insert the index tuple. Documentation for these
> functions need to be added indexam.sgml, I noticed that that's not done yet.
>
> The cluster_inserts GUC option that you can use to enable/disable the
> feature should be removed before committing.
>
> The performance characteristics of this patch hasn't been thoroughly
> discussed yet. The reason why you want to cluster your tables is to
> speed up SELECTs that return a bunch of tuples with similar values, for
> example range queries. The reason for keeping them clustered on inserts
> is to reduce the need to run CLUSTER as often.
>
> It doesn't come without a cost, however. In the worst case, there never
> is room for new inserts on pages, and each insert needs to do one extra
> I/O to fetch the optimal heap page where the insert should go, see that
> there's no room, and then insert somewhere else. Using a non-zero
> fillfactor helps, but even when there is room on the page, it's often
> cheaper to just append to the end of the table and running CLUSTER at
> night for example, than do random access to insert to the "right" pages
> in the heap.
>
> So, should we have a WITH-option on the table to enable/disable this
> feature, and what would be the default?
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com

>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo(at)postgresql(dot)org so that your
> message can get through to the mailing list cleanly

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2007-05-18 13:59:37 Re: Lack of urgency in 8.3 reviewing
Previous Message Alvaro Herrera 2007-05-18 13:53:17 Re: Lack of urgency in 8.3 reviewing

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2007-05-18 14:45:01 Re: CREATE TABLE LIKE INCLUDING INDEXES support
Previous Message Pavel Stehule 2007-05-18 13:02:01 Re: Updateable cursors patch