Re: [HACKERS] Maintaining cluster order on insert

From: stark <stark(at)enterprisedb(dot)com>
To: genekhart(at)gmail(dot)com
Cc: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, "Heikki Linnakangas" <heikki(at)enterprisedb(dot)com>, pgsql-patches(at)postgresql(dot)org
Subject: Re: [HACKERS] Maintaining cluster order on insert
Date: 2006-08-10 10:59:11
Message-ID: 87d5b87t6o.fsf@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches


Gene <genekhart(at)gmail(dot)com> writes:

> "Your best bet might be to partition the table into two subtables, one
> with "stable" data and one with the fresh data, and transfer rows from
> one to the other once they get stable. Storage density in the "fresh"
> part would be poor, but it should be small enough you don't care."
>
> This sounds interesting, I could create a RULE/INSERT on the unstable table,
> I will know during the update if it is ready to be put in the stable table.
> What would be an efficient way to do the transfer? Since the updates occur
> somewhat randomly, wouldnt the tuples in the stable table then be out of
> natural timestamp order?

You may find it easier to handle some of the logic in a low level application
layer or layer of stored procedures rather than trying to make it entirely
transparent with rules. If you do want it to be transparent you might also
consider whether you want triggers instead of rules.

Another direction you might want to consider is whether the columns that
you're updating would be more normalized in a separate table. You might really
want to have a record of those past states as well. So you might find having
three records in this other table for each of your regular records in your
main table might actually work out better.

Even if you only have a 1-1 relationship sometimes this kind of horizontal
partitioning (or do people consider this vertical partitioning?) is still
worthwhile. If the columns being updated are very small or often not needed at
all then it may be reasonably efficient to look them up separately and still
let you store the bulk of the data efficiently and access it in a fast
sequential scan.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-08-10 12:57:48 Re: Forcing current WAL file to be archived
Previous Message Heikki Linnakangas 2006-08-10 08:23:39 Re: [PATCHES] Maintaining cluster order on insert

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2006-08-10 12:57:48 Re: Forcing current WAL file to be archived
Previous Message Heikki Linnakangas 2006-08-10 08:23:39 Re: [PATCHES] Maintaining cluster order on insert