Re: TODO idea - implicit constraints across child tables with a common column as primary key (but obviously not a shared index)

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: "Andrew Hammond" <andrew(dot)george(dot)hammond(at)gmail(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TODO idea - implicit constraints across child tables with a common column as primary key (but obviously not a shared index)
Date: 2007-04-23 18:46:52
Message-ID: 87ps5v2ajn.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


"Andrew Hammond" <andrew(dot)george(dot)hammond(at)gmail(dot)com> writes:

> If you have a table with a bunch of children, and these children all
> have a primary key which is generated from the same sequence, assuming
> that you're partitioning based on date (ie, this is a transaction
> record table), it would be nice if the planner could spot that all
> tables have a primary key on a column used as a join condition, check
> the min / max to see if there is overlap between tables, then apply
> CBE as if constraints existed.

The problem is that it's not really true that sequences and time move
together. It's quite possible to have two transactions which both start just
before the date-based partition cutoff but have one land in each partition
with the greater sequence number landing in the old partition.

It would be rare (but still possible) if you always insert using quick
autocommitted inserts with nextval() in a values list. But it would be quite
likely if you use one of the other coding styles such as doing one query to
look up the nextval() and then doing various inserts based on that value in
multiple statements within a single transaction.

What I've been considering instead was using the statistics. If we provided a
way to mark partitions read-only then once a table (or partition) is marked
then a subsequent VACUUM ANALYZE could mark the resulting statistics as
"authoritative". Now that we have plan invalidation we could use this kind of
information in the planning.

The main data from the statistics that's of interest here are the extreme
values of the histogram. If we're not interested in any values in that range
then we can exclude the partition entirely.

This has a number of nice properties. It requires little additional work for
the DBA and "read-only" is a nice simple concept for a DBA to understand. It's
even a useful feature for other purposes. It also can catch a lot more cases
than the one you describe. In particular it would eliminate the parent table
if it has no rows which gives us a chance to eliminate the Append node
altogether.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2007-04-23 18:51:51 Re: BUG #3245: PANIC: failed to re-find shared loc k o b j ect
Previous Message Andrew Hammond 2007-04-23 18:10:29 TODO idea - implicit constraints across child tables with a common column as primary key (but obviously not a shared index)