Re: effective SELECT from child tables

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Martijn van Oosterhout <kleptog(at)svana(dot)org>, g(at)pervasive(dot)com
Cc: mark(at)mark(dot)mielke(dot)cc, Simon Riggs <simon(at)2ndquadrant(dot)com>, Ilia Kantor <ilia(at)obnovlenie(dot)ru>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: effective SELECT from child tables
Date: 2005-10-01 15:57:27
Message-ID: 20051001155727.GB40138@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

To clarify, this is a hard-coded implementation of what I'm asking for:
http://cvs.distributed.net/viewcvs.cgi/stats-sql/logdb/ in a nutshell:

CREATE TABLE log_other (
project_id smallint NOT NULL
...
)

CREATE TABLE log_8 (
-- No project_id
...
)
CREATE TABLE log_24, log_25, log_5...
CREATE VIEW log AS
SELECT * FROM log_other
UNION ALL SELECT 8 AS project_id, * FROM log_8
...

So the end result is that for cases where project_id is 5, 8, 24, or 25,
the data will be stored in tables that don't have the project_id.

If I were to use this on the main table for
http://stats.distributed.net, which has ~130M rows, I would be able to
save 130M*4 bytes (4 instead of 2 due to alignment), or 520MB. The logdb
will have many times that number of rows, so the savings will be even
larger.

Note that this technique wouldn't help at all for something like date
partitioning, because you have to store the date in the partitioned
table.

On Sat, Oct 01, 2005 at 04:35:49PM +0200, Martijn van Oosterhout wrote:
> On Sat, Oct 01, 2005 at 10:05:22AM -0400, mark(at)mark(dot)mielke(dot)cc wrote:
> > It has the 'side or additional benefit' being requested here. The ability
> > to filter the child table by some attribute. For example, if the child
> > tables are used for partitioning, and the attribute were to keep a date
> > range, the field restriction optimization could be used to automatically
> > determine the set of tables to use for the date range specified. With
> > such a change, it would even work automatically if the date ranges
> > overlapped for some reason. Selecting a table name by date is hacky. This
> > sort of solution would be a general solution to the problem.
>
> This is what "Constraint Exclusion" does. It uses CHECK constraints on
> a table to filter out tables that obviously don't apply to a query.
> It's just the the specific case of "tableoid = XXX" is not supported
> right now.
>
> Have a nice day,
> --
> Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> > tool for doing 5% of the work and then sitting around waiting for someone
> > else to do the other 95% so you can sue them.

--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-10-01 15:57:48 Re: [PATCHES] Proposed patch for sequence-renaming problems
Previous Message Tom Lane 2005-10-01 15:55:12 Re: Proposed patch for sequence-renaming problems