Re: Partitioning feature ...

From: Emmanuel Cecchet <manu(at)asterdata(dot)com>
To: Kedar Potdar <kedar(dot)potdar(at)gmail(dot)com>
Cc: <pgsql-hackers(at)postgresql(dot)org>, Amit Gupta <amit(dot)pc(dot)gupta(at)gmail(dot)com>
Subject: Re: Partitioning feature ...
Date: 2009-03-26 22:08:02
Message-ID: 49CBFCC2.9050502@asterdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Kedar,

First of all, congratulations for the excellent work.
I have some comments and questions.

In get_relevent_partition (btw, relevant is spelled with an a) you are
maintaining 2 lists. I guess this is only useful for multi-column
partitions, right?
If you have a single column partition (without subpartitions), I think
you could directly return on the first match (without maintaining any
list) since you guarantee that there is no overlap between partitions.
A simple but effective optimization for inserts consists of caching the
last partition used (consecutive inserts often go to the same partition)
and try it first before going through the whole loop.

The update trigger should first check if the tuple needs to be moved. If
the updated tuple still matches the constraints of the partitions it
will not have to be moved and will save a lot of overhead.

The COPY operation should probably be optimized to use the same code as
the one in the insert trigger for partitioned tables. I guess some code
could be factorized in COPY to make the inserts more efficient.

The current trigger approach should prevent other triggers to be added
to the table, or you should make sure that the partition trigger is
always the one to execute last.

As we don't have automatic partition creation, it would be interesting
to have an optional mechanism to deal with tuples that don't match any
partition (very useful when you do bulk insert and some new data require
a new partition). Having a simple overflow partition or an error logging
mechanism would definitely help to identify these tuples and prevent
things like large COPY operations to fail.

Looking forward to your responses,
Emmanuel

>
> We are implementing table partitioning feature to support Range and
> Hash partitions. Please find attached, the WIP patch and test-cases.
>
> The syntax used conforms to most of the suggestions mentioned in
> http://archives.postgresql.org/pgsql-hackers/2008-01/msg00413.php,
> barring the following:
> -- Specification of partition names is optional. System will be able
> to generate partition names in such cases.
> -- Sub partitioning
>
> We are maintaining a system catalog(pg_partition) for partition
> meta-data. System will look-up this table to find appropriate
> partition to operate on.
> System internally uses low-level 'C' triggers to row-movement.
>
> Regards,
> --
> Kedar.
>
>
>
> ------------------------------------------------------------------------
>
>
>

--
Emmanuel Cecchet
Aster Data Systems
Web: http://www.asterdata.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2009-03-26 22:28:00 Re: gettext, plural form and translation
Previous Message Bernd Helmle 2009-03-26 21:29:16 Re: maintenance_work_mem and autovacuum