Re: Partitioning option for COPY

From: Jan Urbański <wulczer(at)wulczer(dot)org>
To: Emmanuel Cecchet <manu(at)asterdata(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partitioning option for COPY
Date: 2009-11-22 00:00:05
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Emmanuel Cecchet wrote:
> Hi Jan,
> Here is the updated patch.
> Note that the new code in trigger is a copy/paste of the before row
> insert trigger code modified to use the pointers of the after row
> trigger functions.


ok, this version applied, compiled and ran the regression tests fine. I
tried a few things and was not able to break it this time.

A couple of nitpicks first:

o) the route_tuple_to_child recurses to child tables of child tables,
which is undocumented and requires a check_stack_depth() call if it's
really desirable

o) the error messages given when a trigger modifies the tuple should be
one sentence, I suggest dropping the "Aborting insert" part

o) there are two places with "Close the relation but keep the lock"
comments. Why is in necessary to keep the locks? I confess I don't know
why *wouldn't* it be necessary, but maybe the comment could explain
that? Or is it just my lack of understanding and it should be obvious
that the lock needs to be kept?

o) the result of relation_open is explicitly cast to Relation, the
result of try_relation_open is not (a minor gripe)

And a couple of more important things:

o) the code added in trigger.c (ExecARInsertTriggersNow) is copy/pasted
from just above, I guess there was a reason why you needed that code,
but I also suspect that's a string indication that something's wrong
with the abstractions in your patch. Again I don't really know how else
you could achieve what you want. It just looks fishy if you need to
modify trigger.c to add an option to COPY.

o) publicizing ExecRelCheck might also indicate a problem, but I guess
that can be defended, as the patch is basically based on using that
function for each incoming tuple

o) the LRU OID cache is a separate optimisation that could be
separated from the patch. I didn't do any performance tests, and I trust
that a cache like that helps with some workloads, but I think we could
do a better effort that a simplistic cache like that. Also, I'm not 100%
sure it's OK to just stick it into CacheMemoryContext... Maybe it could
go into the COPY statement context? You said you don't want to start
with a cold cache always, but OTOH if you're loading into different
tables in the same backend, the cache will actually hurt...

[thinks of something really bad... types up a quick test...]

Oh, actually, the cache is outright *wrong*, as the attached test6.sql
shows. Ugh, let's just forget about that LRU cache for now.

o) the patch could use some more docs, especially about descending into
child tables.

o) my main concern is still valid: the design was never agreed upon.
The approach of using inheritance info for automatic partitioning is, at
least IMHO, too restricted. Regular INSERTs won't get routed to child
tables. Data from writable CTEs won't get routed. People wanting to do
partitioning on something else that constraints are stuffed.

I strongly suspect the patch will get rejected on the grounds of lack of
community agreement on partitioning, but I'd hate to see your work
wasted. It's not too late to open a discussion on how automatic
partitioning could work (or start working out a common proposal with the
people discussing in the "Syntax for partitioning" thread).

Marking as Waiting on Author, although I'd really like to see a solid
design being agreed upon, and then the code.


Attachment Content-Type Size
test6.sql text/plain 468 bytes

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Alex Hunsaker 2009-11-22 00:05:10 Re: Ignoring white space in regression tests really a good idea?
Previous Message Peter Eisentraut 2009-11-21 23:59:18 Re: UTF8 with BOM support in psql