Re: Declarative partitioning - another take

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Declarative partitioning - another take
Date: 2016-11-17 16:43:14
Message-ID: CA+TgmoYvp6DD1jpwb9sNe08E7jSFEFky32TJU+sJ8OStHYW3nA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 17, 2016 at 6:27 AM, Amit Langote
<Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> OK, I will share the performance results soon.

Thanks.

>> Also, in 0006:
>>
>> - I doubt that PartitionTreeNodeData's header comment will survive
>> contact with pgindent.
>
> Fixed by adding "/* ----" at the top of the comment.

OK. Hopefully you also tested that. In general, with a patch set
this large, the more you can do to make it pgindent-clean, the better.
Ideally none of your code would get changed by pgindent, or at least
not the new files. But note that you will probably have have to
update typedefs.list if you actually want to be able to run it without
having it mangle things that involve your new structs.

>> - The code in make_modifytable() to swap out the rte_array for a fake
>> one looks like an unacceptable kludge. I don't know offhand what a
>> better design would look like, but what you've got is really ugly.
>
> Agree that it looks horrible. The problem is we don't add partition
> (child table) RTEs when planning an insert on the parent and FDW
> partitions can't do without some planner handling - planForeignModify()
> expects a valid PlannerInfo for deparsing target lists (basically, to be
> able to use planner_rt_fetch()).
>
> Any design beside this kludge would perhaps mean that we will be adding
> valid partition RTEs at some earlier planning stage much like what
> expand_inherited_rtentry() does during non-insert queries. Perhaps, we
> could have expand_inherited_rtentry() make an exception in partitioned
> tables' case and create root->append_rel_list members even in the insert
> case. We could then go over the append_rel_list in make_modifytable() to
> get the valid RT index of foreign child tables to pass to
> PlanForeignModify(). Using append_rel_list for insert planning is perhaps
> equally ugly though. Thoughts?

If it's only needed for foreign tables, how about for v1 we just throw
an error and say errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot route inserted tuples to a foreign table") for now. We
can come back and fix it later. Doing more inheritance expansion
early is probably not a good idea because it likely sucks for
performance, and that's especially unfortunate if it's only needed for
foreign tables. Coming up with some new FDW API or some modification
to the existing one is probably better, but I don't really want to get
hung up on that right now.

>> - I don't understand how it can be right for get_partition_for_tuple()
>> to be emitting an error that says "range partition key contains null".
>> A defective partition key should be detected at partition creation
>> time, not in the middle of tuple routing.
>
> That error is emitted when the partition key of the *input tuple* is found
> to contain NULLs. Maybe we need to consider something like how btree
> indexes treat NULLs - order NULLs either before or after all the non-NULL
> values based on NULLS FIRST/LAST config. Thoughts?

Well, I'd say my confusion suggests that the error message needs to be
clearer about what the problem is. I think this is basically a case
of there being no partition for the given row, so maybe just arrange
to use that same message here ("no partition of relation \"%s\" found
for row").

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2016-11-17 17:05:45 Re: Hash Indexes
Previous Message Alvaro Herrera 2016-11-17 16:37:57 Re: Avoiding pin scan during btree vacuum