Re: partition routing layering in nodeModifyTable.c

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: partition routing layering in nodeModifyTable.c
Date: 2020-10-19 11:55:28
Message-ID: 1d6516c6-9766-3d35-9acd-6358aa51a1ce@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 17/10/2020 18:54, Alvaro Herrera wrote:
> On 2020-Oct-17, Amit Langote wrote:
>> As I said in my previous email, I don't see how we can make
>> initializing the map any lazier than it already is. If a partition
>> has a different tuple descriptor than the root table, then we know for
>> sure that any tuples that are routed to it will need to be converted
>> from the root tuple format to its tuple format, so we might as well
>> build the map when the ResultRelInfo is built. If no tuple lands into
>> a partition, we would neither build its ResultRelInfo nor the map.
>> With the current arrangement, if the map field is NULL, it simply
>> means that the partition has the same tuple format as the root table.
>
> I see -- makes sense.

It's probably true that there's no performance gain from initializing
them more lazily. But the reasoning and logic around the initialization
is complicated. After tracing through various path through the code, I'm
convinced enough that it's correct, or at least these patches didn't
break it, but I still think some sort of lazy initialization on first
use would make it more readable. Or perhaps there's some other
refactoring we could do.

Perhaps we should have a magic TupleConversionMap value to mean "no
conversion required". NULL could then mean "not initialized yet".

>> On Fri, Oct 16, 2020 at 11:45 PM Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> wrote:
>
>>> BTW it is curious that ExecInitRoutingInfo is called both in
>>> ExecInitPartitionInfo() (from ExecFindPartition when the ResultRelInfo
>>> for the partition is not found) *and* from ExecFindPartition again, when
>>> the ResultRelInfo for the partition *is* found. Doesn't this mean that
>>> ri_PartitionInfo is set up twice for the same partition?
>>
>> No. ExecFindPartition() directly calls ExecInitRoutingInfo() only for
>> reused update result relations, that too, only the first time a tuple
>> lands into such a partition. For the subsequent tuples that land into
>> the same partition, ExecFindPartition() will be able to find that
>> ResultRelInfo in the proute->partitions[] array. All ResultRelInfos
>> in that array are assumed to have been processed by
>> ExecInitRoutingInfo().
>
> Doh, right, sorry, I was misreading the if/else maze there.

I think that demonstrates my point that the logic is hard to follow :-).

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2020-10-19 12:07:01 Re: Use PointerGetDatum(cstring_to_text_with_len()) instead of CStringGetTextDatum() to avoid duplicate strlen
Previous Message Heikki Linnakangas 2020-10-19 11:48:26 Re: partition routing layering in nodeModifyTable.c