Re: partition routing layering in nodeModifyTable.c

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: partition routing layering in nodeModifyTable.c
Date: 2020-10-20 12:57:31
Message-ID: CA+HiwqEs18rLT4OivHtLhUOZ1XU5qFdTyMk8qG7yEC8OJ49G_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Oct 19, 2020 at 8:55 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> On 17/10/2020 18:54, Alvaro Herrera wrote:
> > On 2020-Oct-17, Amit Langote wrote:
> >> As I said in my previous email, I don't see how we can make
> >> initializing the map any lazier than it already is. If a partition
> >> has a different tuple descriptor than the root table, then we know for
> >> sure that any tuples that are routed to it will need to be converted
> >> from the root tuple format to its tuple format, so we might as well
> >> build the map when the ResultRelInfo is built. If no tuple lands into
> >> a partition, we would neither build its ResultRelInfo nor the map.
> >> With the current arrangement, if the map field is NULL, it simply
> >> means that the partition has the same tuple format as the root table.
> >
> > I see -- makes sense.
>
> It's probably true that there's no performance gain from initializing
> them more lazily. But the reasoning and logic around the initialization
> is complicated. After tracing through various path through the code, I'm
> convinced enough that it's correct, or at least these patches didn't
> break it, but I still think some sort of lazy initialization on first
> use would make it more readable. Or perhaps there's some other
> refactoring we could do.

So the other patch I have mentioned is about lazy initialization of
the ResultRelInfo itself, not the individual fields, but maybe with
enough refactoring we can get the latter too.

Currently, ExecInitModifyTable() performs ExecInitResultRelation() for
all relations in ModifyTable.resultRelations, which sets most but not
all ResultRelInfo fields (whatever InitResultRelInfo() can set),
followed by initializing some other fields based on the contents of
the ModifyTable plan. My patch moves those two steps into a function
ExecBuildResultRelation() which is called lazily during
ExecModifyTable() for a given result relation on the first tuple
produced by that relation's plan. Actually, there's a "getter" named
ExecGetResultRelation() which first consults es_result_relations[rti -
1] for the requested relation and if it's NULL then calls
ExecBuildResultRelation().

Would you mind taking a look at that as a starting point? I am
thinking there's enough relevant discussion here that I should post
the rebased version of that patch here.

> Perhaps we should have a magic TupleConversionMap value to mean "no
> conversion required". NULL could then mean "not initialized yet".

Perhaps, a TupleConversionMap with its attrMap set to NULL means "no
conversion required".

--
Amit Langote
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Drouvot, Bertrand 2020-10-20 12:58:10 Re: Add Information during standby recovery conflicts
Previous Message Andrew Dunstan 2020-10-20 12:55:37 Re: Make procedure OUT parameters work with JDBC