partitioning - changing a slot's descriptor is expensive

From: Andres Freund <andres(at)anarazel(dot)de>
To: Amit Langote <amitlangote09(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: partitioning - changing a slot's descriptor is expensive
Date: 2018-06-27 05:09:28
Message-ID: 20180627050928.jeejsw5eub2ivilh@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

(sorry if I CCed the wrong folks, the code has changed a fair bit)

I noticed that several places in the partitioning code look like:

/*
* If the tuple has been routed, it's been converted to the
* partition's rowtype, which might differ from the root
* table's. We must convert it back to the root table's
* rowtype so that val_desc shown error message matches the
* input tuple.
*/
if (resultRelInfo->ri_PartitionRoot)
{
TableTuple tuple = ExecFetchSlotTuple(slot);
TupleConversionMap *map;

rel = resultRelInfo->ri_PartitionRoot;
tupdesc = RelationGetDescr(rel);
/* a reverse map */
map = convert_tuples_by_name(orig_tupdesc, tupdesc,
gettext_noop("could not convert row type"));
if (map != NULL)
{
tuple = do_convert_tuple(tuple, map);
ExecSetSlotDescriptor(slot, tupdesc);
ExecStoreTuple(tuple, slot, InvalidBuffer, false);
}
}

Unfortunately calling ExecSetSlotDescriptor() is far from cheap, it has
to reallocate the values/isnull arrays (and potentially do desc pinning
etc). Nor is it always cheap to call ExecFetchSlotTuple(slot) - it might
be a virtual slot. Calling heap_deform_tuple(), as done in
do_convert_tuple(), might be superfluous work, because the input tuple
might already be entirely deformed in the input slot.

I think it'd be good to rewrite the code so there's an input and an
output slot that each will keep their slot descriptors set.

Besides performance as the code stands, this'll also be important for
pluggable storage (as we'd otherwise get a *lot* of back/forth
conversions between tuple formats).

Anybody interesting in writing a patch that redoes this?

Greetings,

Andres Freund

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Rushabh Lathia 2018-06-27 05:16:35 Typo in llvm_function_reference
Previous Message Chen, Yan-Jack (NSB - CN/Hangzhou) 2018-06-27 04:46:03 RE: "wal receiver" process hang in syslog() while exiting after receiving SIGTERM while the postgres has been promoted.