Re: partitioning - changing a slot's descriptor is expensive

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Amit Langote <amitlangote09(at)gmail(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: partitioning - changing a slot's descriptor is expensive
Date: 2018-06-27 13:03:29
Message-ID: CAFjFpRcxb=d9=wdDxterEssKeLd2uAWo9578hRntBptxLkrOYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 27, 2018 at 10:39 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> Hi,
>
> (sorry if I CCed the wrong folks, the code has changed a fair bit)
>
> I noticed that several places in the partitioning code look like:
>
> /*
> * If the tuple has been routed, it's been converted to the
> * partition's rowtype, which might differ from the root
> * table's. We must convert it back to the root table's
> * rowtype so that val_desc shown error message matches the
> * input tuple.
> */
> if (resultRelInfo->ri_PartitionRoot)
> {
> TableTuple tuple = ExecFetchSlotTuple(slot);
> TupleConversionMap *map;
>
> rel = resultRelInfo->ri_PartitionRoot;
> tupdesc = RelationGetDescr(rel);
> /* a reverse map */
> map = convert_tuples_by_name(orig_tupdesc, tupdesc,
> gettext_noop("could not convert row type"));
> if (map != NULL)
> {
> tuple = do_convert_tuple(tuple, map);
> ExecSetSlotDescriptor(slot, tupdesc);
> ExecStoreTuple(tuple, slot, InvalidBuffer, false);
> }
> }
>
>
> Unfortunately calling ExecSetSlotDescriptor() is far from cheap, it has
> to reallocate the values/isnull arrays (and potentially do desc pinning
> etc).

I bumped into this code yesterday while looking at ExecFetchSlotTuple
and ExecMaterializeSlot usages. I was wondering the same.

ExecSetSlotDescriptor() always frees tts_values and tts_isnull array
even if the tuple descriptor being set has same number of attributes
as previous one. Most of the times that will be the case. I think we
should optimize that case. If you agree, I will submit a patch for
that.

Amit Langote has clarified that this doesn't happen often since
partitioned tables will not require tuple conversion usually. But I
agree that in pathological case this is a performance eater and should
be fixed.

> Nor is it always cheap to call ExecFetchSlotTuple(slot) - it might
> be a virtual slot. Calling heap_deform_tuple(), as done in
> do_convert_tuple(), might be superfluous work, because the input tuple
> might already be entirely deformed in the input slot.
>
> I think it'd be good to rewrite the code so there's an input and an
> output slot that each will keep their slot descriptors set.

+1 for that.

But I am worried that the code will have to create thousand slots if
there are thousand partitions. I think we will need to see how much
effect that has.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2018-06-27 13:06:51 Re: Few cosmetic suggestions for commit 16828d5c (Fast Alter Table Add Column...)
Previous Message Daniel Gustafsson 2018-06-27 12:32:19 Re: [HACKERS] Support for Secure Transport SSL library on macOS as OpenSSL alternative