Re: partitioning - changing a slot's descriptor is expensive

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: partitioning - changing a slot's descriptor is expensive
Date: 2018-06-29 06:20:06
Message-ID: CAFjFpRdg7DnOZ0rS9+Eo=A3yKz23R6-Yhk-grvfDonzYEqDgkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jun 29, 2018 at 11:29 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2018-06-29 11:20:53 +0530, Amit Khandekar wrote:
>> On 27 June 2018 at 18:33, Ashutosh Bapat
>> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>> > On Wed, Jun 27, 2018 at 10:39 AM, Andres Freund <andres(at)anarazel(dot)de> wrote:
>> >> Unfortunately calling ExecSetSlotDescriptor() is far from cheap, it has
>> >> to reallocate the values/isnull arrays (and potentially do desc pinning
>> >> etc).
>> >
>> > I bumped into this code yesterday while looking at ExecFetchSlotTuple
>> > and ExecMaterializeSlot usages. I was wondering the same.
>> >
>> > ExecSetSlotDescriptor() always frees tts_values and tts_isnull array
>> > even if the tuple descriptor being set has same number of attributes
>> > as previous one. Most of the times that will be the case. I think we
>> > should optimize that case.
>>
>> +1
>
> This doesn't strike me as a great optimization. Any place where change
> descriptors with any regularity, we're doing something wrong or at least
> decidedly suboptimal. We shouldn't react to that by optimizing the wrong
> thing, we should do the wrong thing less often.

I agree with all of that, but I think this tiny optimization can be
done independent of partitioning problem as well.

>
>
>> >> I think it'd be good to rewrite the code so there's an input and an
>> >> output slot that each will keep their slot descriptors set.
>> >
>> > +1 for that.
>> >
>> > But I am worried that the code will have to create thousand slots if
>> > there are thousand partitions. I think we will need to see how much
>> > effect that has.
>>
>> I agree that it does not make sense to create as many slots, if at all
>> we go by this approach. Suppose the partitioned table is the only one
>> having different tuple descriptor, and rest of the partitions have
>> same tuple descriptor. In that case, keep track of unique descriptors,
>> and allocate a slot per unique descriptor.
>
> Why? Compared to the rest of the structures created, a slot is not
> particularly expensive? I don't see what you're optimizing here.

The size of slot depends upon the number of attributes of the table. A
ten column table will take 80 byes for datum array and 10 bytes (+
padding) for isnull array, which for a thousand partitions would
translate to 90KB memory. That may be small compared to the relation
cache memory consumed, but it's some memory.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2018-06-29 06:23:32 Re: partitioning - changing a slot's descriptor is expensive
Previous Message Andres Freund 2018-06-29 05:59:36 Re: partitioning - changing a slot's descriptor is expensive