Re: Removed extra memory allocations from create_list_bounds

From: Nitin Jadhav <nitinjadhavpostgres(at)gmail(dot)com>
To: Zhihong Yu <zyu(at)yugabyte(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Removed extra memory allocations from create_list_bounds
Date: 2021-05-17 14:52:25
Message-ID: CAMm1aWYr9=qPCTeNzVCEbF=Nr=RLSnt_vdCDTm9VjLS5L5JuOw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> > While working on [1], I observed that extra memory is allocated in
> > [1] https://mail.google.com/mail/u/2/#search/multi+column+list/KtbxLxgZZTjRxNrBWvmHzDTHXCHLssSprg?compose=CllgCHrjDqKgWCBNMmLqhzKhmrvHhSRlRVZxPCVcLkLmFQwrccpTpqLNgbWqKkTkTFCHMtZjWnV

I am really sorry for this. I wanted to point to the thread subjected
to 'Multi-Column List Partitioning'.

> If it's worth counting list elements in advance, then you can also allocate the
> PartitionListValue as a single chunk, rather than palloc in a loop.
> This may help large partition heirarchies.
>
> And the same thing in create_hash_bounds with hbounds.

I agree and thanks for creating those patches. I am not able to apply
the patch on the latest HEAD. Kindly check and upload the modified
patches.

> I'm not able to detect that this is saving more than about ~1% less RAM, to
> create or select from 1000 partitions, probably because other data structures
> are using much more, and savings here are relatively small.

Yes it does not save huge memory but it's an improvement.

> I'm going to add to the next CF. You can add yourself as an author, and watch
> that the patch passes tests in cfbot.
> https://commitfest.postgresql.org/
> http://cfbot.cputube.org/

Thanks for creating the commitfest entry.

> Since the function returns the total number of non null bound values, should it be named get_non_null_list_bounds_count ?
> It can also be named get_count_of_... but that's longer.

Changed it to 'get_non_null_list_bounds_count'.

> The palloc() call would take place even if ndatums is 0. I think in that case, palloc() doesn't need to be called.

I feel there is no such case where the 'ndatums' is 0 because as we
can see below, there is an assert in the 'partition_bounds_create'
function from where we call the 'create_list_bounds' function. Kindly
provide such a case if I am wrong.

PartitionBoundInfo
partition_bounds_create(PartitionBoundSpec **boundspecs, int nparts,
PartitionKey key, int **mapping)
{
int i;

Assert(nparts > 0);

Thanks & Regards,
Nitin Jadhav
On Sun, May 16, 2021 at 10:52 PM Zhihong Yu <zyu(at)yugabyte(dot)com> wrote:
>
>
>
> On Sun, May 16, 2021 at 10:00 AM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
>>
>> On Sat, May 15, 2021 at 02:40:45PM +0530, Nitin Jadhav wrote:
>> > While working on [1], I observed that extra memory is allocated in
>> > [1] https://mail.google.com/mail/u/2/#search/multi+column+list/KtbxLxgZZTjRxNrBWvmHzDTHXCHLssSprg?compose=CllgCHrjDqKgWCBNMmLqhzKhmrvHhSRlRVZxPCVcLkLmFQwrccpTpqLNgbWqKkTkTFCHMtZjWnV
>>
>> This is a link to your gmail, not to anything public.
>>
>> If it's worth counting list elements in advance, then you can also allocate the
>> PartitionListValue as a single chunk, rather than palloc in a loop.
>> This may help large partition heirarchies.
>>
>> And the same thing in create_hash_bounds with hbounds.
>>
>> create_range_bounds() already doesn't call palloc in a loop. However, then
>> there's an asymmetry in create_range_bounds, which is still takes a
>> double-indirect pointer.
>>
>> I'm not able to detect that this is saving more than about ~1% less RAM, to
>> create or select from 1000 partitions, probably because other data structures
>> are using much more, and savings here are relatively small.
>>
>> I'm going to add to the next CF. You can add yourself as an author, and watch
>> that the patch passes tests in cfbot.
>> https://commitfest.postgresql.org/
>> http://cfbot.cputube.org/
>>
>> Thanks,
>> --
>> Justin
>
> Hi,
> For 0001-Removed-extra-memory-allocations-from-create_list_bo.patch :
>
> +static int
> +get_non_null_count_list_bounds(PartitionBoundSpec **boundspecs, int nparts)
>
> Since the function returns the total number of non null bound values, should it be named get_non_null_list_bounds_count ?
> It can also be named get_count_of_... but that's longer.
>
> + all_values = (PartitionListValue **)
> + palloc(ndatums * sizeof(PartitionListValue *));
>
> The palloc() call would take place even if ndatums is 0. I think in that case, palloc() doesn't need to be called.
>
> Cheers
>

Attachment Content-Type Size
v2_removed_extra_mem_alloc_from_create_list_bounds.patch application/x-patch 3.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2021-05-17 14:56:05 Re: Corrected documentation of data type for the logical replication message formats.
Previous Message Tom Lane 2021-05-17 14:38:15 Re: pg_dumpall misses --no-toast-compression