Re: Removed extra memory allocations from create_list_bounds

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Nitin Jadhav <nitinjadhavpostgres(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Zhihong Yu <zyu(at)yugabyte(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Removed extra memory allocations from create_list_bounds
Date: 2021-07-05 13:48:52
Message-ID: CAApHDvpouaAv6P+UHbwZ1OL=z1sBcf5vuMJXGauhBZZMEAHb2w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 19 May 2021 at 05:28, Nitin Jadhav
<nitinjadhavpostgres(at)gmail(dot)com> wrote:
> I have rebased all the patches on top of
> 'v2_0001-removed_extra_mem_alloc_from_create_list_bounds.patch'.
> Attaching all the patches here.

I had a look over these and I think what's being done here is fine.

I think this will help speed up building the partition bound.
Unfortunately, it won't help any for speeding up things like finding
the correct partition during SELECT or DML on partitioned tables.
The reason for this is that RelationBuildPartitionDesc first builds
the partition bound using the functions being modified here, but it
then copies it into the relcache into a memory context for the
partition using partition_bounds_copy(). It looks like
partition_bounds_copy() suffers from the same palloc explosion type
problem as is being fixed in each of the create_*_bounds() functions
here. The good news is, we can just give partition_bounds_copy() the
same treatment. 0004 does that.

I tried to see if the better cache locality of the
PartitionBoundInfo's datum array would help speed up inserts into a
partitioned table. I figured a fairly small binary search in a LIST
partitioned table of 10 partitions might have all Datum visited all on
the same cache line. However, I was unable to see any performance
gains. I think the other work being done is likely just going to drown
out any gains in cache efficiency in the binary search. COPY into a
partitioned table might have more chance of becoming a little faster,
but I didn't try.

I've attached another set of patches. I squashed all the changes to
each create_*_bounds function into a patch of their own. Perhaps 0002
and 0003 which handle range and hash partitioning can be one patch
since Justin seemed to write that one. I kept 0001 separate since
that's Nitin's patch plus Justin's extra parts. It seems easier to
credit properly having the patches broken out like this. I think it's
excessive to break down 0001 into Nitin and Justin's individual parts.

I did make a few adjustments to the patches renaming a variable or two
and I changed how we assign the boundinfo->datums[i] pointers to take
the address of the Nth element rather than incrementing the variable
pointing to the array each item. I personally like p = &array[i];
more than p = array; array++, others may not feel the same.

Nitin and Justin, are you both able to have another look over these
and let me know what you think. If all is well I'd like to push all 4
patches.

David

Attachment Content-Type Size
v3-0001-Reduce-the-number-of-pallocs-in-create_list_bound.patch application/octet-stream 6.7 KB
v3-0004-Reduce-the-number-of-pallocs-in-partition_bounds_.patch application/octet-stream 1.5 KB
v3-0003-Reduce-the-number-of-palloc-calls-in-create_range.patch application/octet-stream 1.9 KB
v3-0002-Reduce-the-number-of-pallocs-in-create_hash_bound.patch application/octet-stream 4.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Victor Spirin 2021-07-05 13:53:06 Atomic rename feature for Windows.
Previous Message Andrey Lepikhov 2021-07-05 13:20:01 Re: Removing unneeded self joins