Re: Optimizing ResouceOwner to speed up COPY

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
Subject: Re: Optimizing ResouceOwner to speed up COPY
Date: 2025-10-21 07:10:38
Message-ID: 73dffc49-942a-4502-9963-1b7838958959@iki.fi
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 18/10/2025 01:49, Tomas Vondra wrote:
> On 10/17/25 12:32, Tomas Vondra wrote:
>>
>>
>> On 10/17/25 10:31, Heikki Linnakangas wrote:
>>>>  typedef struct ResourceElem
>>>>  {
>>>>      Datum        item;
>>>> +    uint32        count;                /* number of occurrences */
>>>>      const ResourceOwnerDesc *kind;    /* NULL indicates a free hash
>>>> table slot */
>>>>  } ResourceElem;
>>>
>>> Hmm, the 'count' is not used when the entry is stored in the array.
>>> Perhaps we should have a separate struct for array and hash elements
>>> now. Keeping the array small helps it to fit in CPU caches.
>>
>> Agreed. I had the same idea yesterday, but I haven't done it yet.
>
> The attached v2 does that - it adds a separate ResourceHashElem for the
> has table, and it works. But I'm not sure I like it very much, because
> there are two places that relied on both the array and hash table using
> the same struct to "walk" it the same way.
>
> For ResourceOwnerSort() it's not too bad, but ResourceOwnerReleaseAll()
> now duplicates most of the code. It's not terrible, but also not pretty.
> I can't think of a better way, though.

Looks fine to me. The code duplication is not too bad IMO.

Here's a rebased version of the micro-benchmark I used when I was
working on the ResourceOwner refactoring
(https://www.postgresql.org/message-id/d746cead-a1ef-7efe-fb47-933311e876a3%40iki.fi)

I ran it again on my laptop. Different from the one I used back then, so
the results are not comparable with the results from that old thread.

Unpatched (commit 18d26140934):

postgres=# \i contrib/resownerbench/snaptest.sql
numkeep | numsnaps | lifo_time_ns | fifo_time_ns
---------+----------+--------------+--------------
0 | 1 | 11.6 | 11.1
0 | 5 | 12.1 | 13.1
0 | 10 | 12.3 | 13.5
0 | 60 | 14.6 | 19.4
0 | 70 | 16.0 | 18.1
0 | 100 | 16.7 | 18.0
0 | 1000 | 18.1 | 20.7
0 | 10000 | 21.9 | 29.5
9 | 10 | 11.0 | 11.1
9 | 100 | 14.9 | 20.0
9 | 1000 | 16.1 | 24.4
9 | 10000 | 21.9 | 25.7
65 | 70 | 11.7 | 12.5
65 | 100 | 13.9 | 14.8
65 | 1000 | 16.7 | 17.8
65 | 10000 | 22.5 | 27.8
(16 rows)

v2-0001-Deduplicate-entries-in-ResourceOwner.patch:

postgres=# \i contrib/resownerbench/snaptest.sql
numkeep | numsnaps | lifo_time_ns | fifo_time_ns
---------+----------+--------------+--------------
0 | 1 | 10.8 | 10.6
0 | 5 | 11.5 | 12.3
0 | 10 | 12.1 | 13.0
0 | 60 | 13.9 | 19.4
0 | 70 | 15.9 | 18.7
0 | 100 | 16.0 | 18.5
0 | 1000 | 19.2 | 21.6
0 | 10000 | 22.4 | 29.0
9 | 10 | 11.2 | 11.3
9 | 100 | 14.4 | 19.9
9 | 1000 | 16.4 | 23.8
9 | 10000 | 22.4 | 25.7
65 | 70 | 11.4 | 12.1
65 | 100 | 14.8 | 17.0
65 | 1000 | 16.9 | 18.1
65 | 10000 | 22.5 | 28.5
(16 rows)

v20251016-0001-Deduplicate-entries-in-ResourceOwner.patch:

postgres=# \i contrib/resownerbench/snaptest.sql
numkeep | numsnaps | lifo_time_ns | fifo_time_ns
---------+----------+--------------+--------------
0 | 1 | 11.3 | 11.1
0 | 5 | 12.3 | 13.0
0 | 10 | 13.0 | 14.1
0 | 60 | 14.7 | 20.5
0 | 70 | 16.3 | 19.0
0 | 100 | 16.5 | 18.4
0 | 1000 | 19.0 | 22.4
0 | 10000 | 23.2 | 29.6
9 | 10 | 11.2 | 11.1
9 | 100 | 14.8 | 20.5
9 | 1000 | 16.8 | 24.3
9 | 10000 | 23.3 | 26.5
65 | 70 | 12.4 | 13.0
65 | 100 | 15.2 | 16.6
65 | 1000 | 16.9 | 18.4
65 | 10000 | 23.4 | 29.3
(16 rows)

These are just a single run on my laptop, the error bars on individual
numbers are significant. But it seems to me that V2 is maybe a little
faster when the entries fit in the array.

- Heikki

Attachment Content-Type Size
v3-0001-resownerbench.patch text/x-patch 8.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-10-21 07:14:39 Re: Fix incorrect comment in pg_get_shmem_allocations_numa()
Previous Message Chao Li 2025-10-21 07:03:41 Re: Why cannot alter a column's type when it's used by a generated column