From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
Subject: | Re: Optimizing ResouceOwner to speed up COPY |
Date: | 2025-10-21 07:10:38 |
Message-ID: | 73dffc49-942a-4502-9963-1b7838958959@iki.fi |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 18/10/2025 01:49, Tomas Vondra wrote:
> On 10/17/25 12:32, Tomas Vondra wrote:
>>
>>
>> On 10/17/25 10:31, Heikki Linnakangas wrote:
>>>> typedef struct ResourceElem
>>>> {
>>>> Datum item;
>>>> + uint32 count; /* number of occurrences */
>>>> const ResourceOwnerDesc *kind; /* NULL indicates a free hash
>>>> table slot */
>>>> } ResourceElem;
>>>
>>> Hmm, the 'count' is not used when the entry is stored in the array.
>>> Perhaps we should have a separate struct for array and hash elements
>>> now. Keeping the array small helps it to fit in CPU caches.
>>
>> Agreed. I had the same idea yesterday, but I haven't done it yet.
>
> The attached v2 does that - it adds a separate ResourceHashElem for the
> has table, and it works. But I'm not sure I like it very much, because
> there are two places that relied on both the array and hash table using
> the same struct to "walk" it the same way.
>
> For ResourceOwnerSort() it's not too bad, but ResourceOwnerReleaseAll()
> now duplicates most of the code. It's not terrible, but also not pretty.
> I can't think of a better way, though.
Looks fine to me. The code duplication is not too bad IMO.
Here's a rebased version of the micro-benchmark I used when I was
working on the ResourceOwner refactoring
(https://www.postgresql.org/message-id/d746cead-a1ef-7efe-fb47-933311e876a3%40iki.fi)
I ran it again on my laptop. Different from the one I used back then, so
the results are not comparable with the results from that old thread.
Unpatched (commit 18d26140934):
postgres=# \i contrib/resownerbench/snaptest.sql
numkeep | numsnaps | lifo_time_ns | fifo_time_ns
---------+----------+--------------+--------------
0 | 1 | 11.6 | 11.1
0 | 5 | 12.1 | 13.1
0 | 10 | 12.3 | 13.5
0 | 60 | 14.6 | 19.4
0 | 70 | 16.0 | 18.1
0 | 100 | 16.7 | 18.0
0 | 1000 | 18.1 | 20.7
0 | 10000 | 21.9 | 29.5
9 | 10 | 11.0 | 11.1
9 | 100 | 14.9 | 20.0
9 | 1000 | 16.1 | 24.4
9 | 10000 | 21.9 | 25.7
65 | 70 | 11.7 | 12.5
65 | 100 | 13.9 | 14.8
65 | 1000 | 16.7 | 17.8
65 | 10000 | 22.5 | 27.8
(16 rows)
v2-0001-Deduplicate-entries-in-ResourceOwner.patch:
postgres=# \i contrib/resownerbench/snaptest.sql
numkeep | numsnaps | lifo_time_ns | fifo_time_ns
---------+----------+--------------+--------------
0 | 1 | 10.8 | 10.6
0 | 5 | 11.5 | 12.3
0 | 10 | 12.1 | 13.0
0 | 60 | 13.9 | 19.4
0 | 70 | 15.9 | 18.7
0 | 100 | 16.0 | 18.5
0 | 1000 | 19.2 | 21.6
0 | 10000 | 22.4 | 29.0
9 | 10 | 11.2 | 11.3
9 | 100 | 14.4 | 19.9
9 | 1000 | 16.4 | 23.8
9 | 10000 | 22.4 | 25.7
65 | 70 | 11.4 | 12.1
65 | 100 | 14.8 | 17.0
65 | 1000 | 16.9 | 18.1
65 | 10000 | 22.5 | 28.5
(16 rows)
v20251016-0001-Deduplicate-entries-in-ResourceOwner.patch:
postgres=# \i contrib/resownerbench/snaptest.sql
numkeep | numsnaps | lifo_time_ns | fifo_time_ns
---------+----------+--------------+--------------
0 | 1 | 11.3 | 11.1
0 | 5 | 12.3 | 13.0
0 | 10 | 13.0 | 14.1
0 | 60 | 14.7 | 20.5
0 | 70 | 16.3 | 19.0
0 | 100 | 16.5 | 18.4
0 | 1000 | 19.0 | 22.4
0 | 10000 | 23.2 | 29.6
9 | 10 | 11.2 | 11.1
9 | 100 | 14.8 | 20.5
9 | 1000 | 16.8 | 24.3
9 | 10000 | 23.3 | 26.5
65 | 70 | 12.4 | 13.0
65 | 100 | 15.2 | 16.6
65 | 1000 | 16.9 | 18.4
65 | 10000 | 23.4 | 29.3
(16 rows)
These are just a single run on my laptop, the error bars on individual
numbers are significant. But it seems to me that V2 is maybe a little
faster when the entries fit in the array.
- Heikki
Attachment | Content-Type | Size |
---|---|---|
v3-0001-resownerbench.patch | text/x-patch | 8.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2025-10-21 07:14:39 | Re: Fix incorrect comment in pg_get_shmem_allocations_numa() |
Previous Message | Chao Li | 2025-10-21 07:03:41 | Re: Why cannot alter a column's type when it's used by a generated column |