Re: Optimizing ResouceOwner to speed up COPY

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Optimizing ResouceOwner to speed up COPY
Date: 2025-10-17 03:13:44
Message-ID: 69AFF12C-4E27-43D0-B509-9CF488A0FC44@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Oct 17, 2025, at 01:46, Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
> --
> Tomas Vondra<v20251016-0001-Deduplicate-entries-in-ResourceOwner.patch>

Nice patch! I eyeball reviewed the patch, only got a few small comments:

1
```
@@ -250,12 +257,21 @@ ResourceOwnerAddToHash(ResourceOwner owner, Datum value, const ResourceOwnerDesc
idx = hash_resource_elem(value, kind) & mask;
for (;;)
{
+ /* found an exact match - just increment the counter */
+ if ((owner->hash[idx].kind == kind) &&
+ (owner->hash[idx].item == value))
+ {
+ owner->hash[idx].count += count;
+ return;
+ }
+
if (owner->hash[idx].kind == NULL)
break; /* found a free slot */
idx = (idx + 1) & mask;
}
```

I think this is the “trade-off” you mention in your email: if a free slot found earlier, then still gets duplicated entries. I have no concern to this “trade-off”, but please add a comment here, otherwise future readers may be confused, and might potentially consider that were a bug, without reading your original design email.

2
```
typedef struct ResourceElem
{
Datum item;
+ uint32 count; /* number of occurrences */
```

Nit suggestion. People usually name this type of count as “refcnt”, which looks more meaningful. But I don’t have a strong opinion here. I am absolutely fine if you don’t want to change.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Xuneng Zhou 2025-10-17 03:17:25 Re: doc: Improve description of io_combine_limit and io_max_combine_limit GUCs
Previous Message Xuneng Zhou 2025-10-17 02:31:38 Re: Fix an unnecessary cast calling elog in ExecHashJoinImpl