From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Optimizing ResouceOwner to speed up COPY |
Date: | 2025-10-16 18:12:47 |
Message-ID: | 1534176.1760638367@sss.pgh.pa.us |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tomas Vondra <tomas(at)vondra(dot)me> writes:
> The reason is pretty simple - ResourceOwner tracks the resources in a
> very simple hash table, with O(n^2) behavior with duplicates. This
> happens with COPY, because COPY creates an array of a 1000 tuple slots,
> and each slot references the same tuple descriptor. And the descriptor
> is added to ResourceOwner for each slot.
> ...
> There's an easy way to improve this by allowing a single hash entry to
> represent multiple references to the same resource. The attached patch
> adds a "count" to the ResourceElem, tracking how many times that
> resource was added. So if you add 1000 tuples slots, the descriptor will
> have just one ResourceElem entry with count=1000.
Hmm. I don't love the 50% increase in sizeof(ResourceElem) ... maybe
that's negligible, or maybe it isn't. Can you find evidence of this
change being helpful for anything except this specific scenario in
COPY? Because we could probably find some way to avoid registering
all the doppelganger slots, if that's the only culprit.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Joel Jacobson | 2025-10-16 18:16:25 | Re: Optimize LISTEN/NOTIFY |
Previous Message | Tomas Vondra | 2025-10-16 17:46:49 | Optimizing ResouceOwner to speed up COPY |