|From:||Michael Paquier <michael(at)paquier(dot)xyz>|
|To:||Daniel Gustafsson <daniel(at)yesql(dot)se>|
|Cc:||Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>|
|Subject:||Re: Ought to use heap_multi_insert() for pg_attribute/depend insertions?|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
On Fri, Jun 26, 2020 at 02:26:50PM +0200, Daniel Gustafsson wrote:
> On 26 Jun 2020, at 10:11, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>> + /* TODO is nreferenced a reasonable allocation of slots? */
>> + slot = palloc(sizeof(TupleTableSlot *) * nreferenced);
>> It seems to me that we could just apply the same rule as for
>> pg_attribute and pg_shdepend, no?
> I think so, I see no reason not to.
I was actually looking a patch to potentially support REINDEX for
partitioned table, and the CONCURRENTLY case may need this patch,
still if a lot of dependencies are switched at once it may be a
problem, so it is better to cap it. One thing though is that we may
finish by allocating more slots than what's necessary if some
dependencies are pinned, but using multi-inserts would be a gain
anyway, and the patch does not insert in more slots than needed.
> I like it, thanks for hacking on it. I will take another look at it later
> today when back at my laptop.
Thanks. I have been able to apply the independent part of pg_type.c
as of 68de144.
Attached is a rebased set, with more splitting work done after a new
round of review. 0001 is more refactoring to use more
ObjectAddress[Sub]Set() where we can, leading to some more cleanup:
5 files changed, 43 insertions(+), 120 deletions(-)
In this round, I got confused with the way ObjectAddress items are
assigned, assuming that we have to use the same dependency type for a
bunch of dependencies to attach. Using add_exact_object_address() is
fine for this purpose, but this also makes me wonder if we should try
to think harder about this interface so as we would be able to insert
a bunch of dependency tuples with multiple types of dependencies
handled. So this has made me remove reset_object_addresses() from the
patch series, as it was used when dependency types were mixed up. We
could also discuss that separately, but that's not strongly mandatory
There are however cases where it is straight-forward to gather
and insert multiple records, like in InsertExtensionTuple() (as does
already tsearchcmds.c), which is what 0002 does. An opposite example
is StorePartitionKey(), where there is a mix of normal and internal
dependencies, so I have removed it for now.
0003 is the main patch, where I have capped the number of slots used
for pg_depend similarly what is done for pg_shdepend and
pg_attribute to flush tuples in batches of 64kB.
ExecDropSingleTupleTableSlot() was also called for an incorrect number
of slots when it came to pg_shdepend. I was thinking if it could be
possible to do more consolidation between the three places where we
calculate the number of slots to use, but that would also mean to have
more tuple slot dependency moving around, which is not great. Anyway,
this leaves in patch 0003 only the multi-INSERT logic, without the
pieces that manipulated the dependency recording and insertions (we
still have three ObjectAddress[Sub]Set calls in heap.c but the same
area of the code is reworked for attribute insertions).
0001 and 0002 are committable and independent pieces, while 0003 still
needs more attention. One thing I also wanted to do with 0003 is
measure the difference in WAL produced (say pg_shdepend when using the
regression database as template) to have an idea of the gain.
|Next Message||Julien Rouhaud||2020-06-29 07:05:18||Re: track_planning causing performance regression|
|Previous Message||Bharath Rupireddy||2020-06-29 06:40:42||Re: Creating a function for exposing memory usage of backend process|