Re: Avoid overhead open-close indexes (catalog updates)

From: Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Avoid overhead open-close indexes (catalog updates)
Date: 2022-09-01 11:42:15
Message-ID: CAEudQAqgBCXO13jj-ykB0ygTC3RFNSaNjr59W1OhEXr5fggoww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Em qua., 31 de ago. de 2022 às 22:12, Kyotaro Horiguchi <
horikyota(dot)ntt(at)gmail(dot)com> escreveu:

> At Wed, 31 Aug 2022 08:16:55 -0300, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>
> wrote in
> > Hi,
> >
> > The commit
> >
> https://github.com/postgres/postgres/commit/b17ff07aa3eb142d2cde2ea00e4a4e8f63686f96
> > Introduced the CopyStatistics function.
> >
> > To do the work, CopyStatistics uses a less efficient function
> > to update/insert tuples at catalog systems.
> >
> > The comment at indexing.c says:
> > "Avoid using it for multiple tuples, since opening the indexes
> > * and building the index info structures is moderately expensive.
> > * (Use CatalogTupleInsertWithInfo in such cases.)"
> >
> > So inspired by the comment, changed in some fews places,
> > the CatalogInsert/CatalogUpdate to more efficient functions
> > CatalogInsertWithInfo/CatalogUpdateWithInfo.
> >
> > With quick tests, resulting in small performance.
>
Hi,
Thanks for taking a look at this.

>
> Considering the whole operation usually takes far longer time, I'm not
> sure that amount of performance gain is useful or not, but I like the
> change as a matter of tidiness or as example for later codes.
>
Yeah, this serves as an example for future codes.

> > There are other places that this could be useful,
> > but a careful analysis is necessary.
>
> What kind of concern do have in your mind?
>
Code Bloat.
3 more lines are required per call (CatalogTupleInsert/CatalogTupleUpdate).
However not all code paths are reachable.
The ideal typical case would be CopyStatistics, I think.
With none or at least one filter in tuples loop.
The cost to call CatalogOpenIndexes unconditionally, should be considered.

>
> By the way, there is another similar function
> CatalogTupleMultiInsertWithInfo() which would be more time-efficient
> (but not space-efficient), which is used in InsertPgAttributeTuples. I
> don't see a clear criteria of choosing which one of the two, though.
>
> I don't think CatalogTupleMultiInsertWithInfo would be useful in these
cases reported here.
The cost of building the slots I think would be unfeasible and would add
unnecessary complexity.

> I think the overhead of catalog index open is significant when any
> other time-consuming tasks are not involved in the whole operation.
> In that sense, in term of performance, rather storeOperations and
> storePrecedures (called under DefineOpCalss) might get more benefit
> from that if disregarding the rareness of the command being used..
>
> Yeah, storeOperations and storePrecedures are good candidates.
Let's wait for the patch to be accepted and committed, so we can try to
change it.

I will create a CF entry.

regards,
Ranier Vilela

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Polina Bungina 2022-09-01 11:58:04 Re: pg_rewind WAL segments deletion pitfall
Previous Message Polina Bungina 2022-09-01 11:33:09 Re: pg_rewind WAL segments deletion pitfall