Re: logical replication worker accesses catalogs in error context callback

From: Zhihong Yu <zyu(at)yugabyte(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: logical replication worker accesses catalogs in error context callback
Date: 2021-01-27 02:19:44
Message-ID: CALNJ-vR3YPD+sh8x4kR8_i3BdHgqmJn5Sfpz2+7zqvHevk2KHA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

For 0002, a few comments on the description:

bq. Avoid accessing system catalogues inside conversion_error_callback

catalogues -> catalog

bq. error context callback, because the the entire transaction might

There is redundant 'the'

bq. Since get_attname() and get_rel_name() could search the sys cache by
store the required information

store -> storing

Cheers

On Tue, Jan 26, 2021 at 5:17 PM Bharath Rupireddy <
bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:

> On Tue, Jan 26, 2021 at 11:29 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
> wrote:
> > > IMHO, adding an assertion in SearchCatCacheInternal(which is a most
> > > generic code part within the server) with a few error context global
> > > variables may not be always safe. Because we might miss using the
> > > error context variables properly. Instead, we could add a comment in
> > > ErrorContextCallback structure saying something like, "it's not
> > > recommended to access any system catalogues within an error context
> > > callback when the callback is expected to be called while processing
> > > an error, because the transaction might have been broken in that
> > > case." And let the future callback developers take care of it.
> > >
> > > Thoughts?
> >
> > That sounds good to me. But I still also see the value to add an
> > assertion in SearchCatCacheInternal(). If we had it, we could find
> > these two bugs earlier.
> >
> > Anyway, this seems to be unrelated to this bug fixing so we can start
> > a new thread for that.
>
> +1 to start a new thread for that.
>
> > > As I said earlier [1], currently only two(there could be more) error
> > > context callbacks access the sys catalogues, one is in
> > > slot_store_error_callback which will be fixed with your patch. Another
> > > is in conversion_error_callback, we can also fix this by storing the
> > > relname, attname and other required info in ConversionLocation,
> > > something like the attached. Please have a look.
> > >
> > > Thoughts?
> >
> > Thank you for the patch!
> >
> > Here are some comments:
>
> Thanks for the review comments.
>
> > +static void set_error_callback_info(ConversionLocation *errpos,
> > + Relation rel, int cur_attno,
> > + ForeignScanState *fsstate);
> >
> > I'm concerned a bit that the function name is generic. How about
> > set_conversion_error_callback_info() or something to make the name
> > clear?
>
> Done.
>
> > ---
> > +static void
> > +conversion_error_callback(void *arg)
> > +{
> > + ConversionLocation *errpos = (ConversionLocation *) arg;
> > +
> > + if (errpos->show_generic_message)
> > + errcontext("processing expression at position %d in
> > select list",
> > + errpos->cur_attno);
> > +
> >
> > I think we can set this generic message to the error context when
> > errpos->relname is NULL instead of using show_generic_message.
>
> Right. Done.
>
> > ---
> > + /*
> > + * Set error context callback info, so that we could avoid
> accessing
> > + * the system catalogues while processing the error in
> > + * conversion_error_callback. System catalogue accesses are not
> safe in
> > + * error context callbacks because the transaction might have
> been
> > + * broken by then.
> > + */
> > + set_error_callback_info(&errpos, rel, i, fsstate);
> >
> > Looking at other code, we use "catalog" rather than "catalogue" in
> > most places. Is it better to use "catalog" for consistency? FYI, the
> > "catalogue" is used at only three places in the server code:
>
> Changed it to "catalog".
>
> > FYI I've added those bug fixes to the next Commitfest -
> https://commitfest.postgresql.org/32/2955/
>
> Thanks. I'm attaching 2 patches to make it easy for reviewing and also
> they will get a chance to be tested by cf bot.
>
> 0001 - for avoiding system catalog access in slot_store_error_callback.
> 0002 - for avoiding system catalog access in conversion_error_callback
>
> Please review it further.
>
> With Regards,
> Bharath Rupireddy.
> EnterpriseDB: http://www.enterprisedb.com
>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2021-01-27 02:35:00 Re: archive status ".ready" files may be created too early
Previous Message Michael Paquier 2021-01-27 01:53:00 Re: Add SQL function for SHA1