Re: Design of pg_stat_subscription_workers vs pgstats

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Design of pg_stat_subscription_workers vs pgstats
Date: 2022-02-02 03:07:40
Message-ID: CAA4eK1K36U=cTpt704p_iiz+riF9TXd1xBEpOni9RsZgjQZFbA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 1, 2022 at 11:47 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Fri, Jan 28, 2022 at 2:59 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Fri, Jan 28, 2022 at 1:49 AM David G. Johnston
> > <david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
> > >
> > >
> > > In short, it was convenient to use the statistics collector here even if doing so resulted in a non-user friendly (IMO) design. Given all of the limitations to the statistics collection infrastructure, and the fact that this data is not statistical in the usual usage of the term, I find that to be less than satisfying.
> > >
> >
> > I think the failures/conflicts are also important information for
> > users to know, so having a view of those doesn't appear to be a bad
> > idea. All this data is less suitable for system catalogs like
> > pg_subscription_rel or others for the reasons quoted in my previous
> > email [1].
>
> I see that it's better to use a better IPC for ALTER SUBSCRIPTION SKIP
> feature to pass error-XID or error-LSN information to the worker
> whereas I'm also not sure of the advantages in storing all error
> information in a system catalog. Since what we need to do for this
> purpose is only error-XID/LSN, we can store only error-XID/LSN in the
> catalog? That is, the worker stores error-XID/LSN in the catalog on an
> error, and ALTER SUBSCRIPTION SKIP command enables the worker to skip
> the transaction in question. The worker clears the error-XID/LSN after
> successfully applying or skipping the first non-empty transaction.
>

Where do you propose to store this information? I think we can't use
pg_subscription_rel for reasons quoted by me in email [1]. We can
store it in pg_subscription but that won't cover tablesync cases. I
think it can work if we store at both places. I think that would be
extendable if one wants to bring parallelism on the apply-side as we
can think of storing the values in the array. The other possibility
could be to invent a new catalog for this info but I guess it will
then have to have some duplicate info from pg_subscription/_rel.

The other point is after this, do we want an interface where the user
can also be allowed to specify error_lsn or error_xid? I think it
would be better to have such flexibility as that can be extended later
to allow users to skip some specific operations like 'update',
'insert', etc., or other similar things.

[1] - https://www.postgresql.org/message-id/CAA4eK1%2BMDngbOQfMcAMsrf__s2a-MMMHaCR0zwde3GVeEi-bbQ%40mail.gmail.com

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2022-02-02 03:14:04 Re: A test for replay of regression tests
Previous Message Jaime Casanova 2022-02-02 02:57:00 [WIP] Allow pg_upgrade to copy segments of the same relfilenode in parallel