Re: Detecting File Damage & Inconsistencies

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Simon Riggs <simon(dot)riggs(at)enterprisedb(dot)com>
Cc: Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>, David Steele <david(at)pgmasters(dot)net>, cleyssondba(at)gmail(dot)com, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Detecting File Damage & Inconsistencies
Date: 2021-07-14 04:01:35
Message-ID: CAA4eK1LDCokGz-W-K16eN8V4_vdixusgaymbnK1ZS03mbfSnFg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 13, 2021 at 8:29 PM Simon Riggs
<simon(dot)riggs(at)enterprisedb(dot)com> wrote:
>
> On Tue, Jul 6, 2021 at 4:21 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > >
> > > > If you don't think the sorts of use cases I presented are worth the trouble that's fair enough. I'm not against adding it on the commit record. It's just that with logical decoding moving toward a streaming model I suspect only having it at commit time may cause us some pain later.
> > >
> > > I think you have some good ideas about how to handle larger
> > > transactions with streaming. As a separate patch it might be worth
> > > keeping track of transaction size and logging something when a
> > > transaction gets too large.
> > >
> >
> > If we want this additional information for streaming mode in logical
> > replication then can't we piggyback it on the very first record
> > written for a transaction when this info is required? Currently, we do
> > something similar for logging top_level_xid for subtransaction in
> > XLogRecordAssemble().
>
> It's possible, but I'm struggling to believe anybody would accept that
> as an approach because it breaks simplicity, modularity and makes it
> harder to search for this info in the WAL.
>
> I was imagining that we'd keep track of amount of WAL written by a
> transaction and when it reaches a certain size generate a "streaming
> info" record as an early warning that we have a big transaction coming
> down the pipe.
>

I am not sure if that satisfies Craig's requirement of making it
available during the streaming of in-progress xacts during logical
replication. It is quite possible that by the time we decide to start
streaming a transaction this information won't be logged yet.

> I'm feeling that a simple patch is expanding well beyond its original
> scope and timeline. How can we do this simply?
>

The patch is simple but its use doesn't seem to be very clear. You
have mentioned its use for future PITR patches and Craig mentioned
some use cases in logical decoding and it appears to me that to
support the use cases mentioned by Craig, it is important to LOG this
earlier than at commit time. As there are no details about how it will
be used for PITR patches and whether such patch ideas are accepted, it
makes it harder to judge the value of this patch.

I think if we would have patches (even at WIP/POC stage) for the ideas
you and Craig have in mind, it would have been much easier to see the
value of this patch.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2021-07-14 04:36:48 Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Previous Message Justin Pryzby 2021-07-14 03:07:25 [PATCH] psql: \dn+ to show size of each schema..