Quick Links

Re: Deficient error handling in pg_dump and pg_basebackup

From:	Michael Paquier <michael(at)paquier(dot)xyz>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	pgsql-hackers(at)lists(dot)postgresql(dot)org, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>
Subject:	Re: Deficient error handling in pg_dump and pg_basebackup
Date:	2021-11-17 05:24:59
Message-ID:	YZSSK4+soCmN+aeT@paquier.xyz
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Tue, Nov 16, 2021 at 10:26:11PM -0500, Tom Lane wrote:
> I feel like doing an immediate exit() for these errors and not other
> ones is a pretty terrible idea, mainly because it doesn't account for
> the question of what to do with a failure that prevents us from getting
> to the fsync() call in the first place. So I'd like to see a better
> design rather than more quick hacking. I confess I don't have a
> clear idea of what "a better design" would look like.

[ .. thinks .. ]

We cannot really have something equivalent to data_sync_retry in the
frontends. But I'd like to think that it would be fine for
pg_basebackup to just exit() on this failure so as callers would be
able to retry a base backup. pg_receivewal is more tricky though. An
exit() would allow for flush retries of a previous WAL segment where
things failed, but that stands when using --no-loop (still the code
path triggered by this option would not be used). When not using
--no-loop, it may be better to actually just retry streaming from the
previous point so as the error should be reported from walmethods.c to
the upper stack anyway.

> However, that's largely orthogonal to any of the things my proposed
> patches are trying to fix. If you want to review the patches without
> considering the fsync-error-handling problem, that'd be great.

I have looked at them upthread, FWIW:
https://www.postgresql.org/message-id/YYtSj5vlWp5faVXz@paquier.xyz
Your proposals still look rather sane to me, after a second look.
--
Michael

In response to

Re: Deficient error handling in pg_dump and pg_basebackup at 2021-11-17 03:26:11 from Tom Lane

Responses

Re: Deficient error handling in pg_dump and pg_basebackup at 2021-11-17 19:19:20 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andrey Borodin	2021-11-17 05:39:58	Slow client can delay replication despite max_standby_streaming_delay set
Previous Message	Amit Kapila	2021-11-17 04:58:46	Re: Skipping logical replication transactions on subscriber side