|From:||Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>|
|To:||Andres Freund <andres(at)anarazel(dot)de>|
|Cc:||Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, tejeswarm(at)hotmail(dot)com, pgsql-hackers(at)postgresql(dot)org, hexexpert(at)comcast(dot)net|
|Subject:||Re: Corruption during WAL replay|
|Views:||Raw Message | Whole Thread | Download mbox | Resend email|
On 2020-Mar-30, Andres Freund wrote:
> If we are really concerned with truncation failing - I don't know why we
> would be, we accept that we have to be able to modify files etc to stay
> up - we can add a pre-check ensuring that permissions are set up
> appropriately to allow us to truncate.
I remember I saw a case where the datadir was NFS or some other network
filesystem thingy, and it lost connection just before autovacuum
truncation, or something like that -- so there was no permission
failure, but the truncate failed and yet PG soldiered on. I think the
connection was re-established soon thereafter and things went back to
"normal", with nobody realizing that a truncate had been lost.
Corruption was discovered a long time afterwards IIRC (weeks or months,
I don't remember).
I didn't review Teja's patch carefully, but the idea of panicking on
failure (causing WAL replay) seems better than the current behavior.
I'd rather put the server to wait until storage is really back.
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
|Next Message||Andres Freund||2020-04-11 00:54:31||Re: Corruption during WAL replay|
|Previous Message||Andres Freund||2020-04-11 00:41:18||Re: pg_basebackup, manifests and backends older than ~12|