Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amul Sul <sulamul(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Noah Misch <noah(at)leadboat(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Neha Sharma <neha(dot)sharma(at)enterprisedb(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [CLOBBER_CACHE]Server crashed with segfault 11 while executing clusterdb
Date: 2021-03-24 14:39:42
Message-ID: 1056468.1616596782@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amul Sul <sulamul(at)gmail(dot)com> writes:
> On Tue, Mar 23, 2021 at 8:59 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> On closer inspection, I believe the true culprit is c6b92041d,
>> which did this:
>> - heap_sync(state->rs_new_rel);
>> + smgrimmedsync(state->rs_new_rel->rd_smgr, MAIN_FORKNUM);
>> heap_sync was careful about opening rd_smgr, the new code not so much.

> So we also need to make sure of the
> RelationOpenSmgr() call before smgrimmedsync() as proposed previously.

I wonder if we should try to get rid of this sort of bug by banning
direct references to rd_smgr? That is, write the above and all
similar code like

smgrimmedsync(RelationGetSmgr(state->rs_new_rel), MAIN_FORKNUM);

where we provide something like

static inline struct SMgrRelationData *
RelationGetSmgr(Relation rel)
{
if (unlikely(rel->rd_smgr == NULL))
RelationOpenSmgr(rel);
return rel->rd_smgr;
}

and then we could get rid of most or all other RelationOpenSmgr calls.

This might create more code bloat than it's really worth, but
it'd be a simple and mechanically-checkable scheme.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-03-24 14:43:22 Re: pg_amcheck contrib application
Previous Message Alvaro Herrera 2021-03-24 14:11:41 psql lacking clearerr()