Re: In-placre persistance change of a relation

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: Jakub(dot)Wartak(at)tomtom(dot)com
Cc: tsunakawa(dot)takay(at)fujitsu(dot)com, osumi(dot)takamichi(at)fujitsu(dot)com, sfrost(at)snowman(dot)net, masao(dot)fujii(at)oss(dot)nttdata(dot)com, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: In-placre persistance change of a relation
Date: 2021-12-20 08:39:27
Message-ID: 20211220.173927.2048573811665828411.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Mon, 20 Dec 2021 07:59:29 +0000, Jakub Wartak <Jakub(dot)Wartak(at)tomtom(dot)com> wrote in
> Hi Kyotaro, I'm glad you are still into this
>
> > I didn't register for some reasons.
>
> Right now in v8 there's a typo in ./src/backend/catalog/storage.c :
>
> storage.c: In function 'RelationDropInitFork':
> storage.c:385:44: error: expected statement before ')' token
> pending->unlink_forknum != INIT_FORKNUM)) <-- here, one ) too much

Yeah, I thought that I had removed it. v9 patch I believe is correct.

> > 1. I'm not sure that we want to have the new mark files.
>
> I can't help with such design decision, but if there are doubts maybe then add checking return codes around:
> a) pg_fsync() and fsync_parent_path() (??) inside mdcreatemark()
> b) mdunlinkmark() inside mdunlinkmark()
> and PANIC if something goes wrong?

The point is it is worth the complexity it adds. Since the mark file
can resolve another existing (but I don't recall in detail) issue and
this patchset actually fixes it, it can be said to have a certain
extent of persuasiveness. But that doesn't change the fact that it's
additional complexity.

> > 2. Aside of possible bugs, I'm not confident that the crash-safety of
> > this patch is actually water-tight. At least we need tests for some
> > failure cases.
> >
> > 3. As mentioned in transam/README, failure in removing smgr mark files
> > leads to immediate shut down. I'm not sure this behavior is acceptable.
>
> Doesn't it happen for most of the stuff already? There's even data_sync_retry GUC.

Hmm. Yes, actually it is "as water-tight as possible". I just want
others' eyes on that perspective. CF could be the entry point of
others but I'm a bit hesitent to add a new entry..

> > 4. Including the reasons above, this is not fully functionally.
> > For example, if we execute the following commands on primary,
> > replica dones't work correctly. (boom!)
> >
> > =# CREATE UNLOGGED TABLE t (a int);
> > =# ALTER TABLE t SET LOGGED;
> >
> >
> > The following fixes are done in the attched v8.
> >
> > - Rebased. Referring to Jakub and Justin's work, I replaced direct
> > access to ->rd_smgr with RelationGetSmgr() and removed calls to
> > RelationOpenSmgr(). I still separate the "ALTER TABLE ALL IN
> > TABLESPACE SET LOGGED/UNLOGGED" statement part.
> >
> > - Fixed RelationCreate/DropInitFork's behavior for non-target
> > relations. (From Jakub's work).
> >
> > - Fixed wording of some comments.
> >
> > - As revisited, I found a bug around recovery. If the logged-ness of a
> > relation gets flipped repeatedly in a transaction, duplicate
> > pending-delete entries are accumulated during recovery and work in a
> > wrong way. sgmr_redo now adds up to one entry for a action.
> >
> > - The issue 4 above is not fixed (yet).
>
> Thanks again, If you have any list of crush tests ideas maybe I'll have some minutes
> to try to figure them out. Is there is any goto list of stuff to be checked to add confidence
> to this patch (as per point #2) ?

Just causing a crash (kill -9) after executing problem-prone command
sequence, then seeing recovery works well would sufficient.

For example:

create unlogged table; begin; insert ..; alter table set logged;
<crash>. Recovery works.

"create logged; begin; {alter unlogged; alter logged;} * 1000; alter
logged; commit/abort" doesn't pollute pgdata.

> BTW fast feedback regarding that ALTER patch (there were 4 unlogged tables):
> # ALTER TABLE ALL IN TABLESPACE tbs1 set logged;
> WARNING: unrecognized node type: 349

lol I met a server crash. Will fix. Thanks!

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gunnar "Nick" Bluth 2021-12-20 08:43:44 Re: [PATCH] pg_stat_toast
Previous Message Jakub Wartak 2021-12-20 07:59:29 RE: In-placre persistance change of a relation