Re: Make relfile tombstone files conditional on WAL level

From: Amul Sul <sulamul(at)gmail(dot)com>
To: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Make relfile tombstone files conditional on WAL level
Date: 2022-05-12 10:57:20
Message-ID: CAAJ_b97nhXKba_xSEM+a_=9BMXohJCY6U9GRL8N4DFan1iAV+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Dilip,

On Fri, Mar 4, 2022 at 11:07 AM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
>
> On Mon, Feb 21, 2022 at 1:21 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Thu, Jan 6, 2022 at 1:43 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > 2) GetNewRelFileNode() will not loop for checking the file existence
> > > and retry with other relfilenode.
> >
> >
> > Open Issues- there are currently 2 open issues in the patch 1) Issue
> > as discussed above about removing the loop, so currently in this patch
> > the loop is removed. 2) During upgrade from the previous version we
> > need to advance the nextrelfilenode to the current relfilenode we are
> > setting for the object in order to avoid the conflict.
>
>
> In this version I have fixed both of these issues. Thanks Robert for
> suggesting the solution for both of these problems in our offlist
> discussion. Basically, for the first problem we can flush the xlog
> immediately because we are actually logging the WAL every time after
> we allocate 64 relfilenode so this should not have much impact on the
> performance and I have added the same in the comments. And during
> pg_upgrade, whenever we are assigning the relfilenode as part of the
> upgrade we will set that relfilenode + 1 as nextRelFileNode to be
> assigned so that we never generate the conflicting relfilenode.
>
> The only part I do not like in the patch is that before this patch we
> could directly access the buftag->rnode. But since now we are not
> having directly relfilenode as part of the buffertag and instead of
> that we are keeping individual fields (i.e. dbOid, tbsOid and relNode)
> in the buffer tag. So if we have to directly get the relfilenode we
> need to generate it. However those changes are very limited to just 1
> or 2 file so maybe not that bad.
>

v5 patch needs a rebase and here are a few comments for 0002, I found
while reading that, hope that helps:

+/* Number of RelFileNode to prefetch (preallocate) per XLOG write */
+#define VAR_RFN_PREFETCH 8192
+

Should it be 64, as per comment in XLogPutNextRelFileNode for XLogFlush() ?
---

+ /*
+ * Check for the wraparound for the relnode counter.
+ *
+ * XXX Actually the relnode is 56 bits wide so we don't need to worry about
+ * the wraparound case.
+ */
+ if (ShmemVariableCache->nextRelNode > MAX_RELFILENODE)

Very rare case, should use unlikely()?
---

+/*
+ * Max value of the relfilnode. Relfilenode will be of 56bits wide for more
+ * details refer comments atop BufferTag.
+ */
+#define MAX_RELFILENODE ((((uint64) 1) << 56) - 1)

Should there be 57-bit shifts here? Instead, I think we should use
INT64CONST(0xFFFFFFFFFFFFFF) to be consistent with PG_*_MAX
declarations, thoughts?
---

+ /* If we run out of logged for use RelNode then we must log more */
+ if (ShmemVariableCache->relnodecount == 0)

Might relnodecount never go below, but just to be safer should check
<= 0 instead.
---

Few typos:
Simmialr
Simmilar
agains
idealy

Regards,
Amul

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Valeriy Meleshkin 2022-05-12 11:02:12 Reproducible coliisions in jsonb_hash
Previous Message Amit Kapila 2022-05-12 08:55:36 Re: First draft of the PG 15 release notes