Re: Adding SMGR discriminator to buffer tags

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Shawn Debnath <sdn(at)amazon(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Adding SMGR discriminator to buffer tags
Date: 2019-07-15 10:58:16
Message-ID: CA+hUKG+8awhBh0ZZ9=2rRuXj8MA4JF2AKo5fqMxNrOX0kN8Ctw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 12, 2019 at 11:19 AM Shawn Debnath <sdn(at)amazon(dot)com> wrote:
> On Fri, Jul 12, 2019 at 10:16:21AM +1200, Thomas Munro wrote:
> +
> + /* TODO: How should we handle other smgr IDs? */
> + if (smgrid != SMGR_MD)
> continue;
>
> All files are copied verbatim from source to target except for relation
> files. So this would include slru data and undo data. From what I read
> in the docs, I do not believe we need any special handling for either
> new SMGRs and your current code should suffice.
>
> process_block_change() is very relation specific so if different
> handling is required by different SMGRs, it would make sense to call on
> smgr specific functions instead.

Right. And since undo and slru etc data will be WAL-logged with block
references, it's entirely possible to teach it to scan them properly,
though it's not clear whether it's worth doing that. Ok, good, TODO
removed.

> Can't wait for the SMGR_MD to SMGR_REL change :-) It will make
> understanding this code a tad bit easier.

Or could we retrofit different words that start with M and D?

Here's a new version of the patch set (ie the first 3 patches in the
undo patch set, and the part that I think you need for slru work),
this time with the pg_buffercache changes as a separate commit since
it's somewhat independent and has a different (partial) reviewer.

I was starting to think about whether I might be able to commit these,
but now I see that this increase in WAL size is probably not
acceptable:

@@ -727,6 +734,8 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
}
if (!samerel)
{
+ memcpy(scratch, &regbuf->smgrid, sizeof(SmgrId));
+ scratch += sizeof(SmgrId);
memcpy(scratch, &regbuf->rnode, sizeof(RelFileNode));
scratch += sizeof(RelFileNode);
}

@@ -1220,8 +1221,10 @@ DecodeXLogRecord(XLogReaderState *state,
XLogRecord *record, char **errormsg)
}
if (!(fork_flags & BKPBLOCK_SAME_REL))
{
+ COPY_HEADER_FIELD(&blk->smgrid, sizeof(SmgrId));
COPY_HEADER_FIELD(&blk->rnode,
sizeof(RelFileNode));
rnode = &blk->rnode;
+ smgrid = blk->smgrid;
}

That's an enum, so it works out to a word per record. The obvious way
to avoid increasing the size is shove the SMGR ID into the same space
that holds the forknum. Unlike BufferTag, where forknum currently
swims in 32 bits which this patch chops in half, XLogRecorBlockHeader
is already crammed into a uint8 fork_flags of which it has only the
lower nibble, and the upper nibble is used for eg BKP_BLOCK_xxx flag
bits, and there isn't even a spare bit to say 'has non-zero SMGR ID'.
Rats. I suppose I could change it to a byte. I wonder if one extra
byte per WAL record is acceptable. Anyone?

--
Thomas Munro
https://enterprisedb.com

Attachment Content-Type Size
0001-Add-SmgrId-to-smgropen-and-BufferTag-v3.patch application/octet-stream 57.1 KB
0002-Add-smgrid-column-to-pg_buffercache-v3.patch application/octet-stream 9.0 KB
0003-Move-tablespace-dir-creation-from-smgr.c-to-md.c-v3.patch application/octet-stream 2.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Westermann (DWE) 2019-07-15 11:01:00 Documentation fix for adding a column with a default value
Previous Message Paul Guo 2019-07-15 10:52:01 Re: standby recovery fails (tablespace related) (tentative patch and discussion)