Re: making relfilenodes 56 bits

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: making relfilenodes 56 bits
Date: 2022-07-11 19:08:57
Message-ID: CA+TgmobwFU-D_cWAqUooUGaQrC91B5QOW79HF=n-1=ApmFh9=Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 11, 2022 at 2:57 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> ISTM that we should redefine pg_class_tblspc_relfilenode_index to only cover
> relfilenode - afaics there's no real connection to the tablespace
> anymore. That'd a) reduce the size of the index b) guarantee uniqueness across
> tablespaces.

Sounds like a good idea.

> I don't know where we could fit a sanity check that connects to all databases
> and detects duplicates across all the pg_class instances. Perhaps pg_amcheck?

Unless we're going to change the way CREATE DATABASE works, uniqueness
across databases is not guaranteed.

> I think that's a very good idea. My concern around doing an XLogFlush() is
> that it could lead to a lot of tiny f[data]syncs(), because not much else
> needs to be written out. But the scheme you describe would likely lead the
> XLogFlush() flushing plenty other WAL writes out, addressing that.

Oh, interesting. I hadn't considered that angle.

> Maybe the easiest fix here would be to replace the file atomically. Then we
> don't need this <= 512 byte stuff. These are done rarely enough that I don't
> think the overhead of creating a separate file, fsyncing that, renaming,
> fsyncing, would be a problem?

Anything we can reasonably do to reduce the number of places where
we're relying on things being <= 512 bytes seems like a step in the
right direction to me. It's very difficult to know whether such code
is correct, or what the probability is that crossing the 512-byte
boundary would break anything.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-07-11 19:24:22 Re: AIX support - alignment issues
Previous Message Ranier Vilela 2022-07-11 19:06:33 Re: Avoid unecessary MemSet call (src/backend/utils/cache/relcache.c)