Re: MaxOffsetNumber for Table AMs

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: MaxOffsetNumber for Table AMs
Date: 2021-04-30 17:56:08
Message-ID: CA+TgmobU68bat=x=WjnkHPBzvOUz+wnoEu6u03VYRkHiwoAqYA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 30, 2021 at 1:37 PM Jeff Davis <pgsql(at)j-davis(dot)com> wrote:
> The particular problem I have now is that Table AMs seem to support
> indexes just fine, but TIDs are under-specified so I don't know what I
> really have to work with. BlockNumber seems well-specified as
> 0..0XFFFFFFFE (inclusive), but I don't know what the valid range of
> OffsetNumber is for the purposes of a TableAM.

I agree that this is a problem.

> Part of changing to uint64 would be specifying the TIDs in a way that I
> could rely on in the future.

I mean, from my perspective, the problem here is that the abstraction
layer is leaky and things outside of the table AM layer know what heap
is doing under the hood, and rely on it. If we could refactor the
abstraction to be less leaky, it would be clearer what assumptions
table AM authors can make. If we can't, any specification doesn't seem
worth much.

> In the future we may support primary unique indexes at the table AM
> layer, which would get more interesting. I can see an argument for a
> TID being an arbitrary datum in that case, but I haven't really
> considered the design implications. Is this what you are suggesting?

I think that would be the best long-term plan. I guess I have two
distinguishable concerns. One is that I want to be able to have
indexes with a payload that's not just a 6-byte TID; e.g. adding a
partition identifier to support global indexes, or replacing the
6-byte TID with a primary key reference to support indirect indexes.
The other concern is to be able to have table AMs that use arbitrary
methods to identify a tuple. For example, if somebody implemented an
index-organized table, the "TID" would really be the primary key.

Even though these are distinguishable concerns, they basically point
in the same direction as far as index layout is concerned. The
implications for the table AM layer are somewhat different in the two
cases, but both argue that some places that are now talking about TIDs
should be changed to talk about Datums or something of that sort.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2021-04-30 18:04:42 Re: Procedures versus the "fastpath" API
Previous Message Jeff Davis 2021-04-30 17:55:34 Re: MaxOffsetNumber for Table AMs