Quick Links

Re: MaxOffsetNumber for Table AMs

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc:	Peter Geoghegan <pg(at)bowt(dot)ie>, Andres Freund <andres(at)anarazel(dot)de>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: MaxOffsetNumber for Table AMs
Date:	2021-05-06 01:07:12
Message-ID:	CA+TgmoZ0S5zU4OpBxQvJ_ifu1LDcvc1z6i=XAXnnM29GvB6Hfw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wed, May 5, 2021 at 3:43 PM Matthias van de Meent
<boekewurm+postgres(at)gmail(dot)com> wrote:
> I believe that it cannot be "just" an additive thing, at least not
> through a normal INCLUDEd column, as you'd get duplicate TIDs in the
> index, with its related problems. You also cannot add it as a key
> column, as this would disable UNIQUE indexes; one of the largest use
> cases of global indexes. So, you must create specialized
> infrastructure for this identifier.
>
> And when we're already adding specialized infrastructure, then this
> should probably be part of a new TID infrastructure.
>
> And if we're going to change TID infrastructure to allow for more
> sizes (as we'd need normal TableAM TIDs, and global index
> partition-identifying TIDs), I'd argue that it should not be too much
> more difficult to create an infrastructure for 'new TID' in which the
> table AM supplies type, size and strict ordering information for these
> 'new TID's.
>
> And if this 'new TID' size is not going to be defined by the index AM
> but by the indexed object (be it a table or a 'global' or whatever
> we'll build indexes on), I see no reason why this 'new TID'
> infrastructure couldn't eventually support variable length TIDs; or
> constant sized usertype TIDs (e.g. the 3 int columns of the primary
> key of a clustered table).
>
> The only requirements that I believe to be fundamental for any kind of TID are
>
> 1.) Uniqueness during the lifecycle of the tuple, from creation to
> life to dead to fully dereferenced from all indexes;
> 2.) There exists a strict ordering of all TIDs of that type;
>
> And maybe to supply some form of efficiency to the underlying tableAM:
>
> 3.) There should be an equivalent of bitmap for that TID type.
>
> For the nbtree deduplication subsystem, and for gin posting lists to
> be able to work efficiently, the following must also hold:
>
> 4.) The TID type has a fixed size, preferably efficiently packable.
>
> Only the last requirement cannot be met with varlena TID types. But,
> as I also believe that not all indexes can be expected to work (well)
> for all kinds of TableAM, I don't see how this would be a blocking
> issue.

+1 to all of that.

> Storage gains for index-oriented tables can become as large as the
> size of the primary key by not having to store all primary key values
> in both the index and the table; which can thus be around 100% of a
> table in the least efficient cases of having a PK over all columns.
>
> Yes, this might be indeed only a 'small gain' for access latency, but
> not needing to store another copy of your data (and keeping it in
> cache, etc.) is a significant win in my book.

This is a really good point. Also, if the table is ordered by a
synthetic logical TID, range scans on the primary key will be less
efficient than if the primary key is itself the TID. We have the
ability to CLUSTER on an index for good reasons, and "Automatically
maintain clustering on a table" has been on the todo list forever.
It's hard to imagine this will ever be achieved with the current heap,
though: the way to get there is to have a table AM for which this is
an explicit goal.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Re: MaxOffsetNumber for Table AMs at 2021-05-05 19:43:01 from Matthias van de Meent

Responses

Re: MaxOffsetNumber for Table AMs at 2021-05-06 01:32:28 from Hannu Krosing

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2021-05-06 01:10:08	Re: Dubious assertion in RegisterDynamicBackgroundWorker
Previous Message	Masahiko Sawada	2021-05-06 00:45:12	Re: Replication slot stats misgivings