Re: Status of the table access method work

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers(at)postgresql(dot)org
Cc: Haribabu Kommi <kommi(dot)haribabu(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: Status of the table access method work
Date: 2019-04-08 11:53:53
Message-ID: d0fc97bd-7ec8-2388-e4a6-0fda86d71a43@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 05/04/2019 23:25, Andres Freund wrote:
> I think what's in v12 - I don't know of any non-cleanup / bugfix work
> pending for 12 - is a pretty reasonable initial set of features.

Hooray!

> - the (optional) bitmap heap scan API - that's fairly intrinsically
> block based. An AM could just internally subdivide TIDs in a different
> way, but I don't think a bitmap scan like we have would e.g. make a
> lot of sense for an index oriented table without any sort of stable
> tid.

If an AM doesn't implement the bitmap heap scan API, what happens?
Bitmap scans are disabled?

Even if an AM isn't block-oriented, the bitmap heap scan API still makes
sense as long as there's some correlation between TIDs and physical
location. The only really broken thing about that currently is the
prefetching: nodeBitmapHeapScan.c calls PrefetchBuffer() directly with
the TID's block numbers. It would be pretty straightforward to wrap that
in a callback, so that the AM could do something different.

Or move even more of the logic to the AM, so that the AM would get the
whole TIDBitmap in table_beginscan_bm(). It could then implement the
fetching and prefetching as it sees fit.

I don't think it's urgent, though. We can cross that bridge when we get
there, with the first AM that needs that flexibility.

> The most constraining factor for storage, I think, is that currently the
> API relies on ItemPointerData style TIDs in a number of places (i.e. a 6
> byte tuple identifier).

I think 48 bits would be just about enough, but it's even more limited
than you might at the moment. There are a few places that assume that
the offsetnumber <= MaxHeapTuplesPerPage. See ginpostinglist.c, and
MAX_TUPLES_PER_PAGE in tidbitmap.c. Also, offsetnumber can't be 0,
because that makes the ItemPointer invalid, which is inconvenient if you
tried to use ItemPointer as just an arbitrary 48-bit integer.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2019-04-08 12:05:53 Re: [PATCH v20] GSSAPI encryption support
Previous Message Julien Rouhaud 2019-04-08 11:53:35 Re: reloption to prevent VACUUM from truncating empty pages at the end of relation