Re: Lets (not) break all the things. Was: [pgsql-advocacy] 9.6 -> 10.0

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Josh berkus <josh(at)agliodbs(dot)com>, Justin Clift <justin(at)postgresql(dot)org>, Merlin Moncure <mmoncure(at)gmail(dot)com>, PostgreSQL Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Lets (not) break all the things. Was: [pgsql-advocacy] 9.6 -> 10.0
Date: 2016-04-12 18:44:34
Message-ID: CA+TgmoYQTEoDBg8xKCv5vnVWW70Pi40vvWZo8qv0BHbTk1hNOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 12, 2016 at 2:27 PM, Andres Freund <andres(at)anarazel(dot)de> wrote:
> none but 2) seem likely to require a substantial compatibility break.

And even that doesn't require one, if you keep the only system around
and make the new system optional via some sort of pluggable storage
API. Which, to me, seems like the only sensible approach from a
development perspective. If you decide to rip out the entire heapam
and replace it with something new in one fell swoop, you might as well
bother not writing the patch. It's got the same chance of being
accepted either way.

I really think the time has come that we need an API for the heap the
same way we already have for indexes. Regardless of exactly how we
choose to implement that, I think a large part of it will end up
looking similar to what we already have for FDWs. We can either use
the FDW API itself and add whatever additional methods we need for
this purpose, or copy it to a new file, rename everything, and have
two slightly different versions. AFAICS, the things we need that the
FDW API doesn't currently provide are:

1. The ability to have a local relfilenode associated with the data.
Or, ideally, several, so you have a separate set of files for each
index.

2. The ability to WAL-log changes to that relfilenode (or those
relfilenodes) in a sensible way. Not sure whether the new generic
XLOG stuff is good enough for a first go-round here or if more is
needed.

3. The ability to intercept DDL commands directed at the table and
handle them in some arbitrary way. This is really optional; people
could always provide a function-based API until we devise something
better.

4. The ability to build standard PostgreSQL indexes on top of the
data, if the underlying format still has a useful notion of CTIDs.
That is, if the underlying format is basically like our heap format,
but optimized in some way - e.g. append-only table that can't update
or delete with a smaller tuple header and page compression - then it
can reuse our indexing. If it does something else, like an
index-organized table where rows can move around to different physical
positions, then it has to provide its own indexing facilities.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-04-12 18:56:25 Re: [HACKERS] Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Previous Message Kevin Grittner 2016-04-12 18:44:00 Re: pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <