Re: Sequence Access Method WIP

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Petr Jelinek <petr(at)2ndquadrant(dot)com>, Vik Fearing <vik(at)2ndquadrant(dot)fr>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sequence Access Method WIP
Date: 2015-07-28 18:11:36
Message-ID: 55B7C5D8.80502@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

So, we have this patch in the commitfest again. Let's see where we are,
and try to find a consensus on what needs to be done before this can be
committed.

On 06/17/2015 06:51 PM, Petr Jelinek wrote:
> On 2015-06-15 11:32, Vik Fearing wrote:
>> I've been looking at these patches a bit and here are some comments:
>
> Thanks for looking at this.

+1, thanks Vik.

>> As mentioned upthread, this patch isn't a seamless replacement for
>> what's already there because of the amdata field. I wasn't part of the
>> conversation of FOSDEM unfortunately, and there's not enough information
>> in this thread to know why this solution is preferred over each seqam
>> having its own table type with all the columns it needs. I see that
>> Heikki is waffling a bit between the two, and I have a fairly strong
>> opinion that amdata should be split into separate columns. The patch
>> already destroys and recreates what it needs when changing access method
>> via ALTER SEQUENCE, so I don't really see what the problem is.
>
> FOSDEM was just about agreeing that amdata is simpler after we discussed
> it with Heikki. Nothing too important you missed there I guess.
>
> I can try to summarize what are the differences:
> - amdata is somewhat simpler in terms of code for both init, alter and
> DDL, since with custom columns you have to specify them somehow and deal
> with them in catalog, also ALTER SEQUENCE USING means that there are
> going to be colums marked as deleted which produces needless waste, etc
> - amdata make it easier to change the storage model as the tuple
> descriptor is same for all sequences
> - the separate columns are much nicer from user point of view
> - my opinion is that separate columns also more nicely separate state
> from options and I think that if we move to separate storage model,
> there can be state table per AM which solves the tuple descriptor issue
> - there is probably some slight performance benefit to amdata but I
> don't think it's big enough to be important
>
> I personally have slight preference to separate columns design, but I am
> ok with both ways honestly.

Regarding the amdata issue, I'm also leaning towards set of columns.
I've felt that way all along, but not very strongly, so I relented at
some point when Andres felt strongly that a single column would be
better. But the more I think about it, the more I feel that separate
columns really would be better. As evidence, I offer this recent thread:

Tom said
(http://www.postgresql.org/message-id/8739.1436893588@sss.pgh.pa.us):
> I really don't see what's wrong with "SELECT last_value FROM sequence",
> especially since that has worked in every Postgres version since 6.x.
> Anyone slightly worried about backwards compatibility wouldn't use
> an equivalent function even if we did add one.

If we went with the single amdata column, that would break. Or we'd need
to leave last_value as a separate column anyway, and leave it unused for
sequence AMs where it's not applicable. But that's a bit ugly too.

Jim Nasby said in the same thread:
> FWIW, I think it'd be better to have a pg_sequences view that's the
> equivalent of SELECT * FROM <sequence> for every sequence in the
> database. That would let you get whatever info you needed.

Creating such a view would be difficult if all the sequences have a
different set of columns. But when you think about it, it's not really
any better with a single amdata column. You can't easily access the data
in the amdata column that way either.

Anyway, that's my opinion. Several others have weighed in to support
separate columns, too, so I think that is the consensus. Separate
columns it is.

>> There is no \d command for sequence access methods. Without querying
>> pg_seqam directly, how does one discover what's available?
>
> Good point.

Well, you can query pg_seqam. I don't think this deserves a \d command.

>> On the whole, I think this is a pretty good patchset. Aside from the
>> design decision of whether amdata is a single opaque column or a set of
>> columns, there are only a few things that need to be changed before it's
>> ready for committer, and those things are mostly documentation.
>
> Unfortunately the amdata being opaque vs set of columns is the main
> issue here.

There was discussion on another thread on how the current sequence AM
API is modeled after the indexam API, at
http://www.postgresql.org/message-id/3896.1437059303@sss.pgh.pa.us. Will
need to do something about that too.

Petr, is this enough feedback on this patch for this commitfest, or are
there some other issues you want to discuss before I mark this as returned?

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2015-07-28 18:16:12 Re: Shouldn't we document "don't use a mountpoint as $PGDATA"?
Previous Message Tom Lane 2015-07-28 18:01:18 Shouldn't we document "don't use a mountpoint as $PGDATA"?