Re: Sequence Access Method WIP

From: Petr Jelinek <petr(at)2ndquadrant(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: Simon Riggs <simon(at)2ndQuadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sequence Access Method WIP
Date: 2014-09-14 23:38:52
Message-ID: 5416270C.5000005@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 18/11/13 11:50, Heikki Linnakangas wrote:
>
> I don't think the sequence AM should be in control of 'cached'. The
> caching is done outside the AM. And log_cnt probably should be passed to
> the _alloc function directly as an argument, ie. the server code asks
> the AM to allocate N new values in one call.
>
> I'm thinking that the alloc function should look something like this:
>
> seqam_alloc(Relation seqrel, int nrequested, Datum am_private)
>

I was looking at this a bit today and what I see is that it's not that
simple.

Minimum input the seqam_alloc needs is:
- Relation seqrel
- int64 minv, maxv, incby, bool is_cycled - these are basically options
giving info about how the new numbers are allocated (I guess some
implementations are not going to support all of those)
- bool is_called - the current built-in sequence generator behaves
differently based on it and I am not sure we can get over it (it could
perhaps be done in back-end independently of AM?)
- int64 nrequested - number of requested values
- Datum am_private - current private data

In this light I agree with what Andres wrote - let's just send the whole
Form_pg_sequence object.

Also makes me think that the seqam options interface should also be
passed the minv/maxv/incby/is_cycled etc options for validation, not
just the amoptions.

> And it should return:
>
> int64 value - the first value allocated.
> int nvalues - the number of values allocated.
> am_private - updated private data.
>

There is also more needed than this, you need:
- int64 value - first value allocated (value to be returned)
- int64 nvalues - number of values allocated
- int64 last - last cached value (used for cached/last_value)
- int64 next - last logged value (used for wal logging)
- am_private - updated private data, must be possible to return as null

I personally don't like that we need all the "nvalues", "next" and
"last" as it makes the seqam a little bit too aware of the sequence
logging internals in my opinion but I haven't found a way around it -
it's impossible for backend to know how the AM will act around
incby/maxv/minv/cycling so it can't really calculate these values by
itself, unless ofcourse we fix the behavior and require seqams to behave
predictably, but that somewhat breaks the whole idea of leaving the
allocation to the seqam. Obviously it would also work to return list of
allocated values and then backend could calculate the "value",
"nvalues", "last", "next" from that list by itself, but I am worried
about performance of that approach.

>
> The backend code handles the caching and logging of values. When it has
> exhausted all the cached values (or doesn't have any yet), it calls the
> AM's alloc function to get a new batch. The AM returns the new batch,
> and updates its private state as necessary. Then the backend code
> updates the relation file with the new values and the AM's private data.
> WAL-logging and checkpointing is the backend's responsibility.
>

Agreed here.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arthur Silva 2014-09-14 23:42:36 Re: [REVIEW] Re: Compression of full-page-writes
Previous Message Thomas Munro 2014-09-14 22:30:27 Re: SKIP LOCKED DATA (work in progress)