Re: Timeline following for logical slots

From: Andres Freund <andres(at)anarazel(dot)de>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Timeline following for logical slots
Date: 2016-04-04 06:39:17
Message-ID: 20160404063917.GD2431@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2016-04-04 14:18:53 +0800, Craig Ringer wrote:
> I want to go over the rest of your reply in more detail, but regarding this
> and the original comment, I'm going by:
>
> if (start_lsn == InvalidXLogRecPtr)
> {
> /* continue from last position */
> start_lsn = slot->data.confirmed_flush;
> }
> ....
> ctx = StartupDecodingContext(output_plugin_options,
> start_lsn, InvalidTransactionId,
> read_page, prepare_write, do_write);
>
> ... in CreateDecodingContext(), which in turn does ...
>
> ctx->snapshot_builder =
> AllocateSnapshotBuilder(ctx->reorder, xmin_horizon, start_lsn);
>
> ... in StartupDecodingContext(...).
>
> The snapshot builder controls when the snapshot builder returns
> SNAPBUILD_CONSISTENT, therefore when we start returning decoded commits to
> the client. Right?

> We pass InvalidXLogRecPtr as the start_lsn
> in pg_logical_slot_get_changes_guts(...).
>
> So, when I wrote that:
>
> /*
> * We start reading xlog from the restart lsn, even though in
> * CreateDecodingContext we set the snapshot builder up using the
> * slot's confirmed_flush. This means we might read xlog we don't
> * actually decode rows from, but the snapshot builder might need it
> * to get to a consistent point. The point we start returning data
> to
> * *users* at is the confirmed_flush lsn set up in the decoding
> * context.
> */
>
> I don't see what's wrong there at all.

They're independent. What the snapshot builder gets as the start
position isn't the same as where we start reading WAL.

> We do "start reading xlog from the slot's restart_lsn". That's what gets
> passed to set up the xlogreader. That's where we read the first xlog
> records from.
>
> In CreateDecodingContext we _do_ "set the snapshot builder up using the
> slot's confirmed_flush" [unless overridden by explicit argument to
> CreateDecodingContext, which isn't given here].

Yes. But again, that hasn't really got anything to do with where we read
WAL from. We always have to read older WAL than were confirmed_flush is
set to; to assemble all the changes from transactions that are
in-progress at the point of confirmed_flush.

> This is presumably what you're taking issue with:
>
> * This means we might read xlog we don't
> * actually decode rows from, but the snapshot builder might need it
> * to get to a consistent point.
>
> and would be better worded as something like:
>
> This means we will read and decode xlog from before the point we actually
> start returning decoded commits to the client, but the snapshot builder may
> need this additional xlog to get to a consistent point.

> right?

No. The snapshot builder really hasn't gotten that much to do with
things. The fundamental thing we have to do is re-assemble all
in-progress transactions (as in reorderbuffer.c, not snapbuild.c).

> I think
>
> The point we start returning data to
> * *users* at is the confirmed_flush lsn set up in the decoding
> * context.
>
> is still right.
>
> What I was proposing instead is, in pg_logical_slot_get_changes_guts,
> changing:
>
> ctx = CreateDecodingContext(InvalidXLogRecPtr,
> options,
> logical_read_local_xlog_page,
> LogicalOutputPrepareWrite,
> LogicalOutputWrite);
>
> so that instead of InvalidXLogRecPtr we explicitly pass
>
> MyReplicationSlot->data.confirmed_flush;
>
> thus making it way easier to see what's going on without having to chase
> way deeper to figure out why data isn't returned from the restart_lsn.
> Where, y'know, you'd expect decoding to restart from... and where it does
> restart from, just not where it resumes sending output to the client from.

I just don't buy this. It's absolutely fundamental to understand that
the commit LSN isn't the point where all the data of transactions is
stored.

Andres

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-04-04 06:42:02 Re: pgsql: Avoid archiving XLOG_RUNNING_XACTS on idle server
Previous Message Craig Ringer 2016-04-04 06:36:29 Re: Timeline following for logical slots