Re: BUG #13725: Logical Decoding - wrong results with large transactions and unfortunate timing

From: Andres Freund <andres(at)anarazel(dot)de>
To: "Shulgin, Oleksandr" <oleksandr(dot)shulgin(at)zalando(dot)de>
Cc: ofir(dot)manor(at)gmail(dot)com, Pg Bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: Re: BUG #13725: Logical Decoding - wrong results with large transactions and unfortunate timing
Date: 2015-10-26 11:05:02
Message-ID: 20151026110502.GA620@awork2.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2015-10-26 10:43:08 +0100, Shulgin, Oleksandr wrote:
> > Test case, script 1.
> > I clean up the previous test, create a new table and two replication slot.
> > Then, I run a script in the background (it will trigger the bug), insert
> > 100K rows, try to get them with pg_logical_slot_get_changes from the first
> > slot.
> > This works correctly every time - I get back the 100,000 rows + one BEGIN +
> > one COMMIT.
> > psql --quiet << EOF
> > SELECT pg_drop_replication_slot('test1_slot');
> > SELECT pg_drop_replication_slot('test2_slot');
> > DROP TABLE test;
> > CREATE TABLE test (id int primary key, v varchar);
> > SELECT * FROM pg_create_logical_replication_slot('test1_slot',
> > 'test_decoding');
> > SELECT * FROM pg_create_logical_replication_slot('test2_slot',
> > 'test_decoding');
> > EOF
> > ./get_loop.sh &
> > psql --quiet --tuples-only -o out1 << EOF
> > INSERT INTO test SELECT i, (i*3)::varchar FROM generate_series(1,100000)
> > i;
> > SELECT * FROM pg_logical_slot_get_changes('test1_slot', NULL, NULL);
> > EOF
> > cat --squeeze-blank out1 | wc -l
> >
> > Script 2 (get_loop.sh) continuously polls for changes using a loop of
> > pg_logical_slot_get_changes, using the second replication slot. Since the
> > test pg database is idle, only one call returns.

If this is anything more than a reproducer: don't do that, use the
streaming protocol.

> > for i in `seq 1 10000`; do
> > echo "SELECT * FROM pg_logical_slot_get_changes('test2_slot', NULL,
> > NULL);"
> > done | psql --quiet --tuples-only -o out2
> > cat --squeeze-blank out2 | wc -l
> >
> > However, I get 116K or 148K rows in the output (after the 100K inserts).
> > I can clearly see the the output jumps back after 16K or 48K rows -
> > starting
> > to stream the transaction from the beginning (see the first column - the
> > offsets):
> > ...
> > 1/68F6E1B8 | 450854 | table public.test: INSERT: id[integer]:49149
> > v[character varying]:'147447'
> > 1/68F6E248 | 450854 | table public.test: INSERT: id[integer]:49150
> > v[character varying]:'147450'
> > 1/68F6E2D8 | 450854 | table public.test: INSERT: id[integer]:49151
> > v[character varying]:'147453'
> > 1/68F6E368 | 450854 | table public.test: INSERT: id[integer]:49152
> > v[character varying]:'147456'
> > 1/68891010 | 450854 | table public.test: INSERT: id[integer]:1 v[character
> > varying]:'3'
> > 1/688910D8 | 450854 | table public.test: INSERT: id[integer]:2 v[character
> > varying]:'6'
> > 1/68891168 | 450854 | table public.test: INSERT: id[integer]:3 v[character
> > varying]:'9'
> > 1/688911F8 | 450854 | table public.test: INSERT: id[integer]:4 v[character
> > varying]:'12'
> > ...
> >
> > When I have larger transaction (inserting 1M or 10M rows, it gets worse,
> > jumping back to the beginning multiple times.
> >
> > BTW - I don't know if it is a bug in pg_logical_slot_get_changes or in the
> > test_decoding output plugin that I'm using to evaluate logical decoding.
> >
> > Any fix / workaround?
> >
>
> I didn't try it, but looks like this part of code in logicalfuncs.c is to
> blame:
>
> static Datum
> pg_logical_slot_get_changes_guts(FunctionCallInfo fcinfo, bool confirm,
> bool binary)
> {
> [snip]
>
> /* compute the current end-of-wal */
> if (!RecoveryInProgress())
> end_of_wal = GetFlushRecPtr();
> else
> end_of_wal = GetXLogReplayRecPtr(NULL);
>
> [snip]
> while ((startptr != InvalidXLogRecPtr && startptr < end_of_wal) ||
> (ctx->reader->EndRecPtr && ctx->reader->EndRecPtr < end_of_wal))
> {
>
> So we compute end_of_wal before entering the loop, but the WAL keeps
> growing as we read through it.

So?

> If we do it correctly, there's potential that the loop will never finish if
> the WAL grows faster than we can decode it.

> Shouldn't we also try to re-write this SRF to support
> SFRM_ValuePerCall?

Why? ValuePercall gets materialized into a tuplestore as well, unless
you call it from the select list.

> Another (quite minor) problem I see is this:
>
> --- a/src/backend/replication/logical/logicalfuncs.c
> +++ b/src/backend/replication/logical/logicalfuncs.c
> @@ -301,7 +301,7 @@ pg_logical_slot_get_changes_guts(FunctionCallInfo
> fcinfo, bool confirm, bool bin
> upto_lsn = PG_GETARG_LSN(1);
>
> if (PG_ARGISNULL(2))
> - upto_nchanges = InvalidXLogRecPtr;
> + upto_nchanges = 0;
> else
> upto_nchanges = PG_GETARG_INT32(2);

Ugh, yes.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Shulgin, Oleksandr 2015-10-26 11:16:40 Re: BUG #13725: Logical Decoding - wrong results with large transactions and unfortunate timing
Previous Message Shulgin, Oleksandr 2015-10-26 09:43:08 Re: BUG #13725: Logical Decoding - wrong results with large transactions and unfortunate timing