From: | Arseny Sher <a(dot)sher(at)postgrespro(dot)ru> |
---|---|
To: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
Cc: | a(dot)sher(at)postgrespro(dot)ru, k(dot)knizhnik(at)postgrespro(dot)ru, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Possible bug in logical replication. |
Date: | 2018-05-25 06:43:23 |
Message-ID: | 87a7soqlpw.fsf@ars-thinkpad |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> writes:
> restart_lsn stays at the beginning of a transaction until the
> transaction ends so just using restart_lsn allows repeated
> decoding of a transaction, in short, rewinding occurs. The
> function works only for inactive slot so the current code works
> fine on this point.
Sorry, I do not follow. restart_lsn is advanced whenever there is a
consistent snapshot dumped (in xl_running_xacts) which is old enough to
wholly decode all xacts not yet confirmed by the client. Could you
please elaborate, what's wrong with that?
> Addition to that restart_lsn also can be on a
> page bounary.
Do you have an example of that? restart_lsn is set initially to WAL
insert position at ReplicationSlotReserveWal, and later it always points
to xl_running_xacts record with consistent snapshot dumped.
> So directly set ctx->reader->EndRecPtr by startlsn fixes the
> problem, but I found another problem here.
There is a minor issue with the patch. Now slot advancement hangs
polling for new WAL on my example from [1]; most probably because we
must exit the loop when ctx->reader->EndRecPtr == moveto.
> The function accepts any LSN even if it is not at the begiining
> of a record. We will see errors or crashs or infinite waiting or
> maybe any kind of trouble by such values. The moved LSN must
> always be at the "end of a record" (that is, at the start of the
> next recored). The attached patch also fixes this.
Indeed, but we have these problems only if we are trying to read WAL
since confirmed_flush.
[1] https://www.postgresql.org/message-id/873720e4hf.fsf%40ars-thinkpad
--
Arseny Sher
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2018-05-25 06:50:48 | Re: Keeping temporary tables in shared buffers |
Previous Message | Andres Freund | 2018-05-25 06:40:55 | Re: Redesigning the executor (async, JIT, memory efficiency) |