Re: snapbuild woes

From: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Erik Rijkers <er(at)xs4all(dot)nl>, Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>
Subject: Re: snapbuild woes
Date: 2017-02-22 02:05:42
Message-ID: 7286ebe9-cb1b-cf1a-f85b-fbe3402ea0bf@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 13/12/16 00:38, Petr Jelinek wrote:
> On 12/12/16 23:33, Andres Freund wrote:
>> On 2016-12-12 23:27:30 +0100, Petr Jelinek wrote:
>>> On 12/12/16 22:42, Andres Freund wrote:
>>>> Hi,
>>>>
>>>> On 2016-12-10 23:10:19 +0100, Petr Jelinek wrote:
>>>>> Hi,
>>>>> First one is outright bug, which has to do with how we track running
>>>>> transactions. What snapbuild basically does while doing initial snapshot
>>>>> is read the xl_running_xacts record, store the list of running txes and
>>>>> then wait until they all finish. The problem with this is that
>>>>> xl_running_xacts does not ensure that it only logs transactions that are
>>>>> actually still running (to avoid locking PGPROC) so there might be xids
>>>>> in xl_running_xacts that already committed before it was logged.
>>>>
>>>> I don't think that's actually true? Notice how LogStandbySnapshot()
>>>> only releases the lock *after* the LogCurrentRunningXacts() iff
>>>> wal_level >= WAL_LEVEL_LOGICAL. So the explanation for the problem you
>>>> observed must actually be a bit more complex :(
>>>>
>>>
>>> Hmm, interesting, I did see the transaction commit in the WAL before the
>>> xl_running_xacts that contained the xid as running. I only seen it on
>>> production system though, didn't really manage to easily reproduce it
>>> locally.
>>
>> I suspect the reason for that is that RecordTransactionCommit() doesn't
>> conflict with ProcArrayLock in the first place - only
>> ProcArrayEndTransaction() does. So they're still running in the PGPROC
>> sense, just not the crash-recovery sense...
>>
>
> That looks like reasonable explanation. BTW I realized my patch needs
> bit more work, currently it will break the actual snapshot as it behaves
> same as if the xl_running_xacts was empty which is not correct AFAICS.
>

Hi,

I got to work on this again. Unfortunately I haven't found solution that
I would be very happy with. What I did is in case we read
xl_running_xacts which has all transactions we track finished, we start
tracking from that new xl_running_xacts again with the difference that
we clean up the running transactions based on previously seen committed
ones. That means that on busy server we may wait for multiple
xl_running_xacts rather than just one, but at least we have chance to
finish unlike with current coding which basically waits for empty
xl_running_xacts. I also removed the additional locking for logical
wal_level in LogStandbySnapshot() since it does not work.

I also identified another bug in snapbuild while looking at the code.
That is the logical decoding will try to use on disk serialized snapshot
for initial snapshot export when it can. The problem is that these
snapshots are quite special and are not really usable as snapshots for
data (ie, the logical decoding snapshots regularly have xmax smaller
than xmin). So then when client tries to use this exported snapshot it
gets completely wrong data as the snapshot is broken. I think this is
explanation for Erik Rijker's problems with the initial COPY patch for
logical replication. At least for me the issues go away when I disable
use of the ondisk snapshots.

I didn't really find better solution than that though (disabling the use
of ondisk snapshots for initial consistent snapshot).

So to summarize attached patches:
0001 - Fixes performance issue where we build tons of snapshots that we
don't need which kills CPU.

0002 - Disables the use of ondisk historical snapshots for initial
consistent snapshot export as it may result in corrupt data. This
definitely needs backport.

0003 - Fixes bug where we might never reach snapshot on busy server due
to race condition in xl_running_xacts logging. The original use of extra
locking does not seem to be enough in practice. Once we have agreed fix
for this it's probably worth backpatching. There are still some comments
that need updating, this is more of a PoC.

Thoughts or better ideas?

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
0001-Skip-unnecessary-snapshot-builds.patch text/x-patch 6.2 KB
0002-Don-t-use-on-disk-snapshots-for-snapshot-export-in-l.patch text/x-patch 2.7 KB
0003-Fix-xl_running_xacts-usage-in-snapshot-builder.patch text/x-patch 6.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-02-22 02:07:19 Re: Speedup twophase transactions
Previous Message Amit Langote 2017-02-22 01:56:54 Re: dropping partitioned tables without CASCADE