Re: snapbuild woes

From: Petr Jelinek <petr(dot)jelinek(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Erik Rijkers <er(at)xs4all(dot)nl>, Robert Haas <robertmhaas(at)gmail(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>
Subject: Re: snapbuild woes
Date: 2017-05-02 06:55:53
Message-ID: ba7a3052-6e71-d162-d4a0-b5bdd7145c19@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 01/05/17 21:14, Andres Freund wrote:
> On 2017-05-01 11:09:44 +0200, Petr Jelinek wrote:
>> On 01/05/17 10:03, Andres Freund wrote:
>>> On 2017-05-01 03:54:49 +0200, Petr Jelinek wrote:
>
>>>> But, I still think we need to restart the tracking after new
>>>> xl_running_xacts. Reason for that is afaics any of the catalog snapshots
>>>> that we assigned to transactions at the end of SnapBuildCommitTxn might
>>>> be corrupted otherwise as they were built before we knew one of the
>>>> supposedly running txes was actually already committed and that
>>>> transaction might have done catalog changes.
>>>
>>> I'm afraid you're right. But I think this is even more complicated: The
>>> argument in your version that this can only happen once, seems to also
>>> be holey: Just imagine a pg_usleep(3000 * 1000000) right before
>>> ProcArrayEndTransaction() and enjoy the picture.
>
>> Well yes, transaction can in theory have written commit/abort xlog
>> record and stayed in proc for more than single xl_running_xacts write.
>> But then the condition which we test that the new xl_running_xacts has
>> bigger xmin than the previously tracked one's xmax would not be
>> satisfied and we would not enter the relevant code path yet. So I think
>> we should not be able to get any xids we didn't see. But we have to
>> restart tracking from beginning (after first checking if we didn't
>> already see anything that the xl_running_xacts considers as running),
>> that's what my code did.
>
> But to get that correct, we'd have to not only track ->committed, but
> also somehow maintain ->aborted, and not just for the transactions in
> the original set of running transactions. That'd be fairly complicated
> and large. The reason I was trying - and it's definitely not correct as
> I had proposed - to use the original running_xacts record is that that
> only required tracking as many transaction statuses as in the first
> xl_running_xacts. Am I missing something?

Aah, now I understand we talked about slightly different things, I
considered the running thing to be first step towards tracking aborted
txes everywhere. I am not sure if it's complicated, it would be exactly
the same as committed tracking, except we'd do it only before we reach
SNAPBUILD_CONSISTENT. It would be definitely larger patch I agree, but I
can give it at try.

If you think that adding the SNAPBUILD_BUILD_INITIAL_SNAPSHOT would be
less invasive/smaller patch I am okay with doing that for PG10. I think
we'll have to revisit tracking of aborted transactions in PG11 then
though because of the 'snapshot too large' issue when exporting, at
least I don't see any other way to fix that.

>
> The probabilistic tests catch the issues here fairly quickly, btw, if
> you run with synchronous_commit=on, while pgbench is running, because
> the WAL flushes make this more likely. Runs this query:
>
> SELECT account_count, teller_count, account_sum - teller_sum s
> FROM
> (
> SELECT count(*) account_count, SUM(abalance) account_sum
> FROM pgbench_accounts
> ) a,
> (
> SELECT count(*) teller_count, SUM(tbalance) teller_sum
> FROM pgbench_tellers
> ) t
>
> which, for my scale, should always return:
> ┌─────────┬─────┬───┐
> │ a │ t │ s │
> ├─────────┼─────┼───┤
> │ 2000000 │ 200 │ 0 │
> └─────────┴─────┴───┘
> but with my POC patch occasionally returns things like:
> ┌─────────┬─────┬───────┐
> │ a │ t │ s │
> ├─────────┼─────┼───────┤
> │ 2000000 │ 212 │ 37358 │
> └─────────┴─────┴───────┘
>
> which obviously shouldn't be the case.
>

Very nice (the test, not the failures ;)) !

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Petr Jelinek 2017-05-02 07:11:40 Re: logical replication and PANIC during shutdown checkpoint in publisher
Previous Message Amit Langote 2017-05-02 06:51:21 multi-column range partition constraint