Re: TRAP: FailedAssertion("!(TransactionIdPrecedesOrEquals

From: Erik Rijkers <er(at)xs4all(dot)nl>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: TRAP: FailedAssertion("!(TransactionIdPrecedesOrEquals
Date: 2017-12-20 06:33:45
Message-ID: f4bc19a726ac9fce7e47c69ddf018cbc@xs4all.nl
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017-12-20 06:27, Michael Paquier wrote:
> On Wed, Dec 20, 2017 at 7:46 AM, Erik Rijkers <er(at)xs4all(dot)nl> wrote:

TRAP: FailedAssertion("!(TransactionIdPrecedesOrEquals(safeXid,
snap->xmin))", File: "snapbuild.c", Line: 580)

>> Sorry, that was probably too terse, I should explain that a little.
>>
>> After initing 50 instances, I set up and run a pgbench session in the
>> master
>> session; the pgbench lines are:
>>
>> init: pgbench --port=6515 --quiet --initialize --scale=1 postgres
>> run: pgbench -M prepared -c 16 -j 8 -T 1 -P 1 -n postgres -- scale
>> 1
>>
>> the other instances then catch up. The whole takes 5 minutes or so
>>
>> I vary scale, duration, and number of instances. I haven't had it
>> fail in
>> this way yet but I mostly tried with lower number of instances (up to
>> 25 or
>> so).
>
> Hm. Are you saying that it takes at least 50 cascading instances to
> see the problem you are seeing? And that you are not seeing any
> problems with a lower number of cascading instances? Are you enabling
> hot_standby_feedback?

That sounds more definitive than I meant it, but yes, only now that I
tried a higher number of instances did I see this. But is also often
succeeds at up to 100 instances (100 is the highest I have tried).

These 50 instances were a logical replication chain, and
hot_standby_feedback is off.

Overnight I ran 80x the test that failed yesterday: now they all 80
succeeded. I am not sure what causes failure over success.

(logical replication does the initial syncing of the instances one by
one (sequentially) so it isn't as busy as expected; it just takes a long
time)

I wrote a simple perl program to test logical replication (attached,
FWIW), running:

./cascade.pl --instances=50 --scale=1 --clients=16 --threads=8
--duration=1 --repeats=3 --waiting=10

This cascade.pl program uses knowledge of my setup so probably won't run
elsewhere as is but it shows how the failing test was done.

Erik

Attachment Content-Type Size
cascade.pl text/x-perl 26.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-12-20 06:38:56 Re: TRAP: FailedAssertion("!(TransactionIdPrecedesOrEquals
Previous Message Amit Khandekar 2017-12-20 06:22:38 Re: [HACKERS] UPDATE of partition key