Re: BUG #16125: Crash of PostgreSQL's wal sender during logical replication

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Andrey Salnikov <andrey(dot)salnikov(at)dataegret(dot)com>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #16125: Crash of PostgreSQL's wal sender during logical replication
Date: 2019-11-26 20:50:58
Message-ID: 20191126205058.d5cny7brrqyqm4et@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Tue, Nov 26, 2019 at 11:35:51PM +0300, Andrey Salnikov wrote:
>Hi, I’m sorry for late response.
>
>> 26 нояб. 2019 г., в 21:27, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> написал(а):
>>
>>
>> I think having more information about the structure (tables, indexes,
>> mapping for relfilenodes) and a better idea what the transaction is
>> doing, would be helpful.
>
>Here is information about tables
>
> filenode | relation
>----------+---------------------------
> 88964815 | table1
> 88964795 | pg_toast.pg_toast_4029268 (toast table2)
> 88964792 | table2
>(1 row)
>
> Table "public.table1"
> Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
>------------+----------+-----------+----------+---------+----------+--------------+-------------
> column1 | jsonb | | | | extended | |
> column3 | smallint | | not null | | plain | |
> column4 | integer | | not null | | plain | |
> column5 | integer | | not null | | plain | |
> column6 | integer | | not null | | plain | |
> column7 | smallint | | not null | | plain | |
>Indexes:
> "table1_pkey" PRIMARY KEY, btree (column6, column7, column5, column4, column3) WITH (fillfactor='70')
> "table1_index1" btree (column7, column5, column4) WHERE column7 = ? OR column7 = ? OR column7 = ?
>Publications:
> "pub1"
>Replica Identity: FULL
>Options: autovacuum_vacuum_scale_factor=0.0, autovacuum_vacuum_threshold=1000, autovacuum_analyze_scale_factor=0.0, autovacuum_analyze_threshold=1000, autovacuum_enabled=true, fillfactor=100
>
> Table "public.table2"
> Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
>--------------+-----------------------+-----------+----------+---------+----------+--------------+-------------
> column1 | integer | | not null | | plain | |
> column2 | jsonb | | | | extended | |
> column3 | jsonb | | | | extended | |
> column5 | bigint | | | | plain | |
> column6 | double precision | | | | plain | |
> column7 | character varying(32) | | | | extended | |
> column8 | bigint | | | | plain | |
> column10 | smallint | | | | plain | |
> column11 | bigint | | | | plain | |
> column12 | bigint | | | | plain | |
> column13 | integer | | | | plain | |
> column14 | bigint | | | | plain | |
>Indexes:
> "table2_pkey" PRIMARY KEY, btree (column1)
> "table2_index1" btree (column1, column14, column12) WITH (fillfactor='90')
> "table2_index2" btree (column11, column14, column12, column8, column1) WITH (fillfactor='50')
>Publications:
> "pub1"
>Replica Identity: FULL
>Options: autovacuum_vacuum_scale_factor=0.0, autovacuum_vacuum_threshold=1000, autovacuum_analyze_scale_factor=0.0, autovacuum_analyze_threshold=1000, autovacuum_enabled=true, fillfactor=100
>
> attrelid | attrelid | attname | atttypid | attstattarget | attlen | attnum | attndims | attcacheoff | atttypmod | attbyval | attstorage | attalign | attnotnull | atthasdef | attidentity | attisdropped | attislocal | attinhcount | attcollation | attacl | attoptions | attfdwoptions
>--------------+----------+------------------------------+----------+---------------+--------+--------+----------+-------------+-----------+----------+------------+----------+------------+-----------+-------------+--------------+------------+-------------+--------------+--------+------------+---------------
> table1 | 4029244 | tableoid | 26 | 0 | 4 | -7 | 0 | -1 | -1 | t | p | i | t | f | | f | t | 0 | 0 | | |
> table1 | 4029244 | cmax | 29 | 0 | 4 | -6 | 0 | -1 | -1 | t | p | i | t | f | | f | t | 0 | 0 | | |
> table1 | 4029244 | xmax | 28 | 0 | 4 | -5 | 0 | -1 | -1 | t | p | i | t | f | | f | t | 0 | 0 | | |
> table1 | 4029244 | cmin | 29 | 0 | 4 | -4 | 0 | -1 | -1 | t | p | i | t | f | | f | t | 0 | 0 | | |
> table1 | 4029244 | xmin | 28 | 0 | 4 | -3 | 0 | -1 | -1 | t | p | i | t | f | | f | t | 0 | 0 | | |
> table1 | 4029244 | ctid | 27 | 0 | 6 | -1 | 0 | -1 | -1 | f | p | s | t | f | | f | t | 0 | 0 | | |
> table1 | 4029244 | column1 | 3802 | -1 | -1 | 1 | 0 | -1 | -1 | f | x | i | f | f | | f | t | 0 | 0 | | |
> table1 | 4029244 | ........pg.dropped.2........ | 0 | 0 | -1 | 2 | 0 | -1 | -1 | f | x | i | f | f | | t | t | 0 | 0 | | |
> table1 | 4029244 | column3 | 21 | -1 | 2 | 3 | 0 | -1 | -1 | t | p | s | t | f | | f | t | 0 | 0 | | |
> table1 | 4029244 | column4 | 23 | -1 | 4 | 4 | 0 | -1 | -1 | t | p | i | t | f | | f | t | 0 | 0 | | |
> table1 | 4029244 | column5 | 23 | -1 | 4 | 5 | 0 | -1 | -1 | t | p | i | t | f | | f | t | 0 | 0 | | |
> table1 | 4029244 | column6 | 23 | -1 | 4 | 6 | 0 | -1 | -1 | t | p | i | t | f | | f | t | 0 | 0 | | |
> table1 | 4029244 | column7 | 21 | -1 | 2 | 7 | 0 | -1 | -1 | t | p | s | t | f | | f | t | 0 | 0 | | |
> table2 | 4029268 | tableoid | 26 | 0 | 4 | -7 | 0 | -1 | -1 | t | p | i | t | f | | f | t | 0 | 0 | | |
> table2 | 4029268 | cmax | 29 | 0 | 4 | -6 | 0 | -1 | -1 | t | p | i | t | f | | f | t | 0 | 0 | | |
> table2 | 4029268 | xmax | 28 | 0 | 4 | -5 | 0 | -1 | -1 | t | p | i | t | f | | f | t | 0 | 0 | | |
> table2 | 4029268 | cmin | 29 | 0 | 4 | -4 | 0 | -1 | -1 | t | p | i | t | f | | f | t | 0 | 0 | | |
> table2 | 4029268 | xmin | 28 | 0 | 4 | -3 | 0 | -1 | -1 | t | p | i | t | f | | f | t | 0 | 0 | | |
> table2 | 4029268 | ctid | 27 | 0 | 6 | -1 | 0 | -1 | -1 | f | p | s | t | f | | f | t | 0 | 0 | | |
> table2 | 4029268 | column1 | 23 | -1 | 4 | 1 | 0 | -1 | -1 | t | p | i | t | f | | f | t | 0 | 0 | | |
> table2 | 4029268 | column2 | 3802 | -1 | -1 | 2 | 0 | -1 | -1 | f | x | i | f | f | | f | t | 0 | 0 | | |
> table2 | 4029268 | column3 | 3802 | -1 | -1 | 3 | 0 | -1 | -1 | f | x | i | f | f | | f | t | 0 | 0 | | |
> table2 | 4029268 | ........pg.dropped.4........ | 0 | 0 | -1 | 4 | 0 | -1 | -1 | f | x | i | f | f | | t | t | 0 | 0 | | |
> table2 | 4029268 | column5 | 20 | -1 | 8 | 5 | 0 | -1 | -1 | t | p | d | f | f | | f | t | 0 | 0 | | |
> table2 | 4029268 | column6 | 701 | -1 | 8 | 6 | 0 | -1 | -1 | t | p | d | f | f | | f | t | 0 | 0 | | |
> table2 | 4029268 | column7 | 1043 | -1 | -1 | 7 | 0 | -1 | 36 | f | x | i | f | f | | f | t | 0 | 100 | | |
> table2 | 4029268 | column8 | 20 | -1 | 8 | 8 | 0 | -1 | -1 | t | p | d | f | f | | f | t | 0 | 0 | | |
> table2 | 4029268 | ........pg.dropped.9........ | 0 | 0 | 4 | 9 | 0 | -1 | -1 | t | p | i | f | f | | t | t | 0 | 0 | | |
> table2 | 4029268 | column10 | 21 | -1 | 2 | 10 | 0 | -1 | -1 | t | p | s | f | f | | f | t | 0 | 0 | | |
> table2 | 4029268 | column11 | 20 | -1 | 8 | 11 | 0 | -1 | -1 | t | p | d | f | f | | f | t | 0 | 0 | | |
> table2 | 4029268 | column12 | 20 | -1 | 8 | 12 | 0 | -1 | -1 | t | p | d | f | f | | f | t | 0 | 0 | | |
> table2 | 4029268 | column13 | 23 | -1 | 4 | 13 | 0 | -1 | -1 | t | p | i | f | f | | f | t | 0 | 0 | | |
> table2 | 4029268 | column14 | 20 | -1 | 8 | 14 | 0 | -1 | -1 | t | p | d | f | f | | f | t | 0 | 0 | | |
>
>And extracted information from wal file by pg_waldump -s 25EE/D66F0438 -e 25EE/D6DE6F00 in attached file.

Can you also show how those relations map to the relfilenodes referenced
by the WAL? This should do the trick, I think:

SELECT relname FORM pg_class
WHERE relfilenode IN (88964795, 88964797, 88964795, 88964792,
88964798, 88964799, 88964800, 88964815);

Also, any idea what the transactions does? It seems it inserts 2 rows
into 88964795, then one row into 88964792, and then it deletes those new
records in the same subxact. And then it does a delete on 88964815 which
triggers the segfault.

How do you create the subtransactions? plpgsql procedure with exception
blocks, savepoints? I'm trying to reproduce it and I'm not sure if those
details matter.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Tomas Vondra 2019-11-27 00:33:11 Re: Failed assertion clauses != NIL
Previous Message Andrey Salnikov 2019-11-26 20:35:51 Re: BUG #16125: Crash of PostgreSQL's wal sender during logical replication