Re: Some pgq table rewrite incompatibility with logical decoding?

From: Jeremy Finzel <finzelj(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Some pgq table rewrite incompatibility with logical decoding?
Date: 2018-08-28 17:41:54
Message-ID: CAMa1XUhw1e4=w8Wxq-FQj=ou4eujmcB2RNL7X0haDMd=Hacufw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

>
> Jeremy, are you able to reproduce the issue locally, using pgq? That would
>> be very valuable.
>
>
Tomas et al:

We have hit this error again, and we plan to snapshot the database as to be
able to do whatever troubleshooting we can. If someone could provide me
guidance as to what exactly you would like me to do, please let me know. I
am able to provide an xlog dump and also debugging information upon request.

This is actually a different database system that also uses skytools, and
the exact same table (pgq.event_58_1) is again the cause of the relfilenode
error. I did a point-in-time recovery to a point after this relfilenode
appears using pg_xlogdump, and verified this was the table that appeared,
then disappeared.

Here is the exact issue again I am having:

Provider logs:
2018-08-28 08:29:30.334
CDT,"foo_logical","foo_prod",70236,"0.0.0.0:48314",5b854e3a.1125c,2,"authentication",2018-08-28
08:29:30 CDT,289/78900643,0,LOG,00000,"replication connection authorized:
user=foo_logical SSL enabled (protocol=TLSv1.2,
cipher=ECDHE-RSA-AES256-GCM-SHA384, compression=off)",,,,,,,,,""
2018-08-28 08:29:30.366
CDT,"foo_logical","foo_prod",70236,"0.0.0.0:48314",5b854e3a.1125c,3,"idle",2018-08-28
08:29:30 CDT,289/0,0,LOG,00000,"starting logical decoding for slot
""pgl_foo_prod_providerb97b25d_foo336ddc1""","streaming transactions
committing after C0A5/EEBBD4E8, reading WAL from
C0A5/EC723AC8",,,,,,,,"bar_foo_foo_a"
2018-08-28 08:29:30.366
CDT,"foo_logical","foo_prod",70236,"0.0.0.0:48314",5b854e3a.1125c,4,"idle",2018-08-28
08:29:30 CDT,289/0,0,LOG,00000,"logical decoding found consistent point at
C0A5/EC723AC8","There are no running transactions.",,,,,,,,"bar_foo_foo_a"
2018-08-28 08:29:30.448
CDT,"foo_logical","foo_prod",70236,"0.0.0.0:48314",5b854e3a.1125c,5,"idle",2018-08-28
08:29:30 CDT,289/0,0,ERROR,XX000,"could not map filenode
""base/16418/2800559918"" to relation OID",,,,,,,,,"bar_foo_foo_a"
2018-08-28 08:29:30.463
CDT,"foo_logical","foo_prod",70236,"0.0.0.0:48314",5b854e3a.1125c,6,"idle",2018-08-28
08:29:30 CDT,289/0,0,LOG,08006,"could not receive data from client:
Connection reset by peer",,,,,,,,,"bar_foo_foo_a"
2018-08-28 08:29:30.463
CDT,"foo_logical","foo_prod",70236,"0.0.0.0:48314",5b854e3a.1125c,7,"idle",2018-08-28
08:29:30 CDT,,0,LOG,00000,"disconnection: session time: 0:00:00.135
user=foo_logical database=foo_prod host=0.0.0.0
port=48314",,,,,,,,,"bar_foo_foo_a"

Subscriber logs:
2018-08-28 08:32:10.040 CDT,,,71810,,5b854eda.11882,1,,2018-08-28 08:32:10
CDT,7/0,0,LOG,00000,"starting apply for subscription
bar_foo_foo_a",,,,,,,,,"pglogical apply 16418:862837778"
2018-08-28 08:32:10.175 CDT,,,71810,,5b854eda.11882,2,,2018-08-28 08:32:10
CDT,7/242099,123886525,ERROR,XX000,"data stream ended",,,,,,,,,"pglogical
apply 16418:862837778"
2018-08-28 08:32:10.175 CDT,,,71810,,5b854eda.11882,3,,2018-08-28 08:32:10
CDT,7/0,0,LOG,00000,"apply worker [71810] at slot 1 generation 366 exiting
with error",,,,,,,,,"pglogical apply 16418:862837778"
2018-08-28 08:32:10.179 CDT,,,27173,,5b61d336.6a25,373,,2018-08-01 10:35:18
CDT,,0,LOG,00000,"worker process: pglogical apply 16418:862837778 (PID
71810) exited with exit code 1",,,,,,,,,""

Thanks,
Jeremy

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-08-28 18:47:17 Re: pg_dump test instability
Previous Message Alexander Korotkov 2018-08-28 17:18:56 Re: Dimension limit in contrib/cube (dump/restore hazard?)