Re: Perform streaming logical transactions by background workers and parallel apply

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Perform streaming logical transactions by background workers and parallel apply
Date: 2023-04-26 09:00:02
Message-ID: 2185d65f-5aae-3efa-c48f-fb42b173ef5c@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello hackers,

Please look at a new anomaly that can be observed starting from 216a7848.

The following script:
echo "CREATE SUBSCRIPTION testsub CONNECTION 'dbname=nodb' PUBLICATION testpub WITH (connect = false);
ALTER SUBSCRIPTION testsub ENABLE;" | psql

sleep 1
rm $PGINST/lib/libpqwalreceiver.so
sleep 15
pg_ctl -D "$PGDB" stop -m immediate
grep 'TRAP:' server.log

Leads to multiple assertion failures:
CREATE SUBSCRIPTION
ALTER SUBSCRIPTION
waiting for server to shut down.... done
server stopped
TRAP: failed Assert("MyProc->backendId != InvalidBackendId"), File: "lock.c", Line: 4439, PID: 2899323
TRAP: failed Assert("MyProc->backendId != InvalidBackendId"), File: "lock.c", Line: 4439, PID: 2899416
TRAP: failed Assert("MyProc->backendId != InvalidBackendId"), File: "lock.c", Line: 4439, PID: 2899427
TRAP: failed Assert("MyProc->backendId != InvalidBackendId"), File: "lock.c", Line: 4439, PID: 2899439
TRAP: failed Assert("MyProc->backendId != InvalidBackendId"), File: "lock.c", Line: 4439, PID: 2899538
TRAP: failed Assert("MyProc->backendId != InvalidBackendId"), File: "lock.c", Line: 4439, PID: 2899547

server.log contains:
2023-04-26 11:00:58.797 MSK [2899300] LOG:  database system is ready to accept connections
2023-04-26 11:00:58.821 MSK [2899416] ERROR:  could not access file "libpqwalreceiver": No such file or directory
TRAP: failed Assert("MyProc->backendId != InvalidBackendId"), File: "lock.c", Line: 4439, PID: 2899416
postgres: logical replication apply worker for subscription 16385 (ExceptionalCondition+0x69)[0x558b2ac06d41]
postgres: logical replication apply worker for subscription 16385 (VirtualXactLockTableCleanup+0xa4)[0x558b2aa9fd74]
postgres: logical replication apply worker for subscription 16385 (LockReleaseAll+0xbb)[0x558b2aa9fe7d]
postgres: logical replication apply worker for subscription 16385 (+0x4588c6)[0x558b2aa2a8c6]
postgres: logical replication apply worker for subscription 16385 (shmem_exit+0x6c)[0x558b2aa87eb1]
postgres: logical replication apply worker for subscription 16385 (+0x4b5faa)[0x558b2aa87faa]
postgres: logical replication apply worker for subscription 16385 (proc_exit+0xc)[0x558b2aa88031]
postgres: logical replication apply worker for subscription 16385 (StartBackgroundWorker+0x147)[0x558b2aa0b4d9]
postgres: logical replication apply worker for subscription 16385 (+0x43fdc1)[0x558b2aa11dc1]
postgres: logical replication apply worker for subscription 16385 (+0x43ff3d)[0x558b2aa11f3d]
postgres: logical replication apply worker for subscription 16385 (+0x440866)[0x558b2aa12866]
postgres: logical replication apply worker for subscription 16385 (+0x440e12)[0x558b2aa12e12]
postgres: logical replication apply worker for subscription 16385 (BackgroundWorkerInitializeConnection+0x0)[0x558b2aa14396]
postgres: logical replication apply worker for subscription 16385 (main+0x21a)[0x558b2a932e21]

I understand, that removing libpqwalreceiver.so (or whole pginst/) is not
what happens in a production environment every day, but nonetheless it's a
new failure mode and it can produce many coredumps when testing.

IIUC, that assert will fail in case of any error raised between
ApplyWorkerMain()->logicalrep_worker_attach()->before_shmem_exit() and
ApplyWorkerMain()->InitializeApplyWorker()->BackgroundWorkerInitializeConnectionByOid()->InitPostgres().

Best regards,
Alexander

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2023-04-26 09:12:06 Re: Add two missing tests in 035_standby_logical_decoding.pl
Previous Message Amit Kapila 2023-04-26 08:56:50 Re: Support logical replication of DDLs