| From: | Antonin Houska <ah(at)cybertec(dot)at> |
|---|---|
| To: | Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com> |
| Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Mihail Nikalayeu <mihailnikalayeu(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Robert Treat <rob(at)xzilla(dot)net> |
| Subject: | Re: Adding REPACK [concurrently] |
| Date: | 2026-03-18 20:07:09 |
| Message-ID: | 98417.1773864429@localhost |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Srinath Reddy Sadipiralla <srinath2133(at)gmail(dot)com> wrote:
> TRAP: failed Assert("RelationGetRelid(relation) == ((RepackDecodingState *) ctx->output_writer_private)->relid"), File: "pgoutput_repack.c",
> Line: 97, PID: 397007
> This crash happens if we run REPACK (concurrently) on a table while a heavy
> pgbench workload is concurrently executing multi-table(setup.sql) transactions(dual_chaos.sql).
> It triggers after a few back to back REPACK (concurrently) runs.
>
> i think i found the cause for this crash , because there were some changes which
> slipped under the nose of the change_useless_for_repack filter , which led some
> changes which are not related to the relation which we are currently doing REPACK (concurrently)
> got decoded and added into the reorderbuffer queue, the reason for this is repacked_rel_locator.relNumber
> is by default set to InvalidOid, this is actually set to the target relation during setup_logical_decoding
> but this done after DecodingContextFindStartpoint, in DecodingContextFindStartpoint changes are not
> filtered even if its not related to the target relation , because rm_decode->change_useless_for_repack->am_decoding_for_repack
> where repacked_rel_locator.relNumber is still InvalidOid, which makes it skip the filtering even its not the target relation,
> this makes it to be added to reorder buffer queue, so during the processing of reorder buffer plugin_change is called
> where assert fails, i have attached a diff patch to solve this.
Thanks a lot! Yes, your explanation makes sense. I'll include the fix in the
next version. I think it might also explain the other crash [1] you reported
earlier. I'll try to reproduce that.
--
Antonin Houska
Web: https://www.cybertec-postgresql.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Jan Wieck | 2026-03-18 20:11:30 | Re: Initial COPY of Logical Replication is too slow |
| Previous Message | Zsolt Parragi | 2026-03-18 19:54:00 | Re: A stack allocation API |