Re: logical decoding / rewrite map vs. maxAllocatedDescs

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: logical decoding / rewrite map vs. maxAllocatedDescs
Date: 2018-08-09 17:47:46
Message-ID: 20180809174746.aa6rbpsu4zjipzif@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018-Aug-09, Tomas Vondra wrote:

> I suppose there are reasons why it's done this way, and admittedly the test
> that happens to trigger this is a bit extreme (essentially running pgbench
> concurrently with 'vacuum full pg_class' in a loop). I'm not sure it's
> extreme enough to deem it not an issue, because people using many temporary
> tables often deal with bloat by doing frequent vacuum full on catalogs.

Actually, it seems to me that ApplyLogicalMappingFile is just leaking
the file descriptor for no good reason. There's a different
OpenTransientFile call in ReorderBufferRestoreChanges that is not
intended to be closed immediately, but the other one seems a plain bug,
easy enough to fix.

> But wait, there's more - what happens after hitting the limit? We restart
> the decoding process, and end up getting this:
>
> ERROR: 53000: exceeded maxAllocatedDescs (492) while trying to open
> file "pg_logical/mappings/map-4000-4eb-1_60DE1E08-5376b5-537c6b"
> LOCATION: OpenTransientFile, fd.c:2161
> LOG: 00000: starting logical decoding for slot "s"
> DETAIL: streaming transactions committing after 1/6097DD48, reading
> WAL from 1/60275848
> LOCATION: CreateDecodingContext, logical.c:414
> LOG: 00000: logical decoding found consistent point at 1/60275848
> DETAIL: Logical decoding will begin using saved snapshot.
> LOCATION: SnapBuildRestore, snapbuild.c:1872
> ERROR: XX000: subtransaction logged without previous top-level txn
> record

Hmm, I wonder if we introduced some bug in f49a80c481f7.

> I'd say this implies the "can't open file" is handled in a way that corrupts
> the mapping files, making it impossible to restart the decoding. AFAICS the
> only option at this point is to drop the subscription and start over.

Wow, that seems pretty serious. Clearly that's a different bug that
must be fixed.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message hubert depesz lubaczewski 2018-08-09 17:49:31 xact_start meaning when dealing with procedures?
Previous Message Tomas Vondra 2018-08-09 17:43:58 Re: Some pgq table rewrite incompatibility with logical decoding?