Re: logical replication worker accesses catalogs in error context callback

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Zhihong Yu <zyu(at)yugabyte(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter_e(at)gmx(dot)net>, Petr Jelinek <petr(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Subject: Re: logical replication worker accesses catalogs in error context callback
Date: 2021-04-16 09:53:14
Message-ID: CALj2ACVLtCD9x00XrJCKwrjgTPQqjx=KhAZeGK4bygFzK4=+nA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 16, 2021 at 2:21 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> writes:
> > Thanks for pointing to the changes in the commit message. I corrected
> > them. Attaching v4 patch set, consider it for further review.
>
> I took a quick look at this. I'm quite worried about the potential
> performance cost of the v4-0001 patch (the one for fixing
> slot_store_error_callback). Previously, we didn't pay any noticeable
> cost for having the callback unless there actually was an error.
> As patched, we perform several catalog lookups per column per row,
> even in the non-error code path. That seems like it'd be a noticeable
> performance hit. Just to add insult to injury, it leaks memory.
>
> I propose a more radical but simpler solution: let's just not bother
> with including the type names in the context message. How much are
> they really buying?

<< Attaching v5-0001 here again for completion >>
I'm attaching v5-0001 patch that avoids printing the column type names
in the context message thus no cache lookups have to be done in the
error context callback. I think the column name is enough to know on
which column the error occurred and if required it's type can be known
by the user. This patch gets rid of printing local and remote type
names in slot_store_error_callback and also
logicalrep_typmap_gettypname because it's unnecessary. Thoughts?

> v4-0002 (for postgres_fdw's conversion_error_callback) has the same
> problems, although mitigated a bit by not needing to do any catalog
> lookups in the non-join case. For the join case, I wonder whether
> we could push the lookups back to plan setup time, so they don't
> need to be done over again for each row. (Not doing lookups at all
> doesn't seem attractive there; it'd render the context message nearly
> useless.) A different idea is to try to get the column names out
> of the query's rangetable instead of going to the catalogs.

I'm attaching v5-0002 which stores the required attribute information
for foreign joins in postgresBeginForeignScan which is a one time job
as opposed to the per-row system catalog lookup v4-0001 was doing. I'm
storing the foreign table relid(as key), relname and the retrieved
attributes' attnum and attname into a hash table. Whenever a
conversion error occurs, using relid, the hash table is looked up to
fetch the relname and attname. Thoughts?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
v5-0001-Avoid-Catalogue-Accesses-In-slot_store_error_call.patch application/octet-stream 6.3 KB
v5-0002-Avoid-Catalogue-Accesses-In-conversion_error_call.patch application/octet-stream 7.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2021-04-16 11:25:23 Re: Replication slot stats misgivings
Previous Message Amit Kapila 2021-04-16 09:48:31 Re: Truncate in synchronous logical replication failed