Re: BUG #14785: Logical replication does not work after adding a column. Bug?

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, yxq(at)o2(dot)pl, pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #14785: Logical replication does not work after adding a column. Bug?
Date: 2017-09-26 15:56:33
Message-ID: 20170926155633.GM4628@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Tom,

* Tom Lane (tgl(at)sss(dot)pgh(dot)pa(dot)us) wrote:
> Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com> writes:
> > On 9/25/17 15:16, Andres Freund wrote:
> >> This'll accept tablenames like pg_temp_1foo, right? Might be worth
> >> being a bit narrower in the test.
>
> > Committed with that change. Thanks.
>
> This patch is using the wrong approach entirely. Every other place in
> the backend that is trying to exclude temp relations uses a test on the
> containing namespace, not the relname.

The specific issue here is that the new pg_class entry is created in the
same namespace, not in the temp one. The commit mentions make_new_heap()
specifically because that's where the issue is coming from because
that's creating this new pg_temp_XXX table in the regular user
namespace.

I'm not a huge fan of this approach either, really, but I'm not sure
that there's a better answer either.

> I also note that as committed, the patch will dump core on a
> concurrently-dropped relation, because get_rel_name returns NULL
> under such circumstances.

That's certainly no good and should be checked for.

> BTW, get_rel_sync_entry has some other serious problems: it is not being
> at all careful about whether persistent data structures are left in sane
> states if it gets an error partway through. In particular it'll leave
> behind a new hash entry in entirely-unknown state, and if LoadPublications
> gets an error, it will also leave a time bomb behind in the form of
> not nil, but already-list-freed, data->publications. And I sure do not
> understand why a single static variable publications_valid is being used
> to remember validity of data->publications ... couldn't there be more
> than one of those?

This all certainly doesn't sound good, but I'm not as familiar with
these bits.

Thanks!

Stephen

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Stephen Frost 2017-09-26 16:00:58 Re: BUG #14785: Logical replication does not work after adding a column. Bug?
Previous Message Tom Lane 2017-09-26 15:43:54 Re: BUG #14785: Logical replication does not work after adding a column. Bug?