Re: some review comments on logical rep code

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: sawada(dot)mshk(at)gmail(dot)com
Cc: masao(dot)fujii(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: some review comments on logical rep code
Date: 2017-04-17 10:39:55
Message-ID: 20170417.193955.42456174.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoB3L50jWwuyaGLxJzmrmXmYJAOBHzDPR=FKwjw29mxj0Q(at)mail(dot)gmail(dot)com>
> On Fri, Apr 14, 2017 at 4:47 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
> > Hi,
> >
> > Though I've read only a part of the logical rep code yet, I'd like to
> > share some (relatively minor) review comments that I got so far.
>
> It seems nobody is working on dealing with these review comments, so
> I've attached patches. Since there are lots of review comment I
> numbered each review comment. The prefix of patches I attached is
> corresponding to review comment number.
>
> 1.
> >
> > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer
> > value from the argument, instead of DatumGetObjectId().
>
> Attached 001 patch fixes it.

Hmm. I looked at the launcher side and found that it assigns bare
integer to bgw_main_arg. It also should use Int32GetDatum.

> logicalrep_worker_launch(Oid dbid, Oid subid, const char *subname, Oid userid,
> Oid relid)
> {
> int slot;
...
> for (slot = 0; slot < max_logical_replication_workers; slot++)
...
> bgw.bgw_main_arg = slot;

> 2.
> >
> > No one resets on_commit_launcher_wakeup flag to false.
>
> Attached 002 patch makes on_commit_launcher_wakeup turn off after woke
> up the launcher process.

It is omitting the abort case. Maybe it should be
AtEOXact_ApplyLauncher(bool commit)?

> 3.
> >
> > ApplyLauncherWakeup() should be static function.
>
> Attached 003 patch fixes it (and also fixes #5 review comment).
>
> 4.
> >
> > Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's
> > subcommands (e.g., ENABLED one) should wake the launcher up on commit.
>
> This is also reported by Horiguchi-san on another thread[1], and patch exists.
>
> 5.
> >
> > Why is IsBackendPid() check necessary in ApplyLauncherWakeup()?
>
> I also guess it's necessary. This change is included in #3 patch.

IsBackendPid() is not free, I suppose that just ignoring failure
with ESRCH is enough.

> 6.
> >
> > Normal statements that the walsender for logical rep runs are logged
> > by log_replication_commands. So if log_statement is also set,
> > those commands are logged twice.
>
> Attached 006 patch fixes it by adding "-c log_statement = none" to
> connection parameter of libpqrcv if logical = true.

The control by log_replication_commands is performed on
walsender, so this also shuld be done on the same place. Anyway
log_statement is irrelevant to walsender.

> 7.
> >
> > LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits
> > an error. The callback function to reset it needs to be registered
> > via on_shmem_exit().
>
> Attached 007 patch adds callback function to reset LogicalRepCtx->launcher_pid.
>
> 8.
> >
> > Typo: "an subscriber" should be "a subscriber" in some places.
>
> It seems that there is no longer these typo.
>
> 9.
> > /* Get conninfo */
> > datum = SysCacheGetAttr(SUBSCRIPTIONOID,
> > tup,
> > Anum_pg_subscription_subconninfo,
> > &isnull);
> > Assert(!isnull);
> >
> > This assertion makes me think that pg_subscription.subconnifo should
> > have NOT NULL constraint. Also subslotname and subpublications
> > should have NOT NULL constraint because of the same reason.
>
> Agreed. Attached 009 patch add NOT NULL constraint to all other
> columns of pg_subscription. pg_subscription.subsynccommit should have
> it as well.

I'm not sure the policy here, but I suppose that the assertions
there are still required irrelevantly from the nun-nullness of
the attribute.

> 10.
> >
> > SpinLockAcquire(&MyLogicalRepWorker->relmutex);
> > MyLogicalRepWorker->relstate =
> > GetSubscriptionRelState(MyLogicalRepWorker->subid,
> > MyLogicalRepWorker->relid,
> > &MyLogicalRepWorker->relstate_lsn,
> > false);
> > SpinLockRelease(&MyLogicalRepWorker->relmutex);
> >
> > Non-"short-term" function like GetSubscriptionRelState() should not
> > be called while spinlock is being held.
> >
>
> One option is adding new LWLock for the relation state change but this
> lock is used at many locations where the operation actually doesn't
> take time. I think that the discussion would be needed to get
> consensus, so patch for it is not attached.

From the point of view of time, I agree that it doesn't seem to
harm. Bt I thing it as more significant problem that
potentially-throwable function is called within the mutex
region. It potentially causes a kind of dead lock. (That said,
theoretically dosn't occur and I'm not sure what happens by the
dead lock..)

> [1] https://www.postgresql.org/message-id/20170406.212441.177025736.horiguchi.kyotaro@lab.ntt.co.jp

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabien COELHO 2017-04-17 11:00:07 Re: pgbench - allow to store select results into variables
Previous Message Nikhil Sontakke 2017-04-17 09:32:01 Re: Failed recovery with new faster 2PC code