Re: some review comments on logical rep code

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: some review comments on logical rep code
Date: 2017-04-17 12:13:12
Message-ID: CAD21AoDQFxeJ==n3o3ce1Wzp3gdxFU+pJYQJnuOH9FhHJ58dKw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 17, 2017 at 7:39 PM, Kyotaro HORIGUCHI
<horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoB3L50jWwuyaGLxJzmrmXmYJAOBHzDPR=FKwjw29mxj0Q(at)mail(dot)gmail(dot)com>
>> On Fri, Apr 14, 2017 at 4:47 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>> > Hi,
>> >
>> > Though I've read only a part of the logical rep code yet, I'd like to
>> > share some (relatively minor) review comments that I got so far.
>>
>> It seems nobody is working on dealing with these review comments, so
>> I've attached patches. Since there are lots of review comment I
>> numbered each review comment. The prefix of patches I attached is
>> corresponding to review comment number.
>>

Thank you for reviewing.

>> 1.
>> >
>> > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer
>> > value from the argument, instead of DatumGetObjectId().
>>
>> Attached 001 patch fixes it.
>
> Hmm. I looked at the launcher side and found that it assigns bare
> integer to bgw_main_arg. It also should use Int32GetDatum.

Yeah, I agreed. Will fix it.

>
>> logicalrep_worker_launch(Oid dbid, Oid subid, const char *subname, Oid userid,
>> Oid relid)
>> {
>> int slot;
> ...
>> for (slot = 0; slot < max_logical_replication_workers; slot++)
> ...
>> bgw.bgw_main_arg = slot;
>
>
>
>> 2.
>> >
>> > No one resets on_commit_launcher_wakeup flag to false.
>>
>> Attached 002 patch makes on_commit_launcher_wakeup turn off after woke
>> up the launcher process.
>
> It is omitting the abort case. Maybe it should be
> AtEOXact_ApplyLauncher(bool commit)?

Should we wake up the launcher process even when abort?

>
>> 3.
>> >
>> > ApplyLauncherWakeup() should be static function.
>>
>> Attached 003 patch fixes it (and also fixes #5 review comment).
>>
>> 4.
>> >
>> > Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's
>> > subcommands (e.g., ENABLED one) should wake the launcher up on commit.
>>
>> This is also reported by Horiguchi-san on another thread[1], and patch exists.
>>
>> 5.
>> >
>> > Why is IsBackendPid() check necessary in ApplyLauncherWakeup()?
>>
>> I also guess it's necessary. This change is included in #3 patch.
>
> IsBackendPid() is not free, I suppose that just ignoring failure
> with ESRCH is enough.

After thought, since LogicalRepCtx->launcher_pid could be 0 should, we
check if launcher_pid != 0?

>
>> 6.
>> >
>> > Normal statements that the walsender for logical rep runs are logged
>> > by log_replication_commands. So if log_statement is also set,
>> > those commands are logged twice.
>>
>> Attached 006 patch fixes it by adding "-c log_statement = none" to
>> connection parameter of libpqrcv if logical = true.
>
> The control by log_replication_commands is performed on
> walsender, so this also shuld be done on the same place. Anyway
> log_statement is irrelevant to walsender.

Patch 006 emits all logs executed by the apply worker as a log of
log_replication_command but perhaps you're right. It would be better
to leave the log of log_statement if the command executed by the apply
worker is a SQL command. Will fix.

>
>> 7.
>> >
>> > LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits
>> > an error. The callback function to reset it needs to be registered
>> > via on_shmem_exit().
>>
>> Attached 007 patch adds callback function to reset LogicalRepCtx->launcher_pid.
>>
>> 8.
>> >
>> > Typo: "an subscriber" should be "a subscriber" in some places.
>>
>> It seems that there is no longer these typo.
>>
>> 9.
>> > /* Get conninfo */
>> > datum = SysCacheGetAttr(SUBSCRIPTIONOID,
>> > tup,
>> > Anum_pg_subscription_subconninfo,
>> > &isnull);
>> > Assert(!isnull);
>> >
>> > This assertion makes me think that pg_subscription.subconnifo should
>> > have NOT NULL constraint. Also subslotname and subpublications
>> > should have NOT NULL constraint because of the same reason.
>>
>> Agreed. Attached 009 patch add NOT NULL constraint to all other
>> columns of pg_subscription. pg_subscription.subsynccommit should have
>> it as well.
>
> I'm not sure the policy here, but I suppose that the assertions
> there are still required irrelevantly from the nun-nullness of
> the attribute.

You're right. Will fix it.

>> 10.
>> >
>> > SpinLockAcquire(&MyLogicalRepWorker->relmutex);
>> > MyLogicalRepWorker->relstate =
>> > GetSubscriptionRelState(MyLogicalRepWorker->subid,
>> > MyLogicalRepWorker->relid,
>> > &MyLogicalRepWorker->relstate_lsn,
>> > false);
>> > SpinLockRelease(&MyLogicalRepWorker->relmutex);
>> >
>> > Non-"short-term" function like GetSubscriptionRelState() should not
>> > be called while spinlock is being held.
>> >
>>
>> One option is adding new LWLock for the relation state change but this
>> lock is used at many locations where the operation actually doesn't
>> take time. I think that the discussion would be needed to get
>> consensus, so patch for it is not attached.
>
> From the point of view of time, I agree that it doesn't seem to
> harm. Bt I thing it as more significant problem that
> potentially-throwable function is called within the mutex
> region. It potentially causes a kind of dead lock. (That said,
> theoretically dosn't occur and I'm not sure what happens by the
> dead lock..)
>
>
>> [1] https://www.postgresql.org/message-id/20170406.212441.177025736.horiguchi.kyotaro@lab.ntt.co.jp
>
> --
> Kyotaro Horiguchi
> NTT Open Source Software Center
>

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2017-04-17 12:20:58 Re: Optimization for updating foreign tables in Postgres FDW
Previous Message Remi Colinet 2017-04-17 12:09:01 [PATCH] New command to monitor progression of long running queries