Re: some review comments on logical rep code

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: some review comments on logical rep code
Date: 2017-04-17 14:29:28
Message-ID: CAD21AoA+yFJTbw0PCw-ttmh9TsTygM3=iavXF+5Bx4O9-UXdsQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Apr 17, 2017 at 9:13 PM, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> On Mon, Apr 17, 2017 at 7:39 PM, Kyotaro HORIGUCHI
> <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> At Mon, 17 Apr 2017 18:02:57 +0900, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote in <CAD21AoB3L50jWwuyaGLxJzmrmXmYJAOBHzDPR=FKwjw29mxj0Q(at)mail(dot)gmail(dot)com>
>>> On Fri, Apr 14, 2017 at 4:47 AM, Fujii Masao <masao(dot)fujii(at)gmail(dot)com> wrote:
>>> > Hi,
>>> >
>>> > Though I've read only a part of the logical rep code yet, I'd like to
>>> > share some (relatively minor) review comments that I got so far.
>>>
>>> It seems nobody is working on dealing with these review comments, so
>>> I've attached patches. Since there are lots of review comment I
>>> numbered each review comment. The prefix of patches I attached is
>>> corresponding to review comment number.
>>>
>
> Thank you for reviewing.
>
>>> 1.
>>> >
>>> > In ApplyWorkerMain(), DatumGetInt32() should be used to get integer
>>> > value from the argument, instead of DatumGetObjectId().
>>>
>>> Attached 001 patch fixes it.
>>
>> Hmm. I looked at the launcher side and found that it assigns bare
>> integer to bgw_main_arg. It also should use Int32GetDatum.
>
> Yeah, I agreed. Will fix it.
>
>>
>>> logicalrep_worker_launch(Oid dbid, Oid subid, const char *subname, Oid userid,
>>> Oid relid)
>>> {
>>> int slot;
>> ...
>>> for (slot = 0; slot < max_logical_replication_workers; slot++)
>> ...
>>> bgw.bgw_main_arg = slot;
>>
>>
>>
>>> 2.
>>> >
>>> > No one resets on_commit_launcher_wakeup flag to false.
>>>
>>> Attached 002 patch makes on_commit_launcher_wakeup turn off after woke
>>> up the launcher process.
>>
>> It is omitting the abort case. Maybe it should be
>> AtEOXact_ApplyLauncher(bool commit)?
>
> Should we wake up the launcher process even when abort?
>
>>
>>> 3.
>>> >
>>> > ApplyLauncherWakeup() should be static function.
>>>
>>> Attached 003 patch fixes it (and also fixes #5 review comment).
>>>
>>> 4.
>>> >
>>> > Seems not only CREATE SUBSCRIPTION but also ALTER SUBSCRIPTION's
>>> > subcommands (e.g., ENABLED one) should wake the launcher up on commit.
>>>
>>> This is also reported by Horiguchi-san on another thread[1], and patch exists.
>>>
>>> 5.
>>> >
>>> > Why is IsBackendPid() check necessary in ApplyLauncherWakeup()?
>>>
>>> I also guess it's necessary. This change is included in #3 patch.
>>
>> IsBackendPid() is not free, I suppose that just ignoring failure
>> with ESRCH is enough.
>
> After thought, since LogicalRepCtx->launcher_pid could be 0 should, we
> check if launcher_pid != 0?
>
>>
>>> 6.
>>> >
>>> > Normal statements that the walsender for logical rep runs are logged
>>> > by log_replication_commands. So if log_statement is also set,
>>> > those commands are logged twice.
>>>
>>> Attached 006 patch fixes it by adding "-c log_statement = none" to
>>> connection parameter of libpqrcv if logical = true.
>>
>> The control by log_replication_commands is performed on
>> walsender, so this also shuld be done on the same place. Anyway
>> log_statement is irrelevant to walsender.
>
> Patch 006 emits all logs executed by the apply worker as a log of
> log_replication_command but perhaps you're right. It would be better
> to leave the log of log_statement if the command executed by the apply
> worker is a SQL command. Will fix.
>
>>
>>> 7.
>>> >
>>> > LogicalRepCtx->launcher_pid isn't reset to 0 when the launcher emits
>>> > an error. The callback function to reset it needs to be registered
>>> > via on_shmem_exit().
>>>
>>> Attached 007 patch adds callback function to reset LogicalRepCtx->launcher_pid.
>>>
>>> 8.
>>> >
>>> > Typo: "an subscriber" should be "a subscriber" in some places.
>>>
>>> It seems that there is no longer these typo.
>>>
>>> 9.
>>> > /* Get conninfo */
>>> > datum = SysCacheGetAttr(SUBSCRIPTIONOID,
>>> > tup,
>>> > Anum_pg_subscription_subconninfo,
>>> > &isnull);
>>> > Assert(!isnull);
>>> >
>>> > This assertion makes me think that pg_subscription.subconnifo should
>>> > have NOT NULL constraint. Also subslotname and subpublications
>>> > should have NOT NULL constraint because of the same reason.
>>>
>>> Agreed. Attached 009 patch add NOT NULL constraint to all other
>>> columns of pg_subscription. pg_subscription.subsynccommit should have
>>> it as well.
>>
>> I'm not sure the policy here, but I suppose that the assertions
>> there are still required irrelevantly from the nun-nullness of
>> the attribute.
>
> You're right. Will fix it.
>
>>> 10.
>>> >
>>> > SpinLockAcquire(&MyLogicalRepWorker->relmutex);
>>> > MyLogicalRepWorker->relstate =
>>> > GetSubscriptionRelState(MyLogicalRepWorker->subid,
>>> > MyLogicalRepWorker->relid,
>>> > &MyLogicalRepWorker->relstate_lsn,
>>> > false);
>>> > SpinLockRelease(&MyLogicalRepWorker->relmutex);
>>> >
>>> > Non-"short-term" function like GetSubscriptionRelState() should not
>>> > be called while spinlock is being held.
>>> >
>>>
>>> One option is adding new LWLock for the relation state change but this
>>> lock is used at many locations where the operation actually doesn't
>>> take time. I think that the discussion would be needed to get
>>> consensus, so patch for it is not attached.
>>
>> From the point of view of time, I agree that it doesn't seem to
>> harm. Bt I thing it as more significant problem that
>> potentially-throwable function is called within the mutex
>> region. It potentially causes a kind of dead lock. (That said,
>> theoretically dosn't occur and I'm not sure what happens by the
>> dead lock..)
>>
>>
>>> [1] https://www.postgresql.org/message-id/20170406.212441.177025736.horiguchi.kyotaro@lab.ntt.co.jp
>>
>> --
>> Kyotaro Horiguchi
>> NTT Open Source Software Center
>>
>
>

I've attached latest v2 three patches; 001, 006 and 009. The review
comments I got so far are incorporated.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment Content-Type Size
001_change_DatumGetObjectId_to_DatumGetInt32_v2.patch application/octet-stream 1.1 KB
006_Prevent_to_emit_statement_log_double_v2.patch application/octet-stream 1.4 KB
009_Add_not_null_constraint_v2.patch application/octet-stream 1.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2017-04-17 14:47:40 Re: Re: extended stats not friendly towards ANALYZE with subset of columns
Previous Message Peter Eisentraut 2017-04-17 14:08:20 Re: Logical replication and inheritance