Re: Walsender timeouts and large transactions

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: funny(dot)falcon(at)postgrespro(dot)ru
Cc: pgsql-hackers(at)postgresql(dot)org, pjmodos(at)pjmodos(dot)net
Subject: Re: Walsender timeouts and large transactions
Date: 2017-10-02 03:44:22
Message-ID: 20171002.124422.224472998.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello Sokolov.

At Fri, 29 Sep 2017 15:19:23 +0300, Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru> wrote in <d076dae18b437be89c787a854034f3f2(at)postgrespro(dot)ru>
> I don't want to make test to lasts so long and generate so many data.
> That is why I used such small timeouts for tests.

I understand your point, but still *I* think such a short timeout
is out of expectation by design. (But it can be set.)

Does anyone have opinions on this?

> Test is failing if there is "short quit" after
> `!pq_is_send_pending()`,
> so I doubt your patch will pass the test.

It is because I think that the test "should" fail since the
timeout is out of expected range. I (and perhaps also Petr) is
thinking that the problem is just that a large transaction causes
a timeout with an ordinary timeout. My test case is based on the
assumption.

Your test is for a timeout during replication-startup with
extremely short timeout. This may be a different problem to
discuss, but perhaps better to be solved together.

I'd like to have opnions from others on this point.

> And you've change calculated sleep time with sane waiting on all
> insteresting events (using WaitLatchOrSocket) to semi-busy loop.
> It at least could affect throughput.

Uggh! I misunderstood there. It wais for writing socket so the
sleep is wrong and WaitLatchOrSocket is right.

After all, I put +1 for Petr's latest patch. Sorry for my
carelessness.

> And why did you remove `SetLatch(MyLatch)` in the end of function?
> Probably this change is correct, but not obvious.

Latch is needless there if it waited a fixed duration, but if it
waits writefd events there, also latch should be waited.

> > Any thoughts?
>
> It certainly could be my test and my patch is wrong. But my point
> is that test should be written first. Especially for such difficult
> case. Without test it is impossible to say does our patches fix
> something. And it is impossible to say if patch does something
> wrong. And impossible to say if patch fixes this problem but
> introduce new problem.
>
> Please, write test for your remarks. If you think, my patch breaks
> something, write test for the case my patch did broke. If you think
> my test is wrong, write your test that is more correct.
>
> Without tests it will be just bird's hubbub.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bossart, Nathan 2017-10-02 04:43:39 Re: [Proposal] Allow users to specify multiple tables in VACUUM commands
Previous Message Andres Freund 2017-10-02 03:14:23 Re: Logging idle checkpoints