Re: Walsender timeouts and large transactions

From: Sokolov Yura <funny(dot)falcon(at)postgrespro(dot)ru>
To: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org, pjmodos(at)pjmodos(dot)net
Subject: Re: Walsender timeouts and large transactions
Date: 2017-09-29 12:19:23
Message-ID: d076dae18b437be89c787a854034f3f2@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Good day, Kyoutaro

On 2017-09-29 11:26, Kyotaro HORIGUCHI wrote:
> Hello,
>
> At Wed, 27 Sep 2017 14:28:37 +0300, Sokolov Yura
> <funny(dot)falcon(at)postgrespro(dot)ru> wrote in
> <90bb67da7131e6186b50897c4b0f0ec3(at)postgrespro(dot)ru>
>> On 2017-09-12 11:28, Kyotaro HORIGUCHI wrote:
>> > Hello,
>> > At Wed, 06 Sep 2017 13:46:16 +0000, Yura Sokolov
>> > <funny(dot)falcon(at)postgrespro(dot)ru> wrote in
>> > <20170906134616(dot)18925(dot)88390(dot)pgcf(at)coridan(dot)postgresql(dot)org>
>> > As the result, I think that the functions should be modified as
>> > the following.
>> > - Forcing slow-path if time elapses a half of a ping period is
>> > right. (GetCurrentTimestamp is anyway requried.)
>> > - The slow-path should not sleep waiting Latch. It should just
>> > pg_usleep() for maybe 1-10ms.
>> > - We should go to the fast path just after keepalive or response
>> > message has been sent. In other words, the "if (now <" block
>> > should be in the "for (;;)" loop. This avoids needless runs on
>> > the slow-path.
>> > It would be refactorable as the following.
>> > prepare for the send buffer;
>> > for (;;)
>> > {
>> > now = GetCurrentTimeStamp();
>> > if (now < )...
>> > {
>> > fast-path
>> > }
>> > else
>> > {
>> > slow-path
>> > }
>> > return if finished
>> > sleep for 1ms?
>> > }
>> > What do you think about this?
>> > regards,
>> > --
>> > Kyotaro Horiguchi
>> > NTT Open Source Software Center
>>
>> Good day, Petr, Kyotaro
>>
>> I've created failing test for issue (0001-Add-failing-test...) .
>> It tests insertion of 20000 rows with 10ms wal_sender_timeout
>> (it fails in WalSndWriteData on master) and then deletion of
>> those rows with 1ms wal_sender_timeout (it fails in WalSndLoop).
>>
>> Both Peter's patch and my simplified suggestion didn't pass the
>> test. I didn't checked Kyotaro's suggestion, though, cause I
>> didn't understand it well.
>
> Mmm. The test seems broken. wal_sender_timeout = 10ms with
> wal_receiver_status_interval=10s immediately causes a
> timeout. Avoiding the timeout is just breaking the sane code.

I think you're not right. `wal_receiver_status_interval` is just
for status update, not for replies. Before I made my patch version,
I've added logging to every `now` and `last_reply_timestamp` during
test run. `last_reply_timestamp` definitely updates more often than
once in 10s. So, `wal_receiver_status_interval = 10s` has nothing
in common with receiver's replies, as I see.

(btw, logging slows down sender enough to "fix" test :-) )

And my patch doesn't avoid timeout check, so it doesn't break
anything. It just delays timeout on 1ms. Given, it is unpractical
to set wal_sender_timeout less than 50ms in production, this 1ms
increase in timeout check is negligible.

And I've checked just now that my patch passes test even with
wal_receiver_status_interval = 10s.

>
> wal_sender_timeout = 3s and wal_receiver_status_interval=1s
> effectively causes the problem with about 1000000 lines of (int)
> insertion on UNIX socket connection, on my poor box.

I don't want to make test to lasts so long and generate so many data.
That is why I used such small timeouts for tests.

> The original complain here came from the fact that
> WalSndWriteData skips processing of replies for a long time on a
> fast network. However Petr's patch fixed the problem, I pointed
> that just letting the function take the slow path leads to
> another problem, that is, waiting for new WAL records can result
> in a unwanted pause in the slow path.
>
> Combining the solutions for the two problem is my proposal sited
> above. The sentence seems in quite bad style but the attached
> file is the concorete patch of that.

Test is failing if there is "short quit" after `!pq_is_send_pending()`,
so I doubt your patch will pass the test.
And you've change calculated sleep time with sane waiting on all
insteresting events (using WaitLatchOrSocket) to semi-busy loop.
It at least could affect throughput.

And why did you remove `SetLatch(MyLatch)` in the end of function?
Probably this change is correct, but not obvious.

> Any thoughts?

It certainly could be my test and my patch is wrong. But my point
is that test should be written first. Especially for such difficult
case. Without test it is impossible to say does our patches fix
something. And it is impossible to say if patch does something
wrong. And impossible to say if patch fixes this problem but
introduce new problem.

Please, write test for your remarks. If you think, my patch breaks
something, write test for the case my patch did broke. If you think
my test is wrong, write your test that is more correct.

Without tests it will be just bird's hubbub.

> regards,

regards,
--
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2017-09-29 13:04:51 Re: Rewriting the test of pg_upgrade as a TAP test
Previous Message Amit Kapila 2017-09-29 11:48:15 Re: Parallel Append implementation