Re: Excessive PostmasterIsAlive calls slow down WAL redo

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Andres Freund <andres(at)anarazel(dot)de>, Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Excessive PostmasterIsAlive calls slow down WAL redo
Date: 2018-04-06 17:09:30
Message-ID: 29ebd68e-a6cf-3dac-8954-16b22d6b11da@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 06/04/18 19:39, Andres Freund wrote:
> On 2018-04-06 07:39:28 -0400, Stephen Frost wrote:
>> While I tend to agree that it'd be nice to just make it cheaper, that
>> doesn't seem like something that we'd be likely to back-patch and I tend
>> to share Heikki's feelings that this is a performance regression we
>> should be considering fixing in released versions.

To be clear, this isn't a performance *regression*. It's always been bad.

I'm not sure if I'd backpatch this. Maybe after it's been in 'master'
for a while and we've gotten some field testing of it.

> I'm doubtful about fairly characterizing this as a performance bug. It's
> not like we've O(n^2) behaviour on our hand, and if your replay isn't of
> a toy workload normally that one syscall isn't going to make a huge
> difference because you've actual IO and such going on.

If all the data fits in the buffer cache, then there would be no I/O.
Think of a smallish database that's heavily updated. There are a lot of
real applications like that.

> I'm also doubtful that it's sane to just check every 32 records. There's
> records that can take a good chunk of time, and just continuing for
> another 31 records seems like a bad idea.

It's pretty arbitrary, I admit. It's the best I could come with, though.
If we could get a signal on postmaster death, that'd be best, but that's
a much bigger patch, and I'm worried that it would bring new portability
and reliability issues.

I'm not too worried about 32 records being too long an interval. True,
replaying 32 CREATE DATABASE records would take a long time. But pretty
much all other WAL records are fast enough to apply. We could make it
every 8 records rather than 32, if that makes you feel better. Or add
some extra conditions, like always check it when stepping to a new WAL
segment. In any case, the fundamental difference would be though to not
check it between every record.

- Heikki

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-04-06 17:11:14 Re: Online enabling of checksums
Previous Message Konstantin Knizhnik 2018-04-06 17:03:36 Re: Built-in connection pooling