From: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robert(dot)haas(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
Subject: | Re: Blocking I/O, async I/O and io_uring |
Date: | 2020-12-09 01:28:43 |
Message-ID: | CAGRY4nzG_U=nnSFq8FwuJhhYVFWNOtdTNTKt_8rrRDqpwTSGyg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, 8 Dec 2020 at 15:04, Andres Freund <andres(at)anarazel(dot)de> wrote:
> Hi,
>
> On 2020-12-08 04:24:44 +0000, tsunakawa(dot)takay(at)fujitsu(dot)com wrote:
> > I'm looking forward to this from the async+direct I/O, since the
> > throughput of some write-heavy workload decreased by half or more
> > during checkpointing (due to fsync?)
>
> Depends on why that is. The most common, I think, cause is that your WAL
> volume increases drastically just after a checkpoint starts, because
> initially all page modification will trigger full-page writes. There's
> a significant slowdown even if you prevent the checkpointer from doing
> *any* writes at that point. I got the WAL AIO stuff to the point that I
> see a good bit of speedup at high WAL volumes, and I see it helping in
> this scenario.
>
> There's of course also the issue that checkpoint writes cause other IO
> (including WAL writes) to slow down and, importantly, cause a lot of
> jitter leading to unpredictable latencies. I've seen some good and some
> bad results around this with the patch, but there's a bunch of TODOs to
> resolve before delving deeper really makes sense (the IO depth control
> is not good enough right now).
>
> A third issue is that sometimes checkpointer can't really keep up - and
> that I think I've seen pretty clearly addressed by the patch. I have
> managed to get to ~80% of my NVMe disks top write speed (> 2.5GB/s) by
> the checkpointer, and I think I know what to do for the remainder.
>
>
Thanks for explaining this. I'm really glad you're looking into it. If I
get the chance I'd like to try to apply some wait-analysis and blocking
stats tooling to it. I'll report back if I make any progress there.
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2020-12-09 01:29:11 | Occasional tablespace.sql failures in check-world -jnn |
Previous Message | Kyotaro Horiguchi | 2020-12-09 01:02:31 | Re: [Patch] Optimize dropping of relation buffers using dlist |