From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robert(dot)haas(at)enterprisedb(dot)com>, Petr Jelinek <petr(dot)jelinek(at)enterprisedb(dot)com> |
Subject: | Re: Blocking I/O, async I/O and io_uring |
Date: | 2020-12-08 03:27:34 |
Message-ID: | CA+hUKGKwArCVb=rdv252yX0GrzkiS+vw7ExAjK7O0bJDUkfzJQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Dec 8, 2020 at 3:56 PM Craig Ringer
<craig(dot)ringer(at)enterprisedb(dot)com> wrote:
> I thought I'd start the discussion on this and see where we can go with it. What incremental steps can be done to move us toward parallelisable I/O without having to redesign everything?
>
> I'm thinking that redo is probably a good first candidate. It doesn't depend on the guts of the executor. It is much less sensitive to ordering between operations in shmem and on disk since it runs in the startup process. And it hurts REALLY BADLY from its single-threaded blocking approach to I/O - as shown by an extension written by 2ndQuadrant that can double redo performance by doing read-ahead on btree pages that will soon be needed.
About the redo suggestion: https://commitfest.postgresql.org/31/2410/
does exactly that! It currently uses POSIX_FADV_WILLNEED because
that's what PrefetchSharedBuffer() does, but when combined with a
"real AIO" patch set (see earlier threads and conference talks on this
by Andres) and a few small tweaks to control batching of I/O
submissions, it does exactly what you're describing. I tried to keep
the WAL prefetcher project entirely disentangled from the core AIO
work, though, hence the "poor man's AIO" for now.
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2020-12-08 03:32:37 | Re: vac_update_datfrozenxid will raise "wrong tuple length" if pg_database tuple contains toast attribute. |
Previous Message | tsunakawa.takay@fujitsu.com | 2020-12-08 03:09:50 | RE: [bug fix] ALTER TABLE SET LOGGED/UNLOGGED on a partitioned table does nothing silently |