Re: Let's make PostgreSQL multi-threaded

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Let's make PostgreSQL multi-threaded
Date: 2023-06-07 13:08:38
Message-ID: CAExHW5uPNB57_3FM-vpwYMFC5rZLJ6Ni6Kk3UpC6sODT+qvhAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jun 5, 2023 at 8:22 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
>
> I spoke with some folks at PGCon about making PostgreSQL multi-threaded,
> so that the whole server runs in a single process, with multiple
> threads. It has been discussed many times in the past, last thread on
> pgsql-hackers was back in 2017 when Konstantin made some experiments [0].
>
> I feel that there is now pretty strong consensus that it would be a good
> thing, more so than before. Lots of work to get there, and lots of
> details to be hashed out, but no objections to the idea at a high level.
>
> The purpose of this email is to make that silent consensus explicit. If
> you have objections to switching from the current multi-process
> architecture to a single-process, multi-threaded architecture, please
> speak up.
>
> If there are no major objections, I'm going to update the developer FAQ,
> removing the excuses there for why we don't use threads [1]. And we can
> start to talk about the path to get there. Below is a list of some
> hurdles and proposed high-level solutions. This isn't an exhaustive
> list, just some of the most obvious problems:
>
> # Transition period
>
> The transition surely cannot be done fully in one release. Even if we
> could pull it off in core, extensions will need more time to adapt.
> There will be a transition period of at least one release, probably
> more, where you can choose multi-process or multi-thread model using a
> GUC. Depending on how it goes, we can document it as experimental at first.
>
> # Thread per connection
>
> To get started, it's most straightforward to have one thread per
> connection, just replacing backend process with a backend thread. In the
> future, we might want to have a thread pool with some kind of a
> scheduler to assign active queries to worker threads. Or multiple
> threads per connection, or spawn additional helper threads for specific
> tasks. But that's future work.

With multiple processes, we can use all the available cores (at least
theoretically if all those processes are independent). But is that
guaranteed with single process multi-thread model? Google didn't throw
any definitive answer to that. Usually it depends upon the OS and
architecture.

Maybe a good start is to start using threads instead of parallel
workers e.g. for parallel vacuum, parallel query and so on while
leaving the processes for connections and leaders. that itself might
take significant time. Based on that experience move to a completely
threaded model. Based on my experience with other similar products, I
think we will settle on a multi-process multi-thread model.

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christoph Berg 2023-06-07 13:18:42 Re: could not extend file "base/5/3501" with FileFallocate(): Interrupted system call
Previous Message Robert Haas 2023-06-07 12:53:24 Re: Let's make PostgreSQL multi-threaded