Re: Let's make PostgreSQL multi-threaded

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Let's make PostgreSQL multi-threaded
Date: 2023-06-08 13:56:37
Message-ID: CA+TgmoaVGyHebYLwmuooHC0f58-=-NYQ-Mz4OWX8aeX5VB=W0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jun 7, 2023 at 5:30 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2023-06-05 17:51:57 +0300, Heikki Linnakangas wrote:
> > If there are no major objections, I'm going to update the developer FAQ,
> > removing the excuses there for why we don't use threads [1].
>
> I think we should do this even if there's no concensus to slowly change to
> threads. There's clearly no concensus on the opposite either.

This is a very fair point.

> One interesting bit around the transition is what tooling we ought to provide
> to detect problems. It could e.g. be reasonably feasible to write something
> checking how many read-write global variables an extension has on linux
> systems.

Yes, this would be great.

> I don't think the control file is the right place - that seems more like
> something that should be signalled via PG_MODULE_MAGIC. We need to check this
> not just during CREATE EXTENSION, but also during loading of libraries - think
> of shared_preload_libraries.

+1.

> Yea, we definitely need the supervisor function in a separate
> process. Presumably that means we need to split off some of the postmaster
> responsibilities - e.g. I don't think it'd make sense to handle connection
> establishment in the supervisor process. I wonder if this is something that
> could end up being beneficial even in the process world.

Yeah, I've had similar thoughts. I'm not exactly sure what the
advantages of such a refactoring might be, but the current structure
feels pretty limiting. It works OK because we don't do anything in the
postmaster other than fork a new backend, but I'm not sure if that's
the best strategy. It means, for example, that if there's a ton of new
connection requests, we're spawning a ton of new processes, which
means that you can put a lot of load on a PostgreSQL instance even if
you can't authenticate. Maybe we'd be better off with a pool of
processes accepting connections; if authentication fails, that
connection goes back into the pool and tries again. If authentication
succeeds, either that process transitions to being a regular backend,
leaving the authentication pool, or perhaps hands off the connection
to a "real backend" at that point and loops around to accept() the
next request.

Whether that's a good ideal in detail or not, the point remains that
having the postmaster handle this task is quite limiting. It forces us
to hand off the connection to a new process at the earliest possible
stage, so that the postmaster remains free to handle other duties.
Giving the responsibility to another process would let us make
decisions about where to perform the hand-off based on real
architectural thought rather than being forced to do a certain way
because nothing else will work.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2023-06-08 14:08:57 Re: Let's make PostgreSQL multi-threaded
Previous Message Jan Wieck 2023-06-08 13:53:02 Re: Named Prepared statement problems and possible solutions